Hacker News new | past | comments | ask | show | jobs | submit login

To clarify why WHATWG exists and why W3C lost power over the HTML spec:

Around 2004 W3C abandoned organizational effort on HTML in favor of things like XHTML2, XEvents, semantic web, etc. The WHATWG was formed in reaction to that, rewriting HTML completely from its W3C HTML 4.0 version for example to make it better for web applications and to specify things in more detail.

Since the formation of WHATWG, W3C HTML specs became copy-pastes (often not even correct copy-pastes) in an effort to satisfy paying member companies (https://github.com/w3c/charter-html/issues/14#issuecomment-1...).

You can see everything WHATWG maintains here: https://platform.html5.org/ (they are the green question mark)

The W3C still maintains a good chunk web standards though, such as CSS, Wasm, web security, etc.

The biggest problem is that in addition to going all in on xhtml, the w3c actively dismissed the need to correctly specify the actual behavior of the web, rather than some idealized model.

Essential the w3c went “incorrect” html isn’t valid so we don’t need to specify it, even though all browsers support that “incorrect” content. Instead they said “everyone should just use xhtml as that makes sure the syntax is correct”.

Unfortunately they again failed to address the real world:

* xhtml is necessarily slower to display because you cannot do anything until you’ve got a full document (otherwise it will definitely fail to validate)

* because IE didn’t display xhtml it was published with the html mime type, so browsers that did support xhtml still had to parse it as html

* as a byproduct of the last one, invalid xml got added to documents which would then cause the browsers that did try to treat it as html to appear broken

* xml is also incredibly hard to actually get right - take RSS that was ostensibly XML from the outset. Even that has to be parsed as html because of the amount of broken xml.

By going all in on XML the w3c essentially went all in on a technology that people didn’t actually use or want.

But browsers did actually need an accurate specification that matched the real world, and that’s what became HTML5 through the hard work of people from apple, Firefox, (eventually) google, Microsoft, and opera - the w3c was not really involved. The end result is that in the modern DOM you are far less likely to need per-browser hacks than you once did.

That explains the HTML spec but not the DOM spec.

The DOM is specified as part of the HTML spec.

The whatwg HTML spec defines exactly how html is parsed, and exactly how every element interacts the scripting environment. Just defining the grammar is not sufficient.

Historically for instance something like


Produced a different DOM tree in different browsers. WHATWG specified what should actually happen. IIRC IE managed to produce a DOM graph rather than a tree in the above example.

The DOM is a completely separate unrelated specification. https://dom.spec.whatwg.org

fair point. I was conflating whatwg spec and html5 (Actually, it's possible they were at one point the same spec - there was some work over the last 5 or so years to stop putting literally everything in a single spec document, unfortunately after a decade of web engine work everything turns into a single amorphous blob)

They were never the same. The closest thing is that the DOM spec and W3C’s XML Schema spec were joint publications for several years.

What are you talking about? I am not talking about the W3C's nonsense, I was fairly clear that I was talking about the actual real spec, which is html5, via the WHATWG [1]

What you claimed is absolutely wrong.

Please note that it defines the DOM interfaces for all of the core elements, and more or less every DOM API, including all elements, as well as most programmatic types - even things like the XHR objects. They used to all be in a single giant "HTML living standard" document, and have in the relatively recent past been split into separate spec docs (many of which reference the original "living standard").

[1] https://html.spec.whatwg.org

The WHATWG DOM spec was spun out of W3C DOM level 3. Neither document was ever associated with or aligned to any version of HTML, though the WHATWG did try something like that and backed off when all the browser vendors refused to give them the time of day. Now the WHATWG document is largely the document of record and W3Cs DOM level 4 is largely a snapshot of the WHATWG document.

None of this is either confusing or a mystery. It’s all out there in the open and the people who maintain these documents respond to email. I typically avoid talking about the DOM online because many developers aren’t aware of what it is and are less aware of its history and sometimes people get sensitive about it.

Applications are open for YC Summer 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact