These are the same people who created XHTML, an ivory tower idea nobody was waiting for... who didn't support the most popular layout method at the time, tables, in their new styling language CSS.
W3C became irrelevant because they kept thinking they could just tell the entire web what to do, that they'd make enormous technical investments to satisfy the W3C's latest fashions.
I interacted with them once, over their seamless iframe proposal. I had to point out that the two biggest uses of iframes at the time, i.e. Twitter embeds and Facebook apps, could not make use of it, despite being a perfect fit. They just hadn't considered that.
> These are the same people who created XHTML, an ivory tower idea nobody was waiting for... who didn't support the most popular layout method at the time, tables, in their new styling language CSS.
You certainly could lay out pages using tables in XHTML, but the point of the standard in the first place was to enable semantically sound documents for the sake of interoperability and to facilitate separation of concerns. So maybe if you authored XHTML documents that was already an important consideration to you.
On a side note, now that the web development industry seems to have collectively given up any ambitions for semantically sound documents it's strange that the criticism against table-based layouts prevails.
I honestly don't think either was the purpose. XHTML existed to make parsing easier.
The only things you couldn't do in XHTML 1.1 Transitional that you could do in HTML were having unclosed tags and using uppercase in tag names. That's it.
Now yeah, the strict version tried to force you into semantically sound documents... but that was completely orthogonal to XHTML vs. HTML. Both XHTML and HTML were available in both transitional and strict forms.
The real problem is that HTML is a massive pain in the ass to parse. You can close tags out of order, some tags can't even be closed (<hr>), some tags can optionally be closed (<p>), nothing is case-sensitive, and because of how flexible SGML is, you need a DTD to properly parse any SGML implementation (fun fact: SGML allows for a bunch of different markup styles, but the HTML DTD explicitly disallows most of them). XHTML sought to eliminate all that by mapping HTML onto XML, which was much easier to parse.
> I honestly don't think either was the purpose. XHTML existed to make parsing easier.
That being the only goal would have resulted in a much simpler standard. Extension modules for example is completely orthogonal to that goal. I think that the standards themselves do a good job of describing what motivates them.
> The only things you couldn't do in XHTML 1.1 that you could do in HTML were having unclosed tags and using uppercase in tag names. That's it.
That's true, but ignores the opposite question—what you could do in XHTML that you could not in HTML.
> You certainly could lay out pages using tables in XHTML, but the point of the standard in the first place was to enable semantically sound documents for the sake of interoperability and to facilitate separation of concerns
That was a major focus of HTML5, as well, which was more successful at enabling that (arguably, it was more of the focus of HTML5, Which didn't try to impose major syntactic change as well, even though it also supported an XML form.)
> On a side note, now that the web development industry seems to have collectively given up any ambitions for semantically sound documents
Semantic soundness in the strict sense has always been something of a niche concern, not something hat was a general web dev industy goal and then abandoned.
> it's strange that the criticism against table-based layouts prevails.
Table-based layout remains an accessibility problem, which is a more practical issue than abstract concern for semantic soundness (though related).
> That was a major focus of HTML5, as well, which was more successful at enabling that (arguably, it was more of the focus of HTML5, Which didn't try to impose major syntactic change as well, even though it also supported an XML form.)
I agree. Don't confuse my interpretation of the point of XHTML with some sort of endorsement. For the record, I think XHTML is an unnecessarily complex standard that results in more work for little added value. I do recognize, though, that the point of XHTML is unrelated to its success to that end.
Comparing HTML5 to XHTML is also a bit of a no-brainer. There are 14 years between their introductions, and HTML5 obviously had a history of lessons learned from XHTML to take into account.
> Semantic soundness in the strict sense has always been something of a niche concern, not something hat was a general web dev industy goal and then abandoned.
What is the strict sense of semantic soundness? I'd agree that few people care about semantic soundness even in a general sense. I frequently see documents where the presentation seems prioritized over content. But the ambition did exist at some point, and was more of a mainstream concern in, say, 2005 than it seems to be now, and the idea of a more semantic web was peddled by big names like Tim Berners-Lee.
> Table-based layout remains an accessibility problem, which is a more practical issue than abstract concern for semantic soundness (though related).
I'd say that they are strictly related. You improve accessibility primarily by improving the semantic representation of documents. I don't see semantic soundness as an "abstract concern" for this reason. The most useful screen readers unfortunately seem to rely on some level of guesswork to make verbal sense of documents that rely on spatiality to communicate element relationships.
After writing XHTML for several years (giving it a full-faith attempt) I never understood the point. You keep repeating "semantically sound" but I can't fathom what that means in your context. I never saw any indication that XHTML brought significant practical semantic information or standardization over what HTML5 can do. It did add significant gratuitous verbosity that made XHTML documents much harder to read and edit.
> You keep repeating "semantically sound" but I can't fathom what that means in your context.
Documents built on the principle that their structure should relate to the meaning of their content rather than its presentation are what I consider "semantically sound". By negative example, an HTML document filled with div pyramids just to apply layout information, and obtuse class and id names are not what I consider semantically sound. However clumsy it was in practice, XHTML and CSS sought to address this by making the markup extensible and moving style information out of the document.
> I never saw any indication that XHTML brought significant practical semantic information or standardization over what HTML5 can do.
Obviously HTML5 had the advantage of hindsight and the perspective to learn from some of the mistakes of XHTML while adopting some of its more useful qualities. I should also note that I'm discussing the point of XHTML, not trying to tell anyone that it was particularly successful to that end. Don't confuse the two.
One of the interesting things you could do with XHTML was ditch the HTML entirely and write an XML document expressing the semantic content, and then couple that with an XSLT stylesheet to convert it into XHTML for display. This way the exact same resource could be read by machines to get the semantic content, and then read by browsers and transformed into the display content.
I'm not sure if anyone ever actually used this seriously though. Definitely very "ivory tower" design. But in the abstract it's a cool idea.
I wrote an XSLT stylesheet that could turn an XHTML document into a display of its own source, complete with indenting, code folding and syntax highlighting. I was quite pleased with that :)
When I worked at the BMJ all of our content was stored as XML and rendered to the browser using XSLT, although this was done on the back-end and the resulting XHTML embedded in our various Spring applications. This sort of thing is probably the most common use of XHTML today...
XHTML documents have exactly the same semantics as HTML. Neither is more semantically sound than the other.
XHTML just defined a slightly different syntax which was XML compatible. This was useful if you were using XML tools, but it didn't affect semantics at all.
XHTML 2.0 was very different from HTML semantically. Some of its tags (ARTICLE, SECTION, MENU, etc) wound up smashed into HTML5 later and became semantically meaningless again, but XHTML 2.0 tried to be more semantically sound than HTML and is a large part of how W3C lost the war to HTML5, because semantics are hard and most of the browsers didn't care about semantics.
A reason XHTML 2.0 got so bogged down in committee and never actually finished a standard was that the attempt was made to define semantically what an ARTICLE would be, how SECTIONS work, what things a browser or semantic web crawler could infer/summarize build from such things. For instance, one group of the committee argued you couldn't have SECTIONs outside of an ARTICLE; that an ARTICLE consisted of zero or more SECTIONs (and maybe SECTIONs could be nested inside of each other). Folks argued for SECTIONs to have concepts of names that could be listed in auto-generated Tables of Contents.
HTML5 mostly just defines ARTICLE and SECTION as optional block-level content elements, with no other real importance. This leaves them as merely fancier synonyms for DIV. Semantics is almost entirely left to ARIA, and while HTML5 has come back around to ARTICLE tag should imply, for instance, ARIA role="article", there's still a bunch of interesting reasons that people concerned with ARIA semantics continue to write "redundant" things like <ARTICLE aria-role="article"...
This isn't against XHTML 1.1, which was HTML 4.01 shoved into a container of "if it isn't valid XML, display an error page instead;" rather, a lot of the hate is against XHTML 2.0, which decided to rip out HTML features such as forms, frames, most of the old elements such as <b> or <i>, and generally screw compatibility completely.
Ooph yes. I had forgotten how they were rewriting HTML essentially from scratch and this time basing it on XML. In fairness, it wasn't a completely dumb idea as HTML had picked up a lot of warts from its long history and anyway almost no browser implemented the HTML spec as written. The browser wars especially spread a lot of debris across implementations. The temptation to rip it up and start again is very strong with software engineers even in the best of circumstances (heck, just ask Google about this).
But getting everyone else to start over as well was always going to be an uphill struggle even if it was truly the best thing ever.
XHTML 2.0 also did screwy stuff with MIME types I believe. It required specific MIME types that a lot of browsers at the time didn't support, and because of that most browsers would straight-up refuse to render XHTML 2.0.
Yes, it was a requirement that XHTML 2.0 documents be served as application/xhtml+xml (which IE didn't support at the time), but that was really a non-issue (with so much renamed and moved around v. HTML 4.01 and its XML reformulations (XHTML 1.0, XHTML 1.1) because there was no graceful fallback story.
The bigger problem was that it required content to be served as application/xhtml+xml, and gave the same element and attribute names, in the same namespace, different semantics to what XHTML 1.0/XHTML5 gave them, with different implementation requirements (and it being impossible to satisfy both).
AFAIK this essentially got resolved by XHTML 2.0 being abandoned (in 2009, years after HTML 5 had moved to being jointly developed with the W3C).
IIRC, the main objection was that error handling was an all or nothing affair. Whereas most HTML on the web is (or was) broken to a greater or lesser degree. There was also the objection that stricter parsing made it harder for hobbyists.
In these days of Typescript, where web devs seem to like having stricter rules, it could play better. But that ship has already sailed.
I never understood that either except, as you said, for the hobbyist's sake. A lot of XHTML was generated from XML and, if one is using XML, chances are programming is involved in the transformation to XHTML. But programming has strict rules itself and will also fail if not adhered to so I never understood the complaint of "draconian" error checking in XML/XHTML.
People underestimated the extent to which markup may be mixed in from sources you didn't control, and the power that this ability gives to your users. Say you built a shiny new forum engine with from-scratch XHTML markup. You try to sell it. Most of your customers say that they've already been running forum software since 1995 but the existing posts allow inline HTML (which was less unsafe in 1995, because no Javascript), which is all badly misnested. As soon as they dump the previous data, their site stops working.
Or you import a small Javascript library from 1999 that generates its own innerHTML for a few elements, but does it with HTML. Oops.
Or you built a new CMS with shiny XHTML markup, but before you had the CMS your org just hand-wrote pages which you now need to parse and import into the CMS.
These were all very real considerations in the 2002-2004 period; I've dealt with all of them. Backwards compatibility is often the most important feature you can offer, because it directly affects the value the end user gets out of the product. Sites that were concerned with "doing it right" in that time period largely failed, while sites that "did it fast" in a hacky, XSS-prone way are now worth hundreds of billions of dollars.
As the scale of a program increases, the probability that someone will do something wrong increases polynomially. Consequently, as web sites got larger and larger, the probability that some component would break the XML goes up quickly. This is a difficult pattern to deal with, pushing up the skill floor required, and as HTML5 shows, it isn't even all that necessary.
There was also similar exposure from the data side; as the amount of data you handled increased, the odds that some data would tickle some code path that you didn't even know could blow up went up. You write your news front page in XHTML, and everything seems fine for six months, until someone finally includes an ampersand in their headline, and your entire front page crashes for two hours (not in any way monitoring will pick up, either, so you're getting customer reports), and it takes you hours to discover that someone was passing through the headline (and just the headline!) unencoded.
The problem isn't XHTML's rigidity per se; personally I'm inclined more in that direction myself. The problem is when you have a ton of sloppy systems working together (MySQL, old HTML generation code, plugins from third parties your don't control, open source written by people whose belief in their understanding of HTML exceeds their actual understanding, decade-old internal databases with poor validations and unknown provenance, and so on and so on indefinitely), and then trying to suddenly, at the last minute, couple that big sloppy pile of technology to a strict technology at the last second. That sudden mismatch there at the end was a huge problem.
One of the reasons I tend to prefer being as strict as possible is that in general, starting with existing strict-tech and adding sloppy-tech to it is no big deal; the sloppy tech doesn't complain that it only gets a subset of possible data it will accept. And if you need to couple to strict-tech, you still can. But if you start with a sloppy-tech system and for some reason need to couple it to a strict-tech system... prepare for some long nights and blown deadlines. So, professionally, the correct default is to choose strict-tech whenever possible. But XHTML forced that at almost the worst possible place.
> But programming has strict rules itself and will also fail if not adhered to so I never understood the complaint of "draconian" error checking in XML/XHTML.
In most cases, if your code has some syntax error, the author of the code sees the syntax error; in the web case, if your code has some syntax error, the user sees the syntax error. That's the dramatic difference.
The other reality is unlike program code, there's vastly more often user content intermixed in (X)HTML and it's rare for people to implement sanitisation correctly (do you handle U+0000? U+FFFF? U+1FFFF? most people outputting XHTML historically haven't, even if they get the security critical stuff (like "<") right).
I've been involved with multiple HTML and XML parsers, but never validators. :)
The reality ten years ago, when a number of prominent XML advocates were using XHTML (and actually using it as such, serving it as such), almost all of their sites had user input means where the input was sanitized well enough for HTML to be secure (and not have any markup injection), but not for XML well-formedness (they got all the markup injection risks in XML, but not all the other WF requirements). If the very people who claim XML is easy can't get it right, can everyone else?
The error handling has already been mentioned. There also was that issue of what mime-type to send in your HTTP headers to get the page displayed correctly in various clients. While you could develop in XHTML, you needed to serve it as technically broken HTML to be compatible with older browsers that still had a significant market share.
Yes, that the standard wasn't designed with the people actually implementing browsers and as a result was unusable for content in the real world whatever advantages it had in the theoretical world where it was widely and correctly implemented was a major problem.
The expression "ivory tower" has an entirely different meaning. Just because the couple of dominant players chose to ignore a standard, particularly in a time and age where they were outright hostile to interoperability initiatives, it doesn't mean the standard was not reasonable.
What good is a standard you can't use in practice? There also were other issues: XHTML was chasing the dream of the semantic, well-formed, machine-readable web, but it didn't do enough to help with pragmatic problems web designers trying to deliver a product actually faced. As much as I like the idea myself from a theoretical perspective, the market had other priorities.
I don't think the idea works from a theoretical perspective.
The browser is a communication channel between a publisher and a reader. They may want different things...
But the "semantic, well-formed, machine-readable web" idea is, in theory, a demand that the channel imposes on both parties. It's not something the publisher wants. It's not something the reader wants. Nobody cares what the channel wants; demanding extra information that isn't relevant to what any party to an actual transaction is trying to accomplish is always going to be doomed.
Readers care about positional information at a fairly minor level. Publishers care about it a lot. And hey, positional information has robust, if annoying, implementations.
I don't see this connection of XHTML and semweb. The preamble to the original XML spec reads as follows:
> The Extensible Markup Language (XML) is a subset of SGML that is completely described in this document. Its goal is to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML. XML has been designed for ease of implementation and for interoperability with both SGML and HTML.
Basically, XML's purpose was to make parsing rules for web documents generic and DTD-less, in particular to support new vocabularies (eg. SVG and MathML) in addition to HTML/XHTML. The XML namespace spec back then was another pillar for advancing that goal.
HTML5 then simply imported SVG and MathML into HTML as external vocabulary, without a need for namespaces or other modularization meta-technique.
That XML has failed on the web (and was succesful only in enterprise and publishing) shouldn't make fans of structured documents on the web bitter, though: SGML is exactly as capable as XML to describe the syntax of markup documents (with tag inference, a feature recently being discussed for re-inclusion into XML under the term invisible markup), and even can parse markdown and other Wiki syntaxes, can do HTML-aware, injection-free macro expansion, and a whole lot of other things waiting to be discovered by XML fans.
Also, XHTML 2 tried to push people to tags with defined semantics like ARTICLE, SECTION, MENU, etc. Most of the tags exist now in HTML5, but their semantics were neutered in the transition.
It doesn't really make development harder, because the semantic tags are defined based on properties that aren't visible to a parser. So it doesn't matter whether you use them correctly -- no validator will be able to tell.
> who didn't support the most popular layout method at the time, tables, in their new styling language CSS.
That's quite a misunderstanding of the past. Your proposition is only technically correct: CSS1 (Dec 1996) did not have table display properties, but they were added in CSS2 (Mar 1998). They were available early enough to matter.
The reason why many Web authors did not design with these properties is not the fault of the W3C, but rather because of the typical Microsoft sabotage. In the early 2000s, I personally did not give a shit about IE - my standard compliant CSS rendered fine in other popular browsers.
> And luckily, tables have become redundant as layout technique, since CSS would do that, already then.
It took well over a decade to get CSS to play nicely with layouts. It's disingenuous to present CSS as the obvious solution to a problem while criticising table usage to implement grid layouts, as this line of argument entirely ignores the recent history of the web.
Did you miss the 5-10 years during which every CSS designer tortured themselves replicating tables with floats? Google 'pure css page footer' to see the wreckage. CSS tables took years to be supported, and even then, only brought back what people had been irrationally told to stay away from. It took another 5 years for flex box to become usable, adding something actually new.
To this day, changing a site's entire design without touching its markup is a mirage. It only works in contrived scenarios with extremely artificial restrictions, and nobody does it in the real world. CSS Zen Garden was demoscene.
Tabular markup for non-tabulated data is not rational & semantic markup is certainly not irrational.
You don't come from print design by any chance?
With flexbox, for example, you can change order of appearance contrary to the code order in the markup. It's not a mirage. Anyone who's used a browser's reader mode, or distilled view (as Brave calls it), knows the usefulness of applying different styles to a fixed markup.
Separation of presentation is possible.
Indeed now we've moved to responsive design and the number of devices and UA has exploded the separation of design and data is coming in to its own -- but instead pixel-perfection is still being chased with a billion @media declarations.
> Tabular markup for non-tabulated data is not rational & semantic markup is certainly not irrational.
Aren't you talking past each other? The issue is that with initial versions of CSS, replicating the powerful layout possibilities of tabular markup was not possible. Developers generally built layouts using "float" instead, and had trouble replicating some of the standard table features like keeping columns the same height.
Only with full CSS 2.1 implementation in IE8 did this state of affairs change, at that point you could apply rules like "display: table" to DIVs and get the same layout possibilities without using tables.
Flexbox, on the other hand, is actually an improvement (are are Grid Layout and Template Layout Modules). But it didn't come until after CSS 2.1.
You've not been doing that whole web design thing for very long, by any chance? Those of us who have been around for a while do remember the inadequate layouting capabilities of CSS of bygone eras. While I was on the semantic markup side of the debate at the time, it's not as if the pragmatists that just went with the table for convenience's sake did so for no reason at all...
I started web design for lynx browser, back when using pine for email was hot, then moved on to Mosaic and NN.
There's a clear body of web design/dev people that think it's just a visual display medium, and a lot of those seem to come from print design.
The web for me has always been primarily a medium for information transfer, visual design is nice but not at the expense of semantic markup; table markup for visual layout is entirely unnecessary (and was terrible for accessibility).
People went with table layout because marketing people demanded pixel matched presentation and/or they didn't care to make sure their content was machine readable. The same views lead to IE only and give us websites that don't bother with semantic blocks now, or that don't work without JavaScript when that js is just being used for presentational flair.
My mistake, then. Point is, you didn't need to start off in print design to end up with certain attitudes under discussion. Customers demanded it, and you can only do so much trying to convince them otherwise.
For another, wishing for a certain amount of control over the layout seems like an entirely reasonable demand to me (eg you should not need to jump through hoops just to place something in the center of the viewport).
Lastly, if we're being honest, how much has semantic markup improved the web experience in practice, and how much did layout tables hurt us? When I used to do web design, I was a good citizen, avoiding layout table and arguing for semantic markup. But nowadays, I'm far more forgiving, though I still think there's some value in it insofar as proper markup can improve the screen reader experience.
Personally, I was fine with ditching tables, but stuff like differing box models, unequal CSS support, quirks modes, etc did make for a rather painful cross-browser development experience: Making things work out correctly (or at least gracefully degrade) in multipe IE versions, Mozilla, Opera, ... could be a challenge.
> Did you miss the 5-10 years during which every CSS designer tortured themselves replicating tables with floats?
I never really understood why people had such difficulty with this. I was able to execute table-less layouts while still supporting IE5 on Mac.
> To this day, changing a site's entire design without touching its markup is a mirage.
That's only because HTML authoring is dead. No one writes HTML well these days. Just look at tools like Elementor. How many nested divs do you need to add a faux button to a website? It's ridiculous.
Write well-structured, semantic HTML being mindful of a separation of concerns, and flipping between stylesheets is a piece of cake.
Can you give us a link to your elegant table-less IE5 supporting website? I would like to see how you achieved it.
I lost a lot of hair trying to make a simple 3 column layout where the middle column would scale to the width of the window and could consist of multiple DIVs in a vertical row, all of the same width. AKA "baby's first blog" layout. Something that should have been one of the design cases for CSS.
> Can you give us a link to your elegant table-less IE5 supporting website?
I cannot. They no longer exist. This was 14 years ago. I left web development shortly afterwards.
For your example, horizontal alignment was easy. One container div with a width of 99.9% and a left/right margin of auto. Inside you place three divs (columns) with a width of 33.3% and float left. Add another div at the end to clear the float.
You don't want the sidebars to scale with the screen though, just the middle. And as I recall the obvious solution of just setting a fixed width on the two outer divs and letting the middle one autoscale didn't work for some stupid reason. Maybe because they scaled to the content, not the width of the remaining space.
Exactly. And the same people who keep producing tons and tons of specs no one uses -- and even if anyone wants to use they're so complex it doesn't make any sense.
RDF, JSON-LD, "semantic web", piles and piles of garbage. And all written by the same group of 10 people who have never written a web app by themselves.
From what I can see, your example of "DOM v3 Document.load()" is a great example of why the W3C failed to convince browsers to implement their proposals.
That spec seems to be mostly related to Java and it seems to me that it hardly considers how ECMAScript could use it i.e. someone had an idea, but couldn't translate that into a useful feature...
Foolishness aside it is important to understand the DOM came out of work from the XML Schema spec and for many years updates to those two documents were always released together.
The W3C isn’t irrelevant as there is more to the web than just HTML just like Oasis isn’t irrelevant for schematic design and business integration. This talk of irrelevance is a hard argument to make for many frontend developers who cant tell the difference between the standard DOM and Reacts VDOM.
Yes tables. There is a reason why people where abusing table for layout even though all hated it.
In the end, we finally have CSS grid now ( https://css-tricks.com/snippets/css/complete-guide-grid ) which do what we were trying to achieve back in the day with tables. We had to wait until 2016 to have a type of layout which is considered standard in most GUI toolkit ...
Which got released in 2009, but took another year to overtake IE6+7 in market share. This means `display:table` was of limited usefulness for nearly a decade after its introduction...
Sure but that's a different issue. It's not the fault of the standard if a major browser doesn't implement it.
In practice I was using display table in some designs in the mid 00's but also using conditional stylesheets to hack an IE layout. This was not a particularly good solution but it "worked".
But that's kind of the point, we had to do a lot of things that just "worked" because we didn't have the tool that were considered standard in almost any other GUI toolkit.
W3C became irrelevant because they kept thinking they could just tell the entire web what to do, that they'd make enormous technical investments to satisfy the W3C's latest fashions.
I interacted with them once, over their seamless iframe proposal. I had to point out that the two biggest uses of iframes at the time, i.e. Twitter embeds and Facebook apps, could not make use of it, despite being a perfect fit. They just hadn't considered that.