Hacker News new | past | comments | ask | show | jobs | submit login

HTML markup is a good example of premature design optimization. It may have been a good in theory to separate content from presentation, but if you look at the web today, the way pages are generated is a huge mess.

Even this site, which uses tables for layout when "you're not supposed to used tables for layout", is a good example of why HTML is so bad for creating web pages. There's no reason I should have to jump through all the hoops I do to display two div blocks side by side in a horizontal box. All other XML based layouts I've used (Android and Flex) are a piece of cake compared to messing with HTML, CSS, and JavaScript.




I'm not sure it was particularly premature for the original use case, which was writing hypertext documents. HTML inherited the idea of structure/style separation (though initially implemented in a sort of half-assed way) from previous document markup languages, like SGML and LaTeX, where it had gotten a decent numbers of years of exercise, and worked fairly well. I mean it still works pretty well in LaTeX, though not perfectly.


You're exactly right... for the original purpose - hypertext documents - HTML and CSS work just fine. The problems arise from the fact that we've hijacked these documents to produce web applications which use the barest stub of a document to get running. And then we try to shoehorn a dynamic site into what was originally designed to be a static document. Dynamic content was originally the exception, not the norm.

Frames were actually a good effort to solve the problem of web-site "chrome" (navigation bars, headers, etc), and were somewhat useful. But they proved to not be flexible enough for designers (rightly so), and thus we have the mess that we have now.


This is a failing of CSS, specifically. CSS has extremely limited tools for building a page layout.

See http://www.w3.org/TR/css3-flexbox/ and http://dev.w3.org/csswg/css3-grid-align/


But it doesn't make sense to me- why does presentation need to be separated from content? On pretty much every single page on the internet, layout and presentation are so tightly coupled that it ends up being more of a hassle to maintain a separate style sheet than it is to declare stuff inline.


content and presentation should be separate so that:

- the browsers can cache presentation instead of loading it every time

- presentation only has to be defined once, instead of every time it's used

- presentation can change based on factors like screen size

- accessibility features, like high-contrast themes, are not possible with tightly coupled presentation logic

if your presentation and content are so tightly coupled that it's easier to inline everything, you're doing it wrong.


If you step back you'll realize these are all failures of HTML and the latency of networks that we have to work around, not requirements.

For caching, why do I have to reload a whole page just to change the article part? The HTML 'it's a document' obsession and terrible DOM model that took so long to be updated.

Why don't I define a high level presentation document and then sub-documents? Network latency and the total bizarreness of CSS selector priority.

Screen size. Why can't you dock elements on a page like form based programming language for the last 10/15 years? HTML doesn't have the ability.

And as for accessibility? If accessibility is taking hints from markup about how to present the page, in reality the mark up is tightly coupled to presentation anyway. Accessibility is tightly coupling the markup to the presentation, what it really means is screen readers can format the content nicely without the need for a style sheet.


I also have a few questions to ask which supports separation of content and presentation:

Have you ever worked on a project that used inline styles on almost ever paragraph or heading? If yes, you will instantly know much of a pain it is. If you've ever transferred said content to a totally different design you should be scarred by this experience.

Have you also tried adding a page to a website that uses a table-based layout (the prime example of presentational HTML)? When a modern design is fit onto one of these layouts, adding content becomes a really painful experience.

I can see first hand the benefits of good separation.


I can attest to this. As a client-side developer I can go into detail of how much pain is involved in changing the skin on a site when the original developers did not structure their HTML well, including inline styles all over the place. Plus in many cases it seemed the developers had no idea how HTML actually works creating pages that will never validate causing all kinds of weird side issues from browser to browser.

I have CSS selectors that go five to six levels deep because of tables contained in tables contained in tables with no classes or ids on any elements. Often times those tables in tables is totally unnecessary.

Div > span > table > tr > span > td > span > div

That's not the way to build a web page. My guess? They used Visual Studio for layout as if they were coding a desktop application.


Are you serious?

Don't get me wrong, there is a lot to hate about HTML and CSS. But unless you're working on a 5-page website, separating those two is a blessing. Do you not remember the nightmare of FONT tags?


> Do you not remember the nightmare of FONT tags?

Yes, but that's simply a failure from having no ability to define abstractions.

Separating form/design/layout from content is one possible abstraction boundary, but it's just one arbitrary one, so it's kind of strange that they tried to bake it into the platform instead of making it easier to define whatever abstractions are meaningful for you.


They deliberately don't let you define whatever abstractions are meaningful you, it's called the Principle of Least Power: http://www.w3.org/DesignIssues/Principles.html#PLP

Just explaining that what you consider a "strange" design choice is actually deliberate and carefully thought out, not defending it--for all I know, it might have been carefully reasoned out but the wrong conclusions reached--but the results stand for themselves: a platform that is now synonymous with "the Internet".


or mobile scaling. or changing the design. or changing a detail on EVERY PAGES of your websites (like the font).


> why does presentation need to be separated from content?

You can't think of any instance when you would want to access the content irrespective of the presentation?

Imagine if, instead of HTML and CSS, web pages were delivered as pre-rendered PNGs. Make a list of all the different things that would break. That's why presentation should be separated from content.


This is the exact problem cited in the original post. Why are we designing for hypothetical users when we have actual users to design for?


I'm not really following you. "Make a list of all the different things that would break" - these things are things that are in use today by real people. No hypothetical people involved.


One group that has always been chasing the "holy grail" of separation of content and presentation are big publishers. They want one content source that is controlled by "editors" and then the content can be rendered differently for the different avenues of publication. Maybe a transform of the content is sent to online databases like lexis/nexis or westlaw, or sent to printing press for a book, or sent to the web, or abstracts sent to a bibliography service, etc.

This is why SGML was big in the publishing industry before the web.

It's mostly achievable but there are clearly problem areas such as tables where sometimes the presentation is an integral part of the content.


I agree with you, but the main reason to do this is to keep it it DRY(Don't Repeat Yourself).

Templating systems help with this a lot, but they don't completely get you around being able to add the class 'round' to everything you want to have rounded corners. That's really why you keep things separate, so you can minting a single point of change.


HTML had the opportunity to get better with XHTML and strict enforcement of schemas and XML syntax (which I'm willing to bet Android and Flex require).

The problem was that doing so broke most of the web, or was not internalized by page creators. So we're stuck with the current situation (bad markup, crufty designs, etc.).

Imagine if the first HTML editors forced a schema check before save. I think we'd be in a much better place now if they did...


The myth that the web would be better with strict XML parsing is convincingly debunked here: http://web.archive.org/web/20110514122249/http://diveintomar...

Web pages (unlike Android layout files) are complicated composites of multiple data sources that are generated on-the-fly, and are combined in complicated (but sometimes low-tech) ways like string manipulation. Under such circumstances, it is simply too hard to create perfect XML every time.


That isn't really related to what he said at all. Sure web browsers should be forgiving in what they except, but dev tools shouldn't be. How would you like the C compiler that tried to interpret any old chicken scratch as a valid C program? Never a compiler error again!


> How would you like the C compiler that tried to interpret any old chicken scratch as a valid C program? Never a compiler error again!

Oh fun, bad analogies time! How would you like it if your word processor refused to save a text document because it detected an incomplete sentence?


Nope no analogy. If I expect a machine to understand a language, then it had better be able to determine weather or not a document written in that language is valid. It doesn't matter if it is a programming language or a mark up language. Permissive modes are great for accepting the work of others, but when learning how to write that language in the first place a strict interpretation is best. That way you can focus on getting it right.


C compilers do this all the time. If you're lucky, they'll warn you about undefined behavior as they do it, instead of just silently making all sorts of optimizations because the standard allows them to.

Now as it happens, C compilers have a syntax validator as part of them. Lots of HTML editors, past and present, have used HTML validators too...


What dev tools? Any text editor is a dev tool that can and is used to create HTML for browsers to (forgivingly) accept.

That's actually fundamental design aspect of HTML which was partly responsible for it's popularity.


Last I checked IE, Fire Fox and Chrome all had a dev mode. Sure they might not have had them at first but a strict mode or a parse check or some thing of that sort would have been helpful all along.


Of course, but there are way of combining data sources that will produce valid XML.


I would argue the contrary, that the very reason why HTML took off so fast was that any fool was able to craft a site, and it would work even if it had a few bugs in it. Failure tolerance is a great feature. Especially if you consider it in the context of document authoring - for which HTML was originally designed for - it's better to read a document that has one unclosed <b> tag in it, than to be completely unable to read it because there's a syntax error in markup.


But the problems sskates mentions would not be helped by better schema compliance; they start and end with the fact that CSS is a miserably poor layout engine that is not powerful enough to effectively separate content from presentation.

HTML itself has its warts, sure, but the fact that we end up constantly resorting to HTML and/or Javascript edits to do things that should be happening in CSS alone is, to me, a much worse problem.


Pages made by decent developers are not a mess. Certainly orders of magnitude less messy than handling all your presentation in code, mixed with business logic and controllers.

Separation of presentation and content allows for information to be accessible by everyone, indexed/readable by machines, adaptable to different displays. It's also easier to maintain (content editors can't mess with layouts) and makes many performance improvements possible.

The HN site has no special reason to use tables, it's perfectly possible to render this layout using semantic, clean markup. CSS has come a long way since 1999.


But nowhere near as easy or as cross browser compliant, especially at the time this code was written.


It may be true if HN code was written before 2001.


It's still true now; IE6 still has a decent number of users especially in the corporate world and no CSS solution is as simple and direct as tables even today.


Well, we are allowed to evolve.


Actualy is it so wrong to use table (like we're not supposed to) in a profesionnal css design? They do great in every browser from ie 6 to opera and they don't nead clear:both or other tricks.

I miss designing with tables... And I just don't do anymore because I was told not to, for maybe no good reasons.


There are perfectly good reasons. Read my comment above and the many answers here: http://stackoverflow.com/questions/83073/why-not-use-tables-...


Ceasing to adopt standards, best practices and support users who are disabled because "the old way seems to work better" is a terrible way to advance technology.


Well, I have done things with HTML which make folks cringe who don't understand why I am doing it.

Imagine a web page with a giant table, every other row of which contains another table, and where the entire page contains probably 20000 INPUT tags, half of which are potentially exposed to the user under one set of visibility rules or another.

Now if it wasn't a bulk payment interface for wiring money out to hundreds of clients, paying potentially up to 5000 invoices in a single run, it would be entirely insane. As it is, the insanity is mostly an issue of the fact we have to do a lot to handle the fact that we have to handle concurrency issues over application protocols like HTTP, which leads to fun stuff in the database.


Switching to a new inferior technology simply because it's hip is not a good way to evolve technology.


Using semantic HTML and CSS to design pages is certainly not a "hip" technology. It's use case and benefits, while not perfect[1] are well understood and proven. It's not a "broken" use case, like using tables for layout are.

[1] highly interactive web apps.


I also like how CSS has less layout functionality than tables, specifically <td width="*" valign="middle">.

Sure, you can emulate this with "display: table;" but a) it's not backwards compatible and b) that's the same darn thing as just using a table!


It's not "the same darn thing". The <table> tag is for tabular data, it implies relationships between it's elements that are of utter importance for screen readers and machines. It might be ok visually, but structurally it's a mess.

display:table is not an emulation, it's exactly the same behavior as a table, exposed for use at will. You can use the flexbox model for the same effect: http://jsbin.com/ejeraq

Be very glad that you have good vision and Google engineers spend millions and decades on making some sense of tag soup.


Your passion is commendable. I'll not go into the "a <li> is more semantic and accessible than a <td>" debate.

I'll just say that 1) display:table-cell doesn't work in IE6/7 and 2) CSS's limitations make it annoying or impossible to achieve functionality that was easy with table layouts (eg mixing percentage widths with fixed widths, and vertical aligment).


That you fail to mention, that table layouts make a lot more things annoying or impossible to achieve.


> The <table> tag is for tabular data, it implies relationships between it's elements that are of utter importance for screen readers and machines.

Nope. The <table> element was originally defined as an all-purpose chainsaw for two-dimensional relationships. The browser crackmonkeys locked us into that in the late '90s, and we have to deal with it forever. We are stuck with the legacy of billions of old documents, and thousands of old HTML parsers.

The W3C is pissing into the wind by trying to rewrite history. They cannot simply wish away technical lock-in by putting a "TABLE elements are semantic" clause in a standard. That needs to be repealed, and the we-must-ignore-history children must be given a spanking and sent to their rooms.

What the W3C should do is define a "tabulartext" attribute:

    <!-- A table of data to read -->
    <table tabulartext>
        <tr><th>Title</th> <th>Title</th></tr>
        <tr><td>Data</td> <td>Data</td></tr>
    </table>

    <!-- Structural markup -->
    <table>
      <tr>
        <td>{{left_sidebar}}</td>
        <td>{{content}}</td>
        <td>{{right_sidebar}}</td>
      </tr>
    </table>
If they did this, the screen readers and Alexa top 500 sites would STAMPEDE towards a bright new future of accessibility, machine readability, and backwards compatibility.


You're joking right?

The HTML table model allows authors to arrange data -- text, preformatted text, images, links, forms, form fields, other tables, etc. -- into rows and columns of cells. - HTML 4.01 specification, 1999

Tables were added in the HTML3 spec:

HTML 3.2 includes a widely deployed subset of the specification given in RFC 1942 and can be used to markup tabular material or for layout purposes. Note that the latter role typically causes problems when rending to speech or to text only user agents. - HTML3.2, 1997

The "for layout purposes" was a big mistake, they even acknowledged that it wasn't adequate. Fixed 2 years later, or 10 years ago. There is no reason to markup things as tables when you can have a much clearer document outline using the proper header and grouping elements.

HTML5 parsers have no problem keeping up with old tag soup, but that doesn't mean we should keep writing crap. Implying that sites built using only tables are accessible and machine readable is just.. asinine.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: