
Ask HN: Why is HN HTML laid out in tables? - joeybaker
Hacker News is certainly not HTML5 friendly. Heck it doesn't even validate in HTML4 http://validator.w3.org/check?uri=http%3A%2F%2Fnews.ycombinator.com%2F&#38;charset=%28detect+automatically%29&#38;doctype=Inline&#38;group=0<p>&#60;body&#62;&#60;center&#62;&#60;table border=0 cellpadding=0 cellspacing=0 width="85%" bgcolor=#f6f6ef&#62;&#60;tr&#62;&#60;td bgcolor=#ff6600&#62;&#60;table border=0 cellpadding=0 cellspacing=0 width="100%" style="padding:2px"&#62;&#60;tr&#62;&#60;td style="width:18px;padding-right:4px"&#62;&#60;a href="http://ycombinator.com"&#62;&#60;img src="http://ycombinator.com/images/y18.gif" width=18 height=18 style="border:1px #ffffff solid;"&#62;&#60;/img&#62;&#60;/a&#62;&#60;/td&#62;&#60;td style="line-height:12pt; height:10px;"&#62;&#60;span class="pagetop"&#62;&#60;b&#62;&#60;a href="http://news.ycombinator.com/news"&#62;Hacker News&#60;/a&#62;&#60;/b&#62;&#60;img src="http://ycombinator.com/images/s.gif" height=1 width=10&#62;&#60;a href="http://news.ycombinator.com/newest"&#62;new&#60;/a&#62; | &#60;a href="http://news.ycombinator.com/newcomments"&#62;comments&#60;/a&#62; | &#60;a href="http://news.ycombinator.com/ask"&#62;ask&#60;/a&#62; | &#60;a href="http://news.ycombinator.com/jobs"&#62;jobs&#60;/a&#62; | &#60;a href="http://news.ycombinator.com/submit"&#62;submit&#60;/a&#62; | &#60;font color=#ffffff&#62;joeybaker's comments&#60;/font&#62;&#60;/span&#62;&#60;/td&#62;&#60;td style="text-align:right;padding-right:4px;"&#62;&#60;span class="pagetop"&#62;&#60;a href="http://news.ycombinator.com/x?fnid=0uLs1HxN7c"&#62;login&#60;/a&#62;&#60;/span&#62;&#60;/td&#62;&#60;/tr&#62;&#60;/table&#62;&#60;/td&#62;&#60;/tr&#62;&#60;tr style="height:10px"&#62;&#60;/tr&#62;&#60;tr&#62;&#60;td&#62;&#60;tr&#62;&#60;td&#62;&#60;table border=0&#62;&#60;tr&#62;&#60;td&#62; …………
======
mixmax
Because it works. No mucking about with CSS overflows, stuff that jumps to the
next line, things that don't align, etc.

------
maukdaddy
Because despite what the CSS purists say, tables work damn well for laying out
content.

~~~
citricsquid
Just like an iPad _works_ as a door stop.

edit: heh, downvotes. I am not a "css purist" there are times when tables are
the correct solution, but they aren't here. Just because "it works" can be
said that doesn't explain/justify their use.

------
profquail
I rewrote the HN markup to use XHTML + CSS + MicroFormats last year, but never
got a chance to re-write the actual templating code in news.arc
(<http://www.arclanguage.org>).

If anyone's interested in finishing that last part, let me know and I'll be
happy to send you the templates and stuff I made. I also added some
rudimentary support for mobile-specific stylesheets and scripts.

~~~
dpurp
i don't know if i'll have time to finish anything, but if you would like to
send it to me (email in profile), i'd be interested in looking it over and
trying it out.

------
Mz
I know a lot of folks here are actual programmers (unlike me, I mean) and can
nitpick anything to death, but maybe folks have missed the part about HN being
a little side project that exists to serve the goals of YC? It isn't a high
priority for PG. YC is the real business. This is a free service and it serves
some of YC's needs but there are no ads, it isn't monetized, etc. Unlike some
of the websites folks here own, it seems to me HN does not provide enough
value for pg to justify jumping through hoops backwards, blindfolded and on
fire to make folks here happy with some cutting edge, WOW! coding.

(Personal note: Now I feel better about my sites being laid out in tables.
:-P)

------
teej
Because it doesn't matter.

~~~
bradleyland
If someone writes C/Python/Lua/language-of-your-choice code that violates good
principles, misuses constructs, and generally looks like crap, everyone calls
them a lousy coder. Yet when it comes to HTML, using semantically incorrect
markup is somehow ok.

I understand why pg hasn't updated the display markup, and it's his website,
so he can do what he likes with it, but to say "it doesn't matter" is to give
anyone writing bad markup a pass. It degrades our artform. That matters.

~~~
wglb
In a previous lifetime, I worked on the code generator for a compiler for an
implementation language. We had a rule that said "A good compiler generates
assembly code that an assembly programmer would be fired for writing." Early
releases of that compiler would get bug reports from the OS implementation
team that "this code is wrong". We would sit down and walk through the
generated code, and there would be an "Oh" moment where the programmer would
realize that it was simply different, and in all cases, faster.

Similarly, the code generated by good Python, to a CPU, looks like crap
because it repeats stuff, allocates and releases memory, makes unnecessary
calls, swaps stuff between stack and registers, and a whole host of other
sins.

But we don't care, because what we care about is the end result: the Python or
Lua or Ruby code generates lovely computational results. Perhaps 1% of spend
time looking in detail at the actual machine code generated. The purpose of
these languages is to 1) Let us write good, readable code and 2) generate a
useful, sometimes beautiful result.

Similarly, the purpose of this particular Arc program is to generate a useful
result, that is, a readable, useful set of html files that our browsers render
readably.

So really, "it doesn't matter" if the assembly language is junk, or if the
html is not something early versions of Patrick would turn out with notepad.

Think of the generated html as assembly language. It no more degrades our art
form than the goofy stuff that modern languages make our CPUs eat.

~~~
bradleyland
I'm sorry, but you have an incorrect view of HTML. HTML's sole purpose is not
display. If that were the case, it would be a simple matter of identifying
what renders fastest and most accurately, but again, that's not HTML's sole
purpose.

The destination goal for HTML is to convey information about the structure and
context of the content, not just how it is displayed. So we must care about
the HTML generated. It is requisite to it's function in providing additional
machine-readable information.

~~~
wglb
So what about the information and structure of the HN site is missing by the
way that it is currently displayed? What additional machine-readable
information is missing in the way that HN (or really any other site) produces
the information? Search?

There are a number of aggregator projects that various HN members have built
by scraping this "broken" html and they seem to work quite nicely.

I spent a couple of years deep in the SGML world and am fully cognizant of all
the arguments about how content needs to be completely separated from
presentation. HTML is really a weak sister in that world. I don't think my
view of HTML is incorrect.

In the real world, the ship of requiring _correct_ HTML from a gramatical
perspective left the harbor back in the 90's. If what you say were true,
browsers would refuse to render broken HTML.

~~~
bradleyland
You clearly know the answer to your own question, but you don't think it's
important. I'd ask you to take a step back and have a look at what I'm saying.
I'm saying that I understand why pg has set his priorities as he has. All I'm
suggesting is that we be honest about it. Let's not say it "doesn't matter".

Currently, aggregator projects work with HN because they know and understand
the HN markup specifically. In an ideal world (and one in which we don't live,
obviously), a "scraper" library should be able to identify things like comment
streams based on contextual information. Think of the power that comes just
from having indexes that are able to identify the title and content body. Now,
what if we take that a step further and build an indexer that can recognize
comments. One that can infer that one comment is made in reference to another
based on its nested hierarchy. Are tables the right structure for that?

I'm asking you to dream. I'm asking you not to be complacent with the tools
that "work" today. That's all. If you're content to use what you've got, and
you don't care if we ever end up with markup that enables these powerful new
ways of relating to data, fine, but don't say it doesn't matter. It matters.

------
olalonde
Here's pg's official answer (<http://news.ycombinator.com/item?id=1998708>) :

why is the UI so completely neglected?

Because when I spend time on HN my top priority is features that will make the
content better. I believe that matches the priorities of the users-- that
users would rather use a site with good stories and comments and a primitive
UI than one with a slick UI and worse stories and comments. And time is a
zero-sum game. Spending more time on UI = spending less on quality.

The focus on content quality above all is the reason you find yourself saying
later "If there were any other community like this..."

[...]

~~~
jpdugan
The OP isn't talking about Ajax calls and Dom transformations. He's talking
about using semantically correct markup (c'mon, it is literally an ordered
list) and a style sheet. Things that would make the site faster, easier to
maintain, and easier to parse.

If this were anyone other than pg, you'd all be excoriating the developer for
living in the 90s.

~~~
pg
It wouldn't make the site easier to maintain. This stuff is all generated by
software. All that would change is what the software generated as output.

How much faster would HN pages render if they used "semantically correct
markup?"

~~~
jpdugan
Why are you generating styling at all? It's not dynamic information. It's a
static asset that can be cached.

Semantically correct markup uses fewer tags, so the output is smaller (also,
there'd be no online styling, which would also make the page smaller). Even a
small difference on a popular site could matter. But only you can answer that
for HN.

Look, do what you want. You want to write bullshit markup, be my guest. But
there isn't a single web dev in this community who would code output like this
for their own site. They're just too chickenshit to tell you.

~~~
pg
Generating and caching are orthogonal questions. You can generate stuff, and
then cache it. And in fact HN does do a lot of caching.

I wasn't asking how much faster "semantically correct markup" would make HN
pages render as a rhetorical question. I'm genuinely curious. If you want to
make claims that x is faster than y, you should be prepared to back them up
with numbers, not merely with more heated language.

~~~
sayrer
Generating and caching can be intertwined. For example, HN seems to generate
an inline script at the top of every page. The script doesn't change nearly as
often as the posts do, so that script is sent redundantly at the top of every
page. If that script were located at separate URL, set to expire never, every
visitor would load it once and never again. If you need to change the script,
you can change the linked URL. Doing that can avoid an expensive calculation
for the server, but also a network request for the client which could be very,
very expensive.

To me, it sounds like the caching you're talking about is on the server side.
I think you mean something similar to memoization, so you can avoid doing some
expensive query or calculation. That is worth doing, but it's still possible
to organize the output of those caches in an inefficient manner, and incur
unnecessary network overhead. The "semantically correct" part of the markup
being advocated here isn't that interesting, if you ask me. What is
interesting is the claim is that you can generate less markup per page, and
get the same display.

The balance between the repeat visitor cache behavior and the initial number
of HTTP requests and latency for a first-time visitor can get hard to judge.
Without lots of time to measure the various alternatives and mitigation
tactics, it's best to try and generates as little markup as possible. That's
where so-called "semantic markup" comes in. Usually it's just less markup, and
does better.

Another thing to take into account is the layout behavior triggered in various
browsers by the markup you're generating. HN is pretty simple and should
render instantly, but it doesn't in anything I've tested (Firefox, Chrome,
Safari). Each of them redraw the scrollbar one or more times. That could be
due to the tables.

------
Kilimanjaro
I said it once and got downvoted to hell. I'll say it again. By removing
tables and moving all styles to a separate css file you can cut page size in
half.

HN is in need of an urgent redesign.

------
angdis
I've all but given up giving a damn about web standards. No one _REALLY_
cares. Not the uses, not the developers, not the w3c with their open-to-
interpretation "recommendations", and ESPECIALLY NOT browsers vendors who, for
various reasons, aren't able to nail down a consistent interpretation of web
standards.

The choices are to use a framework which does the shit-work for you, or resort
to lowest-common denominator "whatever works" techniques like HTML tables.

------
beoba
It doesn't claim to be HTML5 to begin with.

