
Ask HN: Why and how is Hacker News so fast? - i336_
I&#x27;m currently using an artificially-shaped 256k Internet connection (it resets in a couple days), and over the past few weeks browsing the web, I&#x27;ve very consistently found HN to be one of the fastest loading sites I&#x27;ve come across.<p>I understand that HN is built on top of Arc (http:&#x2F;&#x2F;arclanguage.org), a Lisp variant, and that the code (https:&#x2F;&#x2F;github.com&#x2F;wting&#x2F;hackernews&#x2F;blob&#x2F;master&#x2F;news.arc) uses flat files to store post and vote data. This is generally speaking quite an unusual&#x2F;unexpected architecture for a site presumed to serve a consistently moderate load, but HN manages to do so remarkably well, presumably because the site itself is so incredibly lightweight. I don&#x27;t think the minimal HTML is the only factor though, although it probably counts for the majority.<p>I&#x27;m interested to understand as much as possible&#x2F;practicable&#x2F;relevant about the server(s&#x27;) configuration, uplink bandwidth, other tunables, etc, so I can get an idea of what makes HN so exceptionally responsive and &quot;a cut above 99% of everything else&quot;. Does CloudFlare <i>really</i> make <i>that</i> much of a difference? :P<p>I know there&#x27;s some &quot;secret sauce&quot; in there somewhere, because virtually-verbatim clones of HN such as http:&#x2F;&#x2F;firespotting.com&#x2F; seem <i>almost</i> as fast... but <i>just</i> not as instantaneous as HN.<p>(PS: On the occasions I get hit by it, my ISP&#x27;s shaping config seems quite involved&#x2F;nuanced, and I actually want to profile it because it&#x27;s so catalyzing and would be very useful to apply to my own projects for &quot;worst-case&quot; testing. For the interested, the details I have thus far can be found here: http:&#x2F;&#x2F;serverfault.com&#x2F;questions&#x2F;709529&#x2F;how-can-i-profile-my-isps-bandwidth-shaping-settings)
======
Twirrim
It's relatively straightforward. Take a look at what Pingdom tells you it does
when the site loads:

[http://tools.pingdom.com/fpt/#!/cnYAuq/https://news.ycombina...](http://tools.pingdom.com/fpt/#!/cnYAuq/https://news.ycombinator.com/item?id=9990630)

5 requests, Total size 12.6kB, total time 661ms. 1 request for the content, 1
for css, and then 3 gifs, all from a single domain (so just one DNS request).
If you look at the profile, 47% of the time was spent doing that one DNS
lookup, so subsequent requests will be even quicker (plus caching will handle
those gifs just fine)

That's almost no work for the end client to do. No javsacript for the client
to download, parse and process, just straight content to render and done.

The reduced request count is especially important when it comes to the mobile
experience, where bandwidth might be great, but latency is terrible. Every
request you have to make to render a page significantly impacts the loading
time (HTTP2 helps here). Along with trying to reduce the number of resources
being loaded on the screen, it's especially important to reduce the number of
domains you're getting them from so you keep the DNS queries down.

~~~
d357r0y3r
Might be able to get that down to 5 requests if they use .pngs instead of
.gifs and throw them into a sprite sheet.

~~~
supermatt
You can also have gif spritesheets.

~~~
dohertyjf
Woah didn't know that. Thanks!

------
paulsutter
HN pages are small, the HTML is generated on the server, and they render
nearly progressively. That is, the renderer doesn't need to block waiting for
many CSS, JavaScript, or font files to download.

Most development practices considered "modern" prevent progressive rendering.
Common tools like Bootstrap or Jquery, as normally used, cause page rendering
to block until large libraries are downloaded, despite the fact that only a
tiny portion of bootstrap or Jquery are actually used by the page.

The largest reason of all is that most developers dismiss such concerns as old
school worries and like to harp on and on about how tools like Angular enforce
wonderful software engineering practices, ignoring that most such frameworks,
as commonly used, bloat pages and cause incredible rendering delays.

The difficulty with improving performance is that most "modern" pages have
numerous render-blocks. The page doesn't get fast until you fix all of them.
This is the reason most wise suggestions to speed up a page make little
difference when tested individually.

~~~
edanm
"The largest reason of all is that most developers dismiss such concerns as
old school worries and like to harp on and on about how tools like Angular
enforce wonderful software engineering practices, ignoring that most such
frameworks, as commonly used, bloat pages and cause incredible rendering
delays."

Alternatively, you could say that many people look at the tradeoffs and decide
they're comfortable with trading off a certain amount of speed for the user,
in order to increase development speed.

Whether these people are right or wrong really depends on the context.

~~~
paulsutter
Users sitting on gigabit links close to the server make the tradeoffs, and
people on a high latency end of a 3G link, or an airport wifi with a weak
signal and 50% packet loss need to live with those tradeoffs.

~~~
edanm
Sometimes, yes, in which case the tradeoff might not make sense.

But sometimes it does - e.g. I think in most cases, something existing but
being slow is better than something not existing in the first place, which
might be the case if the costs of making it go up.

------
fractalcat
Loading this webpage I downloaded ~12KB. For comparison, I went to the NYT
homepage and downloaded ~1.2 MB. For people with shitty downlinks like you and
I, the absence of exorbitant quantities of Javascript and images makes a lot
of difference - a lot more than time-to-first-byte.

~~~
mkawia
that 1.2Mb gets cached. Infact that will make the site faster because once the
.js are downloaded only json/partials will be fetched when visited again.

It must be the backend ,ie when tasked to render the same page ,which backend
would respond faster

~~~
CydeWeys
I have a fast FIOS connection, and the New York Times webpage still takes
around 6 seconds to fully load according to the Chrome Network panel. In that
time it makes 226 requests and downloads 168 KB. This is after several page
refreshes, so I'm fully taking advantage of browser caching. Simply put,
there's way more images, fonts, and network callbacks on that site than on HN.
It's way more heavyweight. HN, by contrast, downloads all of the content in
only SIX requests, at 14.8 KB total, and in under a second.

It's the number of requests that's killing the NYT site. HN is very simple and
old school, and doesn't do a single thing that isn't explicitly necessary to
render exactly the content you see on the page, which is presented cleanly and
without frills.

------
FraaJad
If you want to see an even faster "forum", see

[http://forum.dlang.org/](http://forum.dlang.org/)

~~~
srean
I have been puzzled by this being said about dlang forever and today about HN.
These sites are certainly not noticeably slow, but I don't notice anything
remarkably fast about them either. Am I not visiting enough of the slow sites
?

My working hypothesis is that I don't have a good understanding of web-
development to appreciate the nuances. Much akin to me listening to Indian
classical music, Jazz: I would often encounter trained listeners breaking out
into a (silent) applause. At those instances I feel bad because I realize I
have missed something but I wish I didn't. So all pointers appreciated.

~~~
mootothemax
With an empty cache, for me, hacker news loads in ~700ms.

Under the same conditions, the dlang site loads in ~200ms.

That's half a second faster, which is definitely noticeable when clicking
around.

~~~
srean
Thanks for the numbers. A difference of 500ms would definitely be noticeable,
especially if its in ones daily workflow

------
andrewstuart2
In addition to being so small (text-based, minimal JS), HN is cached by
CloudFlare, which grabs copies of dynamic pages and pushes those copies out to
their CDN edge servers around the globe so they're closer to end users.

When your browser asks for the IP to news.ycombinator.com, you get the address
of a cloudflare CDN server. Which web server the DNS server will send you is
based on a GeoIP lookup of your own IP address so that they can give you the
address to the server physically closest to you, and therefore the lowest
possible latency.

Edit: Apparently this is not how CF does things (see below explanation by
benjojo12) but is not uncommon in load-balancing and distribution.

    
    
        news.ycombinator.com.	296	IN	CNAME	news.ycombinator.com.cdn.cloudflare.net.
        news.ycombinator.com.cdn.cloudflare.net. 299 IN	A 198.41.190.47
        news.ycombinator.com.cdn.cloudflare.net. 299 IN	A 198.41.191.47

~~~
jonahx
So every time a new comment is posted, is the CloudFlare cache invalidated and
replaced with a new version of the comments page?

~~~
benjojo12
Actually for now, HN is not caching their HTML with us. we put a "CF-Cache-
Status" header for when we are caching things

~~~
jonahx
So when hitting a comment page, HN's backend is being hit in some way, whether
by reading a db or by concatenating flat files or something else? Or are they
just using a local cache that isn't pushed out to CloudFlare because there
would be so much churn that pushing to CF wouldn't be a benefit?

------
justinsaccount
Not pulling in a ton of external javascript, images, and ads is probably the
biggest, but one thing that helps is that most comment threads have a
relatively low number of comments. This page currently only pulls down down
12KB (compressed to 3.75KB).

A thread like
[https://news.ycombinator.com/item?id=9976298](https://news.ycombinator.com/item?id=9976298)
with ~600 comments pulls down 785KB (compressed to 143KB)

In general, looking at the network tab in the developer tools will show why
one site loads faster than another:

[https://developer.chrome.com/devtools/docs/network](https://developer.chrome.com/devtools/docs/network)

[https://developer.mozilla.org/en-
US/docs/Tools/Network_Monit...](https://developer.mozilla.org/en-
US/docs/Tools/Network_Monitor)

~~~
brownbat
Can someone buy the .text TLD already and make a whole part of the internet
where this approach is standard?

~~~
sokoloff
Sorry for the offensive domain name (it's not mine), but this is relevant
here:

[http://motherfuckingwebsite.com/](http://motherfuckingwebsite.com/)

~~~
noir_lord
I prefer
[http://bettermotherfuckingwebsite.com/](http://bettermotherfuckingwebsite.com/)
which was my inspiration for my blog design
[http://benlowery.co.uk/](http://benlowery.co.uk/) :)

~~~
marcosdumay
You know, after 20 years of www, one'd think browsers would be able to
competently format text files already. The fact that we need that small amount
of css (instead of it just being available) means the web is still broken.

~~~
JupiterMoon
Check out Firefox reader view.

~~~
acchan
There should be a doctype declaration that automatically triggers this.

------
lolwuttt
Wow, really? Has it become such a generational sea change? Have people
_really_ failed to understand the nightmare that modern javascript, full-
motion animated gifs, web videos, and ultra high-resolution jpegs represent to
bandwidth? Nevermind their local CPUs?

Javascript is really the black hole, though. When people defer their client-
side libraries to whatever CDN, and then let that library manage whatever it
pleases for dependencies, forget it. That's like saying, I don't even care if
people's computers catch fire and explode, when they try to watch this awesome
star field simulation.

Oh yeah, and all the ads and spyware analytics too, ya know...

------
hippiefahrzeug
As many others have mentioned, the secret sauce is probably quite simply this:
No ads, no trackers, no js libraries. I use Firefox with the 'disconnect'
plugin. On HN, disconnect stays completely inactive, nothing to be blocked,
just good old html to be rendered with a sprinkle of css. It's beautiful, it's
the way it should be.

------
FooBarWidget
I've found that not including Google Analytics, Twitter buttons and that sort
of stuff plays a major part in loading time. Despite the fact that those
scripts have the 'async' tag, they still impact loading time.

Heck, I get the feeling that they impact loading time more than my own CSS,
images and Javascript. I already heavily minify and concatenate everything, as
well as use SPDY for improved concurrent requests, but it seems Google and
Twitter add 1-2 seconds.

~~~
hussong
The fun part is when the google pagespeed tool starts nagging about google's
own resources (fonts, analytics etc.) -- so meta.

------
tfb
It's because it's only text with little to no JavaScript, and I haven't looked
at the source behind HN, but presumably, the HTML for every URL is cached upon
each change (comment, upvote) and requires very little processing time per
user customization.

------
dekhn
The site is just a bunch of text without any async loading. I think all the
database is cached in RAM. So a page load is just "render this tiny HTML with
a few template substitutions and spit it out over the network".

I kind of wish the whole web worked that way. Yahoo was like that in the old,
old days (akebono.stanford.edu).

------
pcl
I haven't done any real analysis of it, but I've always assumed it's because
it renders up a single small HTML document rather than large core content and
dozens of external references (ads, analysis, etc.).

Looking at this page in my browser's network analyzer, I see a request for the
page itself (3.8k), followed by requests for the css and a few images which
don't impact layout. On top of that, JS isn't used for layout, so the HTML
renderer can do its job.

~~~
kmfrk
And the JS and images are probably cached anyway, since they don't change like
the rest of the design.

------
logicuce
For your need to simulate a 256kbps connection, two things have worked for me.
Both require a good few mbps connection to begin with though:

1\. Use Chrome's device emulator mode. You can ignore all other settings like
screen size, user agent, etc. and just use the speed restriction. 2\. Use a
specialized WiFi router. I use the ones from TP-Link in which I can restrict
speed for a particular device IP. DD WRT based routers can also help here.

~~~
kalleboo
For OS X and iOS users, Apple has a "Network Link Conditioner" utility in the
developer tools that can simulate different throughput as well as packet loss.

~~~
logicuce
Just tried it. Looks great.

In fact, it can simulate the scenario better as compared to Chrome as the
speed is shared by everything on the system and not just that Chrome tab.

------
aw3c2
Server-side is a different beast than the whole server->client part.

HN does not load particularly fast for me, 300-500ms is fairly slow.

~~~
tormeh
Same for me. I don't think HN's particularly fast. Barely faster than
rockpapershotgun.com, which has lots of images, but not much. Given that it's
all pure html text, I would expect more. RPS might have better caching that's
closer to me though.

~~~
aikah
> Barely faster than rockpapershotgun.com, which has lots of images, but not
> much

rockpapershotgun.com takes 10 seconds to load entirely and serves 1.6 MB of
data. Which is OK compared to others Like Polygon and Co which serve more that
5MB on their homepage and take 20sec to load on average.

------
ww520
Flat files are very fast. The OS caches the files most of the time in memory.
OS caching only becomes a problem when the hot data exceeds the memory.

By the nature of HN, there isn't that much hot data. Only the couple dozens of
links in the HN front page are popular. May be couple hundreds or couple
thousands people actively voting. All those don't take too much space and can
fit in memory comfortably so the entire data set can be operated inside memory
essentially.

The front page has 30 articles. Assuming 100k per one yields 3MB. It has about
1000 vote points. Assuming 50 bytes per vote, it's about 50k. So you can fit
the entire front page data in under 4MB. That's tiny amount of data to work
with.

------
dang
I'm late to the thread, but for what it's worth: Hacker News is too damn slow.

------
zkhalique
It's fast because everything is text based. Very little CSS. Very few files to
download.

The language which generates the page is irrelevant, when the page you're
viewing changes rarely (doesn't change from user to user). It's very cacheable
which means you might just be loading a static file served by a regular web
server. Combine that with the above, and you get your answer.

Compare this to
[http://qbixstaging.com/Groups](http://qbixstaging.com/Groups), what do you
see?

------
noir_lord
Primarily it's the simplicity of what it sends back and the low number of
requests.

Pre-rendered pages fed over the wire with minimal assets are fast, my blog
[http://benlowery.co.uk/](http://benlowery.co.uk/) loads under 70ms on my home
connection, it downloads 3 assets (and by far the biggest hit is the webfont).

I think we've grown so used to pages taking a couple of seconds to pull
hundreds of assets and then render them that fast pages stand out :).

------
i336_
PPS: I just clicked the serverfault link (I pulled it out of my address bar
history without loading it, because bandwidth) and realized it was [on hold].
I don't seem to have "gotten" the Stack Exchange thing, this happens to most
of my questions, heh.

So, feel free to reply here instead of the SF question if you have a response
for that.

------
staticelf
I live in Sweden and for some reason HN takes several seconds to load for most
of the times. I do not know why but it is frustrating.

------
psibi
I just came to know that the site is using flat files and surprised. Is that
still the case ? Why was such a design selected ?

------
halayli
Why is loading 5k faster than loading ~500k?

------
brador
Everything cached, static html pages, simple design and the big one: no
tracking cookies/tracking scripts.

------
jgalt212
I found that HN became much more performant for me when I got above 500 karma
points.

~~~
berntb
At least it isn't dead slow for us that are hellbanned anymore. :-)

------
davidgerard
It does not load 10,000 trackers and 10,000 Javascript libs.

That's the whole answer.

------
justwannasing
Without looking I can tell you it's two reasons:

1) It's text based and, therefore, can keep its total response small.

2) The page is cached on the backend so database lookups are kept infrequent
and small.

Those two reasons alone make it faster than, say, 80% of most web sites.

~~~
klibertp
> Without looking

You risk being wrong this way. Your #1 is OK, but #2 is false: there's no
database (in the sense of RDBMS or NoSQL) behind HN. It's just plain files on
disk. Which, of course, makes fetching data blazingly fast, thanks to how
filesystems are already handled and cached by the OS.

Anyway, I long suspected that "gzip and cache the hell out of your site" is
not the best way to go about optimizing websites. From my experience websites
come in two flavours: tiny with no need for optimization at all and huge,
where you need to optimize everything and you know it's not going to be enough
unless you're willing to rewrite some code in C.

~~~
justwannasing
You risk being wrong this way. Unless you are involved in the site and can
testify to that. In any case, the page could be, or should be, cached in some
form, either on the server or with the browser.

Which makes me wonder with your second paragraph. gzipping and caching are the
easiest and one of several ways to optimize the downloading of a web page.
Server side code written in C plays no part in that except if the pages are
created on the fly and anything else really does slow down delivery of the
page; that is, page creation takes longer than 80ms or so, but a properly
coded site won't let that happen if it can be helped.

And after running my own web dev business for 11 years, I didn't have to look
at anything to determine any of that.

------
mijoharas
I always just thought it was because Paul Graham is a wizard. some might say
they don't know why that would have an effect on site loading times, but I
know it does.

------
pestaa
Speed isn't everything. Hacker News has been recently one of the most
unreliable website I frequent.

~~~
kentt
What about it has been unreliable for you?

~~~
dekhn
About once every few weeks, the site is hard-down for posting for several
hours.

~~~
dekhn
Downvote me all you want,
[https://twitter.com/HNStatus](https://twitter.com/HNStatus) has all the data
supporting my assertion.

~~~
dang
HN hasn't been down for that long in a year.

