
Making a site that can handle #1 on Hacker News - tahoecoder
http://blog.appraptor.com/
======
saurik
Your site apparently got ~2250 visits per day (so, less than two per minute)
at the height of your "surge", and seems to consist of three pages (/, /about,
and /open-source). Most people are only going to look at just /, so let's say
3000 pageviews. The day after was still seeing good amounts of traffic, so it
wasn't some kind of momentary "all 3000 hit the site in the same minute"
situation: it seems like a fairly benign decay. How could you possibly have
been dealing with 287 concurrent users?

My website (saurik.com) is seriously written in JavaScript. I was doing this
long before node was popular, and so it is designed "horribly sub-optimally":
it is using Rhino, which is not known for speed. I use XSL/T as a templating
engine to build the page layouts, which is also not known for speed. Every
request is synchronously logged to a database. I get over 50k HTML pageviews a
day, most for one recent article which I posted a few weeks ago: when I posted
it, I was getting well over 3k pageviews per hour.

I do not do any caching: I generate each page dynamically every time it is
accessed. I seriously dynamically generate the CSS every request (there are
some variables). Even with 3k HTML pageviews per hour, that's less than one
complex request per second. How does one even build a website that can't
handle that load? That is what I'd seriously be interested in seeing: not "how
do I handle being #1 on Hacker News", but "why is it that so many websites are
unable to handle two requests per minute".

~~~
paulsutter
Surely there's an opportunity here. It's mind boggling how people are having
performance issues here, but they are. It's an opportunity for someone to make
some money and improve the internet, by fixing whatever tools they're using.

To amplify your comment, processors today process billions of instructions per
second. Even if all 3000 pageviews _did_ hit within one minute, thats hundreds
of millions of instructions available per pageview. His pages just aren't
complex enough to require that many instructions to serve.

tahoecoder's image to "prove" his load indicates he had 287 visits within a 45
minute window. Allowing hundreds of _billions_ of instructions per page
served. Please do give me a break.

At Quantcast we handle 800,000 HTTP requests per second, and process 30
petabytes a day, so it really is possible to handle actual high loads.

~~~
tahoecoder
Lots of people are misunderstanding that chartbeat figure. It's not 287 visits
in a 45 minute window. These are visitors who are currently interacting with
your site. Some are idle but most aren't.

~~~
saurik
Can you give us a more useful understanding of how many pageviews your site
actually had? You only have three webpages. If we assume that for each visit
the user visited all three of them and then further reloaded your home page
twenty times, that's still fewer than one pageview per second.

As you had a spike to 282 concurrent visitors (four times your average), even
under that unrealistic amount of reloading, that's less than four requests per
second. (Again, really: one way to look at that figure is "in a 45 minute
window"; I provide the broken-out math below to make it clearer.)

------
kennywinker
While the sentiment is clear, I had an uncached Wordpress site on shared
hosting withstand #2 or #3 (I forget where exactly it peaked). HN isn't all
that huge a traffic deliverer. It's just about the quality of that traffic.

~~~
zalew
> HN isn't all that huge a traffic deliverer. It's just about the quality of
> that traffic.

what?

~~~
jacques_chester
He means that the readership of HN is high value. Name recognition amongst the
HN readership might reasonably be considered to increase your chances of
getting a nicer job, a book deal, a Y Combinator spot, a freelance contract,
an investment round etc etc.

------
jacques_chester
Without wanting to seem unnecessarily rude, it's not that hard to survive #1
at HN. I had a blog post submitted by someone else hit #1 for two days. It was
worth about 25k visits. I've had other stuff do 100k visits in a day on the
same system during a natural disaster; that little network is basically idling
at what used to be vanity numbers (millions of views per month, guys, time to
list on NASDAQ!)

If you're on Wordpress, install WP Supercache. That's 80% of the solution,
right there. Install equivalent whole-page caching for any other framework or
system and tell your HTTP server how to pick it up; that should leave you
prepared for hundreds of RPS.

We're at the stage where people are posting the equivalent of "how I survived
skipping lunch". It's not 1997 any more, tens of thousands of visits is a link
from a moderately popular twitter account or a medium-size metropolitan
newspaper.

I'm sorry to seem so uncharitable. I'm just not sure what value these posts
add.

~~~
tahoecoder
A lot of people on HN are still learning things like this. I understand that a
post like this doesn't add any value to someone like you who probably knows
this stuff inside and out, but I'm grateful for posts that give me insight
into new tools that can help me. This isn't just a community for people with
10 years of programming experience.

Should I not try to help out the community with blog posts about my
experience? Should we just cater to the experts?

~~~
jacques_chester
I feel like we're going to sail off into old/new user debate territory. "It's
been done before" / "I'm new and so it's novel to me" etc etc.

I'm not sure what to tell you, except I fall squarely on the "old" side of
that divide, having had to nurse Wordpress installations for nigh on 10 years
now.

But every time -- _every_ time -- an uncached Wordpress blog is linked to and
dies with the famously unhelpful "Error establishing a database connection",
somebody pops up to mention WP Supercache and/or W3 Total Cache.

Actually, if I have a pet peeve, it's that non-terrible caching isn't part of
the Wordpress core. Probably breaks on gawdawfulhost.com or something, god
forbid that 99.999% of the internet be better off from core architectural
improvements when we could be working on the fifteenth new admin redesign!!1!

Edit: I realise now that you weren't talking about Wordpress and thus, my own
pet obsession is clearly revealed.

~~~
tahoecoder
Fair enough. I think we are all sick of hearing about WP Supercache.

My post was about middleman + s3 + cloudfront, however. I think this
combination of tools isn't as well known and some people could benefit from
knowing about them.

~~~
jacques_chester
In my opinion (given my superficial understanding of the prior situation),
your problem was Apache + mod_php. The default settings for that combination
are to chew memory until the bad people go away.

Out of curiousity, why middleman when you're already using Jekyll/Octopress?

(My dog in the static site generator fight is Nanoc, fwiw).

~~~
tahoecoder
My main site, AppRaptor, isn't on octopress. Just this blog. I agree with you
100% that the problem was apache + mod_php. Nginx would have probably been
better. I decided to try out this s3 + cloudfront solution instead, though.
I'll take a look at nanoc. Thanks for the info.

~~~
jacques_chester
Nanoc is hard to get used to but as far as I'm concerned it's a bucket filled
with magic.

------
baby
You have no dynamic content and your website crash at 200~ concurrent users?

You're doing everything wrong then. I had a website going through 6000
concurrent users sometimes and which was hosted on a very cheap mutualised
server! I didn't realize so many people had no idea about simple caching
techniques.

~~~
tahoecoder
The issue is a static site shouldn't even need to use caching techniques.

When I'm building an app I will use memcached with redis. But this post was
just about getting a host environment/workflow for situations in which you
want the ease of development that server side languages provide (shared code
includes, etc.) and not have to deal with things like caching.

~~~
eagsalazar2
This is your #1 most hilarious comment.

~~~
baby
I guess that's what happens when people learn a lot of technologies in school
and don't really know when to use them.

------
kintamanimatt
A lot of people commented on the original submission in which he asked for
feedback on his site and ... nothing's improved, not even the grammatical
errors!

The only thing that's changed is the site's migration to S3 from Linode, and
the addition of Cloudfront!

~~~
tahoecoder
I beg to differ. I changed a lot of stuff which was recommended by HN. What
errors do you see? The main thing I didn't change was I kept the open source
stuff highlighted because the vast majority of people visiting the site will
want to find those resources.

~~~
kintamanimatt
"Get Perks at Local Restaurants & Bars" shouldn't be title cased. It should be
"early beta" not "early Beta". The ellipsis after the "etc" should be a
period. There are many others too.

There were a ton of comments that were incredibly constructive and valid. I
appreciate it's your site and you're not beholden to anybody, but almost all
the criticism that was given was ignored.

Also, you can edit comments on HN rather than leaving a second as an
afterthought.

~~~
xijuan
I just saw the webpage for the first time. I agree with you. The writing is
very poor on this webpage. Many sentences were awkwardly worded.

>Perks include offerings like free appetizers, complimentary drinks and more.

This is one of the awkward sentences that I found hard to comprehend at
first..

And "early beta" is still not changed..

------
romain_dardour
Actually you can combine middleman and dynamic pages to get fast static pages
and still keep a few dynamic endpoints. We did this on our website
<http://hull.io> for email registration, and blogged about it here :
[http://blog.hull.io/post/45912703356/the-perfect-almost-
stat...](http://blog.hull.io/post/45912703356/the-perfect-almost-static-site-
setup) \- when 90% of your users only consume static content, you greatly
benefit from this.

------
bobfunk
If you're not super keen on spending time on this yourself and don't want to
give up any convenience of a fully dynamic site, that's part of what we built
Webpop (<http://www.webpop.com>) for...

Of course tweaking web servers and playing with your stack can be fun, but if
you just want to build your site and let someone else handle the back-end
performance and scaling issues, then there are solutions for that.

------
cliftonk
Another method, if you decide against a statically generated site, is
microcaching [1] with nginx. Your backend only needs to render the page once
each second and subsequent requests see the cached version. You should be able
to easily handle 2000 req/s using this method.

[1] [http://fennb.com/microcaching-speed-your-app-up-250x-with-
no...](http://fennb.com/microcaching-speed-your-app-up-250x-with-no-n)

------
autotravis
Yep, static is the way to go if that's all you need. I didn't even hit #1 on
HN but got 5,000+ page views in a few hours. All on a 128MB RAM
VPS:<http://linuxterm.com/static-sites-for-fun-and-savings.html>

~~~
ngokevin
I believe I have touched #1, garnering 30,000+ views in a day. It was a static
site on an Amazon Micro instance served with nginx, no hiccups. I have not
minified any of my assets either.

------
chacham15
I dont understand what the problem is with using a slow backend even for
static content (just because it is easier to program). Just use varnish to
cache the pages and you're good, right? Or am I missing something?

------
tedchs
If only there was a way for young adults to learn how to be respectful and
understand "time and place" for various behaviors. Oh yeah, there is,
_parenting_.

------
Sami_Lehtinen
Google App Engine also provides instantly (almost infinitely) scalable
hosting, and super easy static content deployment.

------
pcl
So, it's down for me right now. Back to square one?

EDIT: it's back, one or two refreshes later.

~~~
tahoecoder
Must be your internet. My server load time is hanging steady at 455 ms.

------
jaequery
all he had to do was host on nginx.

