

Preparing for server-loads that make Digg spikes look like a joke - crockstar
http://www.stateofsearch.com/behind-the-code-hoxton-hotel-sale/

======
swombat
How many hits did they have to handle? Without this figure, it's a bit hard to
put things in context.

------
benologist
I might be missing something significant but it looks to me like the site was
just static HTML and JavaScript with the JavaScript being updated at some
rate.

If that's true why wouldn't you just do /xxxxx/clock.js and /xxxxx/data.js on
S3 + CloudFront? The /version/ means caching is never going to get in your way
because you're incrementing it each time you update.

Then the entire diagram is reduced to the web server + S3/CloudFront with
software pushing new versions of the JavaScript.

Also is that for real about GAE cutting you off under load regardless of being
paid? What's the point of using it to scale if there's a very low ceiling on
how high you can go.

~~~
aidos
Having built the system I can probably answer some of those questions.

The cloudfront solution won't work because you need to push from s3 and as we
found out during the previous sale that can take up to 20 minutes. That's no
fun for anyone, believe me. I'd be pretty reluctant to ever use cloudfront
again.

Gae cut you off at 500 rps - they downgrade you back off billing. You can
apply to have the limit lifted, which we did but they didn't lift it for us.

There are other things too, like it can be a struggle to update content in a
bucket when it's under significant load.

I'm sure everyone looks at the architecture and thinks it's overengineered but
we got burned previously and just wanted as much stability and redundancy as
possible. It's an interesting problem too, how to survive DDOSing yourself on
a monumental scale.

Ultimately we ended up with a stable system at negligible cost and didn't get
slated on twitter. That's good enough for me :)

------
checker
LivingSocial just did a similar $1 sale today in DC to launch their new
Instant Deals, which are location based paid coupons. They advertised on the
front page of the free commuter newspaper, the radio, even had people walking
around Chinatown with distinct orange shirts.

Needless to say, their servers were having a rough time keeping up. The main
web page served up timeouts every few page requests and the member creation
system didn't work for a coworker. However, it seemed like the order
processing system seemed pretty snappy, and the iPhone app stayed solid too
(I'm referring to the server-side parts of course).

------
crockstar
In response to all of the comments, would be great if you could direct them to
the post itself. The team behind the site/push have said they are happy to
address/answer questions so please fire away.

------
peterwwillis
Every time I see a HN story about "how i made my site handle lots of traffic"
it's people re-learning the same lessons. We really need a basics-of-building-
big-webapps FAQ.

Step 1: Make your site as static as possible, or at least relying as little as
possible on server-side processing. Pretend your app is made of Java and thus
assume it will hog memory, be slow and crash every other day, so plan to
handle those kind of 'reliability' issues.

Step 2: Get a CDN or a buncha cloud-hosted servers (like, 100). You may need
5-10 of them to serve static content using a caching web server solution ala
Varnish (or just Apache proxying/caching; yes, it works fine), and 50-100 for
application and database processing.

Step 3: Make sure as little of your site as humanly possible makes use of
database calls, and for gods sake try not to write too much.

Step 4: Use a lightweight (I SAID LIGHTWEIGHT!) key/value memory store as a
cache for database elements and other items you might normally [erroneously]
get from disk or network or database.

Step 5: Don't rely solely on 'cloud' resources. Eventually you will get bitten
because they're not designed to scale infinitely and probably do not care
about you (especially if you don't pay for them or pay very little).

Step 6: (optional) Re-create this solution on a different hoster in a
different datacenter on the opposite coast of your country. Not only will you
end up with a DR site, you'll see load and latency improvements for many
users. How to effectively and cheaply replicate content between datacenters
cross-country is left as a lifetime career goal for the reader.

------
bborud
Digg still exists? Wow. It appears to be so!

------
jajoyce2
this scares me, yet I like it.

