

How to Survive a Slashdotting on a Small Apache Server - mocko
http://mocko.org.uk/b/2011/01/02/how-to-survive-a-slashdotting-on-a-small-apache-server/

======
chronomex
I'm a staff member on a community site (ticalc.org), the biggest fish in a
small pond. We get about 250,000 visitors in a normal month, averaging about 3
pages per visit. For the past 5 years or so, the site was hosted off a dual
Pentium Pro; before that we had a 486. Currently it's a VM living in Germany.
Postgresql, Apache, Linux.

We've been Slashdotted several times, without any appreciable slowdown. How
does that work? The whole site is static content. Dynamic content is either
rendered out to disk when it changes, or is a couple of static page fragments
that are combined at serving time--and this is only for logged-in users.

(Granted, the last time the site was redesigned was 2001. I'm working on a new
design right now. I have no plans to make it any slower.)

------
CitizenKane
If you want to save yourself the headache just use nginx, it responds much
better under load than apache. If you're using a dynamic backend such as PHP,
RoR, django or something else you can really boost the number of page loads
with reasonable caching strategies.

~~~
jrockway
I just let Varnish do the hard work. I make sure my web app sets caching
headers correctly, and then I just point Varnish at it. Instant 12,000
requests per second. I am sure if I spent time tuning the machine and setup, I
could do even more. But my users make about 12,000 requests per decade :)

~~~
alnayyir
Varnish is great, and it's something I'd recommend to many people in many
cases, but most people don't have their caching headers set granularly enough
to not get caught off-guard by unexpectedly stale content at some point.

Gotta be careful, but if you know what to expect, Varnish is great.

~~~
jrockway
Yeah, "web developers" should really take a few minutes to learn HTTP.

~~~
alnayyir
Silly, how could they possibly find time to learn HTTP when they have to keep
up with at least three blogs of mountebank victory bloggers in addition to
learning the latest Drupal plugin?

------
fexl
Although my Loom server code is completely standalone and does not use Apache,
I'll take this opportunity to discuss Keep-Alive policy since the article
mentions it.

The Loom server automatically scales down the Keep-Alive interval as the
server load increases. Each child process monitors its own life span,
basically like this:

    
    
      # These parameters are configurable.
    
      my $max_children = 128;
      my $min_life = 10;   # seconds
      my $max_life = 600;  # seconds
    
      # Now compute the lifespan.
    
      my $num_children = get_current_number_of_child_processes();
      my $free_slots = $max_children - $num_children;
    
      my $cur_life = int($max_life * ($free_slots / $max_children));
      $cur_life = $min_life if $cur_life < $min_life;
    

At this point $cur_life is the maximum number of seconds this child process
should live. If the child has been alive that long or longer, it voluntarily
exits.

An instance of this server code is running at <https://loom.cc> . You can find
the source code via the News page at <https://loom.cc/news> . The relevant
function is Loom::Sloop::Client::check_lifespan .

------
cstross
There's one thing you can do pre-emptively that'll help: if using a CMS of any
kind, use it to build your content as static HTML wherever possible. Dynamic
content sucks when your wee Athlon box is trying to field 100 requests per
second.

(My blog's currently quiet, fielding no more than 15,000 http requests per
hour at any time this year so far(!), although it's been more than an order of
magnitude above that in the past month. Srsly, unless you've got massive
clustering mojo you are _not_ going to be handling that load gracefully unless
you're serving static content.)

~~~
smutticus
Very good advice. How often does a slashdotted page need to accept user input?
And if it doesn't need user input it most likely shouldn't need to be
generated afresh with every hit.

    
    
      1) Render the page
      2) Save it to file
      3) Point everyone to it
      4) Enjoy your 15 min of fame

------
peterwwillis
Hopefully-constructive criticism/additions:

* If your ssh connection took forever, it doesn't really matter if it's a load-issue or bandwidth-issue. The site is ungodly slow and so will be your admin access time. Kill the site, put up an "under construction" sign, fix the performance issue, bring the site back. Link to a google-cache, web archive, or other version if possible in the meantime.

* Don't waste time running a ps. If you're running apache, just grep for MaxClients in the error log. Actually, there's really no point in checking because if you're being slashdotted, you hit MaxClients, I guarantee.

* Top isn't going to give you an accurate amount of memory used per process. You need to check smaps and some other things and all of that will take too long. Remove all modules except what you need to serve whatever static content you want to get out there during the slashdotting. You definitely do want to reduce the memory any way you can if you're constantly swapping and its loading you to hell (confirm with vmstat/mpstat/iostat).

* `killall -9 httpd` works faster.

* Your estimates of ram per client are going up when they should be static (25MB for 512 RAM and 54MB for 4G?). If you're lucky your app won't even actually use up all this valuable memory - Copy-on-Write will save as much space as it can unless the individual process needs to reserve some anonymous memory in the process. Once you unburden yourself of extra modules (run 'ldd' on the individual apache modules if you want to see all the shit they can load into your box at runtime) run apache with one or two processes to test and look at the memory use, and go from there.

* I'm kind of on the fence about this one, but in some circumstances it can help a little to reduce MaxReqPerChild to something stupidly low, like 100-1000. You risk overloading with i/o when your process reaps and a new one loads up, but if your processes keep swelling up with more memory as they run (hi mod_perl!) killing them off and starting new may help you.

* Honestly, in a slashdotting situation, use 'wget' or 'curl' to take a snapshot of your dynamic page(s) and put those in place as static files to be served to users. If you don't have a proper caching layer don't even worry about your database because you will almost invariably kill it with queries, which will kill your webservers. If you want a 'dynamic' version of your site to show to people that updates regularly, set up a cron job to wget the dynamic pages every 1-2 minutes and overwrite the static copy (but for god's sake make it back up the old copy and only move it if it's not empty or an error page).

* Looking for the biggest files is good. You can also grep and sort the apache logs to see which files are being requested the most, and staticize/shrink them however possible (css/jsp can have excess whitespace removed with some tools, images can be shrank with 'convert', dynamic pages can be made static as above, etc). `cat $LOG_DIR/access_log | sed -e 's/.*] "/"/' | sort | uniq -c | sort -g | tail` (sort doesn't print the count with 'sort -u' ... somebody should add that). Oh yeah, and anything that prints a log? You should disable that now before /tmp or /var fills up.

------
sambeau
If your server is "business critical" and you have budget you should take a
good look at Zeus's Traffic manager.

<http://www.zeus.com/products/traffic-manager/index.html>

It's faster than anything out there, easier to set-up than anything out there
(except the one-button Apache on a Mac) and can accelerate Apache up to 100x
just by sitting in front of it. It will do the job of Ngix + Varnish as well
as controlling a whole server farm. It's a truly amazing piece of software
that sadly gets very little mention in the world. For instance, this is the
software that runs the Firefox download sites and the BBC news site. Joyent
and Amazon Web Services use it to.

(Disclaimer: I used to work for Zeus 4 years ago. I don't have anything to do
with the company other than having friends in the dev team. ZTM is still my
baby, though)

As many of HN's readers run web-app companies I recommend you take a look and
at least play with the downloadable VM. The software, while expensive by free
standards, is remarkably affordable by business standards.

~~~
sambeau
IF you want to see some of the fun things you can do with it and get some idea
of it's power and flexibility (especially the TrafficScript programming
language it comes with)take a look here:

<http://knowledgehub.zeus.com/articles>

The article explaining why TrafficScript was created it especially good: it
explains the inner workings of the software.

------
lazyant
why wait to being slashdotted to implement all these changes (limit apache
memory use, CDN, caching etc)?

Apache (MaxClients etc) shouldn't be configured to ever take more memory than
the server's RAM and during slow hours or on a separate test server you can
use 'ab' or any other tool for stress-testing.

If you are 'slashdotted' basically you just want to flip a version of the site
with static cached page(s).

Also for web site performance YSlow and Page Speed should be mentioned; before
these changes you want to make sure you are not missing on any low-hanging
fruit like not compressing pages etc.

------
dholowiski
I just finished pushing out an application to Heroku. I plan on the traffic
ramping up sharply along with a couple of slashdottings, HN's and reddits
along the way. It gives me a warm fuzzy feeling when I realize I can just
"Crank up the dynos" and go have a beer.

~~~
zrail
Don't forget to crank them down when the traffic is over. There's no refunds
for accidentally leaving 1000 dynos running for a few hours longer than
necessary.

If/when I actually deploy something worthy of traffic I plan on incorporating
something that automatically scales my app, both dynos [1] and workers [2] (up
to pre-defined limits, of course)

[1]: <https://github.com/ddollar/heroku-autoscale> [2]:
[http://blog.darkhax.com/2010/07/30/auto-scale-your-resque-
wo...](http://blog.darkhax.com/2010/07/30/auto-scale-your-resque-workers-on-
heroku)

------
alexwestholm
Why not do this stuff pre-emptively? I understand that moving content out to a
CDN when it's not necessary might not be the smartest plan, but shouldn't most
sites be optimizing the number of apache procs as a rule of thumb?

------
tedjdziuba
tl;dr run production with swap off

------
raz0r
Remove Apache and install a real web server.

