So the number of threads you have (e.g. setting StartServers, MaxSpareServers/MaxSpareThreads) is way less important than the keepalive timeout: you can/should start enough threads to use all the available resources, but it will only make a difference if you aren't idling all those threads with a high KeepAliveTimeout. Apparently, that's the Apache default setting.
Edit to elaborate:
Sorry, was eating dinner and Kindle is not exactly made for typing on technical documentation. This comment is a abbreviated version of http://www.kalzumeus.com/2010/06/19/running-apache-on-a-memo... -- read that if you want a longer spiel. (It is my most cited blog post on HN. I don't know whether to be happy or sad about that.)
Basically, there are a couple of Apache MPMs available. You may have the prefork MPM installed. You can check by running "apache2 -l". If you see prefork in the output, take a look at your config file (quite possibly /etc/apache2/apache2.conf) and check for the setting" KeepAlive On". If KeepAlive is on, your blog is broken and you just haven't found the failure condition yet.
On my server (Ubuntu, has gone from Dapper to Lucid over the years), the package default Apache2 settings for the prefork MPM are: 15 second keepalive, 150 MaxClients. If your server has enough RAM to support 150 processes for Apache (and, if you're on a VPS, you probably don't), that will let you process a hard theoretical maximum of 600 clients per minute. There are many, many things you can do to exceed that maximum bound: getting on the front page of Reddit or getting retweeted by Jimmy Wales at the right hour of the day both qualify.
Calculation: any client requesting any file, regardless of whether it is dynamic, static, cached, generated by a PHP monstrosity, whatever, occupies one process for a hard minimum of 15 seconds. 4 clients saturate one process for 1 minute.
With special attention to fellow VPS owners: after having died hard several times when apache2 decided to use up all available RAM and then swap the machine to death, I eventually tweaked the MaxClients setting down to 24. This means that, even with KeepAlive at 2 seconds, my max throughput was 720 clients per minute. Again, that number is achievable under very plausible circumstances for a personal blog in 2010/2011.
There are a variety of countermeasures one can take against this. One is not using the prefork MPM, but you have to be a configuration Jedi to figure out how to actually do this and still run PHP on your server. "apt-get install apache2 libapache-mod-php5", which is what substantially all guides will tell you to do, will force you to use the preform MPM. If you had been using the worker MPM instead, you would have a much, much harder time crashing your server serving static content.
Another alternative: switch to Nginx. This problem goes away instantly. (If I didn't have 15 config files I would have to migrate, I would have done this years ago.)
The easiest alternative: turn off KeepAlive. This will give you a very modest throughput hit, but I'll trade "Blog stays up if mentioned in the NYT" for that hit any day of the week.
Its drastically oversimplified if you need to scale an application. "Step 2: Make content static". For an actual application, there is far more to say than 4 sentences.
In my opinion:
* There are few cases where Apache is better than nginx. I don't run PHP, so that may be still there.
* Varnish is awesome, use it, love it.
* Purely static blogs, like jekyll, are great.
One thing I found interesting was his remark this:
Google Analytics failed to detect the surge: page load time was so high that visitors were closing the page before analytics could load.
He then remarks that it took Analytics 15 hours to detect the spike, but isn't that true of all Analytics instances? I'm not sure if the author is mistaking Google Analytics' delayed reporting as a fault or if I'm missing something.
You're right, of course. I considered it and started grepping them out, but got bored and did something else interesting. From a casual glance, there were a lot more Mozilla UAs than reported views, though.
Omniture SiteCatalyst is near-real time. I don't know if they do some error correction later like GA does for Ecommerce data.
Apache's memory usage varies based on what modules are enabled and the code they're serving and a number of other factors.
The rule of thumb is to take the average free RAM when Apache isn't running, divide it by the average RAM usage of a single Apache process on your system, and set MaxClients to a couple under that value.
For example, on a 512MB Linode box, if you've got 450MB free when Apache isn't running, and Apache takes up 12MB per process, you'd allow about 35 at the most.
Was written a few years back for a Linode 128MB.
It sounds like the author was making the same mistake that pretty much everybody makes: Treating your blog as though it were dynamic content. But it's not. It's static HTML, and you should never have to make any modifications to anything to make it scale.
Step one: Have your blog export all entries to plain HTML.
Step two (optional): move your imagery out to S3/Cloudfront.
That's it. That will allow your little out-of-the-box slice handle all the traffic that we can throw your way.
Scaling is an issue that you're meant to have with your product. Because your product actually needs to talk to databases and do things, it may have trouble doing those things when lots of people hit it at once. A website hosting a blog, on the other hand, needs to serve files. And that's been a solved problem for fifteen years.
There have been three posts on my blog this year which would, with absolute engineering certainty, have effectively DOSed Apache if I had kept the Ubuntu default settings. All involve numbers which are really small for computers, like 300k (hits in a day)
It took me years of blogging to realize why this happened and address it, despite my blog running on a beefy machine and me theoretically having experience with much harder problems than serving 20k of plain text repeatedly.
I'm afraid I'm going to have to stand by my assertion that static file hosting is a solved problem in 2011. I think the real issue we're seeing with all these "Slashdotted blogs" is that the database-based-blog is the 20 minute intro lesson for every new server-side tech. The result is that everybody thinks about blog hosting as a problem involving taking content from the database and displaying it to the user. This leads to things like caching and other performance hacks that could be done away with if you simply thought of the problem in terms of hosting files.
If you're stuck with traffic to your blog, you should look at an existing caching solution and maybe scale your VPS a little. It's not likely to be a permanent thing.
Step one: Have your blog export all entries to plain HTML.
He did that.
A website hosting a blog, on the other hand, needs to serve files. And that's been a solved problem for fifteen years.
"Too few Apache threads" is a known problem, which he recognized as soon as he saw the load numbers.
Sure, but it doesn't have to fall over. Outsourcing to Disqus is the easiest solution, but you can build your own AJAXy solution. Or just write out a new static file for each comment (if you're really overloaded, comments may take a while to be processed, but you can just serve the old page in the interim.)
No it doesn't. Comments aren't added all that much relative to how much your site is getting hit. It would make more sense that the page is just a static html file with a form. The form submits to your app engine (php/python/whatever) which adds the data to the database. That then triggers your static html file to be rewritten to display that comment as well. The only other modification you "may" want to make is set it so that browsers don't cache your html page so they can see new comments if they refresh.
Exactly my point. He did that and it worked.
If you have a blog that may one day see traffic, I'd recommend taking that bit of the post to heart and skipping the whole "serve blog content via the database" part altogether.
One of my web apps has the occasional spike in traffic that previously caused Apache to consume vast amounts of memory on my VPS, eventually crashing it due to lack of memory.
After reading many guides, experimenting, and generally getting quite frustrated (and working out what VPSs I could afford to upgrade to), I tried setting up Nginx on a separate port. It took maybe 1 hours for me to have my former LAMP stack set up and working, so I put it live, and haven't looked back since.
If you're on a VPS, use Nginx. The config file is wildly different to that of Apache, and you'll no doubt spend a few minutes cursing trying to figure out how to port over your rewrite rules, but after that it's plain sailing.
Worst case, you can serve static files with Nginx and route dynamic requests to your Apache instance (I still do this with a few old PHP apps I have).
"CPU utilization never exceeded 3%"? Really? Maybe the system load was at 3.0 for a few days?
That number comes from outside of the Linode, not within it. The 14-day graph is also a 2-hour average, which means each value is over a particularly large period and big spikes will be smoothed out; I don't think the results are too far from the real world here, though. Not a lot of CPU time is required to fetch things from various places and transmit over the network...it's mostly wait time.
Can you go into detail on how does that work? I thought you had to have your site originally hosted on EC2 to do that ...
Wow. I had no idea that could make such a difference. I suppose the issue was with the low number of threads set in the Apache configuration. The server was spending its time sending out static content when it could have been doing more important things? I signed up with S3 to serve up my static content to keep that load off my server. I should probably be using CloudFront instead though.
I'm interested in looking for a backup host just in case. I currently use WebFaction and I love them to death -- but I'm worried that under incredible stress the shared hosting won't hold. With Linode, do you start from scratch with a blank OS and just install everything you need from there (Apache, mod_wsgi, etc. kind of thing) or do they have preset installs? With WebFaction I can select a particular setup and I'm up and running in minutes.
Something like this would work in .htaccess
RewriteRule ^t/item/4372/$ /static/4372.html
It's saying "Hey, apache, if you see somebody asking you for website.tld/t/item/4372/, send them to website.tld/static/4372.html instead"
A blog post I wrote got about 100k hits in a day a few weeks ago, and using mod_rewrite in this fashion, I was able to keep the site running for the entire day.