
Fighting Back After Hacker News Took Down My Site - rellik
http://www.plainlystated.com/2011/07/fighting-back-after-hacker-news-took-down-my-site/
======
bradleyland
I cringed a little bit when I read this: "...none get enough traffic for me to
have made caching or performance tuning a huge priority".

I don't mean to level criticism solely against the post author for this,
because he is not alone, but I find this mentality inexcusable. This is
especially true if you're using WordPress, for which there are a multitude of
caching plug-ins. All of which will offer orders of magnitude better
performance than allowing the blogging engine to build the page from scratch
every time.

In other words, caching is _always_ a priority. It's not an add-on or an
after-thought, it should be part of your design.

Consider a scenario where someone asks you to add up three arbitrary numbers:

123 + 456 + 789

Now, imagine you type those in to a calculator to add them up. It takes you a
few seconds. The output is 1368. You can easily remember this number, so the
next time someone asks you what the result of 123 + 456 + 789 is, you can just
say 1368. Not caching is like keying the numbers in to a calculator every
time, rather than just relying on your memory.

I know this is a rudimentary example, and I know that most people running a WP
blog probably know how caching works, but even if you don't "need" it today,
why would you leave your blog set up so that it's constantly re-building
content that can be cached by simply installing a WP plug-in?

I implore you. Make caching the second thing you do (after security) when
setting up or building your web app.

~~~
benmills
I agree with you when you're taking about web apps, but what about personal
blogs?

If I just want to set up a personal blog that I don't plan on promoting or
spending a lot of time on I wouldn't want to spend any extra time setting up
things that don't help me with my primary goal, writing blog posts.

~~~
bradleyland
I totally get that. I'm beyond the phase in my life where I enjoy tinkering
with blogging software settings. I'm far more interested in engaging in the
conversation. But here's the counter argument: it takes a sum total of 60
seconds to implement. If you're using WordPress to host your blog, installing
caching is beneath trivial. You click "Add New" under the Plugins menu, then
search for WP Super Cache and click install. The only thing remaining is to
turn it on.

If you're not using WP, you might use something like Tumblr or Posterous. In
that case, caching isn't your problem. If you rolled your own blogging
software, well, you already violated the principle that blogging is your
primary goal.

~~~
scott_s
It's only trivial if I know that such plugins exist. And if I don't consider
how to improve performance because at the moment I don't care, then it doesn't
matter how trivial it is.

~~~
bradleyland
Hrm. That's a really good point. It's hard for me to see outside the fact that
I've set up what seems like a hundred WP blogs in my lifetime. That adds a
solid 5 minutes of Googling, which I still think is a pretty minimal
commitment.

~~~
scott_s
But you don't know it's 5 minutes of Googling until you do it. Before you
start, it's an unbounded task.

------
anigbrowl
What a strange headline. Who is he fighting back against - Hacker News?
Traitorous Wordpress? Himself? If I needed a plumber I wouldn't want one that
talked about fighting back against the water. Nor would I want an architect
that talked of fighting back against gravity. The language of confrontation is
inappropriate here, since the problem as described stems from a failure to
spend time learning or configuring Apache. This might seem trivial, but to me
it suggests fundamental flaw in the approach to the problem, which probably
increased the time needed to fix it.

------
js2
[http://www.google.com/search?q=site:news.ycombinator.com+pat...](http://www.google.com/search?q=site:news.ycombinator.com+patio11+keepalive)

I will also quote myself: "I first learned to disable Apache KeepAlives in
1998. Yes, 1998. It's disheartening that Apache still ships with it enabled by
default. It has always allowed a relatively small number of lingering clients
to completely DoS your server."

~~~
wheels
This is the most critical bit. We run our website (totally disconnected from
our webservices, naturally) on a 256 MB VPS and frequently handle top HN
stories on our wordpress blog (which runs on a host with several of our other
PHP / static subsites).

Key points:

• Make sure that wordpress supercaching is on. You can verify this by looking
at the last line of the HTML that comes back which has a timestamp for when it
was generated.

• Turn off KeepAlives.

• Set MaxClients to 8.

• Use monit to check to make sure it can connect and restart apache with a
kill -9 if it can't. (This is optional, but helps if you have some random
thing that ends up taking a very long time to execute and eats up
connections.)

With that in place even a much smaller server can easily handle a HN top story
without breaking a sweat.

~~~
peterwwillis
> Set MaxClients to 8.

LOL what? what's your mpm? did you enable this to keep the backend from
blowing up from too many queries? surely you can handle more than 8
connections at a time. is this the proxy layer, and if so are you using web
caching on top of wordpress caching?

if monit is restarting apache every time it can't connect (i hope you have a
long timeout) you're denying service to a lot of people. connections are
supposed to queue so they don't get dropped.

~~~
wheels
As mentioned, this is on a single VPS with 256 MB of RAM. Each Apache process
needs about 25 MB of (non-shared) RAM, so actually 8 is pushing it. We're
using mpm_prefork. There's no additional proxy nor cache.

My point, specifically, was how low you can go with a cheapo VPS. We hold up
fine during an HN spike. (We're B2B and not a destination site, so our usual
load is trivial.) Even during an HN spike you're getting tops of 2-3 visitors
per second, which can be dished out reasonably well with 8 workers.

The monit thing kicks in after a 30 second timeout. With the configuration
above, we don't get that because of load, but rather when something else has
gone wrong (specifically there's a wordpress plugin that our internal status
blog uses that sometimes hangs). But given the original poster's issue of
apache getting so out of control that it took him several minutes to get a
live SSH connection and a system load of 60, having monit kill things (and
restart them) is a preferable stop-gap.

(Note: Our actual customer facing stuff is quite different; there we're using
multiple servers behind an nginx proxy and using a combination of Rails,
Sinatra and Java services. The basic web stuff is segregated off from those
primarily for security reasons.)

~~~
peterwwillis
Ah. It's kind of scary that private RSS of each process would be 25MB, but
it's certainly possible. I assume you've disabled every module you don't need?

If you have some free time try deploying your LAMP stack with Buildroot and
uClibc. The application size ends up being around an order of magnitude
smaller, but i've never bothered checking if private RSS on clunky apps like
PHP or Perl is minimized at all.

 _edit_ Also for prefork we used to have some scripts that would monitor
processes to see if they 'went crazy' and wouldn't ever return, and reap those
processes so Apache would fork a new one so we didn't have to restart the
whole server. When you're under peak load and you restart a server and it
sends all those clients to all your other servers which are already almost at
their peak things get very nasty very quickly. I know you only have the one
VPS here but for applications with many servers it can be handy.

------
rs
Nginx has proved extremely useful and capable especially in memory constrained
virtual environment. It effectively gives an easy way to move from a
synchronous web server to an asynchronous one.

------
sciurus
An important point that was missed is how was he running the php and rails
apps? mod_php? mod_rack? cgi? fastcgi? Proxying to an application server? The
answer to that is going to greatly affect how well Apache scales when you're
memory-constrained.

~~~
rellik
apache: rails & sinatra were on passenger. php was mod_php. nodejs was
proxied.

nginx: rails & sinatra are on passenger. php is php-fpm. nodejs is proxied

~~~
etherealG
thanks for this, useful. probably worth amending in the original post :)

------
ck2
Who run WordPress without a page cache these days?

Just look at the query count even without plugins.

Add a few plugins and it's a total mess.

------
sc68cal
What about isolating the popular URL and temporarily serving a static version
of the page?

~~~
rellik
That's what the page caching plugin does. It puts a static file in
nginx/apache's search path, so it never hits PHP

~~~
sc68cal
Which plugin did you use? WP Super Cache? I'd like to take a look through the
code.

~~~
rellik
I used W3 Total Cache. It generates a static file and puts it in apache's
search path, so it never hits PHP

------
teadrinker
BETTER SOLUTION: Get on the cloud. 15 minutes to upgrade your server when load
gets high and billed by the hour at the ram level you're 'using. Theres very
little reason to stay on traditional hosting if you're running anything close
to a serious startup.

~~~
rellik
While the cloud does enable you to scale your resources on demand, you still
have to address the same issues I had to deal with (how to efficiently use
those resources). Plus, moving to a mutli-server setup involves additional
complexity in the architecture of the system, which I'd rather not mess with
unless I have to :)

~~~
teadrinker
Multi-server? Nope, single. Same as you have now, just with added flexibility
on ram with 2 clicks I can handle a huge server load with ease, then drop back
when done. That way I don't waste time on low level optimization tasks and can
keep to what I do best. Building things.

Seriously, there's no reason to stay on a traditional style server setup and
you'll never go back once you've tried it. It can be pricey once you ramp up,
but give it a shot as you can literally pay by the hour while you play.

~~~
etherealG
which service are you on that increases ram without taking the box down?

~~~
teadrinker
-8 in downvotes with no explanation? really guys? Sorry for trying to help, i'm out.

~~~
iqster
I was on EC2+apache+wordpress+mysql and had my blog go down as well after
being featured on HN. In my case, simply rebooting the cloud instance fixed
the problem (I had lots of other old projects running on the same instance).
However, I did do some research into figuring out what I could have done as my
blog kept getting more traffic ... had I used RDS, I could have gone to a
bigger instance with little downtime. Unfortunately, my understanding is that
there would be no way to go back down to a smaller RDS instance.

P.S. I upvoted you. I also don't get why people downvoted you.

