
10 Million hits a day with Wordpress using a $15 server - EwanToo
http://www.ewanleith.com/blog/900/10-million-hits-a-day-with-wordpress-using-a-15-server
======
terryjsmith
As someone who worked as lead dev for a blog network that does 10M+ visitors a
month, here's the thing: if you've gotten your blog to 10M hits a day, you
likely have a massive amount of content (we had about 1M posts over 5 or 6
years and about as many comments); that is, far too much to get into cache on
a $15 server. Every page gets cached, every gallery page gets cached (if
you've got photo galleries), every comment page gets cached and all of those
have objects/actual DB rows that get cached as well. With that much content,
the GoogleBot alone will kill you if you're not careful.

These are all great tips to help you scale, but unless you've got a very small
site WP site that is also doing 10M hits a day, there are many more
complications. Once I have a little more spare time, I'll have to blog about
some other solutions we've come up with.

~~~
EwanToo
Of course, all this really lets you do is survive a swarm of people from (for
example) here, digg, a tweet meme, stuff like that, which will direct everyone
at one single page on your site.

But I think that's the most likely cause of that kind of traffic on a small
cheap VPS, as anyone running a big site without clustering is just being silly
:)

~~~
terryjsmith
Yup, that's fair. And there are actually ways to scale WP beautifully with a
small amount of servers (we're using less than 5 servers to handle our 10M
visitors a month and content archive). Again, I keep meaning to blog about it,
I just need to find the time.

Anyways, I'm all for articles like this that help optimize WP sites and get
rid of the stigma that WP doesn't scale without cost. Thanks for writing it.

~~~
EwanToo
Thanks :)

Hope I see a blog post from you about it sometime soon, I think the details
around that kind of stuff are far too hard to discover for people trying to
learn about it.

------
verisimilitude
This feels a bit like my "9 million hits per day" article from a while back:
<http://tumbledry.org/2011/08/31/9_million_hits_day_with_120> &
<http://news.ycombinator.com/item?id=2945185>

Now, with 11% more hits! :)

~~~
EwanToo
Yeah but my 11% extra cost me 400% more RAM, so you probably win :)

I knew I'd seen a similar post in the past, but couldn't find it.

------
etrain
TL;DR - Vanilla ubuntu, configure firewall, install nginx, install wordpress,
turn on wordpress caching, install and configure varnish.

~~~
masklinn
Isn't there any kind of static generator for Wordpress? I'd expect a static
wordpress + nginx would be sufficient to handle quite a serious load.

~~~
zippykid
most of the caching plugins for WordPress will generate static html files, or
static html stored in memcached, which can further be written to disk using
nginx' fastcgi_cache, or something like varnish.

------
antirez
A blog is the most simple to scale application, just one step after the
_static_ content. The fact that wordpress has traditionally been not very
scalable always used to puzzle me...

~~~
debacle
As a PHP dev with ~100 WP installs under my belt and plenty of customization,
I think I'm qualified to say that WordPress isn't written to be scalable. It's
actually kind of crap. Many of the things it does to make writing plugins
easier for newbies are Very Bad Things in PHP. WordPress is a memory hog, to
the point that foreaching over query data in the wrong way can cause you to
hit the memory limit, even if just unwinding your foreach into a copy/paste
wouldn't. The memory leaks are somewhat nonsensical, and they make scripting
with the WordPress API a minefield.

I'm not an expert on WordPress internals, but the scene is definitely ripe for
a replacement simple due to the quality of the API. WordPress has been good
enough for most people for a long time, but it has many weak points.

~~~
antirez
IMHO the problem is that's old technology, who used to write free software N
years ago is now busy doing startups ;) So the "next generation" of free
software web stuff is missing in part.

------
kaiuhl
The hallmark of Wordpress is ease of use. That's why you can spin up a blog
right on their site, and they have a backend designed by Happy Cog. Wordpress
is blog software for the technically illiterate.

And then I take a look at this blog post that lists all the incantations
necessary to scale Wordpress to reasonable scale and I wonder _why_ anyone
should do this.

If you're setting up your own server, installing nginx, configuring PHP, and
doing automated load testing, maybe you should also consider rolling your own
software or using a different package that isn't supremely bloated.

~~~
showerst
The thing is this - configuring a server is one set of problems, building a
decent CMS database/backend is another set of problems, and building a decent
browser-based UX for content authoring is a third set.

Very few people have all three of these skills, and it's fair to say that 2
and 3 are not yet solved problems. Rolling your own CMS for any non-trivial
purpose is always something that sounds like a good idea until you try it, and
then you start hitting all of the incredible idiosyncrasies and speed bumps
that other CMS have already solved, even if they've done it poorly.

I'm not exactly defending Wordpress and its lousy code, but in my experience
with publishing CMSs, if it's powerful enough to flex to non-trivial needs
(Modern WP, Drupal, Django, CQ, etc), then it's probably going to feel like a
bloated/complex mess to a programmer, because of all the nuances in the
problem space.

Having worked on both sides of this one, i'd rather solve the 'scale this
crappy software' problem than the 'Build a useable UX solution that does
everything we need and works on mobile and in IE8' problem.

~~~
mortenjorck
> _if it's powerful enough to flex to non-trivial needs (Modern WP, Drupal,
> Django, CQ, etc), then it's probably going to feel like a bloated/complex
> mess to a programmer, because of all the nuances in the problem space._

This is a good rule of thumb, but I have found one shining exception, and it
is called ProcessWire. If you've never tried it, I highly recommend a look.
It's a CMS that essentially offers you a blank slate and a set of simple,
powerful tools that let you mold it into something else quite quickly.

~~~
navs
+1 for ProcessWire. I've been playing with it for a day now and I love its
flexibility.

------
rkalla
FWIW: Amazon's Micro instances will _burst_ to saturate unused CPU cycles on
the host machine up to a hard-capped limit before being choked to death by the
hypervisor.

Had he run the Blitz test for 10 mins or more you would see the spike in
traffic up, beyond what you think a Micro should sustain, and then it plummet
to near 0 for a disturbingly long period of time[1].

If you are unlucky enough to have a Micro on a host that is fairly saturated,
the performance you get is untenable.

Micro's are not "smaller than" Smalls -- they are a different type of
monetization production from AWS allowing you to pay cheaply for little bursts
of underutilized hardware.

There is _no way_ (none... not possible, zero) that a Micro would be able to
provide the bandwidth and CPU power to host a realistic site doing 10 million
hits a day even if everything was straight from RAM.

Read through the EC2 forums for any length of time and you'll frequently find
people coming in with reports of their machines "stopping" or "crashing, and
totally inaccessible" for minutes at a time with a 100% ST ratio; every time
it is a Micro that has been hammer for a bit either through benchmarking or
use before the hypervisor puts it in a full nelson and brings it down to the
point that SSH connections to the host cannot even be maintained.

\-- I would also point out that not only is the CPU time allocated in bursts
for a Micro, but the bandwidth is prioritized behind every other instance type
(unless you were using a CDN, this would make hosting a typical wordpress site
sluggish at best from a Micro -- and again, doing 10 mil hits a day... not
going to happen in reality) -- yes you can offload all your graphical assets
to a CDN, but now this article is about a $15/mo server and a $4217/mo CDN
bill which is a very different article.

Additionally, if you need to use EBS at all in here, the story gets even worse
with Micro's (even using something like RDS which requires network I/O in
addition to hosting site content is all going to collapse on itself within the
first 5 mins of the site's life with traffic like that).

All that said, the tips in the article are great. I only mean to clarify
expectations with the use of Micro's. A whole swarm of them grinding through a
work queue in random order is great; using them as the backbone of your web
presence will have pain points. (Yes you can put 20-50 of them behind an ELB,
but at that point why not run a handful of Mediums or a few Larges).

Anyone with a Blitz.io account, please feel free to setup this exact
configuration and run the benchmark for 1hr with the same load to verify the
meltdown.

[UPDATE] Ewan, I am not knocking your article, I wouldn't expect most people
to be familiar with the ins and outs of every cloud provider. The tips and
techniques are great regardless of the actual performance on a Micro (and
applies to anyone trying to scale WordPress). Just wanted to clarify for
anyone getting excited that they can now run their Fortune 500 website on a
single AWS Micro that there are nasty surprises lurking just under the
surface.

[1]
[http://docs.amazonwebservices.com/AWSEC2/latest/UserGuide/co...](http://docs.amazonwebservices.com/AWSEC2/latest/UserGuide/concepts_micro_instances.html)

~~~
EwanToo
No problem, no offence taken :)

I only picked the micro because it's cheap, and AWS let me fire one up, build
the config, break it, trash it, and start again, all in rapid succession.

One thing though, this configuration doesn't actually need much real CPU or
disk resources - it's pretty much all memory, and as far as I know, AWS
doesn't overcommit RAM. this means it should be "relatively" stable. The CPU
usage is at around 5-10% even at the peak.

Personally, my own blog runs on Linode, because I think EBS is broken, but
each to their own :)

~~~
taligent
EC2 may be a bit slow/clunky but at least it isn't a security nightmare like
Linode.

Months after and they still haven't said what happened or what they fixed
during the Bitcoin theft fiasco.

I wouldn't trust them again with ANY site (former customer).

~~~
nknight
> _Months after_

It's March 31. slush's post was on March 1.

------
zippykid
Great post, the 10 million hits is a bit misleading, WordPress is a cookie
monster, so your default configuration of Varnish is going to have issues,
when people start adding comments etc.

WordPress itself isn't the scalability problem anymore, it's usually how the
blog is being used, and when the db needs to be interacted with.

At the same time, a lot of our customers use WordPress as a CMS, and have
static home pages and sub pages, which we serve in a similar fashion.

<https://github.com/zippykid/php-varnish> the plugin there, will be handy when
you actually start making blog posts.

------
Fluxx
I think this article title is misleading. It's not 10 million hits a day on
_Wordpress_ , it's 10 million hits a day using _Varnish_ to a single URL.
Might as well say you can serve 10 million hits a day to a static file served
out of memory.

------
Brajeshwar
This is good but let's look at from slightly different perspective. The author
(Ewan) is so good, so I'll assume his hourly cost runs anywhere between $50 to
$80 an hour on sys op.

Taking the least denominator - $50, if he spends just 5 hours a month trying
to keep his server alive, active, update and making sure it's not down,
responding to it if anything happens - that will cost him $200 + $15 per month
to maintain his setup.

If you're about running a well maintained popular Wordpress powered blog, why
go to the extend of doing it yourself when you can spend a little more, let
the people who runs such services handle it. That way, you can keep doing your
more productive work.

Unless, of course, he does nothing else but earns via the website, does it
full-time and is happy spending his 4-hour a week on running the server.

~~~
EwanToo
You're absolutely right (except about me being good, I don't know about that),
and if someone asked me about hosting Wordpress for a serious business, I'd
probably just tell them to use WPEngine or similar.

But as a learning experience for me, I think this has been pretty priceless,
and I enjoy it, bizarre as that probably sounds to the people who don't read
HN :)

------
Kesty
While I agree with the main point of the post(optimization, caching)the title
is a bit of an exaggeration.

On a normal website 10M hits will not be evenly during the day this example
hits are only an avarage of 10Kb, and the number of concurrer users are only
250.

Also the test seems to be done on a new wordpress installation. The more
content you have, the bigger the database,the longer the comunication with
your DB, the bigger the cache, ect...

In a real world scenario you proabbly would need something more than a
$15/month virtual server to handle 10M hits in a single day.

Still optimization is always a good thing.

------
jcastro
We're including a tool in Ubuntu called juju [1] that will enable us to ship
cool setups like this to users.

We did something similar for Wordpress [2] and plan to ship it in 12.04.

I'd love to bring you on board so we can compare configs and make it even
better. Also rkalla is correct, micros are great for prototyping but you'll
need at least smalls for production. The nice thing about micros is that you
can set it up, tweak, and then later reboot them into larger instances, so
they're nice for playing, but I wouldn't run a live site on them.

[1]: <http://juju.ubuntu.com>

[2]: [http://www.jorgecastro.org/2012/03/18/redeploying-omg-
ubuntu...](http://www.jorgecastro.org/2012/03/18/redeploying-omg-ubuntu-onto-
the-cloud-with-juju/)

------
hpaavola
What's the point of serving content from database by default if you have to
set up all kinds of caches just in case your blogs happens to be on HN,
Reddit, your local news paper, etc?

Shouldn't database be more like an add-on, not the core? Sure search is
something that is hard to do with flat files, but everything else should just
use files. It might be a good idea to save all the data also to DB, just in
case you want to do some markup changes (which happen like once a year or
something). But querying DB every time someone visits you blog? Crazy.

Also, when you don't need a database, backing up your whole site and/or
transferring it to another host is a lot easier.

More complex sites than a normal personal blog is of course a different thing.

------
kijin
> _echo “deb<http://nginx.org/packages/ubuntu/> lucid nginx” >>
> /etc/apt/sources.list_

Shouldn't that be "oneiric" since you're using Ubuntu 11.10? Or does the nginx
team compile everything statically so it works no matter which version you
choose?

I'm also a bit puzzled with your decision to make PHP run as the "nginx" user.
You probably did this to match the username that the nginx debs use (Ubuntu's
default package uses www-data), but what's the benefit of matching users
there? If you're going to change it anyway, why not make PHP run as "php", for
example? Some might even say that running both PHP and nginx under the same
user reduces security.

~~~
shuzchen
If you've got your php code and static assets in the same repo, then it's
generally easiest to have php and nginx run as the same user that owns the
files in that repo. php needs access to those file to parse/run them, and
nginx needs access to those (static) files to serve them.

Of course, you can manage this in other, more security conscious ways (move
static assets elsewhere, use group permissions, etc.) but this is probably the
simplest.

~~~
kijin
Most files and directories have 644/755 permissions by default, which means
they can be owned by any user and still be accessible (readable) to any other
user on the same system. What really matters is who can _write_ to those files
and directories, and there's no reason for anything other than "wp-content" to
be writable by the web server. WordPress blogs get exploited all the time, so
a bit of paranoia can't hurt.

------
brokentone
Kudos for the step-by-step benchmarking. However, ideal situations and no
business requirements can give you pretty impressive statistics. The most load
producing elements of every Wordpress site I've worked on have been necessary
to pay the bills, allow a reasonable workflow for content creators, or provide
good user experience.

When you load in a sizable theme, warm up the DB with thousands of posts, tens
of thousands of comments and users, and have requirements of short TTLs for
new stories to be posted, comment administration, etc, and real world
traffic... the picture looks a little different.

------
datums
ProTip: Don't run your production MySQL server on a micro. Micros are great
for mail relay servers, load balancer (no static assets), or dns.

------
antihero
Or leverage a template cache on (insert any framework here) and get 2,000+
hits a second on a Linode, or 172,800,000 hits day.

~~~
todsul
Probably closer to 20,000 hits per second with Nginx configured correctly and
if the framework doesn't impose too much overhead. Alternatively, use ESI with
Varnish and the HTTP Cache spec and get 10,000+ hits per second on a
relatively dynamic website (read: micro-caching).

~~~
antihero
Indeed, 1M per day may sound like a lot but it's really not that impressive.

------
danielrm26
Well done on the article, man -- very detailed.

I would say you should dump the caching plugin, however, and just do
everything it's doing in nginx itself. My mix also adds CloudFlare as a
caching system:

[http://danielmiessler.com/blog/how-to-run-a-wicked-fast-
word...](http://danielmiessler.com/blog/how-to-run-a-wicked-fast-wordpress-
instance)

~~~
EwanToo
Thanks, will take a look at your post later :-)

------
ck2
tl;dr - Varnish

The only way to make WP "fast" is to make it nearly completely static.

Get a half dozen people working admin area and your server will cry.

~~~
antihero
Though people on WP doing admin, if you have 6 people, are unlikely to be
doing things every second - most of it is writing stuff, looking at things, so
even if they were doing a click every second that's still 6 requests per
second that for even completely non-cached dynamic stuff is completely
irrelevant.

------
benohear
Assuming you are serving static content, a much easier method is to use a CDN
for all your assets (including the WP generated pages). It might cost a little
more for big spikes, but it makes you completely immune to almost any amount
of traffic for very little effort.

------
meow
Wish there is an Amazon AMI with your config... Much better than doing things
from scratch :)

~~~
EwanToo
Just run The Ubuntu ami ami-baba68d3 and follow the instructions, will only
take 10 minutes :)

------
4qbomb
Awesome, Wordpress promotes bad practices in multiple spaces now. Horrible PHP
habits formed from green developers learning on the "platform" and now basic
server architecture is going to crap for those poor newbies. Thanks
Wordpress...

------
sanswork
Try it again with just the last step(vanilla apache/vanilla wordpress/varnish)
what are your results like?

Keep the original setup and have your test script perform some random
searches. New figures?

(Everything is fast/non-server intensive when you're serving static data)

~~~
dugmartin
I think the main difference between using Apache vs nginx as the backend
server would be the amount of memory available for Varnish to use for caching.
nginx has a much smaller memory footprint.

~~~
cd34
apache-prefork has a larger footprint because it includes the php interpreter
in each thread. If he's using php-fpm, he would use mpm-worker in which case
apache and nginx threads have a similar memory footprint.

------
jstalin
What about memory usage with MySql? I've found that to be a significant
bottleneck with wordpress sites. A basic wordpress installation doesn't use
the database much, but if you add just a few plugins, things start to jump
pretty quickly.

~~~
dugmartin
Varnish takes the load off by caching the results from Wordpress. The Total
Cache plugin handles purging cache entries when you update posts/pages.

If you want to see how many queries Wordpress runs even for a simple post
install this plugin I wrote at my last job. It shows a log of queries in the
page footer if you are logged in as admin.

<http://wordpress.org/extend/plugins/wpdb-profiling/>

------
debacle
I found this really digestible. The HowTos that read like an annotated bash
history are incredibly effective, and even though I don't need this for any of
the WP sites I host, I can use it for almost any (mostly) static site.

------
zschallz
I really like these articles; it's nice to see how things like this are done
from scratch. The firewall step may be unnecessary in AWS though, because AWS
already has a firewall enabled by default through "security groups."

------
dalore
Why setup a firewall when it's hosted on ec2? It's all firewalled off by
default.

~~~
EwanToo
Mostly because some people won't be using EC2, and I didn't want to leave them
unprotected. I thought I might as well include it - it's a trivial addition
after all.

------
vikstrous4
> Download the nginx secure key to verify the package

>

>cd /tmp/

>wget <http://nginx.org/keys/nginx_signing.key>

Verify a package with a key you got over http? Am I the only one who noticed
this?

~~~
EwanToo
A bit silly yes, but nginx.org doesn't support https, which is slightly more
silly and rules out most other options.

------
ricardobeat
On a single page, without any plugins, menus, custom data, sidebar/widgets
logic etc. Real sites will have 10-100x worse performance, so you might get
100k/day on the same machine.

------
IanDrake
Great article.

Question: If you have Varnish running as a cache, should you really have the
WP 3W Cache plugin running too? Seems redundant, but I'm not familiar this
tech.

~~~
hkarthik
Varnish caches the actual HTTP requests, and speeds up static content
substantially by removing the necessity to load up PHP.

WP caching speeds up requests that spin up PHP by preventing them from making
a connection to the MySQL Database. This is useful for requests that may
peruse multiple blog posts, but load the exact same sidebar content by caching
things like post counts, categories, tag clouds, etc.

~~~
IanDrake
Really? That's interesting. I assumed the WP caching was straight page output
caching.

~~~
hkarthik
Some of the cache plugins may just utilize page output caching, but IMO if
you're spinning up your language runtime (Ruby, PHP, Python, Java, .NET) just
to do page caching, you've already lost. Varnish and even built in Web Server
caching can be far more efficient.

It's better to use the language runtime for more specific caching inside your
application logic and for smarter cache expiration.

------
munyukim
This is quite interesting .u could such a setup to run your prototype without
breaking you bank account.

------
brainless
Will the bandwidth not cost way higher on AWS than the $15 mentioned?

~~~
EwanToo
Mmm, your right it'll cost something, though AWS free tier covers 15GB per
month of bandwidth, then 12 cents per GB after that, so I'd expect it to be a
few $.

Obviously though, there's cheaper options than AWS out there - Linode include
200GB of transfer per month in their $20 option.

I mostly just used AWS because I could start the server, build it, and blow it
away, then restart, all without any hassles or full monthly chargers.

~~~
imperialWicket
If you are going to host it for a long duration, you can reserve a heavy
utilization micro instance and end up with an instance cost of less than $6.50
per month (with a three year reserved instance).

As long as you aren't serving lots of images, this will keep you well below
$20/month - even with large amounts of traffic.

If you are serving lots of images, or worried about excessive bandwidth:
Linode gives you 20GB/200GB (storage/bandwidth) on a 512MB server for $20, and
MediaTemple gives you a 512MB server with 20GB/350GB for $30 (a little cheaper
if you are going to be in the 300GB range when you consider Linodes .10/GB for
overages).

------
Iv
15$ != 15$ per month

