It's more useful to know how the sites perform most of the time.
I'd also have been interested to know if tests were controlled for time of day. For example, the sites I host are almost all Australian and so the traffic I see follows a 12-hour wave as daylight crosses the continent from East to West, with small peaks around lunch time and after dinner.
What tends to happen is that spammers hit that file up and ask it to perform slow operations (the Wordpress team recently greatly expanded its capabilities). A PHP instance gets tied up for the duration. I've had instances lock up until killed by timeouts of 2 minutes.
When a bunch of traffic arrives at once, it only takes a handful of badly behaved instances of xmlrpc.php to render the server essentially inoperable.
I just went through this a few months ago. My bloggers don't use it and it can't be deactivated from within Wordpress any more, so I just 404 it in nginx.
I guess as a marketing piece it gets some interest, but it lacked in actual utility for me.
Of course, this is guess from a priori reasoning. I would be genuinely interested if somebody tried file-based comments for a WordPress (or a similar platform) based site and it fell over or failed in some way.
The main reason you'd get failures in a flat file system would be the normal reasons you get a relational database in the first place. Either you get update anomalies, or you want to compact the data into single atoms, or you want ad hoc querying rather than a specific pattern of access set in advance.
It doesn't help that until 5.6 MySQL's query planner joined on disk if you tried to join any two tables in which one or both has a TEXT field, regardless of engine and regardless of the actual fields selected.
As you can imagine, this slows things down a bit, considering that's the core join performed for single post generation. It also makes every kind of "recent comments" plugin a performance nightmare (quite aside from their cache-busting power).
Wordpress divides posts + comments into 4 tables.
> which could be loaded asynchronously via Ajax for swift loading time and less load on the server (no slow view renders)
Wordpress can't assume any particular PHP + webserver configuration. Sometimes they're configured to send while rendering, sometimes the whole process blocks until it runs to completion. Ajax doesn't magically change that.
I also dislike the idea that I should need to download what is a small application to assemble a document ... when I could just download the document straight up.
When page loads get too long, I switch on pagination.
> cached with Nginx/Varnish, and further cached on the backend with Redis
My own setup is memcache for working objects. WP-supercache is configured to write gzip'd pages to disk and nginx is configured to serve .gz versions of page directly from disk whenever they're found. The OS file cache does the rest.
> I've seen WP sites that I had to debug/fix with 200+ queries per page.
When you see hundreds of queries per page, check your plugins and widgets. Wordpress is not my favourite software by any measure, but out of the box it's not that dumb.
The first blog software to get really popular was Movable Type (http://en.wikipedia.org/wiki/Movable_Type). MT was basically a set of Perl scripts that provided a nice GUI backend to manage content stored in a database. But the actual public-facing sites it created were all static files. When you hit "Publish," it would scan over the database and grind out a new set of static HTML files.
This made scaling a Movable Type site really easy, because even back then Apache could serve static files like a monster on practically any hardware. But it meant that when you changed something on your site, that change didn't show up there immediately; you had to wait for a rebuild operation to complete and a new set of static files to be ground out to see it. The larger your site got, the longer that rebuild operation could take. "Rebuilding" became to Movable Type what "buffering" used to be to RealPlayer (remember that?).
Then, after a few years of MT being the leading product in the then-rapidly growing blog market, the developers behind it decided to jack up the pricing structure for it (http://scott.yang.id.au/2004/05/license-changes-with-movable...). MT was not open source, it was proprietary, for-pay software, so if you wanted to use it you had to pay what they asked for it. And suddenly it looked like, instead of being twenty bucks, a valid MT license was going to cost hundreds.
As you can imagine, people freaked out and started looking at the available alternatives. There were lots, but none of them were particularly impressive at that stage. But some prominent people (see http://web.archive.org/web/20060410125402/http://diveintomar...) noticed a little project called WordPress that, while pretty feature-poor compared to MT, had two big things going for it: it was 100% GPL (so no worries about future price changes), and it used a dynamic publishing model where pages were generated live on request through database queries instead of being baked into static files.
"Hooray!" everyone yelled. "No more rebuilding!" And so the rush to WordPress began. It was pretty obvious at the time that using live, dynamic pages would be a scaling headache. But most blogs never get enough traffic for scaling to become an issue, and dynamic pages meant not having to wait for rebuilding anymore. Never underestimate the power of appealing to impatience.
The irony, of course, is that once WordPress got popular, people started noticing how inefficient dynamic pages could be, and the trend of the "static site generator" was born. SSGs are now very hot and buzzed-around, at least among programmers (they're generally too nerd-optimized to appeal to the general market). But there's nothing really new about the idea; they're just the pendulum swinging back to where Movable Type was in 2001.
Perhaps someday an SSG will dethrone WordPress. If that happens, mark your calendar; you'll want to be prepared when "instant updates!" become a selling point again, five years or so later.
Beyond that I couldn't understand their product feature priorities: no headings in their wysiwyg editor, no good themes. I was looking to something like http://om.co which is wordpress but using commercial themes not available for the general public.
I have said elsewhere that the solution to having to choose between static and dynamic is to make site updates POST-based with queuing logic. The term of research is Staged, Event-Driven Architecture (SEDA).
mod_perl allowed programmers to sink their application code deep into the guts of Apache. In retrospect, this turned out to be a dead-end because it is difficult to install, difficult to update and difficult to scale up.
PHP's lack of power and low cost of operation ("here's your FTP account, have fun") made it much more attractive to shared hosts.
In short, operational advantages outweighed programming advantages.
Although Wordpress crashing under load is what caused me to switch to Octopress.
It's all about the caching plugins. I had my blog hit 300k visits in the span of 8 hours, peaking at about 100 visits per second on a shared web host. Utilizing cashing plugins means every single person that visited my blog was served static content that refreshed itself every 10 minutes.
200 visits/second caused blank pages half the time.
300 visits/second? Boom.
Then again, 300 visits per second, estimating 250KB for the entire page (css, js, images, etc) is about 75Mbps which is going to be a lot to ask from any shared hosting provider.
What I do with my VPS:
1. WP Supercache is configured to write .html.gz copies to disk.
2. Nginx is configured to check if such a file exists before referring anything to PHP. If it finds the .html.gz, it will serve it directly from disk. Ordinary file caching gives it the advantages Varnish promises.
3. WP Supercache appears to correctly handle invalidation when a new comment is posted.
Wordpress can handle insane amount of traffic if you know what you are doing. 99% of wordpress borking under sudden spike in traffic issues can be boiled down to
- Not using caching
- Using shared hosting
I'm not a Python guy, but Plone kicks its ass in pretty much every way as a CMS.
Wordpress can't seem to choose what they are- and they fail miserable from a technical perspective (yes, I know they have a huge market share, but I'm not one to believe that the markets actually pick the best option. McDonalds burgers are not the height of cuisine).
As with website software, cache performance probably has a great deal to do with how WordPress performs.