
Handling server load for the Raspberry Pi Zero launch - benn_88
http://blog.mythic-beasts.com/2015/11/27/raspberry-pi-zero-not-executing-a-trillion-lines-of-php/
======
benn_88
I'm the sole developer and maintainer of raspberrypi.org - we're a small team
at the Foundation and I do this alongside outreach, educational resources,
teacher training, software development and other projects. We pay Mythic
Beasts for a basic hosting service and they kindly provide us with a full
support service in running the server and keeping the site and other services
on the machine alive. That one machine runs various Raspberry Pi sub-sites,
the Raspbian and NOOBS downloads server, the Raspbian apt mirror, MPEG licence
key generation and more.

1\. Why PHP?

We all know PHP sucks (we advocate Python in education), but WordPress is
ideal for authors and contributors and that's what we had to begin with.

2\. Why WordPress?

When I joined the Foundation in 2013 they had a WordPress site Eben set up in
2011 and I wanted to keep the great user experience WordPress provides to
authors and contributors so I built a bespoke theme expanding the content
beyond the simple blog, keeping all former URLs working despite changes to the
URL structure. Some of the newer content, like documentation and learning
resources, is outside of WordPress and is pulled in from github and rendered
into templates.

3\. But why not build your own bespoke framework?

Firstly, if I did that, you'd tell me perfectly good web frameworks already
exist. Secondly, WordPress is very mature and it's easy to add functionality
quickly. I'm not talking about BS plugins - I hardly use any. I mean it's easy
to add stuff into the templates, stuff you're likely to want to do. Finally,
I'd rather spend my time and the Foundation's money doing educational
projects.

4\. WordPress caching plugins exist. Why don't you just use them?

We use WP Super Cache. It does most of what we want, but has some pretty fatal
flaws: it doesn't go to cache for "logged in users" (including anyone who's
ever made a comment), it recreates the cache after every comment, it does some
work to figure out of it can give the user a cached page or not (this slows it
down a lot), it doesn't cache the 404 page (more important than you'd think)
and it doesn't cache requests with any GET methods params. Mythic created
their own static page generating script for the pages with the most hits which
updates every minute, and that's what kept us alive this time.

5\. Why not use NGINX?

See a full article on this here: [http://blog.mythic-
beasts.com/2014/11/14/hiphop-and-wordpres...](http://blog.mythic-
beasts.com/2014/11/14/hiphop-and-wordpress-if-youre-tired-of-tea-then-youre-
tired-of-life/)

6\. What about hiphop vm?

We tried it. Fast but unreliable. We saw a huge rise in performance but most
of that came from the upgrade from Wheezy's PHP 5.3. Compared to Jessie's PHP
5.6 was less impressive.

~~~
jrcii
PHP doesn't "suck" it successfully runs thousands of projects, including a
handful of huge ones, all over the world. There is plenty of bad python code
out there as well. Maybe not as much, but it hasn't been popular for as long
either. In fact I don't recall hearing anything about it for web development
until Google standardized around it and everyone jumped on their bandwagon.
For those that would immediately reference the "fractal of bad design"
article, many of which I suspect have never spent any real time with PHP,
here's a point by point rebuttal: [http://forums.devshed.com/php-
development-5/php-fractal-bad-...](http://forums.devshed.com/php-
development-5/php-fractal-bad-design-hardly-929746.html) Which reveals that
most of the arguments against PHP boil down to "PHP is not X language which
I'm used to"

~~~
nickpsecurity
Same argument can be made for COBOL, which sucks. Anyway, I see you have
something to counter a critique. It might be helpful if you provided a link
that shows PHP is equal or superior to alternatives in important attributes:

1\. Concise code to express common constructs.

2\. Easy for 3rd parties maintaining or extending software to understand what
it does.

3\. Easy for type system, tests, or static analysis to catch errors before
they hit production.

4\. CPU or memory efficient during runtime.

5\. Higher uptime for interpreter and critical libraries.

6\. Lower number of defects and vulnerabilities in both interpreter and common
libraries.

7\. Ease of writing portable code might matter to some in case they like
shopping around for cheaper cloud vendors.

So, there's a start at a list with important metrics for missing-critical,
long-term code in web apps. Got a link(s) showing PHP is equivalent to or
better than competition in these?

~~~
Udo
I don't think " _PHP sucks!_ " \- " _no, it doesn 't!_" \- " _unless you can
satisfy my 7 point list you are wrong and my biases are correct!_ " is a
productive pattern of communication here, nor is chasing cheap applause lines
by attacking a language every 'real hacker' is supposed to reflexively
despise.

If we're being honest every language has its weak points and things that are
disliked even by its proponents. Efforts at improvement are more productively
directed towards making things better _on your own turf_. Taking derivative
and uninformed jabs at a language you already dislike and will never use is
not a good way to spend your energy, nor is the energy of your opponents well
spent in trying to overcome your biases.

~~~
nickpsecurity
I mostly agree with you. Hence my list of key points that have been the reason
for adoption of or migration from a language for production systems. Far from
a jab, it's a start at objectively measuring PHP's value vs other languages.

I actually have this data for a number of systems languages from studies done
in 90's and early 2000's. Despite much talk, I've yet to see a PHP proponent
post such objective information in a discussion. So I asked again as I know
_somebody_ has to have done an objective analysis or comparison. Yet again,
all we hear is (a) critiques are BS, (b) many people use it, and (c) therefore
PHP is a good choice. Sounds solid to me! ;)

~~~
Udo
> _the reason for adoption of or migration from a language for production
> systems_

People may think they choose languages and runtimes for objective reasons, but
objectivity is only a part of it. There are other factors like ecosystem
support, public image, fads, suitability for a specific purpose, aesthetic
aspects, and how well a certain language resonates with someone personally.

In this specific case, I believe there is zero value in convincing someone to
use, say, PHP for something (in production or otherwise) if they already
dislike the language for any reason. And it's not like you're missing out on
anything either for not choosing PHP. There are plenty of options out there.

> _I 've yet to see a PHP proponent post such objective information in a
> discussion._

I'm probably not the best person to do this. While I use PHP and believe in
certain contexts it's a good choice for me, I'm less sure about the benefits
of evangelizing this in a greater context. There are lots of things in PHP
that I actively dislike, as there are in any language I am proficient with.

> _(a) critiques are BS, (b) many people use it, and (c) therefore PHP is a
> good choice_

I think for several reasons PHP's advantages are minimized or even badly
understood even by people who write big software packages in it. That's just
my opinion, though. If you still want my atypical response to your points
anyway, I'll give it.

~~~
nickpsecurity
"People may think they choose languages and runtimes for objective reasons,
but objectivity is only a part of it."

Oh I agree. It's why I don't knock them for whatever they use for personal
projects. Serious stuff I recommend we throw the objectively best we have at.
Not PHP or many others for that matter.

"I think for several reasons PHP's advantages are minimized or even badly
understood even by people who write big software packages in it. That's just
my opinion, though. If you still want my atypical response to your points
anyway, I'll give it."

The one advantage I see is a ton of existing code and programmers to draw on.
One of same advantages for C. Also, a great pre-processor for HTML crowd
stepping into programming. Original selling point unless I'm mistaken with
ColdFusion doing same thing in business for data-driven apps.

Those are only two I know vs languages in same space. Certainly add anything
that you think are advantages. I try to give everything a fair shake. ;)

~~~
Udo
> _The one advantage I see is a ton of existing code and programmers to draw
> on_

I would not consider this an advantage per se, because exactly like C, it's
difficult to find a certain type of programmer who cares about things like
performance or has a real understanding of what is happening when their code
executes. I see the same thing happening in the Node.js community. (PHP,
C/C++, and Node are my primary languages right now, I'm certain the same thing
happens everywhere else, too)

> _Certainly add anything that you think are advantages. I try to give
> everything a fair shake. ;)_

How gracious of you :P As I said, I feel I'm not really representing the PHP
community or the direction it's heading right now, but here's my view:

As far as your points 1, 4, 6, and 7 are concerned - it's pretty much on par
with other dynamic languages such as Lua or Ruby. My knowledge of PHP VM
internals is limited, but as far as I saw last time, it's not fundamentally
different from Lua's. There are some big WTFs in the syntax area, which I
think is one of PHP's weakest points to begin with, but as far as expressivity
and concision is concerned it's relatively powerful. Rubyists tend to talk
about how speed is not important, which I disagree with, and PHP has never let
me down in this regard.

As for 2 and 3, well it's a dynamic language with an optional class system. As
such it has the same weaknesses and strong points as other languages with
those characteristics. Personally, I think PHP is strongest in fact _without_
using the class system religiously, and instead making major use of its
array/list type and functional code patterns.

Which brings us to 5 and 6. PHP's unique strength and primary reason for me
using it is its execution model in a web server environment. Every other
language and web framework has gone the way of the application server, but PHP
is using a per-request execution model which is incredibly powerful if used
right. Per-request execution sounds like a high-overhead situation, and in the
past it was, but while there is undoubtedly _some_ cost associated with it,
the FPM engine is really quite efficient with it.

In exchange for this you get to benefit from a stateless app. If your code,
the interpreter, or a library screws up, you're not taking down the server
with it, you just disrupt the requests that actually hit the bug. You can
tailor your app's code paths with this execution model in mind, loading and
initializing only what you need. Since the environment is torn down after a
request finishes, there are no leaks or complex GC gotchas. Scaling up becomes
easy, because state is not stored in your code. You can spin up as many
servers as you like, distribute requests among them, and they don't have to be
aware of each other. Retracing what happens per request becomes very easy,
which is great for optimizations and debugging. And you get live code
reloading for free. I love this, it's so simple and elegant.

Of course, there are lots of things this is unsuitable for, especially when
you really do need a persistent app server. But even in these cases, it can be
advantageous to write a small broker-style server in C or Node (or Go, or
Erlang, whatever you want), and put the intelligence part in a PHP backend
API. I use this setup in some projects where Websockets or other persistent
protocols need to run: the broker needs to be updated very infrequently, so it
can have a gigantic uptime, and the backend PHP API gets updated all the time
transparently with a git pull - nothing needs to be restarted, everything just
works.

~~~
jkarneges
> _the broker needs to be updated very infrequently, so it can have a gigantic
> uptime, and the backend PHP API gets updated all the time transparently with
> a git pull_

I strongly advocate this architectural style. Even for developers well-versed
in Node, Go, etc, it's good defensive programming to split the application
into two processes (broker and backend).

You might find the Pushpin project interesting. It's an attempt at
generalizing the broker part, and it works great with a stateless PHP backend.

~~~
Udo
It's great to see I'm not alone :)

> _You might find the Pushpin project interesting._

That is interesting, I didn't know such a project existed (kinda surprised
actually). However, that's not a pain point for me at the moment. I maintain
my own reusable broker module for Node/Websocket things, plug it into the
NginX config and I'm ready to go. Plus it's not too hard to cook up one of
these things if the need arises.

------
Udo
I want to criticize the choice of headline here, which was no doubt the point
of having it. It's not about PHP. PHP runs fine on Pis. Not that I would ever
advocate running a website with serious traffic on that device, but it could
very well serve a 50-100 dynamic requests per second (I just tested this with
my Pi B running NginX and my custom home automation software which is web-
based, the baseline is 250 static small file requests per second, so not a
great starting point for publicly hosting things).

Instead, this should be a critique of Wordpress, which is like almost any
other successful software immensely bloated. There is nothing inherent in PHP
promoting this verbosity. And even if you are dealing with large code bases,
PHP's request-based execution model gives you good tools to just execute the
codepaths you need. But in practice, little consideration goes into minimizing
execution bloat and overhead. Of course there are reasons why WP is bloated,
some of them bad, some good, some subject to debate. Caching and avoiding
dynamic code execution will be a big part of optimizing any Wordpress install
for the foreseeable future.

We're moving towards more bloat, not less. The best practices of many
programming environments actively encourage us to disregard cycles and
latencies. This happens in PHP, in Node, even in C++ frameworks.

~~~
thrownear
Don't use Php anyway.

[https://www.reddit.com/r/lolphp/](https://www.reddit.com/r/lolphp/)

~~~
djdjgd
And use what instead? Serious question.

~~~
thrownear
Python 3 or even Node is an improvement...

------
tacos
One form of this article might be: "Here are some best practices for hosting a
high-traffic WordPress site."

Another form is "I saved the world from a trillion lines of PHP! And if we did
this on AWS it would be _many tens of thousands of dollars per month_..."

Perhaps I should make sure I'm giving the author the benefit of the doubt:

"... you’d still need to make sure you can effortlessly scale to thousands of
cores ..."

10k users peak. _Thousands_ of cores? $30k+ a month? This is blatant
misinformation.

------
johansch
The lenghts people go to in order to run Wordpress never ceases to amaze me.

~~~
mrcarrot
In this case isn't it more the lengths they were going to to _not_ run
Wordpress?

~~~
johansch
Cute.

------
tyingq
Reading the article they finally get down to what it is...98 lines of perl
that serve as a url specific content cache.

A somewhat odd solution for a php driven website. Since you're in php already,
a more typical solution would be to use apcu to have the cache in shared
memory.

Or, if your needs are more complex, using an already established cache like
the one in nginx or varnish, etc.

Whatever they did served their needs, but I'm not clear on why "we made a one-
off url+param keyed cache" rates an article.

~~~
mfjordvald
APCu would not be enough at this level. Simply invoking PHP itself is enough
to kill servers here. But yes, using nginx or vanish would have probably been
easier and there's plenty of resources on it. In a case at my last job a
single nginx instance handled 24k concurrent connections just fine.

~~~
toast0
PHP can easily handle thousands of requests per second, as long as each
request doesn't do much. Fetching from a shared memory cache counts as not
doing much. You would need to do before you initialize WordPress though.

~~~
tyingq
Right, the idea would be to put the logic at the top of the main php file for
wordpress, something like:

    
    
      # oversimplified, but...
      $key=md5(http_build_query($_GET));
      $content=apc_fetch($key,$success);
      if ($success === TRUE) {
          echo $content;
          return ;# never invoke wordpress
      }
      # wordpress here
      # some logic at the bottom to store what wordpress
      #   just rendered into the cache

------
rasz_pl
I was one of the original raspberrypi.org trolls for a couple of months. RS
Components scammed me with a "preorded" and kept moving delivery date for HALF
A YEAR, while at the same time __already delivering__ to newer customers -
they prioritised fresh customers over me, because they already had my money :/

Similar thing happened again with Zero, all big fat distributors (this
includes Adafruit) engaged in scamming early customers by bundling $5 product
with $2 of cables and upselling whole thing as a $20-60 pack(!!). They went as
far as removing $5 product listing altogether.
[http://www.adafruit.com/pizero](http://www.adafruit.com/pizero) had $5 listed
for maybe few hours, then they deleted it. They still list other out of stock
products there.

To this day Pee foundation acts shocked, SHOCKED I TELL YOU every time someone
complaints about official distributors being scammy. Whole thing would go a
lot smoother (from the client perspective) if they didnt shield predatory
marketing tactics of those resellers. Today I understand it wasnt rPi fault,
but Im sure other people dont, and acting as a PR for element14 will never
work if you promise X and element14 silently delivers X+super cheap piece of
crap at 4x the price (before taxes) behind your back.

------
pmlnr
This is a crappy article about realizing the need of cache in front of dynamic
websites.

Wow.

PHP haters: yes, PHP is crap, and we should all code in C, because nothing
beats C in speed. ( Except for Perl in regex land. )

WordPress hater: the WP core itself is not that bad. The themes and the
plugins, those are the real monsters.

Article owner & WordPress users: there are pretty good cache plugins already
for WP, you needn't have to write another.

~~~
walshemj
Fortran still beats c though

------
TazeTSchnitzel
Er, so they rolled their own custom caching system?

Isn't caching one of the first recommended ways to improve the performance of,
well, any mostly-static content site?

------
pronoiac
There are caching plugins for Wordpress, and you can configure Nginx to serve
pages from the cache if no relevant cookies are set. So this feels a little
like it's showing lack of knowledge about what's been done before.

------
acd
You can run Php 7 and Nginx, that would be faster and stable. When running
nginx should use the nginx proxy caching features together with Wordpress that
is very very fast!

Why not benchmark a raspberry pi cluster ?

------
cjfont
For some reason I thought this was going to have something to do with the
Raspberry Pi Zero.

~~~
joosters
It's about the people who are hosting the Pi website, and what they did to
manage the burst of activity when the Pi Zero was announced.

~~~
cjfont
Right, but the title strongly implies that the article is talking about
executing PHP on the Pi.

~~~
giancarlostoro
I had the same reaction, I was hoping to hear about a problem solved by the Pi
Zero, maybe configuration tweaks or something for PHP, it sounded almost like
it was going to give a performance tip as far as the Pi Zero is concerned.

------
stefantalpalaru
Wouldn't it be better to setup a caching reverse proxy like Varnish? I bet
there are configuration examples out there to help you run it with Wordpress.

~~~
jacquesm
Varnish would do just fine, ditto Nginx, or they could have used cloudflare or
some other service.

The funny thing for me is that they make it seem as if 10K concurrent users is
a 'big deal' for a bunch of static pages. If you're not interacting in real
time a little bit of caching or page generation will go a long way towards
being able to serve up a huge number of viewers / participants.

~~~
gabemart
I've had 10k concurrent users on a bandwidth-intensive app served statically
from 3 $5 digital ocean droplets using nginx. CPU usage and memory usage were
not that high. I agree it's not a big deal.

------
programminggeek
But I really like executing PHP code!

