
How Badoo Saved $1M Switching to PHP7 - DariaBo
https://techblog.badoo.com/blog/2016/03/14/how-badoo-saved-one-million-dollars-switching-to-php7/
======
romankolpak
I'm glad to see such positive news about PHP. It received way too much beating
from the community in the last couple of years while constantly improving.
Interesting to see some benchmarks on how PHP7 compares to Python and Ruby in
terms of performance.

~~~
FranOntanaya
I think one overlooked good thing is how little drama I hear about the
transition from 5.6 to 7. People seem to be moving on quickly (the large perf
improvements help a lot). I feel it's not going to be long before almost
anyone with a half skilled dev to care for it will be on 7 and we won't have a
Python 2 vs 3 or Perl 5 vs 6 scenario.

~~~
vpkaihla
The reason for that is that python2=>3 fixed several fundamental problems. 5.6
=> 7 is just a incremental upgrade, whatever the version numbering tells you.
Most of the PHP problems are still there.

~~~
MAGZine
> The reason for that is that python2=>3 fixed several fundamental problems.
> 5.6 => 7 is just a incremental upgrade, whatever the version numbering tells
> you. Most of the PHP problems are still there.

too bad for py3's fundamental problem: that it broke the interface so badly,
people _still_ don't use it. PHP7 has no such problem, and gets to add
boatloads of new features (which will be immediately used/adopted) while
deprecating old versions of the language.

~~~
vpkaihla
You don't understand. To fix problems like python3 did, you have to break
older code.

Php7 did nothing like that, that's why the "porting" process was so light. But
it also means it didn't fix any of the big things.

~~~
TazeTSchnitzel
> Php7 did nothing like that, that's why the "porting" process was so light.

This is nonsense, PHP 7 contains a long list of backwards-compatibility
breaks:

[http://php.net/manual/en/migration70.php](http://php.net/manual/en/migration70.php)

If PHP 7.0 didn't break anything, it would have been called 5.7.

The upgrade is easy simply because these breaks do not have as much of an
impact as Python's did.

~~~
vpkaihla
No, it's not nonsense. The things you listed in there are incremental fixes,
not fundamental.

[http://php.net/manual/en/language.types.type-
juggling.php](http://php.net/manual/en/language.types.type-juggling.php)

How would you fix this without breaking most PHP code out there?

How would you fix ==, > and < behaviour without breaking most PHP code out
there?

How would you fix Unicode without breaking most PHP code out there?

If PHP fixed things like this, it would be a decent programming language, but
it would also break a lot of things. Kinda of like what python3 did, only that
it didn't have to break quite so much, since python2 was already pretty
decent.

------
jtchang
This is awesome. Not only did the have an excellent write up but also
contributed the patches back.

I'm not the biggest fan of PHP (mostly because I learned it on PHP4 and it
encouraged bad coding practices back then). However I can't deny how many
people use it and how easy it actually is to get started.

~~~
CiPHPerCoder
Up until last year, typing "PHP [anything security- or cryptography-related
here]" into any popular search engine would lead you to bad practices.

Then this happened:
[https://meta.stackoverflow.com/questions/293930/problematic-...](https://meta.stackoverflow.com/questions/293930/problematic-
php-cryptography-advice-in-popular-questions)

Stack Exchange websites rank well and they allowed myself and others to
provide better answers than they had previously. This leads to more developers
being exposed to safer, cleaner ways to solve problems. Which means less
terrible code being written from following terrible SO answers and tutorials.

And now things are a lot better. There's still obviously work that needs to be
done, but:

    
    
        - The ecosystem (i.e. tutorials) is improving
        - The language is improving drastically
        - The community is pushing towards better practices
        - Framework developers are taking security seriously
    

A lot of people don't like how PHP used to be, including most of the people
who put time and energy into making it better. You're not alone.

P.S. If you like the direction things are headed, help us make 7.1 even
better:

* [https://wiki.php.net/rfc/mcrypt-viking-funeral](https://wiki.php.net/rfc/mcrypt-viking-funeral)

* [https://wiki.php.net/rfc/libsodium](https://wiki.php.net/rfc/libsodium)

* [https://wiki.php.net/rfc/php71-crypto](https://wiki.php.net/rfc/php71-crypto)

~~~
voltagex_
Brilliant. Do other languages need the same cleanup?

~~~
CiPHPerCoder
Java, C#, and Node.js come to mind immediately as good candidates, but any
beginner-friendly language probably need it to some degree.

The Node.js core is particularly bad: It only offers OpenSSL's userspace RNG
rather than the operating system's CSPRNG. (That's what
window.crypto.getRandomValues() hooks into.) Many of node's deficiencies are
tackled by the community, of course. For example:
[https://www.npmjs.com/package/scrypt-for-
humans](https://www.npmjs.com/package/scrypt-for-humans) versus the native
PBKDF2 (with no constant-time comparison).

I hear Ruby's SecureRandom still refuses to switch from /dev/random to
/dev/urandom as well. Feel free to throw your hat in the ring on that issue.

Historically, C# doesn't have as strong of an open source culture, but if you
wanted to make an impact, you could try to nurture it into growing
explosively.

But generally: Look at the languages you're already familiar with and see if
bad security advice gets more attention from non-experts than from experts. If
that's the case, go out of your way to learn how to do things better (if you
aren't already well informed) and publish your corrections.

------
giancarlostoro
PHP used to get a bad rep, and I myself looked at it sorely, but PHP7 is
indeed looking promising, and I have noticed a lot of the tutorials I've seen
on the more major sites promote proper PHP development (from the little PHP I
know). For $5 a year I could host a PHP website, can't effectively do that in
other languages / platforms. I really hope PHP7 gets more recognition amongst
most web hosts.

~~~
jshen
I can host a python, go, or Java site for free on app engine. Cheap hosting
isn't exclusive to php.

~~~
brightball
Economically speaking, it really is. App Engine will run those languages
cheaper because it's subsidizing them.

PHP hosting is more naturally cheap which is why there are so many cheap PHP
hosts out there. It basically boils down to one simple principle: boot up.

With other languages, you have to boot up. You've got to startup the app, load
it into RAM and sit there waiting for a connection. With PHP, you can fill up
a hard drive with code and it will all run when called as long as the request
volume doesn't overload the RAM.

That's the entire reason that PHP hosting is so much cheaper. RAM doesn't get
used until a request comes in.

~~~
iso-8859-1
Isn't the language much bigger than the size of the parsed code? Using PHP
with FPM or with mod_php, you have that language runtime sitting there unused
even if there are no requests. By your logic, CGI would be the cheapest.

~~~
brightball
> By your logic, CGI would be the cheapest.

It would be. With PHP the language is loaded once regardless of how much code
is going to be parsed and executed. The amount loaded matches size of the code
base with other languages that have to boot up, which puts a cap on how much
low traffic code you can put on a single server.

------
brianwawok
Or alternatively how Badoo could have saved $1.9M by coding in a compiled
language.

Ruby guys can always do the argument "Our code required 50% more servers, but
it increases developer productivity by 50%, and since developers cost more
than servers - it is a win".

Has anyone ever really made that argument with PHP? It seems one of the least
developer friendly languages possible on a large project. So developer
unfriendly + server unfriendly = ?????

~~~
mrweasel
>Or alternatively how Badoo could have saved $1.9M by coding in a compiled
language.

Assuming you don't include the cost of converting their 3 million lines of
code to a new language and the cost of learning a new language. Of cause they
did need to "port" from PHP5 to PHP7, but I feel that it's cheaper than going
to a completely different language.

You're not wrong, but the $1M saving is much cheaper bought than the $1.9M.
PHP is a weird language though, it's easy to get started with, pretty
efficient at this point, and yet somehow people use it to write ugly and
sluggish code. I haven't written PHP in a long time, so this may have changed,
but it almost like the language is missing ways of structuring the code in a
sane way.

~~~
jwdunne
This isn't really true anymore. Namespaces, as ugly as they are visually, plus
autoloading plus a package manager that handles it for you allows quick
structuring of a project. This can be configured to fit whatever conventions
you prefer but the community has put out there a set of good conventions in
the form of PSR docs so you can get started quickly.

The whole mixing of HTML with PHP with SQL is bad form and is widely
discouraged. In fact, this code is often seen with usage of mysql_ functions
plus manual escaping and interpolation. This is discouraged to the point where
that library has been removed.

I'm no evangelist, I prefer other languages, but with these "good parts" so to
speak, the language is usable and supportive of good structure. We have had
these improvements relatively late in the game but they do appear to come
thick and fast nowadays.

There are many warts in the language. Sometimes I'm angered when very good
ideas for progress are shot down via committee. There has, however, been
improvement since PHP 4.

~~~
maruhan2
Could u explain why HTML&PHP&SQL is bad? And is it only bad when all three are
combined?

~~~
jwdunne
I don't think you should have been down voted for this question. It
discourages asking to learn.

What I meant was mixing them together in the same piece of code. The other
comments explain why: its doing too much, its difficult to maintain and reason
about so its harder to find bugs, especially serious ones, and difficult to
fix them.

------
nerdy
It's hard not to love the CPU usage graphs. Same would go for the memory usage
graph if it wasn't scary, looks as though the application is using no memory
at all.

I've also been looking around for places using PHP7 in production, was glad to
hear it went well. While Etsy did it, they have Rasmus. Does anyone else have
examples of PHP7 running in production, especially if the development team has
written about it?

~~~
Judson
We ([https://judson.biz](https://judson.biz)) switched to PHP7 in production
last month. I need to review the git commits, but our modern (~2yr old,
running PHP 5.6) customer frontend and admin backend were very close to
production-ready without any work.

Our 10 year old inventory/vendor backend that relied on the (now removed)
mysql_* calls needed to be shimmed[0].

Our largest benefit was a significant drop (50%+) in memory usage when
allocating lots of objects (which we do when pre-fetching ORM relationships),
and a significant reduction (40%, 100-200ms) in render latency when rendering
many partial product templates.

Edit: One thing that did bite us post-rollout, which didn't show up in
development was this php7-fpm bug[1].

[0]: [https://github.com/dshafik/php7-mysql-
shim](https://github.com/dshafik/php7-mysql-shim) [1]:
[https://github.com/php/php-src/pull/1720](https://github.com/php/php-
src/pull/1720)

------
cubano
Let's face it, PHP is the Donald Trump of programming languages.

Tons of people love it, but admitting you do is pretty much socially
unacceptable.

:)

~~~
staticelf
I don't know why people downvote you, I think it was a hilarious comment and
I've written php for years.

------
jmngomes
> "Besides this, the overall load on the cluster fell below 50 percent thanks
> to Hyper-Threading technology, further contributing to the impressive
> results. In broad terms, when the load increases to over 50 percent, HT-
> engines, which aren’t as useful as physical engines, start working."

How did "overall load on the cluster fell below 50 percent thanks to Hyper-
Threading technology"? Isn't HT simply exposing a virtual core for every
physical core on the CPU?

When an HT CPU is monitored to be at 50%, it's actually at 100% or close to
it, so it appears that what fell 50% was the reported values for CPU usage. I
don't understand how there's a 50% decrease in load due to "when the load
increases to over 50 percent, HT-engines, which aren’t as useful as physical
engines, start working"

------
fauria
If anyone wants to try a PHP7 LAMP stack, I wrote a Docker image a few days
ago:
[https://hub.docker.com/r/fauria/lamp/](https://hub.docker.com/r/fauria/lamp/)

------
cdevs
I would care more about the reaponse time increase per request over the tiny
100k saved in servers, though 300 less servers makes a happy sys admin out
there.

------
Im_a_throw_away
Do you guys know any good ressource to learn PHP 7 from someone who learned
PHP years ago for fun (thus probably with a lot of bad practices)? Thanks!

------
jmnicolas
3 millions lines of code for a dating website ? Hopefully they don't decide to
create their own OS !

~~~
Mahn
That's what being online for 10 years will do to your product.

------
pmarreck
Might have been $3M had they switched to a functional language

------
johansch
"The idea that databases are a bottleneck in web-projects is an all-too-common
misconception. A well designed system is balanced: when the input load
increases, all parts of the system take the hit. Likewise, when a certain
threshold is reached, all components – not just the hard disk database, but
the processor and the network part – are hit. Given this reality, the
processing power of the application cluster is arguably the most important
factor"

Oh my. Where do we begin? I guess, using this logic, the language that is just
the exact amount of slow to match whatever database workload you have is the
optimal language.

Because, as we all know, all parts must be equally balanced, otherwise it's
not optimized. So if you need to use 100 mysql servers, you better darned use
100 computing servers of equal computing power, because if you only use 50
you're not balanced.

~~~
alexeyrybak
Pardon, are you sure there was anything about being "equally balanced"? Why
100 databases lead to 100 app servers? I'm afraid the key point is missed: in
a balanced system, when input load increases, all parts should slow down all
together. Means you just need to have a "proper" proportion of resources in
various parts of the system. No need to have 100 apps if your MySQL cluster
dies only at 25% cpu load on the apps cluster. This might just indicate you
better have 25 app servers vs 100 databases, and spend the rest of money
proportionally on both clusters (assuming you only have apps and databases).
And a good balance also doesnt mean you have perfectly designed or optimized
system. These are different things.

~~~
cakoose
"Given [a well-designed/balanced system], the processing power of the
application cluster is arguably the most important factor."

That doesn't follow. You can have a well-designed system where the PHP code
only contributes 10% latency and resource usage, in which case the PHP code is
clearly not the most important factor.

"This might just indicate you better have 25 app servers vs 100 databases and
spend the rest of money proportionally on both clusters."

Exactly. What matters is proportion of latency or resource usage contributed
by the PHP code, and that can vary a lot between well-designed systems. The
property of being "balanced" doesn't automatically make your PHP application
cluster "the most important factor".

~~~
alexeyrybak
"The most important" phrase is misleading, I agree. The idea was to underline
the importance of cpu resources on apps cluster but not to state it's the most
important thing, the main, the one, always. You are correct.

