
PHP 7 deployment at Dailymotion - dailymotioneng
http://engineering.dailymotion.com/php-7-deployment-at-dailymotion/
======
nikcub
Hack and HHVM solves what is, IMO, the worst feature of the default PHP
runtime environment[0] - and that is the superglobals.

It wasn't mentioned in the post from Slack, but default superglobals and the
earlier register_globals design decisions are the worst and most impactful
wart in PHP.

Because it was designed as a templating language, the default web server
interface, which is CGI - will auto-expose all variables in global scope, ex.

    
    
        echo $_POST['user_id']
    

PHP has a horrible reputation with security for this reason - we all know that
somehow, somewhere, in almost every project someone is pulling in a user-
controlled variable from a superglobal and they aren't escaping or checking it
properly (since you can't be warned about it but the feature will work).

Worse - and i've seen this _a lot_ , even with Laravel, CodeIgniter, Cake,
Symfony/Silex etc. you end up with these well structured projects that declare
request classes, methods and variables etc. etc. but then sometime down the
road a developer takes a shortcut in a method and pulls in a $_GET or $_POST
inside a controller (usually because they don't know how, or aren't bothered
to - changing all the related classes) - running around the default exec
stack.

I've seen this so often - because it's so easy to do it. The most common place
is where a designer has built a frontend AJAX form. They now need to build a
quick backend check, so they Google "php backend ajax username check" and
they'll likely get a result like this one:

[http://stackoverflow.com/questions/29459183/check-
username-a...](http://stackoverflow.com/questions/29459183/check-username-
availability-using-ajax)

where the 4th and 5th lines to the solution are:

    
    
        $username=$_POST['username'];
        $query="SELECT * FROM username_list WHERE     username='$username' ";
    

they copy that into a file called ajax_username_check.php and save it to the
server - and they've now destroyed all that previous good work by opening up a
very blatant and easy to find SQLi vulnerability. Their database will be on
pastebin within a month.

You can spot this type of vulnerability from the frontend because the URLs
used in the AJAX calls don't match the URL router patterns for the rest of the
app (ex. GET /user/username_check_ajax.php vs /user/check_user).

In other languages you can't get that without using a standard library that
will escape the values by default. Any solution you search for will always be
a safe method to obtain the variable values by default.

Some good news: Hack doesn't expose superglobals in strict mode:

[http://cookbook.hacklang.org/recipes/get-and-
post/](http://cookbook.hacklang.org/recipes/get-and-post/)

I'd _strongly_ recommend that this is used in all PHP projects, since it
strictly enforces variable access - even in cases where you're using a
framework that is supposed to enforce it.

IMO PHP missed a big opportunity with not removing superglobals in version 7
and enforcing an explicit safe request object much like other languages do.
They likely wanted to avoid it because of the cluster of register_globals and
magic_quotes from earlier versions.

[0] I think it is important to distinguish PHP the language and PHP the
runtime. PHP the language is now decent - having caught up with a lot of
features (although I find it very verbose and harder to read) while PHP the
runtime is undoubtably still a horrible runtime - hence HHVM

~~~
Akujin
I don't get why people keep harping on super globals are being inherently bad.
The variables are there. You can use them or ignore them. A variable
definition harms you in no way other than a tiny bit of memory usage which is
capped by the HTTP limit on POST and GET limits anyway. What? You think you're
gonna get hacked because $_POST['ihaxyou'] is set to 'w00ts'?

No one does this anymore:

mysql_query("SELECT * FROM `table` WHERE `id`=".$_POST['ID']);

There's absolutely NOTHING wrong with having $_POST['whatever'] inside a
controller as long as you're doing proper checks.

Are you expecting it to be an integer? Easy

if(!ctype_digit($_POST['ID'])) { // throw exception here }

Contrary to the hive mind you don't need some special encapsulation class to
pull your post and get variables.

HHVM does not in any way solve this issue. You still have to write proper
validation into your code or you'll get hacked. What's with people expecting
frameworks to do everything for them these days?

~~~
developer2
> Are you expecting it to be an integer? Easy

> if(!ctype_digit($_POST['ID'])) { // throw exception here }

ctype_digit is broken. Try passing integer values. ctype_digit(50) === true,
but ctype_digit(100) === false. And "0000001" passes as true, which most
people in the majority of scenarios would prefer not to pass. I can't remember
what the even-worse-bug is with ctype_digit is, but even if you cast the value
to string [ex: ctype_digit((string)$var))], there is some value that still
passes for true when it shouldn't - do _not_ use ctype_digit. is_numeric() is
also unusable for validation [is_numeric("123e4") === true]. is_int() is a
strict type-check so can't solely be used to validate request variables which
are always strings (...or arrays, more below).

The _only_ correct ways to verify that a variable contains either a valid
numeric string or integer is by comparing type, and then using a regex or a
double string-then-int cast.

ex: unsigned database ids: if ((is_int($var) || is_string($var)) &&
preg_match('/^[1-9]\d*\z/', $var)) { // definitely an int > 0 }

ex: signed integer: if (is_int($var) || (is_string($var) && (string)(int)$var
=== $var)) { // valid int (including negative values) }

Frankly, developers who don't understand how request variables are handled in
PHP have zero chance of properly validating input. Find any site/app written
in php, even if built on any of the major frameworks. You can instantly break
30-50% of them by passing an array where a string is expected.

Find an app that takes "?query=hello+world". Instead pass in
"?query[]=hello+world". Want an example? Log in to Facebook, then visit this
search page[1]. Look at the query string and then what was searched for - and
the contents of the search box. Bam, even Facebook gets it wrong! Same thing
with Symfony's search[2]. Or Packagist (composer's package manager
repository)[3]. More seriously at Yii[4], which exposes an internal error to
users as they try to string-trim an array ("Error - trim() expects parameter 1
to be string, array given").

Most developers - including many seniors who have been exclusively coding in
php for years - have no clue. You will either cause a 500 Internal Server
Error, or your input array will result in an output string of "Array" if they
typecast your array to the string they expected. Even the major frameworks,
when you pull user-submitted values, simply passthrough the value submitted.
Your app expects a string (or a string that contains a numeric value), and
instead any user who knows the "[]" syntax can pass in an array.

Really reflect on this fact. Most applications start handling a submitted
array value as if it's a string. The bugs this produces are astronomical in
some cases.

If you think your framework protects you, think again. The frameworks' request
objects also do not have strict type checking. The same goes for their form
and model validation classes; if you're using the built-in "integer" or
"numeric" validators, you're probably doing things wrong.

It's a nightmare. You could try to blame PHP, but really it's the developers -
including the developers of every major well-known framework I've ever touched
- that have absolutely no clue.

Related tangent: comparing password and password confirmation fields. Many
developers do if ($password == $passwordConfirm) {}. In PHP 5.x, "10" == "0xA"
(so type "10" in password field and "0xA" in the confirmation field, and it
passes validation). This changed in PHP 7 though. There are only two correct
ways to verify that two strings are exact: $password === $passwordConfirm
(triple equals), or strcmp($password, $passwordConfirm) === 0.

[1]
[https://www.facebook.com/search/top/?q[]=hello](https://www.facebook.com/search/top/?q\[\]=hello)

[2]
[https://symfony.com/search?q[]=hello](https://symfony.com/search?q\[\]=hello)

[3]
[https://packagist.org/search/?q[]=hello](https://packagist.org/search/?q\[\]=hello)

[4]
[http://www.yiiframework.com/search/?q[]=hello](http://www.yiiframework.com/search/?q\[\]=hello)

~~~
babyrainbow
>try passing integer values. ctype_digit(50) === true, but ctype_digit(100)
=== false.

That is hilarious! I love how even stuff that is supposed to fix other stuff
itself end up being completely broken. But hey, it is documented.

Learning Php is like taking a massive loan. It get you started easily, but
causes eternal suffering in the long run...

~~~
developer2
It is in fact documented; what I didn't explain is that ctype_digit treats
integers < 127 as chr() equivalents. It's _designed_ to juggle both strings
and integers, which indeed works against php's usual method of type juggling.
This is because ctype is a port or wrapper around the C lib which behaves as
such.

------
verelo
We did the same thing with HHVM, and had VERY similar results; getting it to
work was plain hard, and i had a lot of concerns about our ability to ever go
back.

Before we ever launched with HHVM completely, PHP7 came out. With only a few
weeks of work, we managed to make the switch. The gains were identical to what
we saw on HHVM, only the experience of working with PHP7 was so much easier
for everyone involved.

Having said all this, I think HHVM served a great purpose: It raised the bar
and PHP is better because of that. All in all, a great outcome for the people
of the Internet.

~~~
nkozyra
Hack's influence is all over PHP7, unsurprisingly. As someone still bound to
PHP due to technical debt, I'm thrilled this happened. PHP still has warts,
but changes in 7 are tantamount to ES5 :: ES6. The language feels more mature,
real, sensical.

~~~
kyriakos
A much needed overhaul to the utility functions (array, strings etc) should be
the next step.

~~~
nkozyra
Of course, but they won't do that because of backwards compatibility and I get
that.

It's one of the nice parts of rebranding. Hack could keep and throw out
anything they wanted because it was intended for private FB use. At some
point, PHP will have to start cutting off the stdlib PHP4.x warts. There's
enough about PHP 7 that's good enough to be compelling to anyone working in an
interpreted language on the web, but the (well earned) reputation keeps a lot
of people away.

~~~
kyriakos
They don't have to remove them. Just put the updated versions under a
namespace. This will slowly help developers migrate.

~~~
nkozyra
Even that will break existing code, which is the reason it hasn't happened in
the first place.

------
jsamuel
At ServerPilot, we decided early on not to support HHVM for similar reasons:
we could see PHP 7 was going to offer the same performance benefits without
the pain, breakage, and downtime of HHVM.

Early on, before PHP 7 was released, we had to explain this to many of our
users who use ServerPilot to host WordPress, Magento, Laravel, and other PHP
apps. They often thought there was no downside or risk with HHVM, it was as
simple as dropping it in as a replacement. Nowadays, with the hype around HHVM
dying down, we don't get requests for HHVM support much anymore.

For a huge company like Facebook, HHVM makes a lot of sense. And the existence
of HHVM really sped up the PHP 7 development efforts and provided a great
benchmark for how fast PHP 7 could be. So, the PHP community should be very
grateful to Facebook for that even if HHVM isn't the future of PHP.

------
nchelluri
> It took less than a week to migrate our codebase (a 10 years old PHP
> monolith)...

> And it took 4 hours to migrate our custom extensions.

That seems like a very small amount of work; I'm impressed at how smooth a
transition that must've been.

I'm also quite surprised that

> we can handle twice more traffic with same infrastructure.

Wow, I didn't think that PHP application code would be such a bottleneck.
Maybe it's not that, but if the entire codebase is written in PHP, and you
replace it all in one shot, you just get such an improvement. But I thought
DBs, etc. would play a bigger role.

~~~
brianwawok
have you ever loaded a stock magento server? Set it up, add maybe 10 products
with basic images, and turn it loose.

Give it a reasonable box. 2 CPU cores, 2 GB of ram.

You are capped at something like 3-5 requests per second, with an average load
time of 5 seconds..

Just blows my mind. Simple Java web app on the same server is doing 500
requests per second. Python app, with the horrible gil and all that nasty is
going 200 requests per second. And magento is rocking 3 requests per
second??!?!?!?!?!

I wonder how much global energy consumption would go down if PHP was not a
thing.

~~~
mgkimsal
comparing "simple java app" to something as complex (overly? needlessly in
some cases? sure) as magento is nowhere near apples and oranges. compare it to
broadleaf or konakart, maybe. I've no doubt java will probably still be
faster, but it won't be 500 rps vs 3 rps.

~~~
brianwawok
Except I have built Java apps that did the same kind of things as Magento, and
yes it really was 500 to 3.

~~~
corobo
Benchmarks or no you haven't.

~~~
kasey_junk
Without getting into a pissing war if Brian says he's done something on the
JVM, trust him.

Also if you need someone to validate that their system does 500 qps, you
really need to check your assumptions. I'm trying very hard to think of what
kind of system I'd build that would do less than that (hint each q would be
big)

~~~
corobo
With all due respect, no. It's trivially easy to just claim that you've done
something that happens to anecdotally prove the point you're making. I could
say I've written a PHP app that gets an easy 1000 qps without flinching.

Without anyone dropping any factual proof my app is definitely better.

------
maxpert
I used to run one of my site (25K unique visitors a day) on PHP 5.3, when HHVM
came out with stable version I shifted to HHVM and I had similar experience.
Now I am running it on PHP 7 and I have to say I am more than happy with
results. As much as PHP is not cool for today's developers it has served on
some really high traffic sites and stayed useful even with the test of time.

P.S. Now I wish somebody just implements a good Async IO system and ability to
run HTTP right off the PHP engine (I know there is php -S ...; I am talking
about a better async system).

~~~
elgenie
HHVM implements a good async I/O system [0] and has the ability to run HTTP.

[0]
[https://docs.hhvm.com/hack/async/introduction](https://docs.hhvm.com/hack/async/introduction)

~~~
maxpert
This seems interesting hopefully someone will pick this up and make even the
mysql_* and other sync functions async too. This could be final nail in the
coffin.

------
kijin
> _In other languages you can 't get that without using a standard library
> that will escape the values by default._

Escape for what context?

Escaping for SQL is different from escaping for HTML, which in turn is
different from escaping for JS.

How does your hypothetical Request object know how to escape any given
variable? Does it ping every open database handle to figure out how they want
their data escaped? Does it use some kind of static analysis to figure out in
what format (HTML? XML? JSON? CSV?) the app is going to spit out the value
later on?

Or does it simply run a bunch of cargo-cult functions like

    
    
        return htmlspecialchars(strip_tags(mysql_real_escape_string(addslashes($_POST['var']))));
    

and hope that everything will be okay?

------
nodesocket
What tool are they using to show the memory footprint of php processs in
[https://cdn-images-1.medium.com/max/800/1*bFnYX8NE-V6P2U01Uc...](https://cdn-
images-1.medium.com/max/800/1*bFnYX8NE-V6P2U01Uc3W1A.png).

~~~
vbernat
It's Pinba. [http://pinba.org/](http://pinba.org/)

------
martin_
Had a similar experience deploying HHVM at a previous company, I wrote up a
blog post of the issues we ran into / how we worked around them[0]. One thing
the dailymotion blog omits is hacklang which has additional features like
lambdas, async support (though PHP 7 will soon?), strict typing, collections,
generics and more. That said, if you're just trying to squeeze more out of an
existing codebase, then PHP7 wins hands down.

[0] [https://ma.rtin.so/when-hhvm-doesnt-work-quite-
right](https://ma.rtin.so/when-hhvm-doesnt-work-quite-right)

------
porker
It's a shame that HHVM has so many compatibility issues compared to PHP7, as I
would love to be able to use Hack.

------
qaq
Wow today is PHP day on HN :)

~~~
toxican
At least this one isn't ripping it to shreds (so far). The other
discussion...that was rough to read as a PHP dev.

~~~
madeofpalk
I would assume by this point that PHP devs would be fairly confident and
comfortable with their decision to continue with PHP and would be used to
others bagging on it unnecessarily.

As a JS/Web developer you learn to ignore the hatred of the web that it seems
to get from the HN crowd.

~~~
wyager
>As a JS/Web developer you learn to ignore the hatred of the web that it seems
to get from the HN crowd.

As an occasional full stack developer (not by choice), I can confidently say
that the reason people hate on popular web tech is that it is uniformly
terrible compared to non-web tech. I'm no fan of Java, for example, but I'll
take it over PHP any day. JavaScript is so bad that I (and many other
developers) will put a lot of effort into using any alternative, such as
typescript, purescript, Elm, etc.

~~~
vcarl
This is the typical "hatred of the web" that I usually ignore. ES2015 brought
a ton of huge language improvements that are still filtering out into usage,
Babel means you can use them all now without waiting for browsers to implement
them, Webpack gives you a ton of flexibility for packaging it, Eslint allows
you to lint in a completely pluggable way, NPM (and now Yarn, which fixes many
of NPM's problems at scale) allows you to effectively manage dependencies,
Typescript or Flow allow you to incrementally add the benefits of static
types, and Javascript's "functions as a first class object" allow it to behave
as a powerful functional programming language.

It's very possible to write--and _deploy_ \--very high quality Javascript
today.

~~~
wvenable
The length of that paragraph and the number of tools mentioned is exactly one
of the problems of web development. It's like missing the forest for the
trees. And even with all the huge language improvements, it's still no where
near the capabilities and safety of non-web languages.

But I don't disagree that it's possible to write very high quality JavaScript
code -- it's just a little bit painful.

~~~
vcarl
It's certainly a little bit painful, but that's only because these things are
brand new. These tools let you create applications most comparable to native
apps, and could you imagine developing for iOS or Android without a Xcode or
Android Studio? The current trajectory is very, very good, and it's with a
bunch of tools and ideas that came from the community.

------
smegel
Kind of ironic this wasn't a video post.

~~~
agumonkey
At least it's not hosted on youtube.

------
tedmiston
> During few months, this project wasn’t the priority, so we decided to wait
> the release of PHP 7 to compare performances.

I have no idea why the author is choosing to write in such a strange
grammatical style.

~~~
kyriakos
Not everyone is a native English speaker. I am not and probably sound as
weird. If it makes sense though who cares?

~~~
tylerwhipple
Exactly, the author is French. While I am sure this was reviewed by someone
else in the Dailymotion (a French company), the reviewer was also French.

