

Facebook rewrites PHP runtime, will open source on Tuesday - VonGuard
http://www.sdtimes.com/blog/post/2010/01/30/Facebook-rewrites-PHP-runtime.aspx
Facebook has been rewriting the PHP virtual machine for the past 2 years in an effort to make it faster. On Tuesday, they will announce the release of this software as a new open source project.
======
matthew-wegner
This was mentioned in that anonymous Facebook employee interview:
<http://news.ycombinator.com/item?id=1089800>

Specifically:

 _Rumpus: So tell me about the engineers.

Employee: They’re weird, and smart as balls. For example, this guy right now
is single-handedly rewriting, essentially, the entire site. Our site is coded,
I’d say, 90% in PHP. All the front end — everything you see — is generated via
a language called PHP. He is creating HPHP, Hyper-PHP, which means he’s
literally rewriting the entire language. There’s this distinction in coding
between a scripted language and a compiled language. PHP is an example of a
scripted language. The computer or browser reads the program like a script,
from top to bottom, and executes it in that order: anything you declare at the
bottom cannot be referenced at the top. But with a compiled language, the
program you write is compiled into an executable file. It doesn’t have to read
the program from beginning to end in order to execute commands. It’s much
faster that way. So this engineer is converting the site from one that runs on
a scripted language to one that runs on a compiled language. However, if you
went to go talk to him about basketball, you would probably have the most
awkward conversation you’d have with a human being in your entire life. You
just can’t talk to these people on a normal level. If you wanted to talk about
basketball, talk about graph theory. Then he’d get it. And there’s a lot of
people like that. But by golly, they can do their jobs._

~~~
jrockway
_The computer or browser reads the program like a script, from top to bottom,
and executes it in that order: anything you declare at the bottom cannot be
referenced at the top. But with a compiled language, the program you write is
compiled into an executable file. It doesn’t have to read the program from
beginning to end in order to execute commands. It’s much faster that way._

I lost count of the number of untrue statements there after about ten.

~~~
olalonde
Wanna talk basketball?

~~~
VonGuard
_Terrified stare_

------
n8agrin
The article is completely unsubstantiated.

 _Well, I was able to put all the pieces together on this one, finally, and I
now understand exactly what is up: Facebook has rewritten the PHP runtime from
scratch._

Look if it's true, this is cool, it will no doubt be a great contribution to
the OS world, but let's wait until Tuesday or until we have more concrete info
beyond this author's guess that FB has completely rewritten the PHP runtime. I
realize the author has little to gain from making this up, except for maybe 15
min of fame on HN, but still, don't believe everything you read.

------
kaens
I think that someone reimplementing PHP and cleaning up a lot of it's . . .
quirks . . . could be a very good thing for the web. Someone who didn't care
about maintaining backwards compatibility with code that relies on those . . .
_quirks_ . . . and who did care about language design, and clean
implementation.

Unfortunately, I have this sinking feeling that _this_ is not going to be
_that_.

------
scorxn
Is there an implication that all this optimization could be merged into PHP
core? PHP's appeal is ubiquitous support. Even if it is open-sourced, I can't
imagine a bunch of hosting companies suddenly serving Facebook-flavored PHP.

~~~
pbiggar
The PHP internals developers heaped scorn on compilation when I talked to them
about phc (<http://phpcompiler.org>). I think it might be different with a
Facebook seal of approval though.

~~~
jerf
Should the current internal developers object to the idea even when
implemented, well, "a new set of PHP internals developers" would not be the
worst thing that ever happened to PHP....

------
jganetsk
Wouldn't it have been easier to have moved away from PHP? From what I
understand, Facebook only uses PHP for the most front of ends. Business logic
is all in other languages. Isn't it easy to port the front-end into something
else?

~~~
olalonde
What other languages? (genuine question)

~~~
adrianwaj
Reddit was rewritten from Lisp to Python:
<http://www.aaronsw.com/weblog/rewritingreddit>

Twitter from Ruby into Scala and JVM:
[http://www.artima.com/scalazine/articles/twitter_on_scala.ht...](http://www.artima.com/scalazine/articles/twitter_on_scala.html)

I think there are two aspects of a language (correct me if I am wrong) -- it's
elegance/simplicity/breadth of code produced by programmers that write with it
--- and then the efficiency/effectiveness/HW-optimized executable code
produced by its compiler once it's run: so how does PHP and Facebook fit in
with all this?

~~~
munctional
Twitter's frontend is still definitely Ruby (on Rails). Based on what I've
heard from people who have consulted there, it's a gigantic ball of crap.
They've so heavily patched Rails 2.0 that they can't realistically migrate to
a more modern version of Rails.

~~~
rufugee
I do a fair amount of Rails, so I'm really curious here. How could it be that
they've so heavily patched 2.0 that they can't move on? Anyone from Twitter
care to comment?

I've worked on many Rails apps, and have upgraded the apps from version to
version. It's a pain when key elements of the API shift, but it's not _that_
bad...even when the project has monkey-patched Rails a lot. And twitter
certainly has the resources to afford to dedicate a few programmers to this
task, so I'm just not sure I buy it.

~~~
munctional
One of the contractors I spoke with said that they had a branch running Rails
2.1 successfully. When they deployed it in production, the entire application
fell on its face.

Supposedly, the problem was caused by Cache Money, but nobody at Twitter
wanted to risk moving to a different version again. They're still on 2.0
today. :-)

Another fun fact: Twitter has over 1,500 remote git branches. They also have
bright green deer in the reception area of their office. :-)

------
senko
> That team were forced to sign NDA's, and taken to a very quiet, secluded
> meeting room where some cool new Facebook-backed open source project was
> described.

This caught my eye - an interesting use of the term "open source" that I
haven't previously been aware of. This is only a single datapoint (and the
article states the project is going to be opened up anyways), but I do have a
feeling the term has been diluted and joined the buzzword ranks.

~~~
jrockway
It's "open source" as in, "we'll release it as open source when we are good
and ready". So far, that's been never.

~~~
VonGuard
Yeah, open source is kinda a verb now. Anyway, Facebook has other open source
projects: see Hive <http://www.facebook.com/pages/Hive/43928506208>

Kinda silly to have a facebook page for an open source project, though.

~~~
jmatt
Here's the main page for open source facebook projects:

<http://developers.facebook.com/opensource.php>

I agree it's a bit silly. I think they hope that it'll catch on sometime in
the future and they'll have yet another type of group socializing on facebook.

~~~
jackowayed
You mean people actually socialize on Facebook pages?

It seems that the only effective purpose for pages is for
celebrities/companies to push updates to their fans (and almost always ignore
the reverse direction), and groups are even worse. Most pages and groups I see
are things that you join because you agree with/identify with the name and
then totally ignore. I wonder if facebook a) cares, and b) could do something
to fix it.

~~~
jmatt
Ya I used to agree with you completely. I still do for the majority of pages
out there.

I recently learned of a counter example. There is a locally owned sports bar
and restaurant that has a very active group. Most of the members are fans of
sports teams that are across country and the games play regularly at this bar.
The owner is active in the group too and definitely encourages and responds to
conversation. The other group I know that is relatively active has more or
less the same characteristics. It's people who identify with the group but are
otherwise disjoint - and this is the best way for them to casually
communicate. I think it's safe to say this was the original intent of groups
and pages. Celebrities and companies are just taking advantage it. Of course,
I have a handful of pages on my facebook profile, so I guess I'm as guilty as
anyone else. (I'm With COCO!)

EDIT: abusing -> taking advantage

------
timdorr
This is particularly interesting because PHP's runtime was rewritten for 5.1
back in late-2005 (on top of 5.0's Zend Engine 2.0 improvements in mid-2004).

And is PHP really considered "pokey"? Sounds like this guy is making stuff up
because I get execution times of >0.01 seconds on my micro-framework.

~~~
pbiggar
PHP is dirt slow. When you look at the implementations of Lua or Python (which
have approximately the same design as PHP), its about 4-5 times slower. Note
this is only in the interpreter -- lots of the library code is written in C,
which makes the difference somewhat less relevant.

The implication is that writing code that uses PHP's built-in libraries is
pretty fast, but the more you write in PHP itself, the slower it gets. For
example, my impression is that Yahoo isnt really written in PHP - its written
in C patched together using PHP.

The Zend engine was "rewritten" for PHP 4, PHP 5, and to a certain extent for
PHP 5.1. I guess it wasn't a complete rewrite because the legacy code from
about 10 years ago is still in there. Anyway, its still dirty, badly written,
slow, and very very badly commented. Hacks abound (and not the good kind).

~~~
zmimon
This is a point I always have trouble impressing on people. They do a simple
benchmarks on a tiny code base that pulls some data from the database, spits
out some data, and they compare and PHP seems to be lightning fast.

The problem is that because it's interpreted at some point PHP slows down in
proportion to the _size of your code base_. A small app runs really fast, a
big app with 100,000 lines of code will kill your server unless you modularize
it really really well - which harder than it seems, because the more
modularized you make it the more separate "includes" you end up with in
different files and then you come to realize that including a lot of files
itself is a problem. And the nature of PHP's very loose coupling tends to lead
to code that is nearly impossible to do large scale refactoring on once you
have gone too far down the path.

I work on an application that has a very thin PHP layer that performs some
simple web services that are the back end for a pure Java web app. Amazingly,
when we load test it, the PHP part is the bottleneck, burning CPU like crazy
just parsing all our files ... over ... and over ... and over. The java code
meanwhile, while theoretically doing far more "work", is completely bored. We
will probably look at using an accelerator of some kind or maybe just
rewriting all the PHP in another language.

~~~
pbiggar
If your problem is parsing time, just use an accelerator. APC is the standard
and best integrated.

On the other hand, if you have an opportunity to switch out PHP for something
better (read: nearly anything), you should. Otherwise it might grow to a point
that you can't remove it.

