Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: State of PHP at Facebook?
123 points by emadehsan on Oct 27, 2021 | hide | past | favorite | 82 comments
Does Facebook continue to use Hack with HipHop Virtual Machine?

Have there been considerations of using some open source langauge instead of internal built one?




Ex-Facebooker here.

Hack isn't going away. Ever.

It's actually quite pleasant to develop in. The tooling now revolves around VS Code, which I personally don't like. I wish they'd based this on a Jetbrains core.

One thing I learned to love with the Hack type system is that nullability is a distinct type (eg you can assign a T to a ?T but not a ?T to a T; ? here means "nullable"). The static type checking is smart enough to figure out here:

    function foo(?Foo $foo): void {
      if ($foo is nonnull) {
        bar(foo);
      }
    }

    function bar(Foo $foo): void {
      ...
    }
that the null check changes the type from ?T to T within that block.

There's literally no reason to justify the massive investment a complete rewrite would be, like literally none. It's not even close.

HHVM and Hack at this point are extremely mature technologies.


> The tooling now revolves around VS Code

When I was there a couple of years ago, doing Hack development using vim was well supported. Not as slick as the official IDE, but more than good enough for my day to day tasks.

> There's literally no reason to justify the massive investment a complete rewrite would be, like literally none. It's not even close.

Definitely, and my (relatively ancient) background is as a PHP-mocking Perl developer.

In the end, the tooling is far, far more important and relevant than the language, and the tooling at Facebook is better than anything I've seen elsewhere, by a long shot.


Are you able to share some more about what makes the tooling at Facebook so great?

edit: typo


I can answer only briefly, and that's a pretty big topic!

I'll skip the apparently pretty amazing integration between the IDE and source control and the build/test system since I didn't use the IDE very much.

I'll just very briefly mention Tupperware, which one might characterize as an internal kind of Kubernetes, except it's so much more, but more importantly, it's way easier to use.

I did write a small, fairly simple stand alone service in Python. It's super, super easy to get something like that 'going', as far as permissions, ingress and egress, storage, all of the real-life complications I've had to deal with, with variable levels of automation/'ease', at other organizations.

I'm out of time for now!


Thanks for sharing!


Wow, that's almost the same thing that Typescript is to Javascript. Is all of that openly available to outsiders? If it is, I think I might to rethink my stance of php that I had for the last 15 years.


All of the language features in ES6 / Flow / JSX that Facebook has been pushing through as part of the React ecosystem have been pioneered by Hack/HHVM at Facebook and then ported very closely to the JavaScript ecosystem.


It is available. My understanding is that, unlike PHP, it really only has first-class support on Linux. You can run it on macOS, but you may have to fix some issues yourself.

I've been tempted to use it for things like static website generation, since it has the "XHP" stuff, which is like JavaScript's JSX... and I'd undoubtedly prefer writing code in Hack to writing code in JavaScript.

https://docs.hhvm.com/hack/XHP/introduction

JSX is based on XHP, of course, not the other way around.


The warnings about MacOS are mostly being cautious, as I don't think anyone actually runs it in production.

I work with open source Hack code a lot, and I almost always use the MacOS binaries instead of Linux for development.

For my workflows, it's more that there's two major missing things - I'm not seeing random issues. The big things are:

- a lot of the profiling features are Linux-only

- the mcrouter extension is Linux-only. If you're using memcached, this extension is useful even without mcrouterd, as it provides a true async (Awaitable) client.


That feature is also part of the PHP language. Also as far as I know Hack is available to everyone.


Does anyone use Hack outside of Facebook. Disclaimer: it’s been years since I looked at it and even at the time it was a very cursory look as PHP 7 was buzzing.


I believe Slack committed to it at some point, maybe they didn't get off it when Facebook announced they will not keep up with PHP itself anymore.


I got a 500 error with stack trace mentioning a hack file from slack last week.


This is the second such PHP thread I've seen in the last 24 hours (the other one was on a different web property); gotta love people shitting on PHP like it's 1998


I assume this thread was inspired by the other one.

I too am actually interested in the current state of PHP jobs. I previously assumed it mostly entailed working on legacy projects, which is what I do occasionally.

I have a younger friend who is primarily interested in working with PHP. I wanted to advise against pursuing it too much due to questionable future prospects, but he’s found gainful employment in a couple of PHP shops.

The knowledge is transferable to other languages anyway, so even if PHP jobs do dry up in the near future, it won’t have been a waste.


I haven't checked recently, but PHP jobs tend to pay less than other languages. I've also been at interviews where the interviewer mocked PHP and I would probably have been marked lower if I tried to write my answers in PHP.

Overall, PHP has matured a lot in recent years and gained in features that was missing from other popular languages (e.g. Composer/Packagist, Laravel, PHP 7 and PHP the Right Way probably contributing the most).


Personally, I still default toward PHP as a backend for most web apps. While it's not a joy like writing for Nodejs, you don't have to worry about routing or something unforeseen blocking the main thread. I also have to manage servers 24/7 for the apps and sites I build. I simply don't trust Node as much in production.


I wouldn't call nodejs a joy though.


Can you elaborate on this? Not the first time I've heard a similar sentiment (and the proliferation of PHP over the years def provides some justification) but I've never heard people explain why.


I trust Apache to spin up a new PHP process for each request. Except under DOS attack or with some horrendous memory leak, it's just solid. Obviously it's a lot more overhead. But if there's some uncaught exception in a corner of PHP code, it doesn't crash the main thread and drop all the other processing on incoming requests. This means that a bug a user discovers is usually something I can fix during business hours, not get woken up at 4am because "the server is down". I use PM2 when I manage Node and it does a fairly good job of keeping things running and well-logged, but to me the Node event loop is just a single point of failure I don't need unless I'm building something that really needs Node's architecture, like a fast-paced game or chat. (And I've even done those with PHP). And then in cases where I need the server to do a few seconds of heavy data processing like collating large reports once they come back from a database, with Node you need to spin up a worker pool to not block the loop with those operations, which complicates reliability even more; with PHP you can get away in a lot of cases with letting the thread spawned by the call just take its time to chew through a bunch of math.

I've always been a fan of stateless code wherever I can get away with it. My casino, for instance, ran on PHP, and although it upgraded the connection to a socket where possible, it created a new DB connection and loaded the game state from SQL, then stored a new game state, for every single action a player took within a game, e.g. every hit within a hand of blackjack. The database was at all times a consistent snapshot of the entire state of the casino. A single player's action might fail and roll back, but it wouldn't bring down the whole site and lose all the other actions in progress.

In some respects, it's also just that I think setting up an Express router (or Koa, my preference) and graceful handlers and 404 handlers and everything in Node is overkill if you just want to serve web pages, validate forms or handle RESTful API calls. DB connections go down and you may need to recover from a slew of failed queries in rapid succession instead of just counting on a new DB connection being tried for each incoming request.

Node really shines with anything that requires statefulness on the server, low latency, push data, etc. And for me the nicest thing about Node is the ability to share the same unified data classes between client and server. But I view Node kinda like a Porsche. High performance, a bit unreliable and dangerous. Great to have in the garage. But when you're just driving to the market you're better off taking the Subaru.


> I trust Apache to spin up a new PHP process for each request. Except under DOS attack or with some horrendous memory leak, it's just solid. Obviously it's a lot more overhead.

FWIW, PHP-FPM has been the standard way of deploying PHP in production for many years now [1]. No need for running PHP as an Apache module.

[1] https://cwiki.apache.org/confluence/display/httpd/PHP-FPM


Wow thanks for the detailed response! I really appreciate it.


Some people like pretending that other languages don't have their own warts. There are thousands of mature SAAS products that use PHP and don't rely on Wordpress for anything. I don't think that anyone who uses Javascript as part of their backend stack has any room to criticize.


Walmart.com (during Black Friday) would beg to differ. LinkedIn might want a word too. I'm sure every language has its success story(ies) along with a long list of failures. Shopify and Stripe are successful with Rails! JavaScript is no different here.


Maybe read what I said instead of leaping in to defend against things I didn't say.

For example:

>I'm sure every language has its success story(ies) along with a long list of failures.

is actually in your reply to:

>Some people like pretending that other languages don't have their own warts.

I don't think node is a bad platform. I think the language it's based on is terrible, and nobody who is willfully writing backend code in node has any real justification to complain about someone else's choice to use PHP.


I took your comment to instill a sort of "pecking order" among back-end languages. In this case, it sounded an awful lot like "it goes from _insert language(s) you approve of_ to PHP to JavaScript". Basically that you'd lump PHP and JavaScript into the "terrible" bin. I'm simply pointing out while you may have disdain for those languages (and perhaps even rightfully so) the users are free to love them and still hate other languages.

If I misrepresented your stance, I apologize.


> Shopify and Stripe are successful with Rails!

Shopify and Github are successful with Rails.

Stripe uses Ruby only.


I don't think that anyone who uses Javascript as part of their backend stack has any room to criticize.

Nodejs is a solid server runtime these days, widely test by big companies with big traffic as their frontend servers. And there is a mature option for static typing if you want that.


> This is the second such PHP thread I've seen in the last 24 hours (the other one was on a different web property); gotta love people shitting on PHP like it's 1998

I don't care for language wars (or the kind of petty language elitism that people get into with PHP), but: isn't Hack effectively its own language (and ecosystem) at this point?

I don't think that Facebook sinking tens of millions of dollars into a custom language dialect because on the initial technical decision to use PHP is a positive testament to it as a language. It instead suggests that Facebook bit the bullet.


I thought the original reason for it was that Zuckerberg didn't want to have the site rewritten by engineers in a language he didn't know. i.e. find a way to scale up this dorm room code without rewriting it.


Yeah, that's what I meant by the "initial technical decision." Perhaps "technical and ego driven" would have been more precise :-)


All of the “www” code (backend which serves all the web apps and APIs) is Hack. And a lot of that code is generated via internal frameworks like Ent which is a graph abstraction over database access.

There doesn’t seem to be any incentive to switch www away from Hack: it wouldn’t reduce the learning curve very much due to all those internal frameworks, and would remove the opportunity to optimize the language for the www codebase.

There are open source languages used in the internal backends though (services behind www). Most of it is C++.


For anyone interested, it looks like there's an open-source version (close, at least) to Ent at https://github.com/ent/ent.


There's also https://github.com/hhvm/hack-codegen/tree/master/examples/do..., along with the article at https://engineering.fb.com/2015/08/20/open-source/writing-co...

This is modern Hack code, but not representative of the state of the Ent framework nowadays - ent/ent is closer design-wise, but not usable from Hack.


Migrating to Hack from PHP was huge undertaking requiring years to conclude, and that almost kept the same syntax and semantics where possible.

What would the benefit of spending thousands of engineer years (read: billions of dollars) on a rewrite be? Especially when it would nearly certainly come with a huge performance regression compared to continuing to use HHVM which has come with years of optimization for working especially with Facebook's web codebase.

In addition to the need to rewrite a gargantuan codebase and loss of the optimized runtime, it would mean throwing away a huge amount of expertise, tooling, and basically all of the source control history context.

There really isn't much to gain.

That said, the fraction of Facebook's code that is Hack continues to shrink, thanks to the continued growth of backend services and the clients being written in Hack (before there was server-rendered XHP, but now it's predominantly React).


> There really isn't much to gain.

We (external parties) really can't say. We don't know what's holding Facebook's system back, we don't know the challenges it faces both internally and externally. We don't know what VM limits the code is hitting, how well the GC/memory allocation story is for HHVM with Facebook's code (other than what we're told). We don't know how well Facebook's database is or isn't keeping up with the backend, and what future plans the database team has that would change the query profile which could then cause the backend fleet to hit limits of HHVM's tunables with current growth. We also don't know HHVMs team is planning.

If there is a problem that Hack and HHVM hasn't been able to solve, one would hope that the spending of billions of dollars is properly considered. Or to borrow from another expression, I'd hate to be billion-dollar-wise and future-of-the-company foolish. I don't begrudge the exec team for having to make that call and other calls like it. (Choosing to do Hack & HHVM in the first place was a similar judgement call.)


Although currently an external party, I did work on these things at Facebook in the past. And you're right, that any system does have scale limits on a given type of hardware.

Regarding memory allocation: HHVM has per-request memory arenas which get thrown away after each request. That combined with memory and time limits serves to compartmentalize the amount of memory pressure requests can place on the web server. Tuning of concurrent requests and workload mixes allows for some amount of exchanging memory pressure for throughput.

I'm not sure what the comments about the database are: queries to databases and other backend systems are fairly indepent of the language the web server is written in (Instagram is written in Python, and hits many similar or the same systems and those systems don't really care which one is making the request).

For specific components where HHVM is not able to handle them, those pieces can be extracted to separate services written in a different language (C++ or Rust, I think would be the go-tos probably?), with a cost of being unable to depend on libraries written in Hack.


Do you know if Facebook uses Composer to manage PHP dependencies?


Facebook uses a monorepo, similar to the way Google does it. Dependencies are vendored and checked into the repo. Facebook has released a tool for migrating software between repos, which is much like Google's Copybara:

Facebook's fbshipit: https://github.com/facebook/fbshipit

Google's copybara: https://github.com/google/copybara

With these tools, you can make fairly sophisticated choices about how you do vendoring. You can make the internal version look just like the public version, in terms of commit history. And you can export internal commits to public commits, stripping out confidential information along the way (or integrations with internal systems and tooling).


Thanks for the infos!

What does "Dependencies are vendored and checked into the repo" mean?


https://stackoverflow.com/questions/26217488/what-is-vendori...

ie copy software source into your own repository, ideally in such a way that you can keep it up to date (which is what shipit/copybara are for)


It means you literally commit the code from the dependency into your own repo. https://stackoverflow.com/questions/26217488/what-is-vendori...


Facebook doesn't. Open source users of Hack do, and it uses packagist.org - however if a package is written in PHP, Hack code can't use it.

Not 100% accurate, but if a package's composer.json does not require HHVM, it will most likely install into a hack project, but be unusable.


HHVM is open source and seeing active development with weekly releases: https://hhvm.com/blog/

My understanding is they're waiting on the language stabilizing before making a push to open it up to the broader software engineering community.

They're still heavily invested in Hack and releasing research papers like https://engineering.fb.com/2021/03/03/developer-tools/hhvm-j....

Migrating to another language without an incremental adoption path likely isn't feasible for them


The answer to your first question is yes.

Your second question is based on an incorrect premise. Hack is internally built, but it is also open source:

https://github.com/facebook/hhvm

I joined Facebook in 2019 and left this year. Prior to FB, I worked mostly with python, javascript, and C++. Even at FB, I worked mainly in python (instagram backend), but spent a lot of time in the Hack codebase.

My experience was that FB's Hack + HHVM stack is much easier to work with and felt more productive than any other backend stack I've used. It's important to consider that a huge portion of Facebook's backend is one giant HHVM monorepo (called www). The consistency and uniformity has allowed FB to build lots of tooling and developer productivity on top of this one stack. For example, when you add feature flags, the tooling will automatically create a diff (PR) to remove the feature flag once it has been fully rolled out for a few weeks.

There are rough edges and weirdnesses, but HHVM is pretty actively being improved. Old mutable builtins are being (or have been) removed, the type system has gotten better, better error and warning messages all the time. FB is very responsive to data on developer productivity.

EDIT: Another anecdote, when I first joined I was working on both the Instagram python codebase and the Hack API codebase (some of Instagram's APIs are in Hack/HHVM). I constantly wondered why we didn't migrate the Hack code to python, and talked with various people about proposals and potential paths to do the migration for the APIs I worked on.

After a few months of working in both codebases, I completely flipped. After witnessing insane bugs and horrible architectural contortions designed to mitigate python performance issues, I wondered why we didn't just migrate the python codebase to Hack. Python (on cpython/cinder) is just not ready for large scale web services, and Hack is a much more productive environment than Java/C++/Rust for backend.

Typescript may be an even better option, but has some issues of its own.


> the tooling will automatically create a diff (PR) to remove the feature flag

That's pretty cool. It just occurred to me that something like this would be almost trivial in a Lisp.


Out of curiosity, am I right in thinking that Instagram is built atop Django?


Yes. Instagram doesn't use the Django ORM though, and I believe they built their own async view implementation since it existed before Django went async.


Gotcha! Thanks for sharing.


What issues does Typescript have? not saying it has none, just curious.

While much faster than python wouldn't nodejs be a bad option for FB/Instagram too? Why not Go?


IMO Typescript will grow on the backend. It's mainly just not as mature as alternatives and will get better over time. There are small problems similar to python though, like module-level mutable global state. That interferes with certain tools and fast restarts, but can be mitigated the same way as is done in python, ie https://instagram-engineering.com/python-at-scale-strict-mod...

Go is not dynamic enough to build useful abstractions. It does not allow developers to make themselves more productive. It's also more difficult to build hot reloading and high quality reflection based debugging.


Hack is open source.

My understanding from a talk back in 2014, is hack was developed because of frustrations with PHP evolution. Honestly I think it may have lit a fire under the PHP developers because php7 had significant speed improvements[4].

Talk:

"Facebook recently introduced and open-sourced Hack (https://code.facebook.com/posts/264544830379293/hack-a-new-p...), a gradually-typed programming language for HHVM that interoperates seamlessly with PHP. Hack builds the bridge between the dynamically and statically-typed worlds – providing code correctness while maintaining a fast feedback loop. Facebook is committed to working with the community to improve and refine the Hack language, to help interested developers convert to Hack (https://code.facebook.com/posts/264544830379293/hack-a-new-p...), and to narrow the HHVM compatibility gap with PHP5 and popular frameworks."

https://www.meetup.com/bostonphp/events/184609542/

Slidedeck on Hack (https://github.com/gabelevi/Boston-PHP-Meetup-Examples/blob/...)

[4] PHPs response to hack, phpng (php7): https://www.sitepoint.com/php-fights-hhvm-zephir-phpng/

https://news-web.php.net/php.internals/73888


I used HHVM for a while and it was good before PHP7, but then it quickly became irrelevant once the PHP core and community got their act together.


PHP still doesn’t have XHP though :(


Why would you want this when you have blade templates or further down .jsx or .vue templates?


Though it is internally-built, Hack is already open source at https://github.com/facebook/hhvm/.

FB uses a pretty wide array of languages internally -- I don't know if they release statistics publicly, but you can filter/search their open-source projects by language at https://opensource.fb.com/projects/#filter.


I believe Hack is also used by a number of other companies, the most prominent being Slack [1]. FB probably uses almost every language in at least some context: Hack, C++, Python, Haskell, OCaml, Rust, JS, Obj-C, Swift, Java, Kotlin are all ones that I’ve heard referenced by friends who work there, though that’s likely nowhere near exhaustive.

[1] https://slack.engineering/hacklang-at-slack-a-better-php/


And Erlang (mainly in WhatsApp)


Over the last few years many contributions from Facebook have made it into PHP mainline 7 & 8, and we've seen respective performance gains.

Unsure how extensively they continue to use their PHP to C++ compiler internally.


The PHP to C++ compiler hasn't been used for something on the order of 10 years.

At least as of two years ago, pretty much entire web-facing portion of Facebook was written in Hack (which descends from PHP, but has what feels like every syntax feature you could stuff into an Algol-like language). Hack is run on HHVM, which is a bytecode VM that can JIT compile to machine code.


The php-to-c++ compiler was replaced many years ago with a JIT called HHVM.


the problem with php is not php. the problem with php is the troves of terrible wordpress code examples out there that people learn from, modify, and use


Although your statement is valid I don't see how it's relevant to what the post is about.


I don't work there and don't know anybody who works there, but I do know that they have been using proprietary dialects of PHP for some time. React's JSX was influenced by XHP, and XHP was released in 2010 but apparently was in use for years before then. So from an outsider's perspective, it seems like Facebook has been using proprietary extensions or entirely proprietary languages for over a decade.

The fact that Facebook is such a big company means that they can afford to have entire teams working on these proprietary languages, much in the same way Google had the ability to assign full-time engineers to maintaining their internal container infrastructure (Borg, the predecessor to Kubernetes). Because of this, I don't see why they would give up control of some of their stack to people outside their company, who may not share their values. This isn't exclusive to Facebook's PHP developers...they have their own libraries for C/C++, Swift, JavaScript, and much more. It's a big company, the developer tools are going to be a lot bigger than most other workplaces.

Again, this is pure speculation. But most of these big companies do the same or a similar thing. Both Apple and Google have invented their own languages, the difference is that Facebook doesn't have an app platform that requires you to use Hack/PHP, thus nobody is going to just pick it up for no reason unless it compels them in other ways (e.g. Go)


Not sure what you mean by “proprietary”. They use hack/hhvm which is open source. It’s not super popular but slack also uses it.


Yes www still uses hack. No we probably won't ever migrate to another language for www.

That being said, I don't think anything BUT www is written in hack.


> That being said, I don't think anything BUT www is written in hack.

I was at FB a few years ending in 2020.

Wherever I work, I end up doing a fair bit of 'code/tech archeology' just for fun and curiosity's sake. While WWW is the vast majority of Hack, I did see it in active use and maintenance in a number of other areas.

Pretty interesting stuff really.


Given the big investment, I don't know what they would use instead. They seem to now have a very high level language well suited for the web, but with most of the advantages of a typed-from-the-start language, and great performance for the tasks it's doing. It's hard to think of an existing language that wouldn't be a downgrade in some way (productivity, performance, re-work) for them now.

Ironically, the only thing I could see reasonably replacing Hack/HHVM for them is...PHP. Assuming PHP has some success adding all the typing, async, etc, features. They already addressed the performance differences vs Hack/HHVM.

Note: All in context of what it's used for, and the investments already made. I get you wouldn't choose Hack/HHVM as a new company.


I don't work at FB but can say with second hand experience that the company is "all in" on Hack/HHVM. The language has diverged far enough from PHP that moving back is simply not feasible.

As for "considerations of using some open source language", individual teams are mostly free to use whatever best fits their use case, similar to what you may expect at any company of that size. Pretty much every major language is in use at FB somewhere or the other. And HHVM is also open source.


Are there any large Hack codebases I can explore?

I am interested to see how different from straight PHP it is.

Does anyone know if FB tracks changes to PHP so Hack is "up to date"?


Not quite an answer to your question but slack engineering wrote a paper [1] on migrating their code base from PHP proper to Hack.

Specifically they used [2] 'partial mode' since "this loosens several restrictions to ease migration"

[1] https://slack.engineering/hacklang-at-slack-a-better-php/

[2] https://docs.hhvm.com/hack/source-code-fundamentals/program-...

EDIT: This would indicate that Hack may perhaps support standard PHP syntax but not the other way around.

EDIT: According to this [3] they are very compatible.

[3] https://www.oreilly.com/library/view/hack-and-hhvm/978149192...


They stopped supporting interoperability a while ago [0].

[0]: https://hhvm.com/blog/2018/09/12/end-of-php-support-future-o...


> Are there any large Hack codebases I can explore?

depends what you mean by 'large' - perhaps https://github.com/hhvm/user-documentation ?

> Does anyone know if FB tracks changes to PHP so Hack is "up to date"?

No, for the most part, Hack no longer considers PHP 'upstream'. Exceptions are things like security fixes to extension functions, if that particular extension function was derived from PHP.


Hack features have actually driven a lot of the additions into recent PHP versions. So the more relevant question might be the other way around


At a high level think of Hack as Typescript for PHP. Although they completely replaced some data structures (i.e. arrays) that couldn't be made performant in their PHP implementation. Just like their customized version of MySQL, Facebook is stuck with Hack. It's too ingrained. Their customizations to MySQL became very specific to Facebook and they pretty much stopped open sourcing changes. Hack has taken the same path. They will never outsource Hack fully because some of the customization are only applicable to them.



Related question, to those who have experience in both PHP and Hack, is Hack worth it? Would you start a startup in it?


You cut off yourself from the whole PHP ecosystem. Hack has very small community (if any nowadays) compared to PHP.

While Composer, Laravel and Symfony are very nice to work with.


Has anyone used both Hack and the modern React workflow of Typescript and a mixture of client and server-side rendering? Interested to know how they compare


I haven't seen a huge emphasis on SSI + client other than projects like Next (which is ok but kind of clunky).

My experience with Hack was that it was just server-side ... is there some front-end aspect to Hack?


There is: it's called XHP (https://docs.hhvm.com/hack/XHP/introduction).

That said, I'm not sure how you'd go about server-side rendering React from Hacklang, as it seems that the open-source solution to do so is no longer supported (https://github.com/hhvm/xhp-js).


To clarify, xhp-js was not server-side rendering - it's a framework for generating javascript code, and getting an DOM element ID for a react element that the hack code can refer to and use elsewhere, e.g. by passing to some other generated JS.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: