It's actually quite pleasant to develop in. The tooling now revolves around VS Code, which I personally don't like. I wish they'd based this on a Jetbrains core.
One thing I learned to love with the Hack type system is that nullability is a distinct type (eg you can assign a T to a ?T but not a ?T to a T; ? here means "nullable"). The static type checking is smart enough to figure out here:
function foo(?Foo $foo): void {
if ($foo is nonnull) {
bar(foo);
}
}
function bar(Foo $foo): void {
...
}
that the null check changes the type from ?T to T within that block.
There's literally no reason to justify the massive investment a complete rewrite would be, like literally none. It's not even close.
HHVM and Hack at this point are extremely mature technologies.
When I was there a couple of years ago, doing Hack development using vim was well supported. Not as slick as the official IDE, but more than good enough for my day to day tasks.
> There's literally no reason to justify the massive investment a complete rewrite would be, like literally none. It's not even close.
Definitely, and my (relatively ancient) background is as a PHP-mocking Perl developer.
In the end, the tooling is far, far more important and relevant than the language, and the tooling at Facebook is better than anything I've seen elsewhere, by a long shot.
I can answer only briefly, and that's a pretty big topic!
I'll skip the apparently pretty amazing integration between the IDE and source control and the build/test system since I didn't use the IDE very much.
I'll just very briefly mention Tupperware, which one might characterize as an internal kind of Kubernetes, except it's so much more, but more importantly, it's way easier to use.
I did write a small, fairly simple stand alone service in Python. It's super, super easy to get something like that 'going', as far as permissions, ingress and egress, storage, all of the real-life complications I've had to deal with, with variable levels of automation/'ease', at other organizations.
Wow, that's almost the same thing that Typescript is to Javascript. Is all of that openly available to outsiders? If it is, I think I might to rethink my stance of php that I had for the last 15 years.
All of the language features in ES6 / Flow / JSX that Facebook has been pushing through as part of the React ecosystem have been pioneered by Hack/HHVM at Facebook and then ported very closely to the JavaScript ecosystem.
It is available. My understanding is that, unlike PHP, it really only has first-class support on Linux. You can run it on macOS, but you may have to fix some issues yourself.
I've been tempted to use it for things like static website generation, since it has the "XHP" stuff, which is like JavaScript's JSX... and I'd undoubtedly prefer writing code in Hack to writing code in JavaScript.
The warnings about MacOS are mostly being cautious, as I don't think anyone actually runs it in production.
I work with open source Hack code a lot, and I almost always use the MacOS binaries instead of Linux for development.
For my workflows, it's more that there's two major missing things - I'm not seeing random issues. The big things are:
- a lot of the profiling features are Linux-only
- the mcrouter extension is Linux-only. If you're using memcached, this extension is useful even without mcrouterd, as it provides a true async (Awaitable) client.
Does anyone use Hack outside of Facebook. Disclaimer: it’s been years since I looked at it and even at the time it was a very cursory look as PHP 7 was buzzing.
This is the second such PHP thread I've seen in the last 24 hours (the other one was on a different web property); gotta love people shitting on PHP like it's 1998
I assume this thread was inspired by the other one.
I too am actually interested in the current state of PHP jobs. I previously assumed it mostly entailed working on legacy projects, which is what I do occasionally.
I have a younger friend who is primarily interested in working with PHP. I wanted to advise against pursuing it too much due to questionable future prospects, but he’s found gainful employment in a couple of PHP shops.
The knowledge is transferable to other languages anyway, so even if PHP jobs do dry up in the near future, it won’t have been a waste.
I haven't checked recently, but PHP jobs tend to pay less than other languages. I've also been at interviews where the interviewer mocked PHP and I would probably have been marked lower if I tried to write my answers in PHP.
Overall, PHP has matured a lot in recent years and gained in features that was missing from other popular languages (e.g. Composer/Packagist, Laravel, PHP 7 and PHP the Right Way probably contributing the most).
Personally, I still default toward PHP as a backend for most web apps. While it's not a joy like writing for Nodejs, you don't have to worry about routing or something unforeseen blocking the main thread. I also have to manage servers 24/7 for the apps and sites I build. I simply don't trust Node as much in production.
Can you elaborate on this? Not the first time I've heard a similar sentiment (and the proliferation of PHP over the years def provides some justification) but I've never heard people explain why.
I trust Apache to spin up a new PHP process for each request. Except under DOS attack or with some horrendous memory leak, it's just solid. Obviously it's a lot more overhead. But if there's some uncaught exception in a corner of PHP code, it doesn't crash the main thread and drop all the other processing on incoming requests. This means that a bug a user discovers is usually something I can fix during business hours, not get woken up at 4am because "the server is down". I use PM2 when I manage Node and it does a fairly good job of keeping things running and well-logged, but to me the Node event loop is just a single point of failure I don't need unless I'm building something that really needs Node's architecture, like a fast-paced game or chat. (And I've even done those with PHP). And then in cases where I need the server to do a few seconds of heavy data processing like collating large reports once they come back from a database, with Node you need to spin up a worker pool to not block the loop with those operations, which complicates reliability even more; with PHP you can get away in a lot of cases with letting the thread spawned by the call just take its time to chew through a bunch of math.
I've always been a fan of stateless code wherever I can get away with it. My casino, for instance, ran on PHP, and although it upgraded the connection to a socket where possible, it created a new DB connection and loaded the game state from SQL, then stored a new game state, for every single action a player took within a game, e.g. every hit within a hand of blackjack. The database was at all times a consistent snapshot of the entire state of the casino. A single player's action might fail and roll back, but it wouldn't bring down the whole site and lose all the other actions in progress.
In some respects, it's also just that I think setting up an Express router (or Koa, my preference) and graceful handlers and 404 handlers and everything in Node is overkill if you just want to serve web pages, validate forms or handle RESTful API calls. DB connections go down and you may need to recover from a slew of failed queries in rapid succession instead of just counting on a new DB connection being tried for each incoming request.
Node really shines with anything that requires statefulness on the server, low latency, push data, etc. And for me the nicest thing about Node is the ability to share the same unified data classes between client and server. But I view Node kinda like a Porsche. High performance, a bit unreliable and dangerous. Great to have in the garage. But when you're just driving to the market you're better off taking the Subaru.
> I trust Apache to spin up a new PHP process for each request. Except under DOS attack or with some horrendous memory leak, it's just solid. Obviously it's a lot more overhead.
FWIW, PHP-FPM has been the standard way of deploying PHP in production for many years now [1]. No need for running PHP as an Apache module.
Some people like pretending that other languages don't have their own warts. There are thousands of mature SAAS products that use PHP and don't rely on Wordpress for anything. I don't think that anyone who uses Javascript as part of their backend stack has any room to criticize.
Walmart.com (during Black Friday) would beg to differ. LinkedIn might want a word too. I'm sure every language has its success story(ies) along with a long list of failures. Shopify and Stripe are successful with Rails! JavaScript is no different here.
Maybe read what I said instead of leaping in to defend against things I didn't say.
For example:
>I'm sure every language has its success story(ies) along with a long list of failures.
is actually in your reply to:
>Some people like pretending that other languages don't have their own warts.
I don't think node is a bad platform. I think the language it's based on is terrible, and nobody who is willfully writing backend code in node has any real justification to complain about someone else's choice to use PHP.
I took your comment to instill a sort of "pecking order" among back-end languages. In this case, it sounded an awful lot like "it goes from _insert language(s) you approve of_ to PHP to JavaScript". Basically that you'd lump PHP and JavaScript into the "terrible" bin. I'm simply pointing out while you may have disdain for those languages (and perhaps even rightfully so) the users are free to love them and still hate other languages.
I don't think that anyone who uses Javascript as part of their backend stack has any room to criticize.
Nodejs is a solid server runtime these days, widely test by big companies with big traffic as their frontend servers. And there is a mature option for static typing if you want that.
> This is the second such PHP thread I've seen in the last 24 hours (the other one was on a different web property); gotta love people shitting on PHP like it's 1998
I don't care for language wars (or the kind of petty language elitism that people get into with PHP), but: isn't Hack effectively its own language (and ecosystem) at this point?
I don't think that Facebook sinking tens of millions of dollars into a custom language dialect because on the initial technical decision to use PHP is a positive testament to it as a language. It instead suggests that Facebook bit the bullet.
I thought the original reason for it was that Zuckerberg didn't want to have the site rewritten by engineers in a language he didn't know. i.e. find a way to scale up this dorm room code without rewriting it.
All of the “www” code (backend which serves all the web apps and APIs) is Hack. And a lot of that code is generated via internal frameworks like Ent which is a graph abstraction over database access.
There doesn’t seem to be any incentive to switch www away from Hack: it wouldn’t reduce the learning curve very much due to all those internal frameworks, and would remove the opportunity to optimize the language for the www codebase.
There are open source languages used in the internal backends though (services behind www). Most of it is C++.
Migrating to Hack from PHP was huge undertaking requiring years to conclude, and that almost kept the same syntax and semantics where possible.
What would the benefit of spending thousands of engineer years (read: billions of dollars) on a rewrite be? Especially when it would nearly certainly come with a huge performance regression compared to continuing to use HHVM which has come with years of optimization for working especially with Facebook's web codebase.
In addition to the need to rewrite a gargantuan codebase and loss of the optimized runtime, it would mean throwing away a huge amount of expertise, tooling, and basically all of the source control history context.
There really isn't much to gain.
That said, the fraction of Facebook's code that is Hack continues to shrink, thanks to the continued growth of backend services and the clients being written in Hack (before there was server-rendered XHP, but now it's predominantly React).
We (external parties) really can't say. We don't know what's holding Facebook's system back, we don't know the challenges it faces both internally and externally. We don't know what VM limits the code is hitting, how well the GC/memory allocation story is for HHVM with Facebook's code (other than what we're told). We don't know how well Facebook's database is or isn't keeping up with the backend, and what future plans the database team has that would change the query profile which could then cause the backend fleet to hit limits of HHVM's tunables with current growth. We also don't know HHVMs team is planning.
If there is a problem that Hack and HHVM hasn't been able to solve, one would hope that the spending of billions of dollars is properly considered. Or to borrow from another expression, I'd hate to be billion-dollar-wise and future-of-the-company foolish. I don't begrudge the exec team for having to make that call and other calls like it. (Choosing to do Hack & HHVM in the first place was a similar judgement call.)
Although currently an external party, I did work on these things at Facebook in the past. And you're right, that any system does have scale limits on a given type of hardware.
Regarding memory allocation: HHVM has per-request memory arenas which get thrown away after each request. That combined with memory and time limits serves to compartmentalize the amount of memory pressure requests can place on the web server. Tuning of concurrent requests and workload mixes allows for some amount of exchanging memory pressure for throughput.
I'm not sure what the comments about the database are: queries to databases and other backend systems are fairly indepent of the language the web server is written in (Instagram is written in Python, and hits many similar or the same systems and those systems don't really care which one is making the request).
For specific components where HHVM is not able to handle them, those pieces can be extracted to separate services written in a different language (C++ or Rust, I think would be the go-tos probably?), with a cost of being unable to depend on libraries written in Hack.
Facebook uses a monorepo, similar to the way Google does it. Dependencies are vendored and checked into the repo. Facebook has released a tool for migrating software between repos, which is much like Google's Copybara:
With these tools, you can make fairly sophisticated choices about how you do vendoring. You can make the internal version look just like the public version, in terms of commit history. And you can export internal commits to public commits, stripping out confidential information along the way (or integrations with internal systems and tooling).
I joined Facebook in 2019 and left this year.
Prior to FB, I worked mostly with python, javascript, and C++.
Even at FB, I worked mainly in python (instagram backend), but spent a lot of time in the Hack codebase.
My experience was that FB's Hack + HHVM stack is much easier to work with and felt more productive than any other backend stack I've used. It's important to consider that a huge portion of Facebook's backend is one giant HHVM monorepo (called www). The consistency and uniformity has allowed FB to build lots of tooling and developer productivity on top of this one stack. For example, when you add feature flags, the tooling will automatically create a diff (PR) to remove the feature flag once it has been fully rolled out for a few weeks.
There are rough edges and weirdnesses, but HHVM is pretty actively being improved. Old mutable builtins are being (or have been) removed, the type system has gotten better, better error and warning messages all the time. FB is very responsive to data on developer productivity.
EDIT:
Another anecdote, when I first joined I was working on both the Instagram python codebase and the Hack API codebase (some of Instagram's APIs are in Hack/HHVM). I constantly wondered why we didn't migrate the Hack code to python, and talked with various people about proposals and potential paths to do the migration for the APIs I worked on.
After a few months of working in both codebases, I completely flipped. After witnessing insane bugs and horrible architectural contortions designed to mitigate python performance issues, I wondered why we didn't just migrate the python codebase to Hack. Python (on cpython/cinder) is just not ready for large scale web services, and Hack is a much more productive environment than Java/C++/Rust for backend.
Typescript may be an even better option, but has some issues of its own.
Yes. Instagram doesn't use the Django ORM though, and I believe they built their own async view implementation since it existed before Django went async.
IMO Typescript will grow on the backend. It's mainly just not as mature as alternatives and will get better over time. There are small problems similar to python though, like module-level mutable global state. That interferes with certain tools and fast restarts, but can be mitigated the same way as is done in python, ie https://instagram-engineering.com/python-at-scale-strict-mod...
Go is not dynamic enough to build useful abstractions. It does not allow developers to make themselves more productive. It's also more difficult to build hot reloading and high quality reflection based debugging.
My understanding from a talk back in 2014, is hack was developed because of frustrations with PHP evolution. Honestly I think it may have lit a fire under the PHP developers because php7 had significant speed improvements[4].
Talk:
"Facebook recently introduced and open-sourced Hack (https://code.facebook.com/posts/264544830379293/hack-a-new-p...), a gradually-typed programming language for HHVM that interoperates seamlessly with PHP. Hack builds the bridge between the dynamically and statically-typed worlds – providing code correctness while maintaining a fast feedback loop. Facebook is committed to working with the community to improve and refine the Hack language, to help interested developers convert to Hack (https://code.facebook.com/posts/264544830379293/hack-a-new-p...), and to narrow the HHVM compatibility gap with PHP5 and popular frameworks."
FB uses a pretty wide array of languages internally -- I don't know if they release statistics publicly, but you can filter/search their open-source projects by language at https://opensource.fb.com/projects/#filter.
I believe Hack is also used by a number of other companies, the most prominent being Slack [1]. FB probably uses almost every language in at least some context: Hack, C++, Python, Haskell, OCaml, Rust, JS, Obj-C, Swift, Java, Kotlin are all ones that I’ve heard referenced by friends who work there, though that’s likely nowhere near exhaustive.
The PHP to C++ compiler hasn't been used for something on the order of 10 years.
At least as of two years ago, pretty much entire web-facing portion of Facebook was written in Hack (which descends from PHP, but has what feels like every syntax feature you could stuff into an Algol-like language). Hack is run on HHVM, which is a bytecode VM that can JIT compile to machine code.
the problem with php is not php. the problem with php is the troves of terrible wordpress code examples out there that people learn from, modify, and use
I don't work there and don't know anybody who works there, but I do know that they have been using proprietary dialects of PHP for some time. React's JSX was influenced by XHP, and XHP was released in 2010 but apparently was in use for years before then. So from an outsider's perspective, it seems like Facebook has been using proprietary extensions or entirely proprietary languages for over a decade.
The fact that Facebook is such a big company means that they can afford to have entire teams working on these proprietary languages, much in the same way Google had the ability to assign full-time engineers to maintaining their internal container infrastructure (Borg, the predecessor to Kubernetes). Because of this, I don't see why they would give up control of some of their stack to people outside their company, who may not share their values. This isn't exclusive to Facebook's PHP developers...they have their own libraries for C/C++, Swift, JavaScript, and much more. It's a big company, the developer tools are going to be a lot bigger than most other workplaces.
Again, this is pure speculation. But most of these big companies do the same or a similar thing. Both Apple and Google have invented their own languages, the difference is that Facebook doesn't have an app platform that requires you to use Hack/PHP, thus nobody is going to just pick it up for no reason unless it compels them in other ways (e.g. Go)
> That being said, I don't think anything BUT www is written in hack.
I was at FB a few years ending in 2020.
Wherever I work, I end up doing a fair bit of 'code/tech archeology' just for fun and curiosity's sake. While WWW is the vast majority of Hack, I did see it in active use and maintenance in a number of other areas.
Given the big investment, I don't know what they would use instead. They seem to now have a very high level language well suited for the web, but with most of the advantages of a typed-from-the-start language, and great performance for the tasks it's doing. It's hard to think of an existing language that wouldn't be a downgrade in some way (productivity, performance, re-work) for them now.
Ironically, the only thing I could see reasonably replacing Hack/HHVM for them is...PHP. Assuming PHP has some success adding all the typing, async, etc, features. They already addressed the performance differences vs Hack/HHVM.
Note: All in context of what it's used for, and the investments already made. I get you wouldn't choose Hack/HHVM as a new company.
I don't work at FB but can say with second hand experience that the company is "all in" on Hack/HHVM. The language has diverged far enough from PHP that moving back is simply not feasible.
As for "considerations of using some open source language", individual teams are mostly free to use whatever best fits their use case, similar to what you may expect at any company of that size. Pretty much every major language is in use at FB somewhere or the other. And HHVM is also open source.
> Does anyone know if FB tracks changes to PHP so Hack is "up to date"?
No, for the most part, Hack no longer considers PHP 'upstream'. Exceptions are things like security fixes to extension functions, if that particular extension function was derived from PHP.
At a high level think of Hack as Typescript for PHP. Although they completely replaced some data structures (i.e. arrays) that couldn't be made performant in their PHP implementation.
Just like their customized version of MySQL, Facebook is stuck with Hack. It's too ingrained.
Their customizations to MySQL became very specific to Facebook and they pretty much stopped open sourcing changes. Hack has taken the same path. They will never outsource Hack fully because some of the customization are only applicable to them.
Has anyone used both Hack and the modern React workflow of Typescript and a mixture of client and server-side rendering? Interested to know how they compare
That said, I'm not sure how you'd go about server-side rendering React from Hacklang, as it seems that the open-source solution to do so is no longer supported (https://github.com/hhvm/xhp-js).
To clarify, xhp-js was not server-side rendering - it's a framework for generating javascript code, and getting an DOM element ID for a react element that the hack code can refer to and use elsewhere, e.g. by passing to some other generated JS.
Hack isn't going away. Ever.
It's actually quite pleasant to develop in. The tooling now revolves around VS Code, which I personally don't like. I wish they'd based this on a Jetbrains core.
One thing I learned to love with the Hack type system is that nullability is a distinct type (eg you can assign a T to a ?T but not a ?T to a T; ? here means "nullable"). The static type checking is smart enough to figure out here:
that the null check changes the type from ?T to T within that block.There's literally no reason to justify the massive investment a complete rewrite would be, like literally none. It's not even close.
HHVM and Hack at this point are extremely mature technologies.