Hacker News new | comments | show | ask | jobs | submit login
Hack: a new programming language for HHVM (facebook.com)
576 points by bos 1362 days ago | hide | past | web | favorite | 415 comments



I am baffled as to why you'd build your castle atop a crumbling foundation.

I have wondered why FB didn't use a proper language with proper typing to begin with. I mean, I "understand" logistically: they already had a giant codebase in PHP, migrating a codebase is expensive, and it's difficult to hire and train 1000s of hackers in e.g., OCaml. (They do have some OCaml people, but they are outliers. OCaml was my favorite thing to write there, though it didn't afford some of the same niceties and interactivity as the PHP code they had, only because the support was down by several orders of magnitude.)

But at the same time, layering FP with a home rolled static type checking server (??) is bug prone and is certainly yak shaving (which they have time and money to do). Now they've written (1) a compiler to C++, (2) a compiler to VM byte code, (3) a corresponding runtime for each, (4) extensions to PHP, (5) a type checker, and (6) an inference engine. That's a lot of stuff. And in the end, it's still PHP, which is duly disliked. (Though Facebookers don't seem to care. The prevalent attitude toward it is that "PHP, as it's coded here, is mostly like C++, and that's OK.")

Writing correct type checkers and inference engines is kind of difficult. They seemed to take the approach of just building onto it incrementally until it just seems to work. That approach led to many bugs in many cases that just simply aren't thought of when one is trying to build inference engines by hand, as opposed according to theory. Type checking and inference is an area ripe with theory and attached formal, mathematical semantics. Standard ML's standard is perhaps the most infamous; it's a collection of mathematical statements about the language. That way, the compiler is now almost an engine to prove your code is correct. I don't see how the same guarantee can be made with something that is just cobbled together.


> I am baffled as to why you'd build your castle atop a crumbling foundation.

Because perfect is not the enemy of the good? Because "build atop a crumbling foundation" has demonstrated time and again to be, by far, the most successful way to accomplish anything in computing? Unless you have some example of perfect, now dominant, technologies that have been created ex nihilo that I'm missing? I mean we (facebook) are still using PHP and MySql, improving both. And when we need to break things out, we head into C++, the queen mother of castles on broken foundations.

> But at the same time, layering FP with a home rolled static type checking server (??) is bug prone and is certainly yak shaving (which they have time and money to do). Now they've written (1) a compiler to C++, (2) a compiler to VM byte code, (3) a corresponding runtime for each, (4) extensions to PHP, (5) a type checker, and (6) an inference engine. That's a lot of stuff.

All languages, runtimes, and standard libraries (and databases, and source control, and on and on) are "broken" at sufficient scale. You're going to be spending time rebuilding things other people take for granted no matter who you are and what language and technology you are working in. The underlying assumption that a "proper language" gives you these things for free is completely false.

> Writing correct type checkers and inference engines is kind of difficult.

Just so we are clear, you ask why Facebook didn't rewrite 10,000 human-years of code into a mythical unnamed "proper" language, but you consider writing a type checker to be "difficult". I think you might have vastly inaccurate pictures of what is and isn't "difficult".

> That way, the compiler is now almost an engine to prove your code is correct. I don't see how the same guarantee can be made with something that is just cobbled together.

Computing history is littered with dead projects from people who believed that anything less than perfect is unworkable or non-valuable.


Because "build atop a crumbling foundation" has demonstrated time and again to be, by far, the most successful way to accomplish anything in computing?

I can't imagine where that sort of conclusion comes from. Building on a crumbling foundation seems to be just about the most proven, reliable way to ensure your software project won't survive more than a short time without needing serious effort just to maintain it and/or a big rewrite.

Much of the world still runs on C, a language that was created long before many of us here were born. Other languages have risen, peaked, and fallen into relative obscurity since then, yet C endures, because for all its faults, it is simple and predictable, the epitome of a sound foundation.

Large amounts of COBOL is still driving back office systems in large organisations. The cost of hiring people with the skills to maintain them is probably horrific, but those systems are still there, doing their jobs decades later.

You can still run applications from nearly 20 years ago on Windows today, in no small part because of Microsoft's persistent focus on compatibility and keeping the basics reliable over that time. Similar stories can be told in *nix world.

What major accomplishments in computing that have been built atop crumbling foundations can claim anything even close to the scale of success of these examples? You surely can't be talking about the move fast and break stuff philosophy that seems to drive everything at trendy software shops like Facebook and Google, or the kind of MVP/lean start-up hype we hear about ad nauseam on HN and the lasts-just-long-enough-to-exit web apps that result.


PHP is a "crumbling foundation" in exactly the same way DOS, Windows, COBOL, and C and C++ were, are, "crumbling foundations". I don't see how you are disagreeing with me. You are proving the point.

You aren't actually going to claim that Windows was well-conceived and theoretically well-founded, are you? Windows has been a never ending refinement on exceptionally shaky ground.


It seems like you're now arguing that PHP isn't a crumbling foundation -- a point that reasonable people could debate -- rather than that building atop crumbling foundations is by far the most successful ways to accomplish anything in computing, which was the claim I challenged.

Incidentally, my point about languages like C or operating systems like Windows was not that they are theoretically wonderful under the hood, merely that they provided a reliable foundation. C has been standardised for a long time and is widely portable. Code that was designed for early incarnations of Windows will often run with little modification even on today's systems because the essential underlying models and APIs have been diligently preserved over the years even as many other changes were going on around them.


I'm curious just what definition you're using for "crumbling foundation". Is it that old software doesn't break on it, in which case PHP/HHVM/Hack isn't a crumbling foundation either - Facebook is built on it, and Facebook is clearly still running.

Or is it that maintenance is difficult and programmers will run into all sorts of ugly corner cases and features that are just grafted onto each other? Because those apply to C and C++ and Win32 and Google and basically every other large software system as well. That's the point the grandparent is making - if you look at pretty much any successful, evolving software system under the hood, you'll see a byzantine mess of complexity, and it's a wonder that it ever works.


I didn't have some some specific, technical definition in mind, but if I were to try and pin it down, perhaps in software terms it would be something like "a dependency that is unreliable in the long term".

Clearly this isn't an absolute scale. As our industry evolves and we develop more reliable ways to achieve our goals, something that we regarded as being a relatively stable foundation in the past may no longer be regarded as such in the future when our standards have risen. Moreover, what constitutes "long term" might vary wildly among different projects.

I suppose my basic objection to the original claim (that building atop crumbling foundations is by far the most successful way to accomplish anything in computing) is that significant achievements in computing tend not to happen overnight but rather to develop over time, and the more stable your foundations, the better chance your project has of developing far enough to achieve significant things.


If you define "crumbling foundations" as "a dependency that is unreliable in the long term", then your conclusion is circular. Of course no long-term products will be built on crumbling foundations, because if a technology stack has long-term successes then, by your definition, it is not a crumbling foundation.

I think the other posters are using a definition of "crumbling foundation" as "one which most engineers hate, which slows them down through excess complexity". And by that definition, almost all successful projects are built on crumbling foundations, because the fact that the project was successful leads you to add features to it, and adapt it in ways that the original architecture didn't anticipate. This process only ends when the software becomes so complex that all further attempts at modification fail, at which point everybody hates the codebase and it is, by pretty much any definition, a "crumbling foundation".


I think it's funny that this whole thread is based on the initial haphazard word choice of reikonomusha yet he is not even a participant in the debate.


If you define "crumbling foundations" as "a dependency that is unreliable in the long term", then your conclusion is circular.

Not at all. I'm arguing that in computing, worthwhile results often take time to achieve, and therefore that foundations that are likely to be around for longer will improve the chances of achieving such results. Alternatively, from the opposite point of view, the odds of achieving something worthwhile go down significantly if you have only a short time time to achieve it, as inevitably you will if you are building on foundations that aren't themselves going to be around for long (whatever we choose to call them).


hm interestingly, PHP still powers about 70% of the web! the web should have certainly crumbled by now ...


I'd disagree if you say PHP powers about 70% of the web. Web servers (and all other kinds of servers) are not written in PHP neither are protocols that drive the web. Operating systems are not written in it either. These (and many more) are what power the web, not web applications. Your definition of what "powers the web" is remarkablly wrong.


Facebook doesn't use web servers or operating systems written in PHP either, right? So if the language that web applications are written in isn't very important to the overall dependability of web applications, then it stands to reason that Facebook's choice of PHP for their web applications shouldn't be a problem either, right?

What point are you trying to make?


> What point are you trying to make?

I had this question in mind while reading your reply. What point are you trying to make with your comment? For my part I wanted to point out that it's unacceptable to say PHP powers 70% of the web. (Check my previous comment's parent.) It has nothing to do with Facebook. And I didn't allude to your inference that choice of language doesn't affect the overall dependability of web applications.


>I didn't have some some specific, technical definition in mind, but if I were to try and pin it down, perhaps in software terms it would be something like "a dependency that is unreliable in the long term"

In this case you are mostly hand-waving, and what you say amounts to "I don't like X language".


> It seems like you're now arguing that PHP isn't a crumbling foundation

I think you need to go back and read the message you replied to. He is not arguing that at all. He's pointing to a long range of other things that he is asserting are also crumbling foundations.


"Windows has been a never ending refinement on exceptionally shaky ground." No. Windows NT was a complete rewrite and it was " well-conceived and theoretically well- founded"


What you say is technically correct, which is the best kind of correct. However, fewer than 1% of Windows programmers ever talk to NTOSKRNL, and probably an order of magnitude fewer do it regularly.

Most of the time, you're talking to Win32/64 or high-level services based on DCOM or .NET, where the "well-conceived" and "theoretically well-founded" stuff doesn't turn up. You can go your whole career without knowing that there's a well-designed kernel under all that cruft.

I'd guess that less than half of Windows developers could say what the object manager does.


This isn't true. Every Linux users encounters problems with drivers. By integrating them into the kernel we have an eco system where drivers are out of reach of many users and OEMs. Consider the difficultly in making a desktop scanner for linux. Additionally kernel changes can break a driver with little recourse on the OEMs side, you must simply bend to Linus's will. At the end of the day we suffer from Linux's driver architecture.


So true. Windows' services and drivers system may not be beautiful but it full on works most of the time and it allows even non pro Windows users to change basic stuff with far greater ease than Linux which is a real shame given Linux' potential for transparent usability (a potential it has had since the late 1990s but I have personally given up waiting for it to come true)


For more than half of Windows developers, if you told them there was an object manager, they would ask how you can turn off that service to improve performance.


I'm appalled and little bit insulted that you group PHP in the same group with COBOL, C and C++ in terms of their foundation. COBOL and C were designed by some of the greatest pioneers of the field, and indeed PHP is built on C.


Just because they are remembered as being first doesn't mean they were well designed, or actually pioneers. At the same time C was being written others were working on Lisp, and ML. It has taken nearly half a century for some of their innovations to be recognized as good ideas and taken up by main stream languages, while C was a small improvement upon existing language design.


C designers choose deliberately to ignore the other system level programming languages at the time, which already offered more memory safe constructs, in their quest for a portable macro assembler.

PHP could have been easily done in any language with native compiler toolchains.


Why would you feel insulted? Did you personally create any of those languages? In fact, did you personally create any language? If not, maybe you should't be quite so dismissive of the accomplishment it was to create PHP. Sure, it's not the best language out there, and valid criticisms can be levelled at it, but as they say: It's better to have tried and failed...


What you just said is plain stupid. So our language designers can be critical of other languages? Bullshit! Since programmers can vote with their feet and gravitate towards better languages (is it miraculous that almost all programmers have a distaste for PHP?) we should give reasons why we use C and not PHP. If they were equally crappy what would have been the cause of choice of one over the other? I can call PHP the worse language ever, and I don't need to have created a language already.


>So [only] language designers can be critical of other languages?

Yes. Or rather, other people can be critical too, but language designer' opinions have far more validity.

In other words, anybody can say whatever uninformed BS he wants (it's a free country). But that's no replacement for being an expert in what you're discussing.

>Since programmers can vote with their feet and gravitate towards better languages (is it miraculous that almost all programmers have a distaste for PHP?)

You'd be surprised. PHP is one of the most popular languages and one of the most used languages, so far from "all programmers have a distaste for PHP". So if we were to use that "voting" argument alone (which I find wrong), PHP should be considered a very good language. Not the intention you had, I guess.

While PHP has it warts, it's mainly the less pragmatic and more fad prone programmers that have issues with PHP, those who look for silver bullets and like to feel superior by choice of programming language, editor and the like.

As for programmers "voting with their feet", well, they don't do quite a good job at it. The best languages (like LISP, Smalltalk, OcamL, Haskell, to name but a few etc) are seldom the most popular too.


> but language designers' opinions have far more validity

Here's the case we're talking about the work of a language designer. By the way, not all people capable of creating a programming language have created one yet. Marc-André Cournoyer, who wrote the book that Jeremy Ashkenas learnt language design from to write CoffeeScript, has not created a language himself.


Given the widespread use of PHP for web programming, it seems obvious that in the domain of web programming, developers have voted with their feet for PHP.


Energetic amateurs. They've not reached maturity so that their votes should be counted.


Given that these supposed "energetic amateurs" have created an enormous percentage of the code that web applications across the internet run on, their votes automatically count. If PHP developers had never produced anything of value, that would be a different discussion.


I also get a bit (not terribly, just a bit) insulted if somebody says "Beethoven sucks", because the person who said it is technically the same species as I am.


In the sense that any software written in C is built on C.


True, but a lot of PHP's ... design also reflects C, only that they warped it.


A few of the complaints about PHP (the inconsistent function library, for example) exist because PHP was originally a very thin scripting language layer for using existing C libraries.


C was not build by the "greatest pioneers". Just succesful pioneers.

There were far more evolved and elegant languages at the time C was created (and during the time it took for C to rise, even more were made). LISP for one, but also languages with the same performance characteristics and systems programming capabilities as C.

C definitely wasn't seen as a "great language design" -- just a very useful and pragmatic one (e.g see also the classic "Worse is better" essay).


C, simple and predictable? You've got to be kidding me: http://lwn.net/Articles/586838/


Compared to C++

Compared to C++ a language made of explosions and guttural sounds is an elegant way of communication.


It can definitely be simple, but if you're writing C you're compiling directly on hardware. If you change the hardware, your program will not run. That is literally the definition of unreliable.


It's not, actually. The Mars Exploration Rover won't work underwater. That doesn't mean it's unreliable.

When your compiled C program sometimes runs on x86 and sometimes doesn't, that is the definition of reliable.


That's a stupid definition for unreliable.


Reliability can be measured in successes and failures. Failing reliably is not necessarily a bad thing. Some things you build may or may not work on other hardware. That's unreliable. You assert, with confidence, that C will fail. That's reliable. You can rely on it failing.


> Much of the world still runs on C

But these days, most interesting applications run on C++, which started from the arguably crumbling foundation (from C++'s point of view, not per se) of C, and grew organically over several major revisions into something hideously complex.

This trait of C++ is not a good thing, but the amount of successful software written in it seems to prove that it's not fatal either.

> You can still run applications from nearly 20 years ago on Windows today, in no small part because of Microsoft's persistent focus on compatibility and keeping the basics reliable over that time.

Layers of hacks on hacks to keep old software running correctly is exactly "building on a crumbling foundation", and probably the reason Microsoft is trying to get rid of Win32, having severely limited its availability on ARM in favor of the WinRT APIs. But I'd say the venerable success of Win32 demonstrates that the crumbling foundation works.


> This trait of C++ is not a good thing, but the amount of successful software written in it seems to prove that it's not fatal either.

That depends. It could also be argued that our current tools are limiting what we can accomplish and at some point we'll need a revolution, otherwise we'll hit a ceiling, just like when concrete replaced the need to carve and place rocks on top of each other.

For example living organisms are way more adaptable, more self-healing than anything we've ever built. Our own body should be a text-book example of massive parallelism involving trillions of independent agents that cooperate with each other. In particular, the process of wound healing that happens when you cut yourself is quite fascinating: https://en.wikipedia.org/wiki/Wound_healing

If we wanted to simulate the human body, or at least the human brain, since that's the most interesting part, somehow I'm not seeing C++ in that picture.


C++ belongs to my list of languages that I enjoy using, sadly it was build in quicksand foundations due to C compatibility as a way to make it mainstream.


Add JavaScript. Super crappy but its ubiquity for web scripting is really saving it some real bashing. When we finally have options, we'd relish in our freedom and say what it was like to work with badly written programming languages.


> What major accomplishments in computing that have been built atop crumbling foundations can claim anything even close to the scale of success of these examples?

Wikipedia and Facebook?


A fair question is if they successful thanks to PHP or inspite of PHP.


Yahoo!, too, is a heavy PHP user.


They're slowly moving to node and java for quite some time. I interviewed there a few months ago and they drilled me on java stuff.


Neither of those would even remotely qualify is accomplishments in computing. But they also are terrible examples. Facebook isn't using PHP anymore, that's what this discussion is all about. Wikimedia is terrible, and wikipedia is almost exclusively static content being served by squid.


What are we discussing? Are we discussing about the tool or the things made from tools?

Its like trolling a raw cast iron hammer used to build a furniture factory and then saying the whole factory is entirely useless just because some invented a stainless steel hammer.


We're discussing both. The original question posed was something like "what amazing stuff has been built on crumbling foundations", and the response was "wikipedia". I don't necessarily agree with the sentiment of the question, but wikipedia is not a good example of amazing technology.


Given a programming task, it will be written faster, easier, and in a more maintainable and less bug-prone fashion if it is not PHP.

The opposite opinion is basically indefensible. Sure, you can still dig a trench with a spoon (and if you have enough money to wield a bunch of workers with a spoon), even if a shovel would do a better job.

Let's begin.

1) PHP autocraptastically converts strings that look like numbers, into numbers, resulting in all sorts of weirdness like this: https://eval.in/111886

2) PHP 5.4's OWN TEST SUITE has 91 failures and only 70% coverage. There is NOTHING more "WTF" than that! Why even bother having a test suite?? http://gcov.php.net/viewer.php?version=PHP_5_4

3) Why the fuck are all of these different things equal, and how does this NOT result in problems? http://i.imgur.com/pyDTn2i.png

4) String increment is dumb to begin with, but why does it not even match the behavior of string decrement? https://eval.in/60631

5) Why the hell can you jump back into a try block from a catch block? Recipe for disaster: http://phpmanualmasterpieces.tumblr.com/post/33091353115/the...

6) PHP comparison operators. I'm sorry, but this level of complexity might make you feel smart once you master all its idiotsyncrasies [sic], but it's actually dumb: http://stackoverflow.com/questions/15813490/php-type-jugglin...

That's a small fraction of not-thought-out PHP language features that result in REAL bugs and security holes. Which consume large swathes of programmer time. Which, apparently, Facebook can afford to swallow.

I'm sorry, but your position, as valiant as you are defending it, is literally indefensible. And I don't give a fuck how big Facebook is, they would STILL be better-served by switching SOME of their code to a different language. ANY modern programming language wouldn't suffer from this imbecilic, immature language design.


You're making a mistake. The question is not whether to start a company with PHP vs language X. The company is long started. The question is not whether or not to poof into existence a port from all of FB to language X. That's not possible. The question is, given that PHP is the current language, with all its faults, will it it cost more (including all definitions of cost) to make the switch? How long will it take? Does it get the job done? How bad is the damage?

The question more pertinent to your argument is, did they make a mistake years ago choosing PHP? That's when the could have conceivably gone with language X.

BTW the types of stuff you're listing are documented here. So thorough it's amusing to read: http://me.veekun.com/blog/2012/04/09/php-a-fractal-of-bad-de...


> The question is not whether or not to poof into existence a port from all of FB to language X. That's not possible.

And yet Twitter has slowly migrated on the backend from Ruby to Scala/Java. They still use Ruby for the frontend, though it's not clear to what extent, since they've also migrated to a single-page, fat client design on the desktop. And at the very least, their choice of using Ruby when prototyping, was at least sane.

I understand that large codebases can't be migrated easily. But you can migrate individual components when needed, if you have a modular, service-oriented design.

Also - building new functionality in PHP is indefensible, unless their code-base is one big monolithic hairball, which I doubt it is.


> The question is, given that PHP is the current language, with all its faults, will it it cost more (including all definitions of cost) to make the switch? How long will it take? Does it get the job done? How bad is the damage?

An excellent point. Which in fact is an argument in favor of modularizing your code as much as possible, as early as possible. There are tools now like Apache Thrift which make this easier: http://thrift.apache.org/


Why thank you :)


It's interesting how most people seem to attribute the quality and longevity of a software to the language it is written in or the frameworks it uses rather than to the amount of thought that was put into its design. Sure, the former is important, but largely overrated.


I can't fathom why you're under the impression that some code hasn't been switched to another language. Furthermore, your vitriol seems quite effective at undermining your thesis.


Fair enough. And to be honest, there was a single night not too many years ago where I wrote up an auction site that had most of the functionality of eBay, in a single night, in PHP (someone else had already done the frontend work, fortunately- I just built the backend).


I think most of you guys are missing a crucial thing here -- how decisions get made. Facebook had two options: 1.) Rewrite all the PHP code in Java/Ruby/C++/whatever 2.) Write HHVM/Hack/etc to transparently convert the existing PHP code.

Option #2 ended up getting a lot of publicity for the engineers. If they chose option #1, not only is it a lot of hard boring and possibly error prone work, there is no chance for any of the engineers to get the type of publicity they are currently getting. All in all, engineers tend to work on whatever will make them get noticed, not necessarily the better technical choice nor what is in the best interest of the company... this is especially true in companies like Facebook and Google where there are a lot of very smart engineers doing relatively mundane work. So there you have it Hack/HHVM are all just publicity stunts, the more you feed it, the happier their PR gets.


> Just so we are clear, you ask why Facebook didn't rewrite 10,000 human-years of code into a mythical unnamed "proper" ...

You don't need to rewrite anything (at least, not all at once.) Personally, I'd have expected you to make something akin to CoffeeScript or ClojureScript that targets PHP, and can "link with" your existing PHP modules (or rather, with their HHVM bytecode representations.) Then treat the PHP code as a constantly-dwindling Big Ball of Mud (http://laputan.org/mud/).


> Personally, I'd have expected you to make something akin to CoffeeScript or ClojureScript that targets PHP, and can "link with" your existing PHP modules.

We took a similar approach to what you described. Then we called it Hack.

This is what we mean by seamless inter-operation: the HHVM runtime understands both syntaxes and runs both <?php and <?hh code in the same process. Whether Hack integrates into the runtime at the parsing (current) or bytecode (future?) layer is an implementation detail.


But the syntax you've got now is effectively a superset of PHP, and comes with all the problems of PHP. You've effectively wrapped your Big Ball of Mud... in slightly different-colored mud. The whole point of a clean-break-targeting-interoperability like this is that you can stop using mud at all, and it'll still work with what you've got now. In fact, what other reason would you have?

If you mean that future versions of Hack will evolve to have a different syntax, while still targeting the runtime... then you'll still have code around from the intermediate era, and you'll have to interoperate with that too, won't you? You'll have PHP, PHP-looking-Hack, and actually-nice-to-code-in-Hack.


PHP's syntax is not what's preventing you from writing large maintainable systems in it. Many of the more successful languages throughout history became successful BECAUSE they used a syntax that's superficially similar to something familiar.

That's not to say that syntax doesn't matter, but semantics and pragmatics tends to matter more, despite Wadler's Law of Language Design stating that most people don't understand this.


I think you are putting way too much emphasis on syntax here. The important contribution of Hack is the type system and this is something that a syntactic-sugar translator like Coffeescript can't hope to achieve.

I also wouldn't count Clojurescript here. Its a whole different language that just happensto compile down to JS.


That is essentially what Hack is.


Very suitable name for a language then.


That's exactly what they did, that's what they are announcing here.


> Because "build atop a crumbling foundation" has demonstrated time and again to be, by far, the most successful way to accomplish anything in computing?

That's only because the currency for building things on top of crumbing foundations has been sweat and man-power. We aren't that far off from the Egyptians that were using hundreds of thousands of slaves per pyramid. It's a good thing that we've transcended the necessity for hundreds of thousands of slaves when raising buildings, don't you think?

And yet, here you are, claiming that building stuff with broken tools is the most successful way to accomplish anything in computing. Actually I view it as nothing short of a miracle, showing human determination in action ;-)

> All languages, runtimes, and standard libraries (and databases, and source control, and on and on) are "broken" at sufficient scale.

That's a fallacy. Just because both X and Y are broken, that doesn't mean they are equal, as some things are more broken than others and PHP is more broken than anything else mainstream (C++ at least has reasons). Also, I don't see how "at scale" changes things in PHP's favor, I really don't.

If you're trying to argue that "at scale" the level of brokenness converges to the same levels, then that's a stupid thing to say. After all, Twitter didn't had to build its own JVM and the stuff they run on top of the JVM is probably more power efficient than you'll ever be with HHVM. Probably saner too.


> We aren't that far off from the Egyptians that were using hundreds of thousands of slaves per pyramid. It's a good thing that we've transcended the necessity for hundreds of thousands of slaves when raising buildings, don't you think?

False, they were skilled, paid laborers: http://en.wikipedia.org/wiki/Egyptian_pyramid_construction_t...

Apologies for being irrelevant to the main point, but this is a tired myth.


Interesting. Thanks for the link. Is Wikipedia great or what? :-)


Facebook is a relatively mature company now that is in the business of making money. In fact, I would argue that [framework of the week] is more buggy and broken because it less well understood.


Actually Twitter has been pretty successful in migrating a good chunk of their codebase from Ruby to Scala.

Facebook can keep PHP for the thin web layer that renders the page, but they could have migrate the meat of their code to a safer and more robust language.


The HN crowd seems to dislike (or despise perhaps?) PHP, but it's really not that bad. Yes it has a lot of warts, but it has a lot of things that make it nice for web development.

a) try your new code by saving in your editor, and hitting reload in your web browser.

b) it's very approachable. People who only know HTML and CSS can be expected to do a little bit of PHP work to integrate their changes. If you setup the right network mounts, they just need to edit files and reload (see a)

c) it's not super high overhead at runtime. If you're not using a framework, and you don't build up a crazy object hierarchy, it's not too hard to get your page out with about 10 ms of overhead beyond data fetching. For very simple webservices (fetch data, possibly from multiple sources, and do a little formatting for the consumer), I was able to get the overhead down to 2ms. You can certainly do better with other languages, but you can usually get better throughput improvement by working on getting data quickly. Btw, all the frameworks are terrible; many of them add 100 ms to the page just for the privilege of loading the includes; PHP is a framework for web programming thank you very much.

d) cleanup; you don't have to worry about it. If you don't do anything weird (c extensions, with non-preferred malloc), at the end of the request, everything is thrown away.

That said, there are plenty of things PHP isn't good at: I wouldn't run a long running process in PHP; and multithreaded PHP sounds like a bad idea.


Yes, PHP is its own web framework, but it's not a very good one. And, as you say, implementing a better one on top of it adds a lot of overhead due to the execution model. With other languages, where you don't throw everything away at the end of the request, you are free to implement a good web framework without suffering additional overhead.

As for a), many (most?) frameworks are able to monitor the source files and reload the application when they change, in order to enable that workflow. For example: http://cherrypy.readthedocs.org/en/latest/refman/process/plu...


I'm running multithreaded background processes in PHP pretty successfully. I did not see (and still do not see) a reason why I should have chosen another "proper" language for it.

P.S. The vast majority of arguments against PHP here leads me to conclusion that most of (not all) debaters don't understand how PHP works well enough.


Mulithreaded using the pthreads api or something else? Any chance there's code I could take a look at (I'm curious!)


I use curl_multi_* functions.


Oh... I don't count curl_multi as multithreaded (because it isn't really threads), but it's very useful, and I definitely use it.


If you want to know: just checked the one project where I did a long running PHP-process.

It is a job scheduler which takes jobs from a Redis queue and executes it.

It does fork a new process for 1 type of task (image resizing in this case) to make sure it doesn't leak memory.

It also reuses these forked processes so it doesn't need to fork for every task.

It was last started on Jan. 2013 (because of maintenance) and still running fine.

The reason ? I didn't want to introduce an extra language for project for which all the server code was already written in PHP.

So if it fits the task, you can do it.


Duly noted, will not be so negative on long running processes. :)


> a) try your new code by saving in your editor, and hitting reload in your web browser.

The only language for which that works is client-side Javascript. For PHP, you forgot the part in which you have to install and run a web server, then point your browser to localhost. Plus, to get anything useful done, you'll also need some sort of database to go along with it. I remember the first time I did that and it was pretty intimidating.

> c) it's not super high overhead at runtime ... it's not too hard to get your page out with about 10 ms of overhead beyond data fetching

To me much more interesting is the total time it took for the client to receive the response, possibly when multiple concurrent requests are happening. The comparison here should be versus a static page served by Nginx of course.

The best throughput possible that I got for an otherwise complex business logic happened on the JVM. I basically rewrote a web-service built on top of Django/Python, with a redesign with emphasis towards in-memory caching, parallelism and async I/O and the result was a server that was able to process more than 10,000 requests per second with an average of about 5ms per request (actually in production the instances ended up processing about 2000 reqs/sec of real traffic, since c1.medium EC2 instances don't have enough CPU power).

Of course, people that just need to slap something together, don't need this level of throughput. If a request takes 400ms for a dumb web form on a low traffic website, that's of no consequence to most people. The problem happens of course when such a piece of software evolves to something much bigger, like Wordpress. I'm always amazed at the gimmicks that people do just to keep their Wordpress powered blog alive.


Has anyone tried to copy these good parts of PHP, except with a not shitty language?

Is there anything about PHP, the language, that lends itself to this style of development, or could you get these same benefits with, say, Ruby or Python?


Yes, someone has indeed tried this. It's called "Hack".


There are a lot of people making money writing systems in php because it's the right tool for some jobs. There are companies making money using system that are written in php, because it works. These people don't have as much to say as those working in other languages that may feel threatened or uncomfortable when something they think is bad, seems to used with success.


a) WTH? I hope you do not test your code in production. If you don't that feature (a ide that constantly compiles your code) is a common feature.

b) Theorem: Leting your web designers program is a way worse than letting your programmers do web design.

c) Every simple project can become complex at some point. Facebook certainly is.


a) You don't have a development server/VM?

b) Not everyone has programmers. That was his point


Oh, for b) i know a bonmot. Sorry for the image. http://fettemama.org:6502/ed2499a7aa73df6dcefe95fbb649ece3


Not everyone that needs/wants to make a website wants to start a software development company. Often then just want to accomplish a few small things. To match up the rest of your examples:

I was to defend myself in small claims court

I want to learn some first aid

I want to build a shed behind my house


> I am baffled as to why you'd build your castle atop a crumbling foundation.

> I have wondered why FB didn't use a proper language with proper typing to begin with.

I would suggest watching Keith Adams's talk, "Taking PHP Seriously": http://www.infoq.com/presentations/php-history

He goes through why Facebook uses PHP and decided to build upon it to create Hack. I highly recommend watching the whole thing, but the main three things he points to are:

1. Frictionless programmer workflow with a short feedback cycle

2. All PHP requests start out with the same consistent state by default

3. Rigid style of concurrency


Which is nonsense, because:

1. I have never seen a framework that didn't go to great lengths to update to new changes quickly in development

2. Which means you lose resources between requests, unless you stuff them into the interpreter/httpd itself. And anyway this is only a problem for PHP, where by default everything runs in the top-level namespace, versus in separate functions.

3. That's a funny way to spin "no concurrency", but you can get that in any language by just not deploying it threaded.


blink blink

um. have you used java?


What, you mean with "hot deployment" of your changes?


Well, even in Play Framework 1.0 in 2009 there was hot reloading. Edit a file, refresh your browser => it's there.


On the Atari ST you didn't even have to hit reload, the webpage reloaded automatically as soon as you saved your file. No need to alt-tab even, just keep webpage open next to your editor. Now get off my lawn.


nope


(I work at Facebook but not on Hack.) There's a lot to be said about backward compatibility, and much of Hack's virtuosity stems in its smooth interoperability with PHP - many millions of lines of it. There's nothing like working on such a large codebase to convince one how difficult disruption of any kind is.

The language definition and semantic checker are difficult, but are stereotypically tasks that cannot be distributed to many engineers; instead, a few senior engineers took that task to the benefit of all others.


What editor do facebook recommends for hack? Pho storm doesn't support it yet unfortunately


"I am baffled as to why you'd build your castle atop a crumbling foundation."

Congratulations for admitting your ignorance, and lending open ears to experts as to why they made certain engineering decisions.

Oh wait, you weren't doing that, you were just warming up to go on a diatribe about how stupid Facebook engineers must be.


> Congratulations for admitting your ignorance, and lending open ears to experts as to why they made certain engineering decisions.

Yes, they made certain engineering decisions now because the decisions they made back then were stupid, and they have to dig themselves out.


What was the stupid decision "made back then"?

That Zuck wrote the first version of thefacebook.com in PHP, the language he was the most productive at?

That the initial team didn't rewrite Facebook in Python/Perl/Ruby/Haskell during the fast growth phase? If you have ever experienced the growth phase, you understand how ludicrous the idea of rewrite would be. I've personally experienced and heard only horror stories about rewrites. We underestimate how much hidden wisdom a production code base has and that the messiness is often there for good reasons.


I once worked for a social network that was extremely popular in a particular geographic niche. It was even started before Facebook.

And it was also written in PHP. And we rewrote in Ruby (not rails, at the time it was not mature in the ways we needed it to be). We made a lot of great technical achievements in doing this and our codebase became much much better. On a technical level I don't regret us doing that one bit. We went long on schedule and we made mistakes, but some of the best work of my career went into that and I'm immensely proud of it.

But then in a matter of months of Facebook going to being open to people who weren't at a university or big organization (remember that?) our users, who got bored with our site not changing while we rebuilt all the tech, completely abandoned us and we went from profitable (having never taken any outside investment) to dead in another couple of years.

This isn't to say that this wouldn't have happened to Facebook had they done something like this, but it is always a huge risk, even if you do everything right.


> What was the stupid decision "made back then"?

Using PHP to begin with was a bad technical decision. Failing to establish a reasonable migration strategy was a bad business decision likely rooted in bad engineering management that fell out of starting with bad technical decisions.

It's much harder to hire people that can pull you out of a mess like PHP when, at the same time, you have to hire people that can keep writing PHP for you.

> That the initial team didn't rewrite Facebook in Python/Perl/Ruby/Haskell during the fast growth phase?

That would have been a good time to bring on new engineering blood as part of scaling out, which would have provided opportunities to enact mitigation and transition strategies. Imagine if the massive amount of talent currently devoted to HHVM had been devoted to Facebook's actual business?

There are migration strategies other than "rewrite everything immediately", and in fact, I'd bet that's exactly what HHVM is. It's just a shame they waited so long that the most cost-effective migration strategy was to tackle an enormously difficult computer science and engineering problem that the world's biggest software companies already invest hundreds of millions of dollars on and provide to the world largely for free.

> If you have ever experienced the growth phase, you understand how ludicrous the idea of rewrite would be.

Yes, and I've also (repeatedly) been the team brought in to rewrite the mess of a code base that was about to torpedo the growth phase.

There's not much correlation between funding, initial success, and engineering talent. Which is why you so often wind up with a mess that has to be cleaned up once you can hire people who know what the hell they're doing, instead of the ones you happened to be stuck with because you didn't know how to grow an engineering team.

> We underestimate how much hidden wisdom a production code base has and that the messiness is often there for good reasons.

Messiness is never there for good reasons other than that replacing it is more expensive than not touching it. You don't strive for hidden wisdom and inescapable messiness -- that's just what you get when you let engineering slip up.


As much as I prefer not to engage trolls...

Which language would have been a good technical decision in 2002-2003? It needs to be fast enough in terms of iteration. It needs to not require more resources than PHP. It must be easy to onboard people who don't know it onto. It needs to be easy to operate, and not be costly to deploy on the tens, hundreds, and then thousands of servers necessary. (Spending time learning a new technology that others think is cool or which seems cool, while trying to also build a product, has sunk more than a few startups...)

What was bad about the decision to keep the reasonably well-performing and reasonably suited-to-purpose PHP code for front-end code, and peel off suitable tasks into services like the feed, typeahead, messages, and so forth into languages like C++, Java, and so forth. What was bad with the decision to let hundreds of software engineers continue to build the PHP code-base while a much smaller group of people work on projects to improve the efficiency of both the execution environment but also the tooling and developer efficiency on that code-base? Their contribution there is multiplied out by the improved efficiency of those hundreds of developers and the code they wrote.

Seemingly Facebook survived its growth phase fairly well and didn't need a tiger team from outside to handle it - and without the reliability problems others who chose to use languages other than PHP for their front-end and decided to rewrite their much smaller surface area in other languages.

As much as people may dislike PHP (and I'm one of them), it was definitely "good enough". Many languages may not have been, even if they are nicer languages in some objective way.

(Disclaimer: I actually have to write code in Facebook's PHP code-base every once in a while. But most of the code I work with is Python, Java, or C++.)


> Which language would have been a good technical decision in 2002-2003? It must be easy to onboard people who don't know it onto. It needs to be easy to operate, and not be costly to deploy on the tens, hundreds, and then thousands of servers necessary.

You mean, like the JVM? 2003 wasn't the pliocene epoch, we had a working JVM. If you remember back to the last bubble, in the 90s, we were shipping "easy to operate, not costly to deploy" software on Java post-1998. Java 1.4 was released in 2002, and Java 1.5 -- what most people would say is modern Java -- was only 2004.

Scala 2.0 was released in 2006, Clojure in 2007 -- that's 8 and 7 years ago, respectively.

You really don't think there were alternatives during that long period?

> What was bad about the decision to keep the reasonably well-performing and reasonably suited-to-purpose PHP code for front-end code, and peel off suitable tasks into services like the feed, typeahead, messages, and so forth into languages like C++, Java, and so forth.

In 2004? Nothing. In 2005-2008? Things should have been reassessed, especially before building out a millstone of an engineering team around PHP. Instead, Facebook doubled-down on an actively bad language with HPHP, and the results were hilarious:

   HPHPc required a very different push process,
   requiring a bigger than 1 GB binary to be compiled
   and distributed to many machines in short order.
So then in 2010, Facebook decides to embark on HHVM, and now four years later, we can run one of the most correctness-hostile programming languages around, quickly, with optional static typing.

That's a span of 6 years, and at the end of it, Facebook has functionality they could have gotten for free in 2003. On top of that, the intervening years allowed the PHP mess to become only more entrenched -- who on earth do you think the engineers are that accept a job writing PHP, for Facebook or otherwise?

If I had to hazard a guess, I'd guess that HHVM exists because of a large amount of political inertia in the organization that has everything to lose by PHP being eliminated entirely, and the lack of a strong hand by upper management.

I'd guess that lack of a strong hand by upper management came in no small part from hiring straight-out-of-college graduate Adam D'Angelo -- who had literally zero experience -- to serve as CTO from 2006-2008.

By the time FriendFeed was acquired and Bret Taylor along with it (2009), my guess would be entrenched interests made for a very difficult position for anyone wanting to change the ship's course.


"the intervening years allowed the PHP mess to become only more entrenched"

They specifically choose to do it and not move to an other language. One of the reasons was: PHP programmers are cheap and plenty full and can do quick iterations.

Sounds a bit like you are complaining about other peoples choices, it really is their choice. :-)

I'm not saying it is possible to move to an other language. Just look at Paypal they moved their customer-facing code from Java to node.js and got a very large productivity increase: http://www.youtube.com/watch?v=V5yk5SZxWX4

Obviously the reason Paypal choose node.js are similar to why Facebook choose PHP: quick iterations, means more iterations, which means more experimentation and better results.


> You mean, like the JVM?

In 2002-2003 I'd have quit in disgust if anyone tried to introduce the JVM anywhere I worked.... I still probably would, frankly.


> Using PHP to begin with was a bad technical decision. Failing to establish a reasonable migration strategy was a bad business decision likely rooted in bad engineering management that fell out of starting with bad technical decisions.

Facebook is a ten year old company with a market capitalization of $170 billion, so "bad" is probably not the most accurate word to describe their technical and business decisions.


> Facebook is a ten year old company with a market capitalization of $170 billion, so "bad" is probably not the most accurate word to describe their technical and business decisions.

How does that follow, exactly? They haven't failed, so any inefficient or sub-optimal decisions were the correct decisions?


I guess we should consider your decisions as the "optimal and correct decisions"? Are you a billionaire too?


A rewrite isn't such a ludicrous idea. Reddit is a prime example of a rewrite from Lisp to Python. I would say that's even a somewhat difficult rewrite.


The Reddit front-end is a relatively simple application. It has a page with a list of stories. It has a page with one story and a bunch of comments. It has a page to add comments. It has an endpoint to vote up or down stories, and to vote up or down comments. And maybe a few other less-interesting things, like admin interfaces.

Facebook is surprisingly easy to underestimate even as someone using it a fair bit.

Just try find every single interface in the front-end. For your own profile. Feed is front-and-center, as is timeline. Then look at events, groups, &c.. Look at messaging.

Then look at the interfaces for managing your privacy and permissions. The apps that you're using and information about when last they requested information. Security like login approvals. Then think about the flows involved in reporting content as abusive or inappropriate. For verifying your identity if you've forgotten your password. For adding more security if you log in from a new computer or from a new location.

Then look at pages, including insights and scheduled posts and so forth. Then look at advertising - boosting individual posts, creating campaigns, &c.. Then look at the interfaces for developers. For translators. Interest lists.

The backends for most of this are in C++ and Java. There's a large amount of data processing happening to track hidden things like spam and scam prevention. But the front-end surface area is quite clearly an order of magnitude or maybe two larger than Reddit (at least where it was when this happened).


Okay, Reddit is a good example of a successful rewrite during a growth phase. I had forgotten that. As I've personal knowledge of several unsuccessful rewrites, it would be great to hear more anecdotes about successful rewrites in growth phase and try to analyse what made them succeed.

Rewrite is more likely to succeed if you are not in a high growth phase, but even then it's risky.


Yes and the OP is criticizing the decisions being made now as "yak shaving." Could it perhaps be that the "yak shavers" made a conscious and well-reasoned decision to go in the direction of "extend PHP" vs "throw it all out and rewrite everything in language-of-the-month?"


I'm sure it was conscious and well-reasoned, but the OP's point still stands.

Trying to build a reasonable forward-looking high-performance managed language runtime platform is hard.

Trying to build it atop of PHP is harder.

The only way I could see this as being a smart long-term strategy is if the eventual goal is to isolate and retire PHP projects entirely (and PHP usage itself) over time.

However, even then, with PHP gone, and Hack no longer necessary, you're still stuck maintaining your own incompatible VM / runtime. Is Facebook signing up to reproduce the CLR? Or do they have long term plans to somehow bridge the gap between HHVM and more established VMs/runtimes, where they can better share the maintenance load with the wider industry.


Maintaining your own VM is no big deal. Compared to Facebook, HHVM is a tiny codebase. A team made of relatively small number of people (high quality, but low quantity) can and do maintain VMs like HotSpot and V8. LuaJIT is maintained by a single person.


> Maintaining your own VM is no big deal.

As someone who has worked on a VM, I couldn't disagree more. It can take years to hash out things as simple as ideally performing primitives for a target architecture, and then things change.

Add to that the complexity of optimizing compilers, specification of byte code formats and a consistent virtual machine memory model that can be relied upon across architectures, and the art and science of highly concurrent garbage collectors, and your "no big deal" is a load of hogwash.

Hotspot alone is nearly 20 years of big deal.

> A team made of relatively small number of people (high quality, but low quantity) can and do maintain VMs ...

The number of people doesn't matter in this equation; your small team of (expensive, rare, high quality people) can't build a world-class VM in a day. Or a month. Or a year. Maybe in 5 or 10 years, just ask Microsoft.

> ... like HotSpot and V8. LuaJIT is maintained by a single person.

LuaJIT's said "single person" has been working on it for what, 10 years? It's an extremely impressive implementation and I don't want to bag on it, but even still, it lags in certain areas, eg, its GC implementation isn't up to par with the state of the art.

The author's skillset is extremely rare, and LuaJIT itself is an anomaly in the field. Using such a one-off example doesn't really hold water to prove that it's ideal for a company to internalize maintaining a VM for their own custom language built on top of PHP.


I am not trying to belittle efforts necessary for the state of the art VM or programming language implementation. I get paid to do these stuffs, and I am on my third VM/PL project now. It is also true these things take time and not very parallelizable, so while man-month may not be that big, you can't make it faster by throwing more people.

On the other hand, I maintain it still is no big deal compared to rewriting Facebook. I also maintain while skillset is rare, Facebook apparently had no trouble so far and will have no trouble in the future finding (I remind you, small number of) people to work on VM. I also remind you Facebook has been working on alternative PHP implementation for 6 years now, 2 years in private(2008~2010) and 4 years in public(2010~2014). It has been profitable for them for 6 years, will be profitable in the future, and profitability does not need "sharing maintenance load with the wider industry". They can maintain it fine thank you very much. Because, in the end, VM is no big deal.


If they're wasting money on bad management decisions, they're wasting stockholder money.

They're also continuing to propagate an outwardly facing engineering culture that will make it even harder to hire people to help dig them out of the PHP hole -- perpetuating this further.

Your argument is simply another take on survivor bias fallacy.

> I get paid to do these stuffs, and I am on my third VM/PL project now ... Because, in the end, VM is no big deal.

You keep saying that, and yet, there keep being so few high quality VMs.


What do you consider to be high quality VM? How many do you expect to see and how many do you find?

Adaptive JIT and generational GC would be a good baseline. Limiting myself to open source VM, I think (at least) HotSpot, Mono, V8, JavaScriptCore, PyPy, SBCL, Racket qualify. J9, CLR, Chakra, Allegro CL also qualify, but not open source. SpiderMonkey, LuaJIT, HHVM lack generational GC. All these projects are actively developed, and there are doubtlessly more, e.g. I am not faimilar with Smalltalk VM, some of which are commercial. Research VM like JikesRVM, Maxine qualify. I believe Bartok qualifies too.

I am not sure what you are arguing for. If you are arguing for Quercus route(PHP-on-JVM), I think it's unclear Quercus route is better than HHVM route. If you are arguing for not running existing PHP codebase, I think you are being unrealistic.


It's not survivorship bias. Facebook is an existence proof that there is no "PHP hole" that they are in, that it's largely a myth propagated by programming language nerds who have never tried scaling a site in PHP. When was the last time you heard about a site closing up because of PHP-induced technical debt? You don't. People re-write sites because of poor architecture, not because of poor programming languages, and PHP (in general) does not prevent you from building a site with good architecture, both from a software structural standpoint and an operational standpoint.

PHP's APIs are ugly. It's language semantics are a bit hairy until you get the hang of it. But there are parts of PHP that are extremely elegant and easy to reason about. It's OOP support provides all that you need to produce re-usable and easily understood code.

Facebook's work on PHP has focused on largely two dimensions: reducing CPU cycles and increasing static/runtime type checking. The former is something that only really matters at massive scale: PHP is generally fast enough since most of the time PHP processes are I/O bound reading from a database or memcached. It's only for sites like Facebook where if you squeeze out an additional 10% TPS from your boxes that you will start seeing large absolute cost reduction that this level of optimization starts to make sense. On the type checking side, this is something you might start to want in any dynamically typed language when you have millions of lines of code and want to ensure basic guarantees that it will run, and is something that you'd probably see Facebook doing if they were a Ruby or Python shop anyway. It has nothing to do with PHP but with the classic dynamic vs static typing tradeoff.

Should you be writing your chat server in PHP? No. But 90% of the code you write for a large website is HTTP response code rendering HTML or JSON. PHP excels at this and you can pretty much hire any developer off the street to start cranking out code if you give them a solid foundation to build on.


Facebook has already proven that they are able to make improvements that have substantially helped them to the point where this team is likely paying for itself many times over. It doesn't need to be perfect - it needs to offer return on investment, and it has.

It's possible that they could eventually get a total rewrite to give a better return, but frankly I don't think you have any idea of the enormity of trying to convert a multi-million line production platform from one language to another.

In any case, one does not preclude the other. Arguably, many of the changes they have made, such as gradual typing, and their ability to now slowly introduce other changes without breaking their existing codebase, means they have a platform for slowly firming up their codebase and migrate it towards a position where a full rewrite (should they decide to do one in the future) could be made substantially less painful.


My example, "OCaml", is not a "language of the month." Its roots are >30 years old and wasn't developed by someone in their basement over the weekend. As stated, Facebook even uses OCaml, among other languages, for good reason.


You try hiring and/or retraining enough engineers to be able to make a switch to OCaml, and see how much it'll cost you.

I detest PHP, but I've still more than once made the choice to do apps in PHP motivated by developer availability alone.

It's not a great language, but with some discipline it is also not nearly as awful as some people like to think, and you can make up for a lot of awful with the difference in ability to hire experienced engineers who know PHP vs. many of the less common languages.


I don't like how your comment implies that "throw it all out and rewrite everything in language-of-the-month" is the only option. You could also "throw it all out and rewrite everything in better-AND-mature-language" or even "throw the worst parts out and rewrite incrementally in better-AND-mature-language".


I just want to clarify something: I am not calling anyone stupid. Generally the engineers at Facebook are smart, talented, and good at getting things done.


I think PHP has some goodness that makes it the poor's man java.

Neither Ruby or Python provide that for instance.

Though Ruby is in my opinion well designed,and with duck typing,you might not need all that Java like OOP, I dont know,I feel like having interfaces makes me understand code faster and better,

Want to understand what an object does? just read the interface,no magics,no bullshit,it's self documenting and that's important,that's why you can have these huge codebases like Zend,Symfony or Doctrine and still understand how complex elements work together. And even figure out how to use them without a doc,just like Java.

I feel I cant go to Django's source code or Rails source code and understand everything easily just by reading function signatures.

And the write/refresh cycle makes iteration pretty fast during development. But yeah,PHP basic syntax sucks for sure.


Facebook is doing an admirable job replacing rotting pieces of their infrastructure with more robust replacements.

HHVM replaced the execution environment for their code with a more robust code generation/ runtime system.

Hack allows them to bootstrap their code base into higher degrees of reliability without a mass rewrite.

Also Brian O'Sullivan is one of the best people on the face of the earth to be trying to find practical ways on integrating PL research into practical engineering.


Are not coral reefs both beautiful and built upon the dead bodies of those who came before?


People also cut themselves easily with them.


Yes, but unlike Facebook, they are largely submerged in seawater.


Facebook is literally 60% water, whence they came. Pretty wild when you think about it.


That 60% figure pertains to average humans. Those skinny hipsters are mostly calcium :)


Like coral reefs! And hipsters are built from the dead bodies of vintage things (that you probably never heard of because they're pretty underwater/ground/whatever).

Ok, I cede the point. PHP has the retro chic and Hack's type system is fixed-gear.


>I have wondered why FB didn't use a proper language with proper typing to begin with.

Because there are factors like an existing codebase that works that trump BS idealism.


(Disclaimer: FB employee.)

Do you also argue that C++ is built on top of a crumbling foundation, because it's based on C, a ancient language with almost no type-safety?


Why do you say C has no type safety? Only way around the type system is void, explicit casts and I guess unions.

It admittedly doesn't have a very advanced type system, that's true.


> Only way around the type system is void, explicit casts and I guess unions.

And typedefs. Given that there's no parametric types, you run into void* quite frequently as well, so saying "only" inaccurately minimizes the scale of how much C code isn't strictly type safe.


Right, void* is used quite a lot for "generic" data structures. But I'm not sure that's what he meant. The reason I said "only" was that from my own experience, most C data structures are tailored for a specific use and so I don't see void* too much in this context.

And may I ask why you say "typedef" is unsafe? It is merely a type alias, like e.g. Haskell's and ML's "type", or isn't it?


> It is merely a type alias, like e.g. Haskell's and ML's "type", or isn't it?

It is, but it freely allows conversion between the aliased types.

    typedef int Feet;
    typedef int Meters;

    int main() {
      Feet height = 6;
      Meters inEngland = height; // <-- OK. :(
    }


I'm pretty sure you and the parent were thinking of Haskell's `newtype` not `type`

λ> :{

Prelude| type Feet = Int

Prelude| type Metres = Int

Prelude| :}

λ> let a = 5 :: Feet

λ> let b = 4 :: Metres

λ> :t a

a :: Feet

λ> :t b

b :: Metres

λ> a + b

9

λ> let f = id :: Feet -> Feet

λ> f b

4


C++ attempted to fix type safety issues in C.


You don't need to add a disclaimer about your workplace. It doesn't contribute at all to your question.

I do not think C is a "crumbling foundation". I would not suggest that C be used for large scale engineering efforts, but it's a relatively well-defined language with semantics dictated by a formal standard. A lot of research has gone into C compilers, which are state-of-the-art.

With that said, one of the biggest complaints I hear about C++ is that it still has the legacy of C embedded in it.


Something I didn't really learn in school, that I only picked up much later in my career, is that when applicable, applications should be built on top of mathematical models. A very contrived example would be.. would you build an ad rotator by keeping counts of all banners served and picking the next banner based on those counts and your weighting rules, or would you build the rotator on a foundation of statistics and probability with a little extra logic for handling caps and edge cases?


> would you build an ad rotator by keeping counts of all banners served and picking the next banner based on those counts and your weighting rules, or would you build the rotator on a foundation of statistics and probability with a little extra logic for handling caps and edge cases

Actually, building on a foundation of solid statistics and probability will often result in an algorithm that essentially counts views and applies weighting rules.

As another example, Pagerank has a nice theoretical basis, but power iteration reduces to perhaps 5-10 lines of C.


I built an ad engine base on probability. But I am not sure which way is better.



This seems an unequivocal improvement for Facebook, since they're unwilling to move away from PHP. The better question is why would anyone else choose to build their company on this?


For the same reason there are companies writing software for Windows. Or companies putting Windows on all their workstations.


As far as I can tell they are using local inference, which is basically just unification. The set of possible types seems pretty narrow[1] as well so I don't see much room to go wrong. You're right that there is a lot of mathematical theory about type systems and that inference can easily go wrong (be undecidable) but that is mostly for type systems that try to do inference for higher rank polymorphism and other things, which it doesn't seem like Hack is. Also I guess the language is supposed to be a superset of "valid" PHP, although I don't know whether this is true without modifying the PHP program much.

http://hacklang.org/manual/en/hack.annotations.types.php


Local inference isn't just unification. In particular, most local inference algorithms are designed to work with subtyping, which doesn't work in ML-like type systems.


I assume that "local inference" means, in practice, "no let-polymorphism". That's what causes most of the headaches with extensions to Hindley-Milner (including the undecidability of subtyping).


True, I didn't see any mention of subtyping on there but since it's PHP I guess they have to deal with it somehow.


I have seen many python programmers that are PHP haters, are you one of them? just curious, no offence. I like both Python and PHP, but use PHP for commercial applications.


I very occasionally write Python code if my job calls for it, but I am not a fan.


Spelling correction: I said "layering FP" but I meant "layering PHP". The edit window has since come and gone.


"And in the end, it's still PHP, which is duly disliked."

You might dislike it, but that doesn't mean it's disliked. PHP has a giant base of programmers, scales, is easy to learn, is extremely versatile and powerful, and as you point out, the code was already in PHP. Only an idiot would rewrite a giant working codebase simply to have it in a language that's "difficult to hire and train" in. Or any language for that matter. Perhaps they could have pulled a netscape but instead decided to serve billions.


>> I am baffled as to why you'd build your castle atop a crumbling foundation

I think you're using a wrong metaphor. Facebook foundations can't be crumbling just because they're made in PHP. There wouldn't be Facebook as we know it today otherwise. You might say they used a "low quality" material to build them. I see Hack more as a better material, that can also bind with the previous one and make it stronger.


I think the short story is that the engineers at Facebook feel pretty productive with their stack today. Making a new language that basically fixes up PHP is ideal for them because it gives them good confidence they can get the benefits of static analysis without sacrificing much productivity. That level of productivity + the sheer numbers they have make Hack a more attractive option.


If I recall correctly PHP is banned at Amazon, even for internal tools, mainly for security reasons. The team that gives security clearance won't even take a cursory look at the service if it depends on PHP in anyway.

Disclaimer: I have no zero experience with PHP.


Because your perfect OCaml, ML, Haskell, and any other fancy, magical, fabulous, eternal, fantastical, simplest, elegant ... languages are all atop crumbling foundations implemented in ugly, stupid, out-dated, evil, chaffy ... C, C++, and assembly running on inefficient, silly and fragile digital logical CPUs.

And when Facebook uses this stupid technique to build the world's largest social network for more than one billion users, those elegant and perfect solutions are serving ... how many?


> (3) a corresponding runtime for each

That's not really true, HPHP and HHVM share the runtime (mostly).


But at the same time, layering FP with a home rolled static type checking server (??) is bug prone

Clearly the home rolled stuff is working out for facebook. Twitter was/is on RoR and continues(!!) to have major downtime issues.


Twitter could have been written in any language and worked, but given their old architecture decisions, they'd been just as likely to mess it up in every language.

They blamed RoR a lot because they needed an explanation for their problems, but "many to many" messaging even at their scale is a "solved" problem and has been for decades and is fairly easy to scale.

(Think of Twitter as a bunch of mailing lists and mailboxes; you scale it by decomposing it: map accounts to virtual "buckets" that becomes the domain part, and map tweets to messages; break apart large follower lists into smaller ones and introduce a forwarding reflector; break apart large following lists by splitting "mailboxes" and doing zipper merges on reading it; add a caching layer -- this is not rocket science, and you could do it properly in any language)

Note: I think RoR was a horrible choice for them, though I love Ruby, but I also don't for a second believe RoR was their real problem.

Edit: Your overall point stands, though. Especially given that Facebook is a far more complicated application.


Twitter might have some legacy Rails hanging around, but it's a beastly Scala system these days.


What if you could build a castle on top of a highway?


I'm the manager of the team that developed Hack, and I'm sitting here with some of the language designers. Happy to answer your questions.


Hi Bryan, I know most people know you from your prolific work on many great Haskell libraries ( Criterion, Attoparsec, Aeson, ...). Did Haskell have any role in the development of Hack? Looking at the code base it seems like the type system is primarily written in ML, what made the team decide to use OCaml over Haskell?


As you note, the team developed the typechecker in OCaml, as that's what the founding engineers were familiar with. Many of ML's cousin languages happen to be well suited to this kind of work.


Have you considered using some standard library replacement like Core?


Say I'm starting out with an entirely new project and want to leave the legacy of dynamic typing behind. Is there a flag available to enforce the use of type signatures, causing Hack to throw a compilation error when they're omitted?


Yes, "<?hh // strict" at the top of a file will do the trick.


Great! Thanks.


Do you imagine a future where Hack will merge back into PHP (like PHP 7), in the same style that Beryl & Compiz then rejoined? Or does the team intend for the two to always be adjacent-yet-separate?


HHVM developer & PHP runtime developer here. I've got hands in both runtimes and all even I can say is: Maybe.

I think the most likely outcome is that PHP will adopt some of HHVM's additional features, but remain a separate project. IMO that's a great outcome, since we'll both likely drive the other to be better.


Hhvm is not a fork of php. To my knowledge it's a radically different codebase that reimplements php's api.


My impression is that Facebook mostly write their stand-alone services in Java or C++, and are using PHP only where they're "stuck" with it due to a large existing code base.

Do you think Hack is a good language to start a new project in, compared to non-PHP languages? Are you using Hack for things besides the main web page?


Engineer working on Hack here.

Yeah, I think Hack is a good language to start a new project in. For as much flak as PHP gets, there are actually a lot of good things about the language. The fast development cycle -- edit php script, refresh -- is something amazing that you don't get in a lot of statically typed languages, which usually have a compilation step. The crazy dynamic things you can do also occasionally have their place, though it's certainly easy to shoot yourself in the foot.

On the other hand, a lot of the time you want the safety that strong static typing can give you. Even just the null propagation checking can immediately find tons and tons of silly little bugs without even running the code, and ensure that the code stays consistent as a "mini unit test" if you will.

Hack hits the sweet spot of both. Wiring the Hack typechecker into vim was really revolutionary for me -- having both the immediate feedback of the type system for all the silly bugs that I was writing, along with the fast reload/test cycle from PHP, is great.


Er, `paste serve --reload` restarts small-to-medium Python projects faster than I can alt-tab, which is actually faster than my static blog engine can regenerate itself too.


The Facebook codebase is not exactly a small-to-medium project, so there is a fair bit of value in edit-and-reload.


The parent comment was about Hack's being good for a new project.


Is there a statically typed variant of Python that would work with existing web servers, etc..? I am aware that there's Cython and I know that py3k technically permits type annotations (which Jetbrain's Python IDE uses quite effectively), but that isn't true static type checking in the same way as Hack does this.


Statically-typed Python would ruin a great number of Python libraries you'd probably want to be using. It'd be a very different language.

Once you have type annotations, it wouldn't be too much of a stretch to enforce them statically with a separate tool. You could even go as far as rejecting first-party code if you can't statically determine every single value's type. Pylint's underlying astroid library has a bunch of inference tools you could perhaps build on top of.


Such a tool would be no less difficult to build than hack itself, though. Hack's "gradual typing" solves the problem of re-using existing code that you've mentioned.

FB already had a PyLint-like tool earlier that could do some static analysis, namely pfff (also open source and written in OCaml), but it did not provide a full-on static type system like Hack. (Background: I used pfff when I was in bootcamp at FB itself. This was however prior to hack, I worked solely on C++, Java, and a bit of Python at FB after bootcamp).

I am sure if FB started off with Python, a similar solution could have been found, but if you're looking for a tool that exists _right_ now, Hack is actually quite decent.

Creating a static type system, implement local type inference, as well as working out "gradual typing" and associated problems (all while being able to do type-checking at speeds developer _expect_) is not a trivial problem.


The announcement post says the actual type-checking happens in a persistent server that watches for filesystem changes. That sounds pretty close to continuously running a linter. It also says "without breaking things", so I'm a little fuzzy on whether badly-typed code will actually execute or not.

For that matter, can you call a typed function from an untyped one? Or, worse, a typed method? If the typing is purely static, there's no way to know the method you're calling is actually typed, so there's no actual guarantee that it receives the types it's declared unless your entire program is typed. It doesn't seem like a very strong guarantee if both the caller and the callee have to opt into the typing.

If you're looking for a tool that exists right now, you either have an existing codebase and can't port it to Hack if it's not already PHP anyway (for the same reason Facebook couldn't port away from PHP), or you're starting from nothing and could just use a statically-typed language in the first place.

I don't know if I'd even be excited about the prospect of optional static typing in Python. (It hasn't gotten me interested in Dart, for example.) I'd kinda rather see the effort poured into something that could do static duck-typed analysis/inference, e.g. balk if I pass an argument that could be a non-string into a function that tries to call `.startswith` on it. (Ah, but maybe it could theoretically be a string or None, and I only know it isn't None for reasons the type system can't see, and now I hate the type system.)

I didn't say it was a trivial problem. I just don't feel excited by the solution.


The default mode in Hack is partial: in partial mode, the code itself must be typed and must past the typechecker, but it can call untyped code (that's in a separate compilation unit).

Another mode (you specify the mode per file/compilation unit) is "strict". In strict mode, you can not code any un-typed code (note, the standard library is typed with hack).

(There is a bit more nuance here, but you can read that in the documentation.)

So the idea is to eventually migrate most of the code to strict, but code that relies on legacy can remain partial and you can write new code without waiting for a re-write to finish.

See http://docs.hhvm.com/manual/en/hack.modes.strict.php http://docs.hhvm.com/manual/en/hack.modes.partial.php

"Shapes" are also a neat feature specifically for parts where static typing can be frustrating for dealing with HTTP requests specifically: http://docs.hhvm.com/manual/en/hack.shapes.php

(FWIW I don't see myself using Hack, but I'm not a web developer. I'd say the ML family languages are my favourite, but for what I do day-to-day it's not really an option.)


Do you know if other languages have static null checking i.e. your null annotation/propagation (apart from Haskell's Maybe union type etc)?

I'm intrigued, because it's such a good idea (especially when null's originator claimed it was a "billion dollar mistake"), though Java doesn't have it. I'm wondering if there's some subtle problem with it...?

Also, how do you make your vim typechecker fast enough? Usually, even syntax colouring is local to ensure adequate performance - and a typechecker with inference/propagation would be very non-local.


Java 8 adds this exact feature through extending annotation capability and adding hooks for pluggable type checkers, including a null propagation checker: http://docs.oracle.com/javase/tutorial/java/annotations/type...

and an Optional<T> class (references to which can still be null for maximum hilarity): http://download.java.net/jdk8/docs/api/java/util/Optional.ht...

I'm stoked. And disappointed the "elvis" operator didn't make it in.

I hope some day Java breaks backwards compatibility and eliminates null entirely. Then again, that's already happened with the proliferation of other JVM languages. But that doesn't me at my day job, where we have a large legacy code base... which would need to be ported to a backwards-incompatible version of non-null Java anyway. Hm.

Well, with Java 8 I can at least start to grow null-safe code within our codebase.


>The fast development cycle -- edit php script, refresh -- is something amazing that you don't get in a lot of statically typed languages

You seriously think that? That's how we do haskell web development. Both yesod and snap do this out of the box. That's how every java developer I know works.


I have a little bit by way of Haskell chops, and I'll venture that the performance of the Hack typechecker is a very big deal, and it is in a different breed than the turnaround time you get from snap or yesod (or Java).


what makes it a different breed?

Edit: I misread what you wrote. I thought you were saying there was something fundamentally different about the type checker, but more that reload experience is different then what you get with Yesod.


"Type checking is faster than other statically typed languages" is quite different from "other statically typed languages don't offer this workflow", which is what was claimed.


How do you do it in java? Compile (takes time) -> hotswap (takes time), or can't hotswap since changes to signature, will need to restart server (takes lot of time).



Yeah, that's the one we're using. Still takes a couple of seconds, though. And as I said, big changes can't be reloaded, so the whole server will have to be restarted.


>Still takes a couple of seconds, though

How big is your code base, and is it all in one huge file or something? Our stuff is reloaded and ready before I've alt+tabbed back and hit refresh.


Thanks for taking the questions!

How extensively is Facebook using Hack at this point? Is it in production?

What has been the biggest learning/unlearning you've needed to do going from PHP to Hack?


100% of our web front end developers use Hack now. This has been an organic process of growth over the past year, by which I mean our engineers are using it because they like it and see value in it, not because there's someone standing over them with a big stick :-)

The biggest learning step for our engineering teams was to treat type errors from Hack as actual logic errors. We have a collection of "linters" that provide advice on code style and other nice-to-have factors. Some people (quite reasonably) initially thought of Hack errors as lint-like stuff that it was safe to ignore, when in fact they indicate real logical inconsistencies in code.


> Some people (quite reasonably) initially thought of Hack errors as lint-like stuff that it was safe to ignore, when in fact they indicate real logical inconsistencies in code.

Interesting, thanks!

One followup: what has it been like from an ops perspective? Similar to PHP, or is there a better frame of reference?


Can you rephrase your question about the ops perspective? Is there something in particular you want to know about?


Sure. I guess I'm wondering how it compares to running plain old Apache/PHP in a production environment. Or, is it more like a Django/Rails stack? Does it use the same memory footprint as PHP, etc?


Facebook hasn't used plain old Apache/PHP in production for several years (HipHop for PHP was announced in Feb 2010), so it is hard to compare.

HHVM is its own web server (although it supports FastCGI for easier use with existing infrastructure like nginx), and that's how we use it in production. It's hard to compare memory usage, except at scale, where it benefits from not having a whole bunch of interpreters (in different processes) running at the same time and some other benefits by using more appropriate types to store values through type inference.


On the scale of a single request (doing stuff from the command line), most benchmarks I've seen are about half the memory footprint. Obviously that's going to vary program to program ($data = file_get_contents("some2gbfile.txt"); is going to take 2GB, regardless), but for "normal usage" 1/2 looks fairly common.

On a webserver, that goes further since HHVM is single-process/multi-thread, whereas PHP (in its typical setup) is multi-process/single-thread. This cuts HHVM's memory overhead much further.

Yes, PHP can run multi-threaded, but it still has a number of instability issues in that mode.


At a glance it looks like Hack is to PHP what TypeScript is to JavaScript. Is that a fair analogy?


Yes and no. Yes, because TypeScript is bringing a type-system to a dynamically typed language and so did Hack. No: because Hack is bringing some additional language features affecting the runtime. Modest changes for now, but we intend to carry on in that direction.


> No: because Hack is bringing some additional language features [...]

TypeScript added classes, interfaces, modules, and arrow functions.


The key difference is affecting the runtime. All those typescript features can be compiled down to regular JS and typescript-generated javascript can be run in a browser without extending the JS engine.


Well, it seems Microsoft is doing similarly with Typescript, except instead, waiting for the spec to catch up. I like both ideas.


I don't know what you mean by that... stuff like classes were already in the harmony proposal phase before Typescript implemented them. Stuff like type hints won't be in JS ever.


At least to date, Microsoft stresses the "TypeScript is just JavaScript." New language features are added to ES6 and then wrapped with types in TypeScript.


Great work!

A question about the type inference mechanism: in PHP, while it is possible to define object interfaces which classes may be defined against, by the dynamic nature of the language, functions don't necessarily need that interface specification to accept an object conforming to it, explicitely or implicitely. OCaml provides something "similar" with its object system, but much more powerful with static inference of an object (super)type from its usage. will Hack be able to infer an object interface as well from its usage in a function?


There seem to be 2 questions here: 1) Is Hack type inference total? Answer no: you must annotate parameters and return types.

It would be pretty much impossible to implement total type-inference without loosing separate compilation in Hack. PHP projects are not organized around a module system, which means that you have "spaghetti" dependencies, and even better, cyclic dependencies all over the place.

So trying to implement total type-inference would be a bad idea, you would not be able to separate the code in independent entities and the checker would not scale.

2) Does Hack support structural sub-typing? Answer: No, but not for obvious reasons.

Fun fact, the first version of the type-checker was implementing structural sub-typing. And it was not scaling, for subtle reasons.

Hack allows covariant return types, so if we implemented structural sub-typing we would have to "dive" into the return types of each method to see if they are compatible. But in turn, these objects could have covariant return types etc ... The process of checking that was too inefficient. Caching is a bad idea (or at least a non trivial idea to implement), because of constraints and type-variables.

Since disallowing covariant return types was not an option (it was crucial to make a lot of code work), we had to kill structural sub-typing.

I hope this answers your question. As a big OCaml fan myself, I like the features you just mentioned (Well, Hack is written in OCaml), but they really didn't seem to be a good fit due to the nature of the language and the kind of checking speed we were shooting for.


Thanks! I was mostly thinking about point 2, and I understand your motivations in going in a different direction after trying it. Very good and enlightening answer!


Engineer working on Hack here.

We don't do any type inference across function boundaries, so we largely dodge the issue that I think you are getting at. (Please elaborate if I misunderstand!) We rely on interface and class definitions in order to know what methods are available, and even though the runtime resolves everything at runtime so you can call any method that happens to exist at that time ("duck typing"), we enforce statically that methods do exist where we can. So for example, the following code will work at runtime since the method `g` does exist, our static typechecker will reject it since it does not exist on type `I`.

  <?hh // strict
  interface I { public function f(): void; }
  class C implements I {
    public function f(): void {}
    public function g(): void {}
  }
  function f(I $i): void {
    $i->g();
  }
  f(new C());


Thank you! You brought an interesting point with the example you gave, it is one of the things I was thinking about. Thank you for your work, and for making it available!


Why did you pick the name "Hack"?


because it is a HACK !!?!


It involves PHP so...


Pretty star-studded cast you have on the core team, there.

What's your motivation, aside from modernizing Facebook's code base? What language niche does Hack serve which is not served by other languages? Why Facebook at all (aside from the rarity of finding a company to pay you to write a compiler :))?


The sweet spot that Hack hits is that it combines gradual typing (an idea that hasn't yet seen much real-world adoption) with an incredibly fast typechecker.

This lets you choose the pace and extent to which you want to adopt the safety of static typing, while preserving your dynamically typed code -- and without sacrificing the rapid turnaround of PHP.

That's a unique combination, in my experience.


How does your approach compare to Typed Racket and Typed Clojure? Could they conceivably achieve the same performance or is there a fundamental difference?


Does Typed Racket have DrRacket support for instant feedback on errors? (It didn't 4+ years ago when I used Racket.)

Typed Racket also includes more types, like ": Integer [more precisely: Negative-Fixnum]" (from docs), as well as polymorphic data structures and higher-order functions. I don't know if this is a "fundamental difference" but it might mean Facebook's type checker can optimize in certain ways that Racket's cannot.


Yes, Typed Racket does have that -- DrRacket continuously expands in the background, so Typed Racket gets it for free by integrating into the macro system.

The Typed Racket type checker is much slower than Hack, though.

Hack does seem to have polymorphic data structures and higher-order functions.


It looks like you've done a good job with closures, type inference, etc...

Would you consider adding even more expressive features like algebraic types or hygienic macros to the language?

Also how likely do you think it is that typical PHP will eventually be overtaken by HHVM? I'm hoping it will!


Why use

public string $x = '';

instead of

public $x:string = '';

It seems inconsistent to me, probably because I've used AS3/Haxe.


PHP is based on C. It's way more consistent with C, Java, C# , etc to do the former. So the real question is why

    function foobar() : int { .. }
instead of

    function int foobar() { .. }


1. function foobar() makes it easier to grep for. 2. if I recall correctly, it made some parts of the grammar easier.


Yeah, you could go either way on it really. It seems weird to mix them, though!


Its not just inconsistent with other languages, its consistent with their own language.

Property types are declared before the property name, while function return types are declared after (and with a colon).

Very strange.


If you look at the original C syntax for function types, it is also a bit weird: part of the type (the return type) is placed before, and part (the parameters) after.

That syntax in hack seems more "functional" to me. That said, I must admit that I kind of like the C syntax, because of its oddity.


But inconsistency is consistent with PHP's philosophy ;-)


I too programmed for a long time in AS3/Haxe, and while I prefer variable:type as I think it's more readable, most static languages (java, C#) do it the other way.


Yeah, I'm just not sure why they have public function get_name(): string but use public string $x = '';

The incosistency drives me a bit crazy.


Thank you for your work on this and for open sourcing it. It looks like it can make large code bases in PHP a great deal more manageable.

Do you expect Hack to be stable without large breaking changes going forward?

The documentation doesn't say much about the scheduling for asynchronous tasks. Can async functions be used to batch requests to e.g. caches and databases? Can Awaitable be used to interface with code that expects chainable Future or Promise-style interfaces?

Can HHVM/Hack use standard PHP extensions, or how much work is it to port an extension?

Is there a Hack plugin for IntelliJ?

Does the Hack project have a mailing list or forum?


"Can HHVM use standard PHP extensions?"

Our internal plumbing is different, so the extension APIs aren't the same, thus PHP extensions can't normally be used with HHVM. Paul has a source-compatability transformer which works for simple extensions (no fancy stuff allowed), but we recommend anyone with an extension consider porting the code.

If your extension is one of those "This was written in PHP but we wanted it to be faster so we turned it into C" types, you should consider going back to PHP for it. In practice, we find that HHVM can JIT PHP code into running about as fast as (sometimes faster than) C/C++ extension code.

If you do need to dip into C/C++ (because you're calling an external library), the API we've designed is a LOT easier to work with than PHP's. And I say that with the authority of having written THE book on writing PHP extensions: http://amazon.com/dp/067232704X .

I'm working on proper documentation for extensions as we speak, but for now you can find some very basic info at https://github.com/facebook/hhvm/wiki/Extension-API .


I like the async/await additions. Seems very similar to the .NET implementation. Was this an inspiration?


Yep, we're happy to be inspired by good ideas when they're obviously the right path to walk.


With Async functions I'm not sure if the documentation is lacking or I'm just slightly thick, I believe the later.

(I'm not a php developer so excuse my ignorance)

It seems you only have 1 method 'await', to check if the async function has completed its job (it self, if I'm understanding is a blocking operation in which ever thread/worker its called).

So is there a way to run something more similar to

      $i = 1;
      while(doingSomeThing)
      {
             $b = "";
             if(completed gen_foo())
             {
                    $b += await gen_foo();
                    do_critical_task($b);
             }
             else
             {
                    do_something(do_item[i]);
             }
       $i++; 
       }
Because as I see it now, what ever I run asynchronously I.E.:

       gen_foo(get_user_data); //async
       $x = do_something_for_a_while(); //do something while gen foo processes
       do_new_thing($x, await gen_foo()); //do something with rendezvous result
I'm forced to time out my async calls so that they'll rendezvous in the same place won't I?

Again if I'm completely off the mark please let me know.


async/await allows cooperative multitasking on a single thread: stay tuned for more...

The way it works is quite different from your example:

  async function gen_foo(): Awaitable<string> {
    echo "until we get to an await, eager execution\n";
    // ...
    await gen_bar();
    // this code is only executed after await
    return 'result';
  }
  $x = gen_foo(); // x is a handle, suspended at the point of its first 'await'
  await $x; // ... and now it's resumed
  
The benefit comes from being able to batch these async functions together:

  list($x, $y, $z) = await genva(
    gen_foo(), 
    gen_bar(),
    gen_baz();
  ); // genva creates a wait handle for awaiting its args

  // ... and when we get here $x, $y, and $z are all assigned


Generics, lambdas, type annotations. This looks awesome, seems like the next generation for PHP programming.

Can I embed it? Or extend it via compiled binaries written in C/C++?


I'm working on proper documentation for extensions as we speak, but for now you can find some very basic info at https://github.com/facebook/hhvm/wiki/Extension-API .

As the author of pretty much THE book on writing PHP Extensions ( http://amazon.com/dp/067232704X ), I can say with a fair bit of authority that writing HHVM extensions is SO SO SO much easier.


Is this just syntax sugar or there are performance improvements upon php?


Static types are more than syntactic sugar, they allow you to make additional guarantees about the behavior of the code, like having unit tests without actually writing them.

I program in php every day and miss static typing (as opt-in mechanism). Sadly i'm on windows and use oracle, so hhvm is not applicable.


We are just beginning to use the static type information that Hack provides to yield performance optimizations, and that's definitely on the cards as something we'd like to push further.


Hi and thanks,

Are you planning a Windows build ?

do you think Hack could replace all PHP stacks (PHP cloud stacks , not talking about shared host wordpress toys) in a 5 years time frame ?

What do you think about the current state of PHP?

what would you tell PHP core devs about that?

Does an async frameworks like React runs with Hack?

What can you answer to people that are concerned that it is Facebook who is developping this?


> Are you planning a Windows build ?

Yes. auroraeosrose is working on it (the same person who did the port for php) https://github.com/auroraeosrose/hhvm/tree/win32_start


Why on earth would the solitary example for a shiny new language, use the awful mysql_ library for PHP instead of PDO?


whats the state of the type system like? Here[1] it suggests that it may have bugs, is this a 'covering backside' thing or is it currently unsound?

[1] - http://hacklang.org/manual/en/hack.annotations.summary.php


Engineer working on Hack here.

It's complicated.

If you use partial mode, which doesn't enforce that every function is fully typed, then it's trivial to break the type system by just using an untyped function. In order to ease conversion, we just assume the programmer knows what they are doing with untyped functions and let anything pass.

If all of your code is in strict mode, then we believe the type system to be sound. We haven't done any formal proof of this of course, and there have been plenty of bugs in the past. But that's the goal.

Some types are enforced at runtime -- just like PHP5 enforces class type annotations on parameters. However, at least for now, the runtime doesn't do a lot of the clever things with the type system that the static typechecker does. It's much more conservative with using the type information, and doesn't do things like check generics at all. (At least right now! We probably will change this in the future.) This means that we can play a little more fast and loose with the type system in the static parts; we want it to be sound, but it doesn't have to be right now; it's not going to cause a JIT crash or anything.

There are more details on the modes here: http://hacklang.org/manual/en/hack.modes.php


What was the rationale for being inconsistent in the type notation? Why not be consistent like most languages?

For example, in Java you'd write: int add(int x, int y) { ... }

And in Go you'd write: func add(x int, y int) int { ... }

But it looks like in Hack you'd write: function add(int x, int y): int { ... }

Both Java and Go are consistent, either the type comes before or after the name, no matter which context you're dealing with, be it return type or argument type. Hack seems to mix the two styles, which seems like it would make code harder to read as things get more complex, like when functions accept other functions as arguments.

Am I reading the documentation right? And, if so, can you shed any light on the decision making process that went into this decision?


You are reading the documentation right. I wasn't here when the syntax was codified, but my understanding is the following. For parameter types our hands were tied, since PHP actually already has this syntax for object types (unless we wanted to needlessly break compatibility here). We just extended it with primitive types, generics, etc. For return types, we wanted to preserve the greppability of "function add" in large codebases, both for the actual "grep" tool itself as well as for any other tools like ctags that look for strings like this.


> If all of your code is in strict mode, then we believe the type system to be sound. We haven't done any formal proof of this of course, and there have been plenty of bugs in the past. But that's the goal.

When I look at this from the docs, it seems unsound:

"Hack treats traits as a stand-alone entity during the type checking process. In other words, it ensures type consistency within the trait (i.e., as a black box, so to speak), but does not "copy and paste" the code into all of the classes that use the trait and check for type consistency there. The reason this is done comes down to performance."

Is there something I'm missing that does make this sound?


I'm not sure why you think this is unsound. We check traits in isolation -- we ensure that methods you call are either defined in the trait or declared abstract (and so must be defined in the including class). We also added "trait requirements" to the language, so you can say "the including class must implement this interface" ("require implements IFoo") and we'll know that in the type system too. This means that we can ensure traits are sound even in isolation, and that including classes are sound when they include the traits.

Feel free to play around with the type system in the interactive code editor on http://hacklang.org/.


From playing with the editor, it looks like it is sound.

What I thought the quoted passage was saying was that consistency between a class and a trait used by the class was not checked. That's clearly not the case, though. I don't understand what the performance point is, though, since I don't think the copy-and-paste style would produce different answers.

Finally, the online editor doesn't seem to honor // strict -- it doesn't produce an error when some methods aren't annotated.


Hi Bryan,

How do you protect from type errors when a value crosses from the untyped to typed fragment? Do you use contracts?


Not Bryan, but I am an engineer who works on Hack :)

We don't (statically) protect against type errors when you cross from untyped into typed code. If you call an untyped function, we assume the programmer knows what they are doing -- just like PHP does. You might get a runtime exception if you are calling a method on null, for example.

This is actually a pretty important part of the conversion process. You don't have to convert all your code all at once. Typed code can be verified statically; untyped code is assumed to work just as before.

Parameter and return type annotations are enforced at runtime by HHVM (just like you can add a class type annotations to a parameter in PHP). This will protect against some inconsistencies -- again, just as if you weren't using Hack.


Oh I see, so there's no change in code generation for typed code, then?


At runtime, we check and enforce parameter types at function entry and return types at function exit. The extra type information can also enable the JIT to emit more efficient code in some cases, and work is ongoing to make that even more efficient. But having these types is independent of being in Hack -- you can have fully untyped Hack code, though I wouldn't advise it, as you are leaving one of the most powerful features of the language on the table.


What do you do with higher-order functions (which PHP has I believe)?

If I write a typed map : (a -> b) -> [a] -> [b] and then it's called with an untyped function as input, but it returns a something that's not a b. Do you place some kind of barrier around it so that that's stopped immediately?


Does this presentation match the current version of Hack fairly well? It has a nice explanation of the type system that is more concise than the one on the web site.

https://raw.github.com/strangeloop/StrangeLoop2013/master/sl...

The idea of bolting a static checker onto a dynamic language and using an unresolved top type to make it work is cool. The presentation refers to your system as SoA gradual typing. Are there any papers or presentations which explain that approach and how it works in more detail? Particularly in how it might differ from gradual typing?


I'm a little confused as to the need for Hack unless you have a code base in PHP and need to ship tighter code (which is a problem Facebook has and my team probably has as well).

Adding lambdas makes PHP more Ruby-like and generics and type checking are straight out of Java. I'm still unconvinced in how this makes programming websites more efficient or bug-free than existing languages. Can you please elaborate on that?


I rather think lambdas, generics and type inference are straight out of ML, especially as hack is written in OCaml.


Any plans to have overrides for return types in the same class? I think that would be amazing. http://docs.hhvm.com/manual/en/hack.otherrulesandfeatures.ov...


Could the same techniques you applied to PHP be applied to other dynamic languages like Ruby or Python?


If you're looking for a statically-typed version of Python/Ruby, I'd say nimrod gets fairly close. It's also fantastically fast and compiles to portable ANSI C.


Nimrod is not a statically typed version of Python in the sense Hack is a statically typed version of PHP. Not at all. Hack is gradually typed, Nimrod is not. Big difference.


I thought it was clear I wasn't trying to draw an analogy between PHP/Hack and Python/Nimrod, but rather pointing out that Nimrod is a very nice language for someone who enjoys Python but wants static typing.


Yes, definitely.


Just a heads up that the search page seems to be 404'ing.


Oops! Thanks - we're fixing it now.


Have you guys looked at the new Truffle/Graal back end in Java 8? It has seen some impressive numbers in other languages, did you explore this as a possibility?


Is there any interest in adding named parameters to Hack?


That's coming in PHP 5.7 anyhow. It's been implemented and works, but Nikita Popov hasn't had the time to change the function definitions of the standard library to work with it, hence PHP 5.6 won't have it, sadly.


Wow, I haven't kept up with things, but gosh that sounds pretty great. Type checking + named params is a fairly difficult combination of great language features to find in the same language.


I think both C# and Scala have both features.


What IDE do people use for Hack?


Stay tuned. You'll hear more in due course.

In the mean time, the open source release of Hack includes integration with both Emacs and Vim.


If it involves Jets and brains..., then you will have won the interweb awards so soon in the year.


Will the vim plugin be seperated out into a subsplit so it can be used with Vunde or any of the other package managers that exist for VIM?


Facebook IDE incoming?


I think they mentioned something in the browser using js_of_ocaml at CUFP last year.


The js_of_ocaml bits are actually what are powering our interactive tutorial (http://hacklang.org/tutorial/) -- the tutorial is actually a full build of the typechecker running in your browser :) The source for it is mostly at https://github.com/facebook/hhvm/tree/master/hphp/hack/src/j... minus a little bit of glue.


Perhaps Sublime Text 2?


We currently don't have Sublime Text 2 support, but it shouldn't be too hard to do, given Sublime's excellent plug-in design.

Feel free to send us a pull request: https://github.com/facebook/hhvm/tree/master/hphp/hack/edito...


How about an MVC type framework, like Laravel for Hack?


Noted! Great work you people have done there.


[deleted]


We wanted to preserve our investment in PHP, while introducing new capabilities and safety that make rapid development at scale less daunting.


Presumably because it brings them some value?

More

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: