Wait, Facebook uses Haskell? I had no idea.
There is little information about the size of the codebase though. I could only dig up a talk from 2009 and a tool to modify a PHP codebase with abstract-syntax-tree transformers in Haskell [2]. Can somebody give some more insight on their internal use of the language?
Anyway I think it's great that they have a hybrid codebase. With projects like HipHop and this Haskell backend it looks like they are gradually moving away from PHP as their main language.
Facebook hasn't been a PHP shop in a long time. PHP is only used on the front end site. New projects that come along are written in whatever language makes sense. Like facebook chat being in erlang for example.
I heard a rumor a while back that Facebook Chat was rewritten in C++... but I can't seem to find anything to verify that. Do you know if it's still Erlang?
No idea. Sounds like the kind of thing that gets started based on a misunderstanding like someone saying "we wrote that in C++" in reference to part of the chat system. Either way, my point was that facebook isn't a PHP shop. Erlang/C++ or just plain C++, neither is PHP.
I'm sort of surprised that Facebook's "friends of" is a directed graph, when the actual social graph is undirected. I guess they use enough directed edges for other things in that graph, but I'm sort of surprised their social graph isn't undirected and ridiculously optimised as its own separate thing.
An undirected graph is equivalent to a directed graph with doubled edges.
The only efficient way to implement an undirected graph is as a directed graph with doubled edges.
You want to list friends of Alice. Directed: query all friend(Alice, X). Undirected: query all friend(Alice, X) OR friend(X, Alice). Finding all friend(X, Alice) will be very slow without an index of all friend relationships with Alice in the second position. Storing this is harder than just storing friend(Alice, X) and friend(X, Alice).
More generally, in Facebook's social graph, connections can be of different types. For e.g., for a user, 'likes' can be one type of an edge and 'followers'(people he follows) is of a different type, 'followed by'(people who follow him) is yet another type, so on. Many of these types can only be uni directional(e.g., all the above three examples). So, the social graph is a directed graph. Of course, the well known 'friends' connection is a bi-directional type of edge.
I assumed, I think incorrectly with hindsight, that the 'friends' connection dwarfed all others and would be separately optimised. I think you're right though - 'likes' and 'friends' probably don't dwarf each other at a guess, and as Scaevolus pointed out it wouldn't be fundamentally different anyway.
For surveillance it's of great importance to know the difference between me being friends with Iranian nuclear scientists, them being friends with me, or us having mutual hugs going on.
The hot-swapping is very new and experimental, but if facebook has a need then Simon Marlow definitely has the capability bring that back to GHC core. He's already stated at least some of the hot swapping will make it.
Anyway I think it's great that they have a hybrid codebase. With projects like HipHop and this Haskell backend it looks like they are gradually moving away from PHP as their main language.
[1]: http://cufp.galois.com/2009/abstracts.html#ChristopherPiroEu... [2]: https://github.com/facebook/lex-pass/tree/master