
Scaling Mercurial at Facebook - jordigh
https://code.facebook.com/posts/218678814984400/scaling-mercurial-at-facebook/
======
danielrhodes
"We could have spent a lot of time making it more modular in a way that would
be friendly to a source control tool, but there are a number of benefits to
using a single repository. Even at our current scale, we often make large
changes throughout our code base, and having a single repository is useful for
continuous modernization. Splitting it up would make large, atomic
refactorings more difficult. On top of that, the idea that the scaling
constraints of our source control system should dictate our code structure
just doesn't sit well with us."

I remember reading about their Git troubles awhile ago, and I still don't buy
this argument that it is better to have one large repository. One reason
modularization is important is for the precise reason they are trying to get
around it: removing the ability to make large scale changes easy and thus
increasing reliability.

However, my understanding is that their desire to have one large repo is
reflective of their their "move fast and break things" philosophy, which means
not being afraid of making large scale changes. So I would be interested in
hearing how they mitigate the obvious downsides given how many people they
have committing to their codebase. It seems like you would just end up having
to create constraints in other ways, so which constraints end up being the
lesser of two evils?

~~~
zaphar
We use one repository at google.

We've found that "Removing the ability to make large scale changes easy and
thus increasing reliability." isn't actually correct.

As an example, most of your codebase uses an RPC library. You discover an
issue with that library that requires an API change which will reduce network
usage fleetwide by an order of magnitude.

With a single repo it's easy to automate the API change everywhere run all the
tests for the entire codebase and then submit the changes thus accomplishing
an API change safely in weeks that might take months otherwise with higher
risk.

Keep in mind that the API change equals real money in networking cost so time
to ship is a very real factor.

It sounds like facebook also has a very real time to ship need for even core
libraries.

~~~
igravious
What version control system do you use, or is it a secret? Perforce? Git? I've
seen Linus's talk that he gave about Git at the GooglePlex so perhaps you use
Git. If so, how have you not run into Facebook's scaling issues?

~~~
zaphar
This question is actually really complicated to answer. The short answer is
that we run perforce.

The long answer is that we run perforce with a bunch of caching servers and
custom stuff in front of it and some special client wrappers. In fact there is
more than one client wrapper. One of them uses git to manage a local thin
client of the relevant section of the repo and sends the changes to perforce.
This is the one I typically use since I get the nice tooling of git for my
daily coding and then I can package all the changes up into a single perforce
change later.

Google has invested a lot of infrastructure into code management and tooling
to make one repo work. We've reaped a number of benefits as a result.

As others have mentioned though there are trade offs. We made the tradeoff
that best suited our needs.

~~~
sandGorgon
Interesting, has Google made the client parts open source?

It would be great if someone used that to write a high performance git server.

Could they?

~~~
thrownaway2424
There is no such thing as a high performance git anything. People at Google
who use git (I used to be one of them) suffer from far worse performance than
the people who just use perforce as such. In particular, the cost of a git gc,
which invariably occurs right when you're on the verge of an epic hack, or
fighting a gigantic outage, is unreasonable and perforce has no analogous
periodic process.

------
agwa
For some context of why Facebook choose Hg over Git, here's the mailing list
thread where Facebook initially reached out to the Git developers:
[http://thread.gmane.org/gmane.comp.version-
control.git/18977...](http://thread.gmane.org/gmane.comp.version-
control.git/189776)

~~~
novaleaf
I personally love mercurial (simpler than git), and have been a bit nervious
over the last year or so with the mindshare shift to git.

So hearing about this (FB all in with hg) ensures that hg won't be falling
behind... at least in the nearterm.

~~~
dgesang
Don't believe the hype. hg will be around for some time.

~~~
twic
Mercurial is the new FreeBSD.

------
xkarga00
"Our engineers were comfortable with Git and we preferred to stay with a
familiar tool, so we took a long, hard look at improving it to work at scale.
After much deliberation, we concluded that Git's internals would be difficult
to work with for an ambitious scaling project."

[http://www.techradar.com/news/software/how-open-source-
chang...](http://www.techradar.com/news/software/how-open-source-changed-
google-and-how-google-changed-open-source-1206582)

"And then Git itself wasn't working for us anymore because it wasn't scaling
when we'd have an operating system release. So we ended up hiring most of the
Git team - there's like only one or two core committers now for Git who don't
work at Google,"

~~~
durin42
Note that Facebook is trying to scale a single large repository, not an army
of slightly smaller ones. It's a very different problem, and has to be solved
in a very different way.

~~~
daleharvey
I have seen presentations mentioning google development done as a single
monolothic repository

[http://www.infoq.com/presentations/Development-at-
Google](http://www.infoq.com/presentations/Development-at-Google)

~~~
tonfa
This monolithic repo isn't a git repo.

~~~
spiffytech
Correct. See
[https://news.ycombinator.com/item?id=7020859](https://news.ycombinator.com/item?id=7020859)

------
LukeHoersten
I love the Mercurial community. We use Mercurial at work and I'm able to get
instant support in IRC for any issue we have with an awesome signal/noise
ratio. I'm glad Facebook is contributing back so much as well. My suspicion is
that open source projects tend towards Git because of GitHub but I think a lot
of companies who don't have the option of external code hosting lean towards
Mercurial. All anecdotal observations of course ;-)

~~~
VladRussian2
>My suspicion is that open source projects tend towards Git because of GitHub
but I think a lot of companies who don't have the option of external code
hosting lean towards Mercurial.

GitHub is a consequence not the cause (ponder for a moment why there is no
MercurialHub...) It is about ability to choose best source control tool for
multi-versioned distributed concurrent development. Open source devs have such
choice while corporate ones - no. FB choosing Mercurial tells a lot about
environment there.

~~~
ZenoArrow
There is a well used "MercurialHub", it's called BitBucket:
[https://bitbucket.org/](https://bitbucket.org/)

It supports Git now as well, but it was only for Mercurial use when it
started.

~~~
warmwaffles
I could have sworn bitbucket was an SVN host to start with, then patched in
mercurial support and then git support later.

~~~
barkingcat
Bitbucket never supported svn (and still doesn't), it was created with django
+ uses the pure python mercurial (It was one of the major "posterchild"
stories for django).

Not sure if it's still django though.

You might be thinking of [http://beanstalkapp.com/](http://beanstalkapp.com/)
which supports svn.

------
ngoldbaum
Mercurial has seriously improved over the past couple of years. If you tried
mercurial a few years ago and were scared away due to speed or functionality
issues, you might want to give it another shot.

~~~
JoshTriplett
Does the standard branch workflow still expect you to have a separate
repository and directory per branch? I don't care about plugins, here; if the
standard workflow doesn't include incredibly lightweight branches, I'll stick
with a version control system that does.

Likewise, does the standard workflow still intentionally make it painful to
rearrange changes in your local repository to construct a series of patches?
Does Mercurial provide built-in commands equivalent to "commit --amend",
"rebase -i", and "add -p"?

~~~
masklinn
> Does the standard branch workflow still expect you to have a separate
> repository and directory per branch?

You're probably confusing mercurial with bazaar, mercurial has always had
branches (though they're not quite the same as git, mercurial's bookmarks are
more closely related to git branches) and anonymous heads (contrary to git, an
unnamed head is not stuck in limbo).

> I don't care about plugins

That's stupid, mercurial is very much about plugins: there are dozens of
official plugins shipped in a standard mercurial install.

> the standard workflow

there is no such thing.

> Does Mercurial provide built-in commands equivalent to "commit --amend",
> "rebase -i", and "add -p"?

All of them are provided in the base install, you just have to enable the
corresponding extensions.

~~~
berdario
Furthermore, he's probably confusing mercurial with something that don't even
exist

bazaar expects you to have a separate working copy (directory) for each
branch, but you can have multiple branches stored in the same repository
without any problem

(I just wished that git people actually knew how do other tools work... but,
alas! Now it's too late for underdogs like bazaar or darcs to catch up)

~~~
JoshTriplett
> bazaar expects you to have a separate working copy (directory) for each
> branch, but you can have multiple branches stored in the same repository
> without any problem

Those are functionally equivalent, in that they both mean branching is not
instantaneous.

~~~
rbehrends
_Those are functionally equivalent, in that they both mean branching is not
instantaneous._

They are not functionally equivalent. A "bzr branch" with a shared repository
will only populate the working tree and does not have to duplicate the
repository. It is functionally closer to "git-new-workdir" than "git clone" or
"git branch".

To have instant branching in Bazaar, you can use co-located branches.

Branches with their own directories exist for the use case where the
accumulated cost for rebuilds after branch switches is more costly than
populating a new directory with a checkout or if you need two checkouts in
parallel. They also exist in Bazaar for simulating a workflow with a central
server and no local repository data.

------
boklm0
According to this page, Mercurial project leader is currently working for
Facebook:
[http://mercurial.selenic.com/wiki/mpm](http://mercurial.selenic.com/wiki/mpm)

~~~
CmonDev
Google bought Git devs, Facebook bought Mercurial devs - wise choice for
companies that even order custom CPUs.

~~~
jordigh
There are some hg devs at Google too. :-)

~~~
CmonDev
Hedging to avoid the risk I guess.

------
moron4hire
> "Our code base has grown organically and its internal dependencies are very
> complex."

That's a polite way of saying "we write shitty code without any sort of plan."

> "Splitting it up would make large, atomic refactorings more difficult"

Actually, it's the other way around. Modularity tends to obviate the _need_
for large, atomic refactorings.

And what, exactly, is the meaning of these graphs? This is leading me to
believe that being a developer at Facebook is about quantity over quality.

~~~
ZenoArrow
To be honest, whilst we have no way to accurately determine whether the code
is a mess without a chance to see it, the most surprising line of this article
(in my opinion) was that the code base was larger than the Linux kernel. I'm
not seeing anything on the front end that would warrant such complexity,
guessing a large chunk of the code base is server code. Would be interested in
reading a summary of the components of the Facebook code base.

~~~
igravious
This rather surprised me as well. I tend to think of the Linux kernel as one
of the larger single code-bases out there. Am I wrong?

~~~
bpicolo
It's one of the largest open source projects perhaps. But when you get into a
large company like facebook who creates a hell of a lot of different things
the numbers are way higher.

------
mindjiver
At a previous job I migrated a quite large code base to git. IIRC it was at
least 500k files and a couple of 10MLOCs. We had the same "scaling" issues
Facebook mentioned here when trying to place all of this inside one
repository. So we ended up going with submodules for this. Another idea was to
perhaps enable re-use of repositories and/or disconnect old legacy code this
way.

We did took a quick look at Mercurial but since lots of the upstream tools we
used was using Git (linux, uboot, yocto, etc) it was an obvious choice. I seem
to recall there being two hg extension that where of interest at the time
(2010-ish), one to add inotify support and another to store large files
outside the repository (hg-largefiles?).

Seem like Facebook's approach to the lstat(2) issue with watchman [1] is to
use inotify on Linux. This has been discussed a couple of times for git as
well but nothing has come of it so far [2].

[1]
[https://github.com/facebook/watchman](https://github.com/facebook/watchman)
[2] [http://git.661346.n2.nabble.com/inotify-to-minimize-stat-
cal...](http://git.661346.n2.nabble.com/inotify-to-minimize-stat-calls-
td7577352.html)

------
Touche
> We could have spent a lot of time making it more modular in a way that would
> be friendly to a source control tool, but there are a number of benefits to
> using a single repository.

Pray tell?

~~~
yonran
I worked at Google (in a team using Perforce) and now work at a different
company that uses multiple interdependent projects using Maven. Using a single
monolithic codebase along with a build tool that statically builds everything
at trunk has its advantages:

* You immediately get improvements from upstream projects without having to get them manually.

* You can unambiguously answer the question, What code am I using? with a single number. With multiple repositories, you have to list all the versions of each project that you are using.

* Easy API refactoring. You don’t have to worry about coordinating version number bumps across different repositories/dependency manifests when you make major changes to inter-project APIs. With a monolithic repository, you fix all callers of an API using a single code commit. No need to edit the version numbers in your pom files.

* Low cost to split a project into multiple separate projects. With multiple repositories or version numbers, you are reluctant to create new projects because the APIs will forever be harder to refactor (since you will have to worry about version numbers).

* No diamond dependency problem of 2 dependencies using a different version of a base project. Everyone is using the same version of base.

With a monolithic repository+build system, upstream callers are responsible to
never make a commit that breaks downstream callers. I feel like it’s similar
to the question of optimal currency areas. If your organization is growing in
lock-step, then you can all happily share a single gold-standard repository
with little friction. But if you can’t trust your upstream projects, you
introduce versioning between the projects and have to deal with the mental
burden of wondering whether to upgrade to the newest upstream project and
whether you’re actually running the latest code.

Edit: added a couple more.

~~~
Touche
> * You immediately get improvements from upstream projects without having to
> get them manually.

You also immediately get regressions. Not trying to be dismissive, but we
fundamentally have different software philosophies if you think this point
(which is the essence of most of your points) is a good thing that should be
encouraged.

~~~
thrownaway2424
Immediate regressions are good! If someone at Google breaks my code, I will
know within half an hour at the latest and I will tell them to go fix it or
just revert their changes myself. Immediate regressions also go perfectly with
daily (or hourly!) releases. If there's a performance problem it will be
identified early and I will only have thousands of changes to investigate
instead of tens of millions.

Imagine if I had a regression and I had to go to the other team saying "We
just upgraded from the Foo you released 2 years ago to the one from last year
and the performance sucks. Help!" I would not get any help. However I get
plenty of support when I go to Foo-team to tell them that my Foo-per-second is
10% worse in the noon release compared to the midnight release.

Having artifacts and stable interfaces and library releases and all that is
very ivory tower hocus pocus stuff. In _practice_ instant integration is
better.

~~~
Touche
I don't just mean performance regressions. Someone upstream can change an API
in a way that doesn't fit well with your use-case, goes in and "fixes" your
code (makes sure all the tests pass) to fit the new API but makes it less
maintainable in the process.

~~~
thrownaway2424
Teams (at Google) can't change calling code without the review and approval of
the owners of the calling code, so what you state would not happen.

------
natrius
Branches are my biggest pain point in Mercurial. What branching workflow does
Facebook use?

~~~
k_bx
Why not use standard mercurial branches? At least you're able to say in which
branch commit was done and draw a clean history for them.

~~~
natrius
I do use standard Mercurial branches, and I vastly prefer Git's model. I ask
because there might be a better way to use them that I've overlooked.

~~~
masklinn
Mercurial's equivalent to git branches (movable pointer to a commit rather
than embedded commit metadata) is bookmarks.

~~~
fhd2
From what I've seen, bookmarks are not "branches" in any sense, really just
pointers to commits.

I've tried several times to use bookmarks for feature branches (read: branches
developed in parallel to each other and the default branch). I thought I just
can't figure it out, but it really seems impossible at this point.

~~~
masklinn
> From what I've seen, bookmarks are not "branches" in any sense, really just
> pointers to commits.

That's exactly what git branches are.

------
ed_blackburn
I wonder what they use at Microsoft. For their sake I hope they don't subject
their own engineers to TFS.

~~~
mindjiver
I read somewhere(?) that they where running Perforce with some internal
tooling built on top of it.

~~~
boyter
I saw/heard that as well (hopefully someone can come up with the link). Things
like Windows/Office apparently are in a custom perforce.

I believe that they do use TFS on a lot of the internal projects though and
that Visual Studio is now done in TFS.

~~~
robaato
The following is a presentation in 2008 by Richard Erwin of Microsoft at the
BCS CMSG (Configuration Management Specialist Group):

[http://bcs-cmsg.org.uk/events/e20081124/2008-11-24-agile-scm...](http://bcs-
cmsg.org.uk/events/e20081124/2008-11-24-agile-scm-erwin.pdf)

Lots of good stuff there, but a key one is slide 8. In the question session
Richard confirmed that Source Depot (the custom version of Perforce) was at
that time still used for the source management although TFS was used for bug
tracking etc for Office and Windows.

Don't know what has happened in the intervening years...

------
riddlemethat
Can anyone explain to me how Mercurial is used at Facebook in conjunction with
Murder ([https://github.com/lg/murder](https://github.com/lg/murder))?

------
voidr
They didn't really scale Mercurial, they basically took it and replaced a lot
of it's functionality with remote services.

I understand their reasoning that technology shouldn't dictate the way they
develop stuff, however I think splitting the project up and using submodules
would have been a cleaner approach. If refactoring _everything_ is something
you do all the time, you might be doing it wrong.

------
cool-RR
I wonder whether putting all the code on a ramdisk (with backup of course) is
feasible. If so it might be a very cheap solution.

~~~
exDM69
No, using a ramdisk these days usually makes things worse, not better. The
reason is that the operating system already holds as much of the filesystem in
caches (in RAM) as possible. So as long as you have enough RAM in your system,
files will be cached and the result is better than using a ramdisk.

------
rottyguy
Any comments on the build system being used for a monolithic hive of this
size?

------
Crito
I wrote a comment about the scaling of repositories (and specifically
Facebook's issues) a few days ago that was wiped out by the HN crash, but I've
managed to recover it from HNSearch:

 _< old-comment>_ Facebook's problem is that they were trying to scale with
Git improperly. With conventional CVS[sic] systems like Perforce, you can
scale a single repo nearly as large as any company will need. Emphasis on that
nearly. At a certain point, with a large enough codebase (and, critically,
enough throughput) you start to realize that you are about to hit a brick
wall. With perforce, this starts to manifest with service brownouts.

With perforce, you can reasonably expect to run into this brick wall somewhere
in the neighborhood of terabytes of metadata and dozens of transactions per
second. That changes depending on what sort of beastly hardware you are
willing to throw at your version control team.

Git of course hits a brick wall much sooner, somewhere around single-digit
gigabytes of data (depending heavily on the average size of every object in
the DAG), even ignoring throughput.

Perforce is probably good enough for Facebook in the present, but if you are a
company that large and if you are forward looking, it becomes apparent that
with existing version control technology, "one repo per company" is not a long
term solution. Even "one repo per department".

You can split it even further, but what you realize is that you are developing
infrastructure that allows you to use many repos (for instance your build
servers and internal code search/browsing tools will now need to understand
that concept) but you are losing many of the benefits of Perforce. While you
are in the process of adapting your infrastructure to wrap its mind around
many repositories, it makes sense to allow dev teams to really take advantage
of this splitting. Develop infrastructure that allows Perforce and git repos
to coexist in the company, allowing dev teams to spin up new git repos for
their every project at will. Done properly, git allows you to create massively
scalable systems that you can count on supporting your companies needs for the
foreseeable future.

Smooth migration, migration that does not disrupt development, takes months
(assuming the right initial conditions), so it is best to recognize the
problem and start early, before service becomes disrupted.

If I understand Facebook's situation currently, they are still in the "try to
make Mercurial scale" stage of denial, burning developer time and effort to
push back that first brick wall (the same one that git hits, though mercurial
hits it after git hits it but before perforce hits it...)

Here is a google presentation about extreme scaling with Perforce:
[http://www.perforce.com/sites/default/files/still-all-one-
se...](http://www.perforce.com/sites/default/files/still-all-one-server-
perforce-scale.pdf)

An example of building multi-repo infrastructure for large projects with git
is Android's repo:
[http://en.wikipedia.org/wiki/Repo_(script)](http://en.wikipedia.org/wiki/Repo_\(script\))
Repo is just one example though; other, better, solutions are very possible.

 _< /old-comment>_

I guess the TL;DR here is that I think it is great that Facebook is making a
single Mercurial repository work for their purposes right now, but I think
they are kidding themselves if they think that is a long term solution. They
are doing a pretty good job of making Mercurial scale like Perforce can scale,
but that will only work for them for so long.

(In the above comment I talk about building a scalable system _with_ git _(
"with" git, rather than making git itself scale)_, but the same can of course
be done with mercurial instead of git. I don't mean this to be a comment
suggesting that they should use git instead of mercurial.)

~~~
dubcanada
I think they just want to modify Git and don't have any solid C developers
that can make such things. So they turned to a python solution which is
perfectly fine.

But webkit, and chromium, as well as other GIANT projects which as far as I
know are larger then Facebook seem to work fine on Git.

~~~
khuey
You really believe that Facebook couldn't find a competent C programmer to
make some changes to git?

~~~
kanzure
Would you accept an answer that complains about how hard it is to hire
programmers? :)

------
greatsuccess
Partitioning is the answer. In a repo that size 99% of the history is useless
to anyone. You wouldnt manage a database like this so why force SCM down this
path?

If they used git with say only the last year of history in it they would be
having zero issues.

~~~
exDM69
Partitioning may be the answer but this is a huge problem for corporations
like Facebook (and the one I am working for). If things have been done with
one giant repo from day one, splitting it is going to be a major
engineering/political/social problem when there's thousands of engineers
working on the code base and you can't just shut down business for the
duration of the migration.

