I remember reading about their Git troubles awhile ago, and I still don't buy this argument that it is better to have one large repository. One reason modularization is important is for the precise reason they are trying to get around it: removing the ability to make large scale changes easy and thus increasing reliability.
However, my understanding is that their desire to have one large repo is reflective of their their "move fast and break things" philosophy, which means not being afraid of making large scale changes. So I would be interested in hearing how they mitigate the obvious downsides given how many people they have committing to their codebase. It seems like you would just end up having to create constraints in other ways, so which constraints end up being the lesser of two evils?
We've found that "Removing the ability to make large scale changes easy and thus increasing reliability." isn't actually correct.
As an example, most of your codebase uses an RPC library. You discover an issue with that library that requires an API change which will reduce network usage fleetwide by an order of magnitude.
With a single repo it's easy to automate the API change everywhere run all the tests for the entire codebase and then submit the changes thus accomplishing an API change safely in weeks that might take months otherwise with higher risk.
Keep in mind that the API change equals real money in networking cost so time to ship is a very real factor.
It sounds like facebook also has a very real time to ship need for even core libraries.
Normally, libraries are written under the assumption that clients cannot be modified or updated when the library changes. This brings in the concept of a breaking change, and a set of design constraints for versioning. For example, modifying interfaces becomes verboten, final methods start becoming preferable to virtual methods, implementation detail classes require decreased visibility, etc.
The advantage is that library producers are decoupled from consumers. Ideally the library is developed with care, breaking changes between major versions are minimized, and breakage due to implementation changes are minimal owing to lack of scope (literally) for clients to depend on implementation details.
But under the model you describe, you're leveraging the Theory of the Firm as much as possible - specifically, reducing the transaction costs of potentially updating clients of libraries simultaneously with the library itself.
The downside is the risk of unnecessary coupling between clients and libraries - the costs of a breaking change aren't so severe, so the incentive to avoid them is lessened, and so the abstraction boundaries between libraries is weakened. If the quality of engineers isn't kept high, or they don't know enough about how and why to minimize coupling, there's a risk of a kind of sclerosis that increases costs of change anyway.
All of our core libs are owned by a team and you can't make changes to them without permission and a thorough code review. Our perforce infrastructure allows us to prevent submits that don't meet this criteria so we get the benefits of ownership only we use ACL's instead of seperate repo's. It has so far worked very well for us.
Commit in library -> successful library build, failed client build.
Commit in client -> successful library build, successful client build.
In both situations you don't really care about the intermediary broken build - you still have the previous library/client versions and can use those. Once everything is committed & fixed you have the new versions and you can upgrade.
The only problem I see if there's a long delay before the second commit. But this can be prevented by a fast CI cycle (always a good idea) and sending notifications for failures across teams (i.e. the library committer is notified
that the client build for his commit failed).
The long answer is that we run perforce with a bunch of caching servers and custom stuff in front of it and some special client wrappers. In fact there is more than one client wrapper. One of them uses git to manage a local thin client of the relevant section of the repo and sends the changes to perforce. This is the one I typically use since I get the nice tooling of git for my daily coding and then I can package all the changes up into a single perforce change later.
Google has invested a lot of infrastructure into code management and tooling to make one repo work. We've reaped a number of benefits as a result.
As others have mentioned though there are trade offs. We made the tradeoff that best suited our needs.
It would be great if someone used that to write a high performance git server.
We actually have an open source tool that allows you carve off parts of your Perforce server as Git repos. The repos can overlap in Perforce allowing you to share code seamlessly between different Git repos. You can generate new repos from existing code easily and can even generate shallow git repos that are usage for development.
Details are at: http://www.perforce.com/product/components/git-fusion
I'm happy to answer questions here or on Twitter: @p4mataway
The story i got from a Googler was that there are many separate projects in a single repository (which is normal for Perforce), and dependencies are handled by making your libraries subprojects (or whatever they're called - like subrepos in Git, essentially symlinks), rather than using an artifact-centred approach.