
Advantages of monolithic version control - Tomte
http://danluu.com/monorepo/
======
tzs
> With a monorepo, projects can be organized and grouped together in whatever
> way you find to be most logically consistent, and not just because your
> version control system forces you to organize things in a particular way.
> Using a single repo also reduces overhead from managing dependencies.

This is the major thing I miss about Subversion, and the fact that in
Subversion a subdirectory in a repository can be checked out on its own.

At the top level of our Subversion tree, there were 'web', 'it', and 'server'
directories, reflecting the departments in the company (at least those
departments that dealt with source code).

In the 'server' directory, there were things like 'payments', 'reports', and
'support', for things like payment processing, reporting, and stuff to help
the customer support people.

So lets say we had programmer in the server department working on the credit
card storage system, and on a script to make a quarterly tax report. That
programmer would just have to check out /server/payments/cc_storage and
/server/reports/quarterly_tax from the company Subversion repository. When he
checks the history in either of those, he only sees commits that affected
those directories or their subdirectories. It really is like they are separate
repositories.

Suppose another programmer is working on the whole payment system. He can
check out /server/payments, and automatically gets cc_storage under that, but
also order_processor, paypal_callback, cc_updater, and subscription_biller.

I was in charge of the whole server department. So I could check out /server
and have a copy of everything we did. The other programmers would usually save
work in progress to a personal development branch, and so every morning I
could update from the server, and then see what everyone had done the day
before. I could do a quick review to check the less experienced programmer's
work, and it also gave me what I needed to write a short note to my manager
letting him know what the server department was up to and who we were
progressing.

~~~
hinkley
I have poked around in the git file format a tiny bit and I think the hash
tree semantics aren’t incompatible with subversion’s commit tree semantics,
which are what let you grab a particular sub tree cleanly.

I kinda think you might be able to convince git to let you check out a
subdirectory. But I’m not sure if any of the plumbing exposes that ability or
if it would take significant surgery.

~~~
DaiPlusPlus
Git supports this scenario as "sparse checkouts": [https://git-
scm.com/docs/git-read-tree](https://git-scm.com/docs/git-read-tree) \- see
this QA: [https://stackoverflow.com/questions/4114887/is-it-
possible-t...](https://stackoverflow.com/questions/4114887/is-it-possible-to-
do-a-sparse-checkout-without-checking-out-the-whole-repository)

~~~
hinkley
Does this allow you to file PRs against the repository you have checked out?
Or does this only work for read-only use? What about CI? How do you convince
TC or Jenkins or Bamboo to do the same thing?

If this does do all that, I think this functionality needs some SEO love
because this pretty much never comes up when I search for the latest ways to
grab part of a repo. All I find are conversations where people are trumpeting
the wrong tools for the job.

Edit: Also, this doesn't seem to let me check out a subtree the way people
mean "check out a subtree". When I check out 'just foo/bar/baz' I expect to
have a directory called baz as my project root. Not a directory named foo with
a single grandchild named baz.

~~~
foxhill
perhaps you could rename the repo to something else/move it somewhere else,
then ln -s the grandchild, if that’s important?

~~~
hinkley
It breaks down if you need more than a couple of them. But I guess hat could
go both ways.

A problem we saw with perforce and the clientspec: when people see a directory
structure, they forget they don’t have all the bits. They make errors of
judgment based on bad info.

------
amag
On the other end of the spectrum, a colleague of mine recently told me about
when his previous company (big, 100k+ employees) were starting to adopt git,
some people seriously considered having a separate repository per _file_! _"
Because then you can just set up your project as using version X of file A and
version Y of file B"_

It's good we now have (at least) Google, Facebook and Microsoft as examples of
companies using monorepos. Those are names that carry some weight when thrown
into a conversation.

~~~
tdumitrescu
That sweet abuse of version control reminds me of the fabled JDSL from
[http://thedailywtf.com/articles/the-inner-json-
effect](http://thedailywtf.com/articles/the-inner-json-effect)

~~~
mathgladiator
That can't be real... is it? No. No way.

~~~
sandos
It has to be made up. I am with you, that can no be real. There are trolls for
everything.

~~~
mathgladiator
It is so well crafted that it must be rooted in truth... I just... no. No way.
This is too much madness...

~~~
icc97
Seems interesting that he would put in comment examples that would delete the
entire database. Perhaps the guy telling the story is what happened when
little Bobby Tables grew up.

------
zdw
The argument against this is that other external-to-Google-but-started-by-
Google projects like Golang and Android use the multirepo model, with gerrit.

There are pros and cons to each - do you want to have a hugely churning "I
always have to rebase/merge" repo under you, or multiple repos and trouble
keeping them in sync?

Having done both, I'm not sure which is better - it's probably very project
specific.

~~~
fishywang
Android uses multirepo model not because they want to, it's because they have
to. You can't have a single repo Android's size in git and still run it
snappy. All the tools, including the old repo tool and the new toolings around
gerrit, are to make the actual multirepo model works like a single repo.

~~~
petters
Yes, and lack of proper tools is of course the main reason against a monorepo.
Google had to develop most of its dev tools internally.

------
shados
I always find it a bit funny how monorepos are now this big exotic new age
thing. "Monorepos are the future!".

That very well may be, but I think we're at a point where the majority of
developers started after git came out (since the industry experienced
explosive growth just in the last few years). They kind of forget (or didn't
know) that it's not that long ago we did monorepos because we HAD to. The
tooling to do a system in multi-repos was just not there. It wasn't practical.

Companies like Microsoft, Google and Facebook predates the days where building
your company on top of 3000 repos was practical, and they certainly were not
going to convert everything if they could help it. Thus, they built an
enormous amount of tooling to make it work. It certainly has benefits (and
tradeoffs). With similarly advanced tooling to support you, multiple repos
also has a lot of very nice properties and scale quite nicely.

To each their own.

~~~
beagle3
> They kind of forget (or didn't know) that it's not that long ago we did
> monorepos because we HAD to.

That was never ever the case. We did _centralized_ repos because we had to,
but every single place I worked at, whether they were using SCCS, RCS, CVS,
SourceSafe or SVN, had multiple repositories for different projects. No place
had more than 20 repos, but then, the largest of those was about 250 people
with VCS access.

~~~
shados
Most places I worked at had everything in one repo and folders for projects
/shrugs. Though some did have multiple repos, though in my book, that's just
"many monorepos". 20 repos for 250 people was quite manageable with old tools.
When one system is split in 4000 repos, that's a different story.

------
peterevans
Using multiple repositories seems like the most natural workflow to have. It's
easy to make a new one; it's lightweight to do, and it allows your code base
to scale naturally. You can set permissions for each repository, so if you did
include some sensitive code within one repository or another, it's easy to
narrow the access to them. (Yes, don't include sensitive code in a
repository—but in an early-stage company, you or your engineers may not know
that.)

Multiple repositories make for a forgiving structure for your code base. You
can tailor them however you like.

But once you have a _lot_ of code, they become hard to manage. I see the
utility of a monolithic repository there—now you know exactly where all your
code is: it's in this one repository!

Package managers mitigate a lot of the trouble with pulling in internal
dependencies from other repositories. Nowadays, most languages have a package
manager that can work with a private codebase, so monorepos aren't necessary
to help with that. But monorepos can help if you have a ton of versions
floating around and you don't want to support version 1.5.x of libjohnny when
it's now on 4.3.x. Your code either works with libjohnny as it is right now,
or it doesn't. (Which in turn makes it very clear to you how important it is
to manage API-breaking changes!)

This feels a little bit rambling, but my thought is that there is some analogy
between monorepos and microservices; don't use them until you _need_ to use
them! You'll know it when you get there.

~~~
mnm1
With multiple repos, how do you solve the issue that the private dependencies
installed by package managers are not themselves under source control when you
edit them? They are in /vendor or /node_modules or whatever. It's easy to pull
them down and of course, for other people's dependencies, this is fine. But
say I pull out one component of my app that's used by multiple apps. I set up
my private repo or maybe the package manager works straight with git. But when
it pulls the dependencies down, it doesn't pull the git subdirectories and so
that code is not under source control. I make edits, test, then I have to copy
that code back to its original repo, resolve any conflicts, and check it in.
If I need to make a change, I have to do that again. If someone worked on that
code, I have to pull that and merge it in separately. Even if I manually pull
it myself in the right place, I'm still dealing with a git repo inside a git
repo but the two are unrelated so many tools won't work with the inner repo.
Short of writing a bunch of custom scripts, is there a standard way to handle
this situation that I assume anyone with multiple repos that share internal,
private dependencies has?

~~~
peterevans
Any changes you need to make to internal libraries you install via package
manager would need to be made to those separate repositories that house the
libraries, and then those changes would have a release made (generally this is
done through a git tag, and using semantic versioning). Once you update the
version, your package manager will allow you to install that update to the
other places you're using it, so all you need to change is the package file.

It might seem like a pain to do it this way, especially if you're rapidly
iterating on a library—it might be that your library is not really mature, or
even used by more than one repository, so it may not even make sense to have
that library in a separate repository to begin with! But once you do have a
mature code base, semantic versioning is a really sane way of managing
dependency updates for the N number of other projects which use your library.

~~~
hinkley
And how do you test if a change in one of these repos fixes the problem you’re
seeing? This is what we’re failing at with our multirepo.

That and resectioning code to split or combine responsibilities in different
ways. Something a monorepo makes trivial.

~~~
peterevans
Ideally, your library code that you're pulling in has some unit testing to
demonstrate that things are working as they should be. (If not, consider
adding unit testing! It's really useful!)

If that is the case, then you can isolate the likelihood of an issue as either
in the library (because unit tests fail there), or in the project consuming
the library (because unit tests succeed in the library).

Without testing, it's hard to have a ton of confidence in where the problem
lies—which is exactly the problem you cite. And while a monorepo (or just a
standalone repository with no separate libraries, which is frankly an easier
setup to manage than monorepos or multirepos!) may make debugging a bit
easier, it's not going to give you much more confidence in your code.

Once your organization outgrows the paradigm where you just have N standalone
projects with N completely separate code bases, and you do need to commit to
either a multi-repo or a mono-repo configuration, it's really, really helpful
to have unit testing to allow you to isolate where issues are.

~~~
hinkley
The class of problems I’m talking about are integration issues. Unit tests
look good but when you put the pieces together...

When the tests that matter cross version control boundaries you pay for it.
Whether the costs outweigh the benefits is something you have to think about.

~~~
peterevans
What does your test coverage look like? Perhaps you're missing something there
that would have caught that bug?

Testing is, of course, no silver bullet. Tests are written by humans, and
humans make mistakes—and it's pretty difficult to achieve 100% test coverage
in a production system. The goal of testing is to have confidence in the code
you've written.

Tests often don't need to cross version control boundaries. You can use mock
data—like would be produced by the library—on the consumer side, because you
can delegate responsibility for testing of that library to the library
repository itself. If your tests work great with the mock data, but things are
still failing, then you can infer that the mock data and the actual data are
different, and your bug is in the library.

~~~
hinkley
For instance, I have a piece of code in a particularly gnarly modules that at
this moment is disabled and has been for two sprints due to emergent behavior.
First sprint it had adequate unit tests but not enough functional tests to
exhibit a problem. Second sprint I fixed the testing deficiency and got the
code to work end to end.

Or so I thought. The first time I turned it on it preprod I couldn't turn it
back off again because some piece of data that came from five function calls
away was being shared, and nobody who participated in the PR recalled that
fact.

Most of the code I'm dealing with is in a single module. I have been chipping
away at fixing the insane ball of mud as I can. My coworkers often aren't that
lucky. They come to me for advice on how to deal with this sort of problem but
crossing 2 or three modules.

There's no low-friction way for them to fix any of this. They can't just
refactor because of the coordination costs, and also the loss of historical
information when you move a block of code across module boundaries or try to
change module boundaries. This is the prime argument for monorepos in the
literature - not making irreversible decisions on Law of Demeter problems.
It's not my biggest reason, but it's sufficient for most people.

~~~
peterevans
Yeah, I sympathize—that's a tough situation.

I think there's two ways you can look at your choice of configuration: ease of
debugging, and ease of organization. When Google lays out why they use a
monorepo, they are doing so because it simplifies their organization—there are
no longer so many versions of so many libraries and apps they need to support;
there's only one version of anything to support. Either everything works or
everything fails.

But in your case, you're looking at it from the debugging point of view. It's
easier to play around with the code in a monorepo. And that's totally fair
point of view to have, particularly in your predicament.

That choice of a monorepo doesn't necessarily improve the quality of your code
organization and interoperability. It's still going to be a bad bug to fix.
It's just a little bit easier to debug.

------
iandanforth
I didn't notice a link to "Software Engineering at Google" which is a great
article that goes into the monorepo argument as well as a lot of other cool
practices.

[https://arxiv.org/abs/1702.01715](https://arxiv.org/abs/1702.01715)

~~~
caurusapulus
Even more relevant IMHO is this: Why Google Stores Billions of Lines of Code
in a Single Repository

[https://cacm.acm.org/magazines/2016/7/204032-why-google-
stor...](https://cacm.acm.org/magazines/2016/7/204032-why-google-stores-
billions-of-lines-of-code-in-a-single-repository/fulltext)

~~~
samkellett
wait hang on...

> At Google, we have found, with some investment, the monolithic model of
> source management can scale successfully to a codebase with more than one
> billion files, 35 million commits, and thousands of users around the globe.

so each commit adds on average over 28 new files?

~~~
icebraining
There are probably commits that add hundreds or thousands of auto-generated
files, raising the average.

Also, it depends on how much you squash - my latest 10 commits turned into
just one when merging to master.

------
nextos
I initially found the idea of monolithic repositories hard to digest. But now
I think it's a good idea for some of the reasons outlined in the article.
Namely, it's very easy to depend on other code that the organization has
created.

In the open source world, I have found some Unix distros use the same model. I
know it's not as extreme, but the principle is quite similar. For example, in
Nixpkgs all package definitions (which are actually code in the functional
language Nix) are in the same repository and thus they can all depend on each
other in a very easy and transparent way.

~~~
stcredzero
_I initially found the idea of monolithic repositories hard to digest. But now
I think it 's a good idea for some of the reasons outlined in the article.
Namely, it's very easy to depend on other code that the organization has
created._

I remember the days when monorepo was the norm, and distributed version
control was the weird, kooky idea. Mainstream programmers had knee-jerk
notions that all managed environments were too slow.

For game development, monorepo is simpler. If one is using git, one needs to
use some other software to turn the part of your repository for media into a
monorepo, otherwise the asset files become a burden. (gitannex, for example)

~~~
jameshart
Do you really mean a monorepo, though, or just a project repo whose scope is
one entire game? Because a monorepo for multiple games - including released
and in progress ones - seems likely to create a lot of pain in the long term.
The release cycle for games seems much more suited to a release branch model,
which would kind of require per-game repositories, and some sort of package
versioning for common dependencies. I guess maybe with things like mobile
games where you have a constantly moving target platform even ‘released’ games
are live code so maybe I’m just betraying an outdated ‘gold master’ kind of
mindset here?

------
jedberg
The catch is the tooling. If you have the time and resources to make the
tooling that is necessary to make it work specifically for your org, than
great!

But if you don't, then a monorepo will generally slow you down because it will
require coordinating changes across a much bigger group of people.

Monorepos are great for very small companies with a low communication
overhead, and very large companies with the resources to build the tooling to
make it work.

For everyone in between, I feel that small repos and microservices give the
best developer velocity.

~~~
joshuamorton
But I can say the same thing about multiple repos!

I've seen more than one company now that has had the same problem: how do they
patch atomic cross-repo changes onto their multiple git repos? The reasons for
this can vary, but the core problem is always that. As far as I see it, there
are two solutions:

\- Use a monorepo

\- Create some external database that ties multiple hashes together for use in
your ecosystem. This also requires re-inventing bisect on top of this
database. I'm sure the intelligent people of HN can come up with the multitude
of other tools you need to modify to make this work, but its not trivial.

If you're willing to manage that overhead somehow, that's fine, but I can't
imagine its fun.

~~~
shados
Eventual consistency. The great thing about multi repo is the ease of
decoupling the pieces so they can evolves separately (you can do that in
monorepos too, but it's not quite as natural).

You're free to PR changes gradually, making sure things work a couple of repos
at a time, until you eventually get everything. If you can tolerate temporary
inconsistencies, it allows you to scale to infinity, essentially for free.

~~~
joshuamorton
>Eventual consistency. The great thing about multi repo is the ease of
decoupling the pieces so they can evolves separately (you can do that in
monorepos too, but it's not quite as natural).

I don't see how you can do this any better in a multi-repo than a monorepo
though, unless you mean to the extent of simultaneously having multiple
versions of the same library in your transitive deps (and thus kind of
kludgily sidestepping the diamond dependency issues). Would you mind
elaborating?

>You're free to PR changes gradually

This is possible in a monorepo too, by much the same means, I'd expect: you
define an adaptor that you slowly migrate everyone onto, deprecate the old
thing, and then optionally remove the adaptor and deprecate it too. Am I
missing something?

~~~
shados
> I don't see how you can do this any better in a multi-repo than a monorepo
> though, unless you mean to the extent of simultaneously having multiple
> versions of the same library in your transitive deps (and thus kind of
> kludgily sidestepping the diamond dependency issues). Would you mind
> elaborating?

Leaf repos (projects nothing depends on, like apps) can do literally whatever
they want at any time without affecting anyone else. Right there is a big win.
Dependencies can have multiple versions and the leaf can depend on whichever
version they want at any given time. You can do the same thing in a monorepo
if everything is in independant folders, but that's just the worse of both
worlds.

> you define an adaptor that you slowly migrate everyone onto

No no. The way we do it is: make breaking change, people upgrade to it
whenever (with gentle pushes so that we're eventually all on the same version
sooner than later). No adapter, no transient state within a service. Just
upgrade repos one by one until you got them all, no magic involved. This
assumes that your repos represent loosely coupled components (micro services
or micro apps).

~~~
joshuamorton
These two things you state only work if you have no shared deps. If I depend
on A and B, and B also depends on A, I can't use whatever version of A I want,
it has to be compatible with B. This means that I'm forced to delay my upgrade
of A until B has done so.

There longer the chain of deps is the worse this is. Even if everyone takes
just couple days to upgrade, which ime is generous, your leaves end up being
forced to wait weeks to upgrade in the worst case.

~~~
shados
Depends: if B declares that it is compatible with both versions of A (because
it uses methods that did not break), then you sure can upgrade whenever you
want (and potentially start using new methods that contained breaking change).

If the change is breaking for B, then yeah, you have to wait until B upgrades
(or you can upgrade it yourself!). During that time, other projects that don't
depend on B can go on using the new A.

The alternative is "stop the press, everyone is upgrading to the latest A
NOOOOOOOOOW", which if the change is not automatable, might either be non-
realistic, be pretty large in scope, or require you to never make drastic
changes in A (which is tricky if A is a 3rd party you don't control). You can
also just have these long-running transient state where somehow A is always
compatible with everything no matter via compatibility wrappers.

It's tradeoffs. I like our world where we don't have to migrate everything at
the same time all the time, even with drastic changes. Works quite well for
us, with thousands of repos and 10s of millions of lines of code. We enjoy the
flexibility. It makes certain cross-project efforts harder. That's the
tradeoff.

------
jschrf
I think that the question of whether or not to use a monorepo is not a
function of "how to organize code for efficient coding". It's more of an
actual organizational question; how and how often do you deploy, what are the
actors and their needs, and what's the domain? Are you producing statically
linked C libs? interrelated NPM packages? SaaS?

If you look at monorepos from a package management standpoint, they are
usually just high-level graphs of dependencies. Or, more accurately, sources
of dependencies. How these are rolled up, shipped, and ultimately deployed is
a function of the operational culture and business needs more than it is
source control or even language choices, in my opinion. Business needs impact
source control in any sufficiently complex, source-controlling org.

That's not to say that, for example, small companies benefit from monorepos,
while large ones benefit from small repos and packages (or the opposite). I
think the pros and cons are entirely decoupled from codebase size and
complexity. In order to do one or the other well, you need the right business
needs, operational parameters, engineering culture, and resources. So, I
always find it interesting to read about Google or Microsoft leveraging one,
the other, or both approaches with their own codebases.

I liken the monorepo vs small packages approach to be a little bit like
rendering a frame on a CPU vs a GPU. Do you build/test/deploy each "frame"
(iteration) as a top-down, more-or-less-discrete block of work, or can you
parallelize it and "ship" multiple compatible streams at once?

I suggest it depends almost entirely on the problem space and the "hardware"
(business needs), far more than it does the actual code or volume of code.

Perhaps Conway's law here applies here in a sense, i.e. any organization that
manages source code will produce a source control management scheme that is
representative of how they deploy to downstream consumers.

------
kingosticks
How do people handle the case where in a large repository, with people
committing almost constantly, pushing your changes to a git server on the
other side of the world becomes quite tricky. By the time my push gets to the
server it seems someone has gotten in before me and my repo is out of date. I
have to pull the new changes and try again. During busy parts of the day (my
afternoon, the U.S morning) you might have to loop this process a few times.
If I was to check the code still builds after each of those merges I'd be
there all day, so there is some risk there. When we had cvs this wasn't an
issue since most changes are to completely different parts of the code base so
you don't need their changes.

~~~
ambivalence
You don't ever push straight to the repo. You enqueue your changeset to be
pushed by a central system. At Facebook we call this "asynchronous landing"
and you're right, before this became a thing about five years back, at certain
times of day it was very tricky to push your changes out.

Once you switch to this model, you can do convenient things like landing
straight from the review system (by a "Ship it" button), landing after some
checks were successfully performed, and so on.

Incidentally, this is similar to how merging pull requests by rebasing works
on GitHub.

~~~
kingosticks
This does sound like the answer. I wonder if our Atlassian solution supports
this, perhaps they call it by another name.

------
meddlepal
I warmed up to the monorepo awhile ago but tooling and reference material that
helps guide an org into adopting a monorepo and using it properly is hard to
find.

~~~
mikepurvis
People always talk about how it takes all this tooling to have an effective
monorepo— it definitely does; many moons ago I was a Googler too and
experienced all the fun little perforce wrappers that would check out the
portions of the tree you needed to build whatever it was you were trying to
build.

But having things split across many repos also takes a lot of tooling too. So
really, for each approach you're looking at which portions of it are covered
by the version control system itself, and where the gaps are that have to be
plugged by auxiliary stuff. And of the auxiliary stuff, how much of it is
standard enough to be things you can inherit pretty much off the shelf from
your distro or other ecosystem vs. something you need to actually build and
maintain in-house.

Other organizational priorities come into play too, like the importance of
open source in your codebase, and what your relationships are to your
upstreams, if you have them. It actually surprises me that there aren't
more/better tools out there that help with synchronizing commits (or portions
of commits) in and out of external standalone repos. The main patch-management
tool I'm aware of is Debian's quilt, which pretty much just boils down to a
handful of bash scripts and arcane conventions. Why isn't there more stuff in
this space?

~~~
astral303
The difference is that multirepo large project dev and dep mgmt are problems
solved today with existing, widely used tooling. With monorepo, I’m DIYing
scalability onto a system not designed for it.

~~~
mikepurvis
Oh, I totally agree with you. I do robotics, and my company inherited a multi-
repo approach kind of by default from our upstream ecosystem (ROS). So we get
a lot of the benefit of our upstream having tackled a number of the issues
around having a product made up of hundreds of repos.

Now, our upstream doesn't _actually_ ship very much, and what they do ship is
relatively slow moving. So we've had to extend the supplied tooling in various
ways to truly meet our needs.

------
perfmode
Thinking that git is the only option is a harmful and pervasive form of
Stockholm syndrome.

~~~
JetSpiegel
Thinking that SVN is an option is even more harmful.

------
dpflan
I'm supportive of this approach; it's been an interesting shift from
"multirepo". I think an advantage does come when a common language is used for
projects, which is what I am familiar with. It can make it easier to enforce
design patterns and re-use common code/dependencies (grep, text search, etc
all in the same place that you're working in); which I think has a large
positive effect on "loading working context" for whatever project you are
working on - when there are common phrases and patterns and libraries used, it
is easier to discuss with or to get advice from co-workers.

------
jayd16
I personally find multi-repo thinking leads to better architecture. Cross
project change history seems nice but if the projects are that coupled in the
first place, why are they difference solutions to begin with? If you're
building decoupled code you shouldn't need cross project changes.

That said, I understand the worth of getting things done at the expense of
rigor so I chalk this topic of discussion up to personal taste. It's akin to
the dynamic vs static typing debate.

~~~
remus
> If you're building decoupled code you shouldn't need cross project changes.

That's the theory, but in practice designing robust, future proof APIs has
proven to be really hard in a lot of cases. You're then left with the option
of supporting old APIs forever or migrating dependent code to new APIs, both
of which are difficult in their own ways.

~~~
jayd16
That has nothing to do with requiring an atomic cross project change. If your
depended library needs to be updated, update as needed.

Upgrading other projects to support the new dependency version can come in a
different commit. You only run into trouble when you've set up your projects
to always use the latest version of their dependencies. That's a recipe for
disaster.

------
jncornett
I think it boils down to two things that are not mutually exclusive: 1\. Are
you building a monolith? 2\. What does your org chart look like?

I work for a fairly large company where a monorepo doesn’t make a whole lot of
sense because each team runs several services that get released or patched
independently. If you have a large product composed of many components that
need to function together as a cohesive whole, go ahead, use a monorepo.

------
9034725985
Here is a naive question:

Let's say I want a small application with flask and angular.

I create a single repo for both flask and angular. I put everything flask in
one sub folder called backbend and I put everything angular in another folder
called frontend

WIP here:
[https://github.com/kusl/flaskexperiment](https://github.com/kusl/flaskexperiment)
or
[https://git.sr.ht/%7Ekus/flaskexperiment/](https://git.sr.ht/%7Ekus/flaskexperiment/)

Now the problems are just starting: how do I set up ci for all my projects?
Travis ci expects a single file at the root of the project and so does gitlab
ci (hi sid, big fan)

I am sad I can't talk to the experts at Google about how they navigated these
problems. I understand Google has many enemies that are constantly trying to
exploit whatever but I still wish we could have a more open conversation here.

~~~
skovorodkin
> Travis ci expects a single file at the root of the project and so does
> gitlab ci

GitLab has a related issue to support "Several .gitlab-ci.yml for monorepos"
([https://gitlab.com/gitlab-org/gitlab-
ce/issues/18157](https://gitlab.com/gitlab-org/gitlab-ce/issues/18157)).

~~~
connorshea
We also have an issue for running jobs only when modifications are made to a
given file or files in a directory, it's tentatively scheduled for 10.7
(April's release).

[https://gitlab.com/gitlab-org/gitlab-
ce/issues/19232](https://gitlab.com/gitlab-org/gitlab-ce/issues/19232)

------
vog
Is there any Free Software VCS that specifically targets monorepos?

If not, which of the "usual" VCS are best suited for monorepos? CVS?
Subversion? Darcs? Bazaar? Mercurial? Git?

Should one use Mercurial simply because Facebook uses (and patches) it, or are
the better choices for small-to-midsize organizations?

------
qwerty456127
OMG someone is finally going to tell me about how SVN can be better than GIT
in particular cases! I have been asking this question for years out of pure
curiosity yet everything I've received in response were accusations in
trolling.

------
adsharma
There is a people aspect to this trend.

There is a lot of developer movement between the companies cited and it's not
surprising that people take the practices they're familiar with to their new
employer.

Companies incentivize people to deprecate feature X and replace it with
feature Y and celebrate the win. Much harder without a monorepo.

The counter argument is open source, where the development follows the
distributed model and the difficulty of syncing the monorepo with a custom
build system. Figuring out a way to leverage the QA work distro people do in
coming up with a consistent cut + patches would benefit everyone regardless of
repo structure.

------
oselhn
You can successfully use this approach only if you do not have external
dependencies. If you use some library you have basically the same problem as
multirepos. I worked in companies with monorepo and it work really well.
Except all external dependencies were copied into monorepo, outdated and with
unpatched security problems because no one ever updated them unless there was
some functionality missing. Also there were some internal libraries for
logging etc. which were worse than publicly available alternatives but it was
easier to maintain them then use something standard.

------
vog
There should be " (2015)" added to the title.

(BTW, I was slightly confused not to find a single date on this blog post, and
the HTTP headers were also useless. But at least there are rough timestamps at
the main site.)

------
rkangel
As a non-Google/Microsoft/Twitter person, if my org wanted to start using a
monorepo, where would I begin? Or is the relevant tooling all closed?

------
oweiler
Doesn't history become a mess with Monorepos?

~~~
detaro
Alternatively, it is a mess to track connected changes through the history of
many repos. In this case, I'd think default tooling is actually better to
solve the problem that the mono-repo scenario has.

------
ot
Relevant talk by Google at CppCon last year focused on C++, but really talks
about getting rid of versioning in general.

[https://www.youtube.com/watch?v=tISy7EJQPzI](https://www.youtube.com/watch?v=tISy7EJQPzI)

------
skybrian
re: "I’ve seen APIs with thousands of usages across hundreds of projects get
refactored and with a monorepo setup it’s so easy that it’s no one even thinks
twice."

This makes it sound much easier than it is. You still need to get approvals
from all the teams whose code is touched. In the meantime, code may be
changing out from under you. And the more code you touch, the more tests you
need to run.

For anything but the most obviously risk-free changes (where a global owner
can approve it), splitting up a large change into independent pieces and
sending out a bunch of changelists in parallel will make more sense. There are
tools to do that too.

------
ramshanker
Anybody can point me to any free version control software tool be used with
MONOREPO.

I mean do we simple use 1 GIT repo with multiple folders for each
Project/Library?

------
coldcode
We build our 2-4 iOS apps out of 70 separate git repos, each creating a
dynamic framework. I told some folks at Apple once and they were ROFL.

------
krig
Monorepos are great if you are a monoculture. I don't find it surprising that
Google, Facebook and Twitter all enjoy and benefit from monorepos. I also
don't find life inside the monochromatic empire particularly appealing.

I think the benefits that massive corporations derive from monorepos
demonstrates how massive corporations are a net negative to society. Imagine
if, instead of an increasingly centralised and closed culture in technology,
companies were small enough and interdependent enough that a decentralised
model became a net positive.

------
bahmboo
Monorepo? Why not just keep everything in one directory? Let's go 8.3.
Everyone in same functional division? No way. Dependencies in your source are
managed at a higher level. This whole thing is nonsense because the orgs can't
figure it out. If you cannot figure out git or mercurial integration it's
either institutional break down or...

