Monorepo Support

corytheboyd · on Sept 17, 2022

I don't get the case against monorepos, and why it's so polarizing.

You can share code without having to stand up infrastructure to host packages and whatnot.

You can separate concerns without introducing the infinite complexity of network io, queues etc. This is kind of a dig at microservices I guess, which have their place functionally (decoupled infrastructure, scaled independent of other services).

You can still deploy to separate targets, a code repository is not 1:1 with a deploy target, that is a fake constraint that never even existed.

Manyrepos ALWAYS end up being second class citizens. Test setup isn't as good as in the monorepo because that means duplicating it N times, and that is obviously wrong.

Common patterns are The Same But Different everywhere and/or there is crazy complexity in sharing code between repositories to alleviate this (often having its own problems)

It's just... all of that goes away with one/fewer code repositories. So... why? I'm not even anti micro-service, monorepo actually makes MORE sense with microservices IMO. Why do we do this?

Before someone points it out, I do recognize that a monorepo can still be poorly architected. We can all rest assured knowing that poor architecture is poor architecture whether it be monorepo, manyrepo, monolithic, microservice, PHP, Rust, blah blah.

camgunz · on Sept 17, 2022

Because in larger monorepos, by definition most of the stuff in there is irrelevant to most people. So you're waiting for checkouts to finish because of a bunch of irrelevant stuff. You're waiting for tests to finish because of a bunch of irrelevant stuff. You're waiting for requirements to update because of a bunch of irrelevant stuff. You're waiting for compilation, CI/CD, deploys, because of a bunch of irrelevant stuff.

Sure, tooling and configuration can mitigate a lot of that stuff, but most tools don't countenance codebases that are so large that most of the stuff in them is irrelevant to everyone. The natural thing to do is to split it up.

I'm personally against microservices; I think they go way too far the other way and tend to encourage some of the worst software development practices (NIH, code duplication, super weird architecture and viral explosion of dependency injection everywhere), but "we have one repo for everything" is also pretty weird. I mean, the most famous monorepos (Linux, OpenBSD, Google) literally invented tools to deal with them. That should say something.

frenchman99 · on Sept 17, 2022

> You're waiting for compilation, CI/CD, deploys, because of a bunch of irrelevant stuff.

You could setup your CI to only recompile what's changed, so you wouldn't be waiting for anything else than what you've changed. This usually requires a bit of work upfront, but once you've done it for 1 part of the codebase it is easy and low-maintenance to replicate to the entire codebase. With Gitlab CI (and others), you can import bits of yaml configuration here and there to avoid code duplication for such use cases.

> I'm personally against microservices

In some cases, you need microservices or at least being able to run only a single part of the monolith through configuration. For instance if you want to host parts of the codebase in a separate virtual machine for security or scalability.

I don't think the choice is a matter of opinion but a matter of technical/business requirements.

camgunz · on Sept 17, 2022

Well, that's my point. The default for pretty much all your tools is "one repo one project". If you don't want tools to work on the entire repo, you've got to configure them specially, if that's even possible, and you have to hope it works well.

You've gotta weigh the trade off of whatever benefit you're getting from monorepos against the irritation of tooling that doesn't really countenance them.

FridgeSeal · on Sept 18, 2022

> You could setup your CI to only recompile what's changed, so you wouldn't be waiting for anything else than what you've changed.

What’s it like having CI pipelines that aren’t trash? I’m stuck with the awful choice of CI tool my work has made, and there’s not a lot I can do about that.

azemetre · on Sept 17, 2022

There’s also project repos where all the code pertaining to a project is under one repo. I think this setup is the best of both worlds.

If you’re a team that has a client, several micro services, DB, etc it’s way better to have that under a single repo than spread to multiple. Monorepos don’t have to be gigantic monstrosities, they can encapsulated products.

lucasyvas · on Sept 17, 2022

I like to call this a "macroservice" or "business function" repository. If you are dealing with the Billing function, everything to do with that is in the billing repository (UI, API, scripts, workers/jobs, database migrations).

This is my favorite for sure in terms of balance - you don't need the overhead of one repo per deployable artifact, and the size of the repository has a reasonable theoretical maximum and can still come close to fitting in your head.

You can also test it all together extremely easily and it's the perfect slice for a team to work on, or be shoveled off to a different owner of that function in the future.

8note · on Sept 17, 2022

I'd certainly prefer to keep each microservice in its own repo.

That way in a reorg, the services can be redistributed in a way that makes sense without having to make code changes.

Having two monorepos where you only own half of each seems like the worst of both worlds

azemetre · on Sept 17, 2022

What if your team is the only ones that consume the micro service? Doesn’t make sense to me to split it. It’s also way easier to take stuff out of monorepos than put them in IMO.

drewcoo · on Sept 17, 2022

> I'd certainly prefer to keep each microservice in its own repo.

Having a single monorepo solves that, too.

unity1001 · on Sept 17, 2022

> code duplication

That makes it very easy to roll out changes to any given service without breaking others and helps a lot with the backwards compatibility of services. Makes everything more resilient.

Shared code is always an internal dependency. You think 'We are sharing code and this is efficient'. But you end up having to take into account many different parts of the application when rolling out one change for you can easily break something totally out of sight while trying to change another.

I were mostly against duplication of code. But the more I develop and maintain larger systems, the better I see the value of separating codebases, even if this includes code duplication.

camgunz · on Sept 17, 2022

Yes! Great OK a microservices debate. Let's do this.

My overarching argument is this: I would characterize microservices as a response to organizational and cultural challenges. However, separating a software project into multiple repositories with their own dependencies, deployments, tests, and philosophies adds needless overhead and complexity to systems. Further, because code can no longer be shared across these component parts of the system, requirements, code, deployments, documentation, etc. must be duplicated. But this duplication is done--if it is done at all--imperfectly, because it is laborious and tedious. Finally, microservices tend to drift. Some are written in Node, others are written Python or Go, etc. Someone wants to try functional programming, someone wants to try Hexagonal. This leads to brittle, badly documented systems that by definition no one completely understands and that require a mountain of ancillary software to manage and operate.

> That makes it very easy to roll out changes to any given service without breaking others

Functionally, there's no difference between this and copying a function to modify for the new functionality. You're saying "I need to modify [functionality X] in order to deliver [feature Y], but other systems rely on [functionality X], so I have to carefully modify [functionality X] to avoid breaking [feature A-X]".

A solution to this is to just make [functionality X_new] and use some if statements. It's not elegant, but neither is forking a new repo to avoid refactoring. I'd characterize that as extreme technical debt, and would recommend refactoring instead. I understand that in lots of shops, forking a new repo is actually easier--you can break free of sclerotic design processes or overbearing colleagues/managers/architects. But the systems that allowed those problems into that project will soon force them into your new project. Microservices are a short-term solution to a long-term organizational or cultural problem.

> helps a lot with the backwards compatibility of services

There's two ways to do this. You can set up integration tests that prevent you from deploying if you've broken compatibility. Or you can fork a new repo whenever you need to make a (potentially breaking--which is all of them) change. I would recommend integration tests. You're gonna need them eventually.

> Shared code is always an internal dependency. You think 'We are sharing code and this is efficient'. But you end up having to take into account many different parts of the application when rolling out one change for you can easily break something totally out of sight while trying to change another.

This more or less applies to any part of any software system unless you're very careful about shared state and side effects. The alternative here is unit testing, which, similar to integration testing, I recommend because you'll also need that eventually too.

> But the more I develop and maintain larger systems, the better I see the value of separating codebases, even if this includes code duplication.

I'm not at all a hardcore "don't repeat yourself" person (more "rule of three") but to the extent I am, I tend to think code duplication indicates you need to reconceive the mental model of your application, not merely create a helper method or whatever.

My point here is that for me, code duplication isn't really the worst part of microservices. The thing I find most objectionable is that they lead to weird, rickety systems that are hard to develop and maintain. You haven't lived until your tickets for a sprint involve fixing multiple bugs in code copy-pasted across half a dozen microservices, or you have to deploy your changes to this microservice you've never worked on before and it involves the most bonkers incantations you've ever read about (update this Jenkins script blah blah).

I think people have this idea that microservices reduce scope, interdependence, and complexity. Maybe sometimes that's true. But for the engineers who have to work across multiple microservices, you really get all of the bad and none of the good. You're still dealing with a big software project, but now it's sprawled across multiple repositories all with their own idioms, idiosyncrasies, languages, copy-pasted code, deployment setups, etc. etc. ad nauseum.

You might argue I've only seen bad implementations of microservices. Sure, that's possible. I'm not saying they can't work. I'm saying the forces that lead teams to adopt microservices inevitably corrupt all projects no matter their architecture, and that microservices incentivize a particular type of shorttermism and myopia that makes some of the scenarios I've described the path of least resistance. I think there are far fewer pitfalls with "1 repo per project" (not "1 repo per company") and that we have great tools and techniques to help teams using this structure.

unity1001 · on Sept 17, 2022

> However, separating a software project into multiple repositories with their own dependencies, deployments, tests, and philosophies adds needless overhead and complexity to systems

Yes, for the organization. However when the organization is larger, this may not be a disadvantage - if the teams are already as large as small startups, then it only makes sense that they have their own repo if they are doing microservices.

They drift. True. But that's no different from the dependencies in a repo that comes with reusing code. Its good practice, but it also has its drawbacks.

> A solution to this is to just make [functionality X_new] and use some if statements

That's a worse practice in the long run. Such exceptions and slightly modified functions complicate the codebase and make it more difficult to gain keep context for anyone working on that codebase in the long run. There are situations in which this is inevitable. But if it can be avoided, it should be avoided.

> You can set up integration tests that prevent you from deploying if you've broken compatibility

Nope. Trust me, you eventually can't. Things will get complicated in the long run. You wont be able to have tests for every important angle, use case or function and maintain it. User-facing interfaces and functions are even more difficult - they involve combining all of those different services and functionality in a coherent whole. Move one brick and everything will get disrupted. You can try. But your tests, your commits, deploys will take much longer and everything will get more complicated.

> This more or less applies to any part of any software system unless you're very careful about shared state and side effects

Yes it does. Microservices is a way of avoiding that as long as possible. Eventually testing will still get complicated in user-facing functionality. But until your app becomes such a large and well-featured and used one, you have pretty good runway with microservices.

> I tend to think code duplication indicates you need to reconceive the mental model of your application, not merely create a helper method or whatever.

You can do that at the start. And you will be able to do it for a good chunk of time. But when your application is large and complicated enough, it will become more difficult to do. Microservices is a way to keep things isolated and contexts understandable as long as its possible.

> they lead to weird, rickety systems that are hard to develop and maintain. You haven't lived until your tickets for a sprint involve fixing multiple bugs in code copy-pasted across half a dozen microservices

Yes, that is an inherent difficulty in microservices. Tracking bugs, logging must improve. They eventually will.

> or you have to deploy your changes to this microservice you've never worked on before and it involves the most bonkers incantations you've ever read about (update this Jenkins script blah blah).

That is not specific to microservices. It can easily be encountered when dealing with a service that is tightly integrated in a monorepo, or even the part of a singular monolithic app.

Simple standards must be applied to all code across all microservices for keeping them simple, easily understandable and modifiable.

> "1 repo per project"

Doesn't that converge to the microservice model...

camgunz · on Sept 18, 2022

> Yes, for the organization. However when the organization is larger, this may not be a disadvantage - if the teams are already as large as small startups, then it only makes sense that they have their own repo if they are doing microservices.

This is a pretty good point. I've not dealt with microservices at a huge company like a Netflix or a Wal-Mart. My experience is "we have a website with 10-20 pages, built by 1-3 teams of ~5 on 40 microservices". Maybe they make sense when you're at the scale of 100s of engineers, but they're not better than Rails/Django for 99% of companies.

> That's a worse practice in the long run. Such exceptions and slightly modified functions complicate the codebase and make it more difficult to gain keep context for anyone working on that codebase in the long run.

Oh, I was saying there's an easier solution than forking a repo. My overall point is that adding new functionality is a core software engineering thing; we should see it coming, and architect our code to--reasonably--accommodate it. To that end, sometimes we have to refactor stuff, which means leaning on your {integration,unit} tests. You're gonna have tests anyway, why not let them help you maintain compatibility?

> You wont be able to have tests for every important angle, use case or function and maintain it.

A lot of mission-critical industries (aerospace, transit, medical devices) more or less achieve this. It comes at a velocity cost, but it's not impossible. I'm just gonna hand wave and say you can probably get 80% of the effect with 20% of the effort, which is great. The point isn't to catch all bugs every time, the point is to let you modify functions/etc. without forking a new repo. And again, you're gonna have tests anyway.

>> This more or less applies to any part of any software system unless you're very careful about shared state and side effects

> Yes it does. Microservices is a way of avoiding that as long as possible.

Well, another way of saying this is "microservices have this problem too, just later". Sometimes that's useful, but if the agreement we're forging out here is "microservices are useful if you have a big company with lots of services/teams", it sounds like you'll inevitably have this problem. The solution, therefore, isn't microservices, it's managing shared state and side effects, which microservices doesn't have a monopoly on by any stretch.

> You can do that at the start. And you will be able to do it for a good chunk of time. But when your application is large and complicated enough, it will become more difficult to do.

This makes me think we disagree a little about what a microservice is--in fairness it's kind of a vague term. The microservices I've dealt with are like, a small web app in some micro framework (Node, Flask, Tornado, Go) that gets iframe'd into a website along with other microservices. My problem with this is: this should be a single website built in a single framework. Django, Rails, Phoenix, Symfony, etc. are all great at this, or you can get some backend-as-a-service like Hasura or PostgREST if you're willing to use minimally fancy JavaScript.

People will argue, "but $FRAMEWORK gives you no tools for managing shared state and side effects". But it does: the database. And I would also argue that having a single project lets you consolidate the implementation of your business logic. Using microservices, your business logic is copy-pasted across dozens of services. That's pretty clearly bad--wouldn't it be better if there were something like a libcompany that everyone shared?

You'd probably argue that it'd be hard to change libcompany and that every team should have lots of forks of libcompany that implement whatever bespoke changes they need so they can move fast. But that sounds nightmarish to me, especially as someone who's debugged this kind of situation before. And guess what, when you find the problem, you might not even get to fix it because microservices still have dependencies, some of which are unknown, so you lose that benefit also. What you'll probably do is fork another microservice with your libcompany changes.

>> or you have to deploy your changes to this microservice you've never worked on before and it involves the most bonkers incantations you've ever read about (update this Jenkins script blah blah).

> That is not specific to microservices.

It kind of is though, in that with "1 repo per project" you only have 1 weird pipeline to deal with. I can manage that. I find it hard to manage a dozen weird pipelines, or dozens of weird pipelines on different old versions of deployment/testing tools or custom scripts, etc. Maybe the vision of microservices is a fleet of Lambdas carefully managed by Terraform. That sounds nice! My experience with microservices is a hot mess of everything from Chef to Make to Jenkins. Scale matters here: dealing with 1 Chef and 1 Make is much better than dealing with 4 Chefs, 3 Makes, and 5 Jenkins.

>> "1 repo per project"

> Doesn't that converge to the microservice model...

Depends on what you mean by project I suppose. I'm not a fan of Domain-Driven Design, but one of the things I do like about it is how it defines a domain, which is a namespace without clashes. For example take the word "job". In the construction.com domain, "job" might mean the building you're building, which has certain attributes: building address, crew size, etc. In the softwareconsultant.com job domain, "job" might mean a contract you have with a client, so client name, requirements, etc. To me, these should be separate projects--two Rails codebases with two databases, testing setups, deployment pipelines, etc.

I don't think there's any convergence to the microservice model here. Microservices say, "what if construction.com was actually multiple Sinatra apps". Unless you're very into the microservice ideology, it sounds like these projects should stay single projects.

---

Leaving out all the "bad" implementations of microservices (which I think the microservices model makes really easy to do), I still think the "fleet of Lambdas managed by infra-as-code" leads to situations where you have copy-pasted versions of your business logic everywhere. This is solvable with a "libcompany", but microservices proponents are against that because it necessitates "coupling" and "coordination". But that's what a software system is: a bunch of interacting parts. I often find myself lost in a semantic graveyard in these debates because it's like, what is a project, is each microservice a project or is the project now defined in terms of dozens of microservices, blah blah blah. It's hard for me to not see these systems as simply expensive, complicated ways to deploy individual API endpoints with the attendant overhead of multiple deployment/maintenance/testing setups. Maybe that's useful for some organizations, like if construction.com is a spawling billion dollar enterprise with tons of complicated functionality. That sounds like a hard problem. But I think most people don't have hard problems, they don't need Martin Fowler, they just need Django or Hasura.

bckr · on Sept 17, 2022

What's NIH?

simlevesque · on Sept 17, 2022

Not invented here.

jayd16 · on Sept 17, 2022

Monorepo comes with its own set of challenges. Git doesn't scale all that well but is the most popular and supported VCS.

Assuming you get as far as actually having code in one repo, the advantages of monorepo do not come for free. You either use a consolidated build system or you're still linking code using packages. A mono-build is no small task especially if your org is of any sort of complexity.

You'll almost certainly never get away from packages entirely unless you want to pull in the source code of all your dependencies. Not only are you merging it in but you're integrating it into your mono-build system. Doesn't sound feasible or enjoyable to me.

Some people consider mono-repo a fools errand. At most places you can take the pragmatic approach of consolidating repos where it makes sense while keeping the decoupling advantages of packages where it makes sense. They both have trade offs.

drewcoo · on Sept 17, 2022

Believe it or not, there is not only one way to do a monorepo.

It's still possible to have different jobs for different projects, just like before, with some kind of build filter in front of them. Those different build jobs can be managed however they are now. This is common. There is no need for a single giant build mechanism that knows all the things.

Packages are a separate concern. And of course you should use versioned packages, just like you always have. Why re-invent a solved problem? Trying to force library upgrades in n-many services all at once, automatically, is a hard problem. Why invent that, too?

Repo consolidation really shines for me in infrastructure and testing - the things that can touch multiple services at once. That's stuff most devs aren't really involved in day-to-day. I think that a lot of the monorepo hatred comes from not understanding other people's problems.

jayd16 · on Sept 18, 2022

If you're using versioned packages, how does a mono-repo make sense? At least, what is the point? If your build machine creates the package, you can't even push the package update and the consumption of the package in the same commit.

NomDePlum · on Sept 17, 2022

Doing exactly this at a new organisation I recently joined.

We are treating it as an experiment. Got 2 teams with 5/6 Devs in each sharing the same monorepo. We are using nx.dev as the build tool and it's going pretty well so far.

Different tech stacks too but using nx.dev thats been abstracted away. Allows us to share practices and we've built out the CI/CD and supporting infrastructure on AWS together which has certainly saved duplication of effort. Possibly one more team coming on board too.

If in the future it's not paying off we can always split. Doesn't need to be a forever decision, is how we are viewing it.

curlftpfs · on Sept 17, 2022

What do/don't you like about Nx?

You describe my scenario (with fewer teams), difficult deciding between options. (Using Webpack + git to do "component management" and bundle to single-file ES6 modules.)

lxe · on Sept 17, 2022

Yes all these are positives in theory, but in my 5 years of supporting and maintaining monorepos, I've mostly been inundated with incredibly degraded developer experience and hair-pulling CI hacks.

I am convinced that the monorepo (a single repository to store code for all of the organization's microservices) is something google concocted to keep their 250+ infra and devops engineers very busy.

baslas · on Sept 17, 2022

My own experience is that most of the times people view this subject as a matter of personal preference. They do not see this as a technical choice that may (or not) solve problems or help reach particular goals. Thus, when discussing this subject from a logical/engineering perspective, laying down pros/cons arguments, the other party may take it personally.

hk1337 · on Sept 17, 2022

I found that treating each directory as its own separate application or package is helpful. If you need to share resources between multiple apps/packages, use your dependency manager, most of them seem to have some sort of way to use local packages.

echelon · on Sept 17, 2022

Exactly! Frontend, backend, or desktop monorepo structure for Java, Rust, Typescript, etc:

  /src
    /lib
      /mysql-common
      /redis-common
      /...
    /services
      /app-1
      /app-2
      /...

You can use other directories to group assets, database migrations, documentation - whatever. The core build, test, CI, etc. tools can live in the root.

The ability to share common code and not have to worry about packaging internal libraries, versioning them, and rolling out updates n-many times is a game changer.

hk1337 · on Sept 17, 2022

And if an app/package did grow to the point it needed its own repository, all the packages depending on it just need to update their dependency configuration to the new source.

echelon · on Sept 17, 2022

The most typical case of this happening I've seen is an ancient unsupported legacy app. You still need to deploy it, but you only build it every few months or so. You rarely touch the thing, and its dependencies are pinned to old versions. The lack of support tends to drag down the rest of the monorepo, so it'll sometimes get "kicked out" as a bad citizen.

Really only reserved for the worst cases.

majormajor · on Sept 17, 2022

In practice they often are rarely implemented as well as you describe and end up with a fun set of problems of both monorepos AND microservices.

You try to handwave this away with "poorly architected is poorly architected" but... I don't think you can say that and say you don't get the backlash in the same sentence.

You describe that as "all those problems go away" but they don't go away for free! Nor do I agree that they're worth the cost of unfamiliar tooling in most cases, anyway.

They only go away if you get the architecture right, and the monorepo does not have inherent architectural guardrails against getting it wrong. That's the big difference between the hype and the reality.

You can easily get independently-deployed services sharing code in ugly tangled ways. Weird weird ugly crap done because service A just isn't ready to upgrade to the new version of library B even though service C needs it for a new feature, instead of just keeping service A on the previous version of a published artifact for longer. Sometimes that sort of thing is billed as a good thing - "force the team to update all their consumers before making breaking changes" - but in practice you get hacks and weird workarounds.

Weird compilation or runtime errors because this team wanted to use a different JVM language in this part than this other team did in that part, and the tooling got super confused.

"Just use bazel from day 1 and make sure separation of concerns is good" and so on and so on - sure, sure, sure. But now we're not talking monorepos, we're talking specific tooling too. And every layer of doing it differently you add is another chance to fuck up.

0x457 · on Sept 17, 2022

I think it's due to people working in badly setup monorepos:

- CI/CD that runs on the entire thing

- merge policy that requires branch being up-to-date even for FF merges

  - I once spent an entire day trying to merge a PR because CI for other people changes was taking N minutes, but mine was N + 1 minutes. I kid you not, I resigned a week after that day because I just couldn't.

 - Messy organization within the repo

 - People wanted tag releases, so every single service within monorepo was on the same version and some of had months since last activity

 - Whole kitchen sink in one repo: terraform, Ansible, source code, fucking debian packages, secrets (sop) whatever else VPoE thinks is part of engineering.

 - include over an include in pipeline files (this was GitLab CI) that is impossible to navigate because of name collisions.

You gripes about many-repos are IMO wrong, though. If you have to duplicate things, then you're doing it wrong.

robust-cactus · on Sept 17, 2022

I agree with this take. The companies I've worked at that have had multi-repos basically just obfuscated related pieces of code - ultimately leading to lots of bugs.

The fact that I can grep across services is a godsend.

I think we should all actively be fighting against Conway's law: "your code resembles your org structure". Multi-repos are usually a thin facade that basically end up supporting this and makes it harder to dev in and make the architecture typically worse.

https://en.m.wikipedia.org/wiki/Conway%27s_law

drewcoo · on Sept 17, 2022

> we should all actively be fighting against Conway's law

I think the inverse of that, the org mirroring the code, happens as well. And maybe Conway really meant that, too.

I would argue that we should be organizing code in a way that we'd like our teams to be organized. We should use this as a tool for devs to self-organize. Eventually, management will see the cost savings in organizing the people similarly.

web007 · on Sept 17, 2022

You get everything for free, but you also have to take EVERYTHING.

I work primarily on "cloud" stuff, so 95% of my world is Terraform and other (asynchronous) configuration-based stuff. I feel the pain more acutely because of the overhead imposed on my work that shouldn't exist, vs compiled software deployed for end-use-cases where there is an expected burden of build + test.

Because monorail, all of my changes go through the same tests and restrictions and review as executable code. We have to "pass the build" - and waste resources testing all of that executable code for every change. It's even worse being in a pseudo-regulated space (SOC2 / similar) since anything applied to any sub-part of the repo applies to everything.

You can argue that parts of the repo can use different processes, but then what's the benefit of keeping it all together vs having different repos with those different processes?

The only reason monorepos are better today is that git sucks at multi-repo (think submodules), and humans suck at separate repos. It might also be that nobody takes it far enough, where you have separate repos for everything, but that feels like the first two points both together; maybe tooling and/or a massive DevEx team could make that work. I'm pretty sure something would need to supplant git for subtree repos to work and make sense, until then the monorepo is the least bad option of all the bad options.

throwaway894345 · on Sept 17, 2022

Ideally a monorepo doesn’t run all tests all the time, but only the ones that need to based on a given change. Unfortunately, there’s no great tooling for this. Bazel works well sometiems depending on your project, and Nix is sort of in the same boat. Both are beasts to learn. But I think it’s mostly a solvable problem, and the reason there aren’t better offerings is because of the sheer effort involved in building these systems.

web007 · on Sept 18, 2022

We actually do use bazel, where I have a love-hate relationship with that as well.

It's incredibly painful and a lot of overhead to bazel allthethings. Using bazel to define dependencies is another double-edged sword where you can get them enumerated, but EVERYTHING has to use it. You can't check out a repo and go, you have to check out the repo and bazel whatever you're trying to do. No "I know Python, let me venv && pip install -r requirements.txt or pipenv ...", now you have to incant the magic bazel incantations instead of native stuff. Same for NPM/Yarn, Rails, Go, etc. And that's assuming there's good support for your language/framework of choice; a Rails acquisition a few years ago was awful because bazel didn't have good support for it.

We have a double digit number of humans on our DevEx team dedicated to this stuff, and it's still painful. That speaks to the effort you're talking about.

On the love side, being able to query for "what does this dep trigger a rebuild for" without guessing is about as good as you can get, and having a requirements.txt-equivalent that magically packages itself into a standalone set of files with minimal effort is pretty sweet. The overhead to get from 0 to 1 is a lot (like... A _L_O_T_), but 1 to N is easy.

charcircuit · on Sept 17, 2022

>and waste resources testing all of that executable code for every change

If it's a waste why is it even running those tests? That's it's own problem.

web007 · on Sept 18, 2022

Of course, but how do you solve that problem in the general case?

When everything lives next to everything else it's harder to draw the line around the blast radius of a change. If there was a repo boundary, that's a really easy hard constraint.

bilalq · on Sept 17, 2022

Both monorepo and multirepo can work fine. However, you need tooling to be effective with either.

At Amazon, a small team of just 5 people might own around 100 repos. But they have tooling that makes that easier to organize and manage. Google has a ridiculous amount of code in a single repo, but they too have specialized tooling that makes that manageable.

Since you laid out the case for monorepos, I'll share some points about why one might prefer multirepo. These points mainly center around microservices, since you probably are monorepo by default if you have a monolith:

1. Tooling is more straight-forward for it out of the box. For a given service, create a repo from a template and you're live. You don't need to derp around with lerna or yarn workspaces and figure out how to make those work when some repos use maven or rubygems and cocoapods in addition to npm.

2. Better enforcement of requiring services to only communicate over API boundaries instead of code-share. Monorepos make violation of service boundaries too easy.

3. If you're doing a monolith, monorepos may be fine, but there are a ton of problems you run into if you use them for microservices. If you have one mega pipeline for all services, what happens if a single service fails to deploy or fails post-deployment CI checks? If you have multiple pipelines, what do you do when 4/12 pipelines fail? How do you track which commit each service is at in different stages? What happens if CI checks end up failing for some other service than the one you actually touched on your PR submission?

4. Less merge-conflict/out of date branch noise when developing.

5. Able to see which commits, PRs, issues, etc are associated with which service without needing to setup manual labor or build tooling to auto-label and tagging.

6. Possible to introduce fine-grained permissions on different repos/sections of the code. You can limit view permissions of top-secret projects, grant teams more ownership of their own repos, etc.

7. Fine-grained permissioning extends to automated tools. If you install a Github app to only one repo, it's limited to what it can do. This is a blast radius reduction you get only with multi-repo. With a mono-repo, you also have all Github secrets shared with the whole repo. If only one service should have access to certain secrets, you can't model that.

8. If you use git tags for service release annotations, it'd get very noisy to have tags for every service all in one monorepo.

9. If you want to generate automated release notes on deployment or library package publication, where can that go in a monorepo? The Github releases API gives you that for free, but if you're doing a monorepo without a monolith, you're going to have to find or build your own tooling here.

tetromino_ · on Sept 17, 2022

Excellent points! The usual answer to points 1, 2, 3, 5, 9 is "use Bazel (or something inspired by Bazel) for absolutely everything". The usual answer to points 6, 7, 8 is "don't use git (at least not plain vanilla git)". The answer to point 4 is along the lines of "don't hold it that way" and "add custom tooling", but honestly, doesn't seem to have an awesome solution.

gedy · on Sept 17, 2022

> I don't get the case against monorepos, and why it's so polarizing.

Because bad monorepos ala monolithic apps tend toward no separation of concerns, and with a large enough team everyone is stepping on each other's toes in conflicts, tests, etc. Some tech like Rails makes this even harder to enforce boundaries.

I get that a good team can manage a monorepo that's not one big ball of code, and a bad one wouldn't necessarily be better with microsystems either.

But it's super frustrating to hear teams blobbing everything together with no layering or separation and defending it as "but we're a monorepo"

vlunkr · on Sept 17, 2022

Seems like you’re conflating monolith and monorepo.

gedy · on Sept 17, 2022

No I'm suggesting many teams claiming they have "monorepos" conflate that with monolithic thinking and design.

candiddevmike · on Sept 17, 2022

In most organizations the gravitational pull of Conway's Law is so overpowering it forces every team to have at least one repo.

gregmac · on Sept 17, 2022

> You can still deploy to separate targets, a code repository is not 1:1 with a deploy target, that is a fake constraint that never even existed.

I've found repository:deployment is a very useful way of organizing code. I work on a mix of cloud and on-prem apps, and maybe I'm biased from my personal preferences and experience but I see most of the deployment/release problems, confusing merge conflicts and build pain comes from products/repositories where this isn't the case.

I'm not quite sure what you mean by "targets" though, so I'll explain what I mean when I say "deployment", in the context of the mix of stuff I work on: the entire repository is released (and/or deployed) in a monolithic way.

This can be a microservice, an executable or installer (on-prem), a package (consumed by other repositories), or even a big monolithic service (eg, multiple web apps + proxy servers + terraform templates).

My reasoning behind this is that a deployment has to be standalone.

For example a microservice should be able to be deployed independently at any time without breaking anything that consumes it -- meaning you always have to ensure backwards compatibility. If you can't (or don't) do that -- which means you have to do coordinated deploys and deploy several things at the same instant -- you actually don't have microservices; you really have a monolithic architecture with all the complications/overhead of managing a microservice.

I think when you combine multiple microservices into one repository, it's too easy to break this backwards compatibility contract because from the source code point of view, those mixed versions never exist. My experience is that lots of developers seem to struggle with this conceptually and argue about it being unnecessary when it's raised during a PR.

If you don't want to support backwards compatibility between your "microservices" that's totally fine too: but IMHO you aren't doing microservices, and you shouldn't even design the ability to deploy them independently. When you do coordinated deploys and one fails, the rollback process is awful and you can have extended downtime. Instead it makes sense to have a monolithic deployment process (eg: single terraform file) that deploys them all together, and the easiest way to manage this is to have them in a single repository.

There's another challenge with on-prem software. Having a branch or tag for "Foo v1.1" that also contains the source for "Bar" that is at an unfinished state somewhere between v3.4 and v3.5, and likewise a "Bar v3.5" that contains the source for "Foo v-not-quite-1.2" is just nonsensical. Depending on the branching strategy and types of changes happening it also leads to the team that works on "Foo" fixing merge conflicts they don't understand from changes in "Bar", which are really easy to get wrong. So once again: If they're released independently, they should have their own repositories.

jbverschoor · on Sept 17, 2022

To make people feel dumb or smart

malkia · on Sept 17, 2022

Monorepo + micro-services 4tw.

eropple · on Sept 17, 2022

Hey folks. I do devrel here at Render and it's nice to see this floating by on Hacker News--our new monorepo support does a lot to improve the ergonomics of running multiple services out of a single repo.

One note: in Render parlance, "services" includes static websites, so even for systems that wouldn't always be considered a monorepo in other contexts, this is useful to launch a static website alongside your code and, in so doing, more clearly communicate what you're doing to the next person to touch it. (Including six-months-from-now you.)

paulgb · on Sept 17, 2022

Render is one of those companies you don't see a lot on HN, but you talk to developers who use it and they have a lot of love for it. Stuff just works like you'd expect it to. We've been a happy user of monorepo support since it was in beta and it's worked great for us.

samwillis · on Sept 17, 2022

I disagree, any post about Heroku or any of the startups aiming to replace it (Fly or Railway for example) always mention it. From the perspective of someone looking to move to one of them from Heroku:

- Render is closest to what Heroku do, however being on their own hardware they can undercut Heroku who, by running on top of AWS, are both limited but apparently also don’t want to compete on price.

- Fly is both aiming to be a Heroku alternative but also have a really compelling use case of placing many smaller VMs closer to your customers. I believe they are heading towards a target of scale to zero on distributed VM regions for your app. They have some super clever ideas around distributed DB read replicas.

- Railway I think is somewhere in the middle, but I can’t understand their pricing… they say “you only pay for what you use” and the pricing page imply a that you can use fractional resources. It seems confusing as to if they have some sort of auto scale to zero or not. The docs are lacking on in the regard.

For me Fly I think are winning. Probably with Supabase or or Crunchy Data for the DB.

tiltrus · on Sept 17, 2022

Render.com has been a lot more stable for me than Fly.io. After 3-4 months of running into issues running on Fly.io, I went with Render.com and have had no issues so far.

I think Fly.io employees are a lot more active on HN which is why you see them mentioned more often. They also write great blog posts which get a lot of attention here.

I also agree with the sibling comment. I came to Fly.io being sold on servers running close to the users. But I didn't see much of a peformance improvement with my web apps as they all have to communicate with Firebase. There's probably situations where this technology makes sense, but I haven't found a use case yet.

detaro · on Sept 17, 2022

Curious what kind of issues you've seen with fly.io? (Most reports about them I've seen seem to be based on people trying it out relatively briefly, where they might miss things that crop up over a longer time)

jayp · on Sept 17, 2022

If you are going to use a centralize database coupled with edge compute, you may end up with far worse end-to-end latencies if your compute<->db does multiple roundtrips per request (which is fairly common in practice). In most cases, you need compute and db to be colocated.

I am skeptical of edge compute being generalizable. I do forsee a bright future for it for embarrassingly parallel problem space, where data sharding can be done cleanly based on an end user.

stefanvdw1 · on Sept 17, 2022

This is great! I’ve been using Render to host all my projects ranging from static sites to full blown web apps ever since moving away from Heroku. Their free tier is also pretty good for prototyping.

Recommend it to anyone moving away from Heroku or looking for a cheap place to start.

hyperhopper · on Sept 17, 2022

This is patching symptoms.

If you need some hosting/deployment tool to support the way you structure your code repos, there is a bigger problem.

Your own tooling for your repo should decide what gets triggered. It's not even as simple as this article makes it out to be. If I edit a readme.md, should this redeploy because it's in the same directory?

Of course not. Structure your code how you want. Then make good tools to trigger other tools. don't restrict yourself to every downstream job type and tool integrating vertically with your specific structures.

benced · on Sept 17, 2022

Many-repo is better at places that can’t fund a large infrastructure team, mono-repo is better at places that can.

The natural state of a mono-repo is Twitter-like paralysis, it requires concentrated work to avoid that but that work can make them better than many-repo.

porridgepie · on Sept 17, 2022

Tried it for a couple of years, and as many already said, its more that its different than that its a lot better or worse than the altrrnative.