Hacker News new | past | comments | ask | show | jobs | submit login
Amazon's Build System (gist.github.com)
33 points by todsacerdoti 53 days ago | hide | past | favorite | 22 comments

> I've heard descriptions and seen blog entries about many other large companies build systems, but to be honest, nothing even comes close to the amazing technology Amazon has produced.

> Say what you will about Amazon's frugality, their turnover, and their perks, but the tools available at Amazon make it a world-class place to build software.

Haha, funny. I mainly object to the fawning here. There's a lot of good ideas in the build system, but there's also more to it, like CI. Any (hypothetical) build system that runs the tests after commits have been merged is not great. Does constantly blocked pipelines sound fun? (Edit: one of the comments mentioned dry-runs, but for example, if the merge behaviour of the code review tool was atrocious, so people were accustomed to overriding it, and people did not pay attention to the dry-run, because there's no huge green tick like GitHub, then that may be ignored far too often)

There's also other issues. A build system that was originally made for e.g. C++ might struggle with other languages, and the experience be sucky. As good as something like Apollo sounds it might not deal too well with deploying e.g. Lambdas, or other cloud resources.

Don't drink the cool aid. There are some good lessons to be learned, but you can have a pretty great system with e.g. GitHub or GitLab - maybe even better for developer productivity while not being weighed down by years of legacy and inertia.

> Any (hypothetical) build system that runs the tests after commits have been merged is not great.

This is true and it's a problem with Brazil at Amazon. Certainly, if every developer does a dry-run build they can verify that their change will build and test. Dry run builds are super-easy to do, run pretty fast, and there's a great interface for seeing the test output logs if it fails. But if a developer has permission to push to a package they can do it with doing a dry-run and then the tip of your main branch is not green.

That said, I've also worked with a CI/CD system (in a large monorepo) that tested every commit before merging. Even that system could not guarantee that the tip of the main branch was green, because the change velocity was so high it wasn't practical to serialize the changes. Instead, every change was tested against a known green revision and if it passed and merged cleanly it would be put on the master branch. However, the master branch still had to be tested again because it was possible for two changes to test green and merge cleanly into a revision that failed tests. Serializing would have taken a lot longer to run (since you couldn't test in parallel) and one bad commit would flush the entire queue, meaning your change might be delayed by someone else's bad commit. I think this was the right trade-off for that system.

My time at amazon is what firmly convinced me that the “monorepo is the only way to develop at scale” folks just hadn’t seen something like Brazil.

It (and apollo) were truly a work of art and they were designed and built so long before everyone else was talking about reproducible builds.

Apollo had us deploying exact copies to our dev environments before docker came along and made it easy.

I have no idea how the designers had so much foresight.

My first couple of years at Amazon I was completely mystified by Brazil, version sets, and VFIs. Eventually the lightbulb clicked and when I moved on to a company with a monorepo I really missed Brazil. My main gripe with Brazil is that there are few ways to use it properly and a lot of ways to use it wrong and when you use it wrong you easily fall into dependency hell.

You seem to be one of the rare people who used both, Brazil and monorepo. Could you elaborate? What did you realize when the lightbulb clicked? What do you miss outside of Amazon? How can Brazil be misused?

So the main thing that clicked with Brazil was really understanding how to use versionsets effectively to model software stacks and in particular thinking about a package's dependencies as part of its public interface.

A common anti-pattern I saw at Amazon was for some team to need some functionality, see a package in the "live" version set that had it, and grab it without thinking about how that package's dependencies aligned with their application's. Teams also frequently used the wrong dependency type (runtime, build, test). Things will "work" if you declare a test dependency as a runtime dependency, but you are setting yourself up for unnecessary version conflicts down the road (you can have multiple versions of a package in your test dependency closure, but not your runtime dependency closure).

As for what I missed at the company with the monorepo there were two things: The first was that there was no way to deploy a version of the code that was exactly the same as what was in production plus some patch. The deployment system could only deploy the tip of the master branch, so if you needed to patch in prod you were going to bring with it all the other changes that had landed on the master branch, for better or for worse. There is no reason that a monorepo has to have this problem, though, it was a limitation of the deployment system.

The main thing I didn't like about the monorepo though was that it was next to impossible for me to track the changes that were relevant to me. In Brazil, every package is its own git repo and it's very straightforward to just list the git history of a package and see what has changed. In the monorepo 99% of the change in the git history were completely irrelevant. Also, twice they needed to rewrite the history of the entire repo to expunge some secret that had accidentally been committed years ago. All the hashes changed and it was very disruptive.

So what's good about Brazil is that it gives you the advantages of a monorepo without all the source code literally being in the same repo. It's also good because you can maintain multiple versions of the same package together and migrate different consuming apps separately. What's not so great about Brazil is that there is really a subtle art to factoring your software into packages and version sets and most times you get it wrong in a way that leads to seemingly unnecessary pain. The fact that you don't have to update all consumers together means that its really easy to just not update consumers and you pay the price for that eventually. There's another big pain point with Brazil which is that it was traditionally difficult to import open source software from open-source repositories (NPM, PyPy, Maven, etc.). This has gotten much better in recent years and there are efforts to improve it even more.

Not the OP but I am currently working at a company moving towards a Bazel monorepo and I previously worked at Amazon. For me the lightbulb of "this is a great system" didn't click until I actually left Amazon.

Compared to Gradle, Brazil is great because it lets you avoid thinking about a lot of smaller details. You declare a dependency on major version in the Config file and it'll grab the latest minor version from your versionset. A versionset is basically just a giant package-lock.json or cargo.lock file for all of your deps that is constantly updated in the CI/CD system. Since artifacts are deployed at a versionset level, you can always look up what specific version of a package is deployed anywhere. This means you don't have to think about minor versions 90% of the time, but when something breaks you can easily find it since the versionset interface is connected to the code browser. You can also easily ask "where does this package version exist" and see all the versionsets that are using a version of a package.

I think one of the other big things I miss about it is how well the tooling worked with multiple packages. Brazil had a concept of a "workspace". Normally if you're working with a package it pulls the deps from S3 or wherever. But if you wanted to work with multiple packages that depend on each other, you would run "brazil ws add package-foo-1.0" and it would clone that package to your workspace. Any other package in your workspace that depends on "package-foo-1.0" would now understand to use the local copy to build instead of pulling it remotely. This worked fairly seamlessly with the the Intellij Brazil plugin, making cross package refactoring pretty easy. Doing the same with gradle or npm requires manual work.

One of the biggest ways that brazil was misused was around handling of major versions. For context, only a single major version of a package is allowed to exist in a versionset at a time. If you tried to merge in a different major version of a package into your versionset, your pipeline would fail to build due to "Major version conflicts". One of the biggest sins was around bumping the major versions of the dependencies in a library without bumping the major version of that library at the same time. This would lead to many broken pipelines. Let's say you have a library Foo-1.0 with a bunch of users on other teams. You decide to bump up the Guava version from 25 to 29 and publish the new version of Foo-1.0. Anyone consuming Foo-1.0 would automatically pick up the new version of that lib because it's just a minor version change, however the merge would fail with a "major version conflict" because the major version of Guava they're using in their versionset is still 25. This means you would either have to pin that library back at a previous version, or bump your dependency on Guava in all of you packages to 29.

I think this last point really highlights the big difference between Bazel and Brazil. Bazel makes bumping versions a pain because you have to upgrade everything at the same time. However it also ensures that if there's a security issue with a lib, everyone is forced to upgrade at once. Brazil allows teams to adopt newer versions at their own paces, however you need a more complex CI/CD system, you have to deal with major version conflicts, and you have to deal with longer campaigns to upgrade libs with security issues. I think the two systems just have different tradeoffs, though the biggest advantage Bazel has is that it doesn't require the tight integration with a CI/CD system so it's easier to open source and operate.

> For context, only a single major version of a package is allowed to exist in a version set at a time.

This is not entirely true. A version set can have an arbitrary number of major versions of the same package. This is convenient if you have multiple applications feeding from the same upstream version set. You can add a new major version to the upstream version set and the applications can all switch over independently. This is of course how different applications can all consume from the "live" VS which has multiple major versions of many packages.

What you can't have is multiple major versions of the same package in the runtime closure of your application. It's fine to have multiple versions of build and test tools, but you can't deploy multiple versions of the same package via a VFI, which includes just the runtime closure. (You also can't have multiple packages providing the same file at the same path, but it is surprising how uncommon that is when you think about it).

Ah right I misspoke. I was conflating it with the common case, which is that one of the deps for versionset's target was changed. I agree with your other comment, the first year or so I just cargo culted Brazil and versionset changes until I actually had to fix more complicated build issues (e.g. Packaging up a third party lib)

I will also say that it's been my experience that a lot of Amazon developers find Brazil very opaque and frustrating. It reminds me of my experience learning git. In my early days I'd just try to mimic procedures without understanding what I was doing and eventually I'd get in a painful state that I couldn't resolve. It took quite a while to understand how to use both tools to solve problems in idiomatic ways.

How is it an improvement over the monorepo? The article claims it is but doesn't say how.

The main power of Brazil is that every buildable unit ("package" in Brazil) can have multiple branches and then a software stack can be defined in terms of the set of branches ("version set" in Brazil) across all the software in the stack. The "version set" defines the complete build, test, and runtime closure of the application.

In that sense, a version set is very much like a branch in a monorepo. The difference is that version sets are more composable and a version set contains only a subset of the repo. Like branches, version sets can have an upstream version set, so a common pattern is to have three layers of version sets: the central one (called "live" at Amazon) which contains your build tools, open source packages etc.; a shared platform version set which contains code shared across many applications; and then a number of downstream application version sets.

Thanks, but that's what was in the article.

Why is this better than monorepo, specifically? What use cases does it allow that aren't possible in a monorepo?

BTW, the typically talked about giant monorepos do support branching, and the branch is never the entire repo.

A branch in a git repository defines a snapshot of the entire repository. You can't have two branches checked out simultaneously. If your repository has a file at a path, you can't have some part of the repository depending on that file at revision A and some other part of the repository depending on that file at revision B. Version Sets let you do that.

Not everything is git. Subversion lets you do that. So does perforce. And of course Google's monorepo has its history in perforce.

Again... People making this claim that Brazil is so much better than monorepos seem to know a lot about Brazil but not much about monorepos.

Brazil was originally built on a Perforce monorepo. It didn't start being git until 2013 or so and even then packages could migrate between the Perforce repository and git repository. So Brazil is not an alternative to a monorepo.

It starts with a distributed approach, from the bottom up. The primitives of Packages and Version Sets match how teams are organized and work. Package versions, VFIs, and package builds are the interfaces & immutable artifacts for teams and software stacks to interact.

Conversely the the monorepo approach seems to spend its time trying to serialize and separate the innate codependencies of its approach.

Package versions, VFIs, and package builds are the interfaces & immutable artifacts for teams and software stacks to interact.

That doesn't explain anything, it's just a bunch of terms. Monorepos have lots of terms too. Why is it so hard to explain? I suspect it's because people claiming one is better than the other are only familiar with one, so they can't actually back up that claim.

There seems to be little information about it available although the people on the linked page give it a great rating:

> what Google, Facebook, and most other companies of comparable size and larger do is at best objectively less good and at worst wasting millions of dollars of lost productivity,

> Once you understand the build and deployment tools you first wonder how you ever did anything before and then start to fear how you'll do anything once you leave.

The kind of open-source variant is https://qbtbuildtool.com/

I strongly doubt that Google has a less sophisticated build system. They open sourced bazel, which has a pretty good feature set, e.g. reproducible builds, distributed caching etc.

Ok should have more carefully read what it does ;-) It is not really comparable to bazel, because it solves a different problem (stitching repos together). IMHO saying Google has nothing as sophisticated is still incorrect.

Somewhere else one said it is more like Nix (than Bazel).


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact