
Building Uber’s Go Monorepo with Bazel - dayanruben
https://eng.uber.com/go-monorepo-bazel/
======
simfoo
Might make sense if you use Go or any other language with a straightforward
tooling landscape (compilers, package management etc.).

That changes drastically if your codebase contains a lot of C++ and your SW
model doesn't quite match the way Bazel tries to push you into. For instance
doing any of the following things will quickly turn into a nightmare when
using the Bazel C++ rules:

* dynamically linking libraries

* using a containerized approach for library includes (so transitive relative include paths)

* using different toolchains and cross compiling

* interfacing with thirdparty libraries

We are talking seasoned build engineers ending up frustrated after literally
months of trying to achieve something that is easy as pie in CMake.

In addition there is still no real IDE support. The CLion plugin is
permanently broken and lags behind versions. No real VSCode support. Using
custom rules makes this even worse, to the point where CLion will refuse to
sync and not having a way to produce a compilation database json.

There are so many bugs open on those projects and no progress or answers. I
can not recommend Bazel if C++ or C is what you care about.

~~~
q3k
> We are talking seasoned build engineers ending up frustrated after literally
> months of trying to achieve something that is easy as pie in CMake.

But it's not the same as CMake. A properly set up Bazel project gives you so
much more: organization-wide incremental builds, build cache and build farm.
Also, actual full hermeticity of builds (no, taking ambient deps from a Docker
container doesn't replace that).

Comparing CMake to Bazel is like comparing some barely-working bash scripts on
a single box to a kubernetes deployment. Maybe you're okay with just bash
scripts, but some of us aren't, and that's where Bazel comes in.

~~~
keithwinstein
I've been confused for a while about the common claim that Bazel gives "full
hermeticity" of builds -- it doesn't seem to be true in practice (at least for
packages with system dependencies). Maybe you can help me clear it up.

E.g. Google's protobuf libraries [1] can be built with Bazel, and they depend
heavily on system headers outside the repository, e.g. <iostream> and
<stdio.h> and lots more. If those headers subsequently change, Bazel will not
pick up on this and will not know to rebuild the parts of the build that
depend on them.

To reproduce: run `bazel build -c opt //:protoc_lib` and then put random
garbage in your /usr/include/stdio.h and /usr/include/c++/<version>/iostream
and then rerun the bazel command -- it will not know to invalidate the build
cache. If you `bazel clean` and then build again, you'll get different
results.

Bazel does a lot of really nice things and I can believe that within a
google3-like environment (where the source code never references a system
header?), it effectively provides hermetic builds, but in practice as used
outside Google (or even in Google's public OSS releases) it doesn't seem to
really match this description or _enforce_ a hermetic seal. What am I missing?

[1]
[https://github.com/protocolbuffers/protobuf](https://github.com/protocolbuffers/protobuf)

~~~
jeffbee
The best way to get your hermetic builds in order is to _always_ use remote
execution and ensure that your executor nodes don't have compilers and headers
just laying around. Force yourself to get the toolchain under bazel's control.

I think the reason the public finds this so mysterious is the documentation
for CROSSTOOL is terrible, and virtually all of the people who learned how to
use blaze at Google go out in the industry with no understanding at all of
CROSSTOOL, because there's a small dedicated team who maintain it.

~~~
thundergolfer
Does your company use remote execution across the board with Bazel? How much
effort was it?

~~~
jeffbee
I work for no man, as a character in a movie once said. But I recently went
through this exercise just to make sure I could do it, and it's easy enough to
put a bare-bones executor node in the cloud and use it temporarily. You can
even get ARM nodes, to make sure you aren't implying the target architecture.

Doing it on a company scale is probably harder. Last place I worked used bazel
but did not bother with remote.

------
filipn
I've made a similar post about building a Go monorepo using Bazel two years
ago, check it out - [https://filipnikolovski.com/posts/managing-go-monorepo-
with-...](https://filipnikolovski.com/posts/managing-go-monorepo-with-bazel).

We've been using Bazel not just to build and test our Go apps, but to build
docker images, compile proto files, even deploy to k8s. It's really versatile
and the developers only need to know one tool to build and test the whole
environment.

------
steeve
We have been Go with Bazel for the last 2 1/2 years to build most of our
backends, but also on mobile (gomobile) as well. It's not an easy tool, but it
delivers amazingly and it's very, very reliable.

Haven't run a bazel clean in... 6 months maybe?

The multi-language really is a killer feature when the project inevitably
becomes polyglot, such as when doing protobuf/gRPC or using CGo.

------
wgyn
I wish they had spent more time describing how/why Go's built-in build tool
stopped working. Or perhaps that's part of the basics of Bazel. Anyone able to
share more on that?

~~~
jchw
Honestly the main reason Bazel is useful in a monorepo configuration is simply
because it’s designed for it. It supports cross-language dependencies and
generated files, and is designed for large repositories with many nested
targets. Go’s build system is fine, but things like Go generate vs genrule
probably come into play. Bazel also offers target visibility, to protect
targets from being depended on in unintended ways, and build isolation, to
allow builds to be more predictable (and help enforce target visibility.) It
also has a few convenience features for vendoring 3rd party libraries and
handling their licenses.

Of course Bazel is far from perfect, in fact sometimes you may be better off
running your own rules instead of the official ones in some cases, but IMO it
makes a pretty decent build system for putting all of your code in one place.

On the other hand, it’s one of those Google things where if you haven’t seen
it in action in a functional configuration it’s hard to explain why it’s
actually nice. And sadly I feel unsatisfied with the Node.JS rules for
example, which is probably how a lot of people will first experience Bazel
(since I believe Angular supports Bazel this way.)

------
yannoninator
I wish my company was at the scale of Uber to work on all these kinds of tech.
Makes me want to leave my current job just to go to Uber.

~~~
klodolph
Not personal experience, but I’ve had several coworkers who worked at Uber,
and they all gave me the same story. This is the story:

Uber reinvented a ton of technology because of the idea that off-the-shelf
solutions wouldn’t work at “Uber scale”. However, there was very little
accountability for whether the in-house solutions were necessary, and working
on these tools would get you promoted. So for every “Uber scale” problem that
a team actually solved, there were a couple other projects that were just
half-baked alternatives to the off-the-shelf software that they _should_ be
using.

It turns out that “Uber scale” is not really _that_ large, despite the name.
But engineers kept repeating “Uber scale” and building infrastructure.

The same problem occurs at the larger tech companies like Google, Facebook,
Microsoft, Amazon, and Apple, but in different ways and to different degrees.
And to a large extent, engineers are copying what other companies do, and
bringing ideas from one company to another when they hang out after work or
switch jobs. For example, you can bet that these companies mostly have their
own containerization and scheduling systems, many of which are undoubtedly not
competitive with Docker or K8s in 2020, but K8s only goes back to 2014 and all
these companies are older. I’m sure Borg and Tupperware are great if you work
at Google or Facebook but I’m also sure that they’re missing a bunch of
tooling that you’re used to. Same thing with build systems. Bazel, Buck,
Pants, Please, and that Frankensteined system that Chrome uses are all copies
of each other but Bazel is the only one with a decent size community and
ecosystem, as far as I can tell.

Uber is absolutely not on the scale of those companies, and _most of the time_
they should probably be using off-the-shelf solutions when they become
available.

~~~
schoolornot
\- nothing wrong with NIH syndrome if you have the means and resources to
refine existing ideas and rebuild them with scale in mind

\- uber is a 50 billion dollar company and they have to de-risk themselves by
owning entire stacks, top to bottom. if it means re-creating something from
scratch... who cares, they have billions.

~~~
klodolph
That’s just apologia for NIH syndrome. It’s fine to reinvent software if that
serves your goals. It’s short-sighted to reinvent software without due
consideration for whether off-the-shelf solutions work. That’s not de-risking,
that’s _adding risk_ and _slowing development._ You see it happen at companies
like that and usually it’s a sign that there’s something wrong with the
culture, incentives, and the way that promotions work.

My take on it is that the engineers want to make complicated solutions to hard
problems to justify their salaries and get promoted, and that managers
encourage that behavior so they can defend their headcount. I’m not accusing
any of these people of acting in bad faith here—nobody’s reinventing tech to
sabotage the company, it’s just that the system encourages this kind of
behavior.

This problem is not unique to Uber, you’ll see similar things happen across
the industry to different degrees.

~~~
jeffbee
Well, the "due consideration" needs to duly consider the actual costs and
benefits of the off-the-shelf software, not make the kind of unsupported
blanket claims about its obvious superiority, like you just did. Every off-
the-shelf program has maintenance costs, flaws, and weak fitness for purpose.
You just made a sweeping and italicized claim that all off-the-shelf software
is both less risky and faster to develop than house-made software. That's a
dangerous philosophy and probably isn't true for every organization and
application.

~~~
klodolph
> …not make the kind of unsupported blanket claims about its obvious
> superiority, like you just did…

Not what I said. Let’s move on.

> That's a dangerous philosophy…

Go pick a fight with someone else.

------
blauditore
> Out-of-the-box software solutions rarely work for a codebase as large and
> complex as Uber’s Go monorepo.

That's a weird way to put it. Bazel is the open source variant of what Google
uses for its monorepo, which is several orders of magnitude larger and
arguably more complex.

~~~
antoinealb
The version of Bazel we (Google) uses internally has a lot of features that
are not out of the box in open source Bazel. Mostly around distributed
compilation or object caching.

~~~
laurentlb
The public version also offers remote execution and remote caching.

[https://docs.bazel.build/versions/3.1.0/remote-
execution.htm...](https://docs.bazel.build/versions/3.1.0/remote-
execution.html) [https://docs.bazel.build/versions/3.1.0/remote-
caching.html](https://docs.bazel.build/versions/3.1.0/remote-caching.html)

What's not open-sourced is mostly the interaction with Google internal
infrastructure.

------
chetanbhasin
This is excellent! The only reason we haven't moved some of our Go codebase to
a Bazel monorepo is the IDE integration.

I have been tinkering with a few ideas to make the existing tooling work with
Bazel, but the effort is larger than I had originally expected.

------
vladaionescu
If you also want to migrate off of Makefile and also want reproductibile
builds, try out Earthly. Normal companies can't do Bazel because it's too
alien and requires deep investment.

[https://github.com/earthly/earthly](https://github.com/earthly/earthly)

Disclaimer: I am Earthly's creator.

------
quicklime
Does anyone have experience using Bazel (or a similar build system) together
with create-react-app? Specifically, is there a way do it without ejecting?

I work on a project with a Go backend and a React frontend, and having to
update all those React dependencies myself is what keeps me from moving to a
system like Bazel.

------
giacaglia
Super interesting! There is also a video explaining their system that might be
worth watching: [https://vimeo.com/358691692](https://vimeo.com/358691692)

------
quux
Wait, didn't Uber decide that Bazel didn't scale enough (speaking as a
xoogler, lol) and made their own build tool? Or am I misremembering?

------
DavyJone
That seems like a ridiculously complex pipeline to maintain and onboard into.
I personally dislike this "do it all" monorepos in most cases, there are cases
were they work, but they also break a lot of other things.

I have not seen metris or blogs that prove this "uptick in build efficiency"
or an increase productivity.

While I do like the idea behind bazel, I hate repeating deps in things like
"go_repository" with gazelle.

~~~
parsnips
go mod vendor && bazel run gazelle

No go_repository duplication.

------
agounaris
The article says that monorepos are more efficient but also that "As the
monorepo grew, the build target list increased to a point where it became too
long to pass it through Bazel’s command line interface.".

Monorepos are not efficient. They are easier to manage when a team is small
but as the team grows and you have more and more deliverables with separate
versioning you are introducing control structures in your automation.
Complexity explodes!

Anyway, all this does matter if you don't make any profit :)

~~~
luckydata
Google is a monorepo as far as I know, and they are doing fine.

~~~
jeffbee
Indeed, in their paper from 5 years ago Google claimed 300000 commits per day
across 9 million source files, compared to this article claiming 10000 commits
per month on 70000 source files at Uber. Whatever the differences are between
blaze and bazel, it must be the case that the former can easily scale to this
size of repo.

I like to look at Google's GitHub commit messages to get an idea of the pace
of their revision history. Yesterday they committed something with a Piper
revision of 311324901. A month ago it was 306514102, and a year ago it was
248381230. That's about 160k revision numbers per day.

