
Bazel 2.0 - the_alchemist
https://blog.bazel.build/2019/12/19/bazel-2.0.html
======
malkia
One issue we hit with our CI, and mix of build systems is this - given a
changelist, find out which targets needs to be built, and which one needs to
be tested on pre-submit, and which on post-submit.

With that, we end up paying so much extra time building everything over and
over without need, and then not building things that we ought to.

So that's one reason to switch, but at the same time lots of people simply do
not get it. To them it seems intrusive, new, opinionated, and makes them not
happy to use it. I've used it for 2+ years at google, and yes initially - was
WTF is this? Then it hit me... And I'm sure the same is for buck, pants,
please.build, gn and other similar systems.

At the end of the day, you need way to express "end to end" your build graph,
from any single individual source file, shell script, or configuration downto
building your executables, deploying them, etc.

It's an industry tool, that needs to be looked, and if it takes 5 people to
support it, then it takes 5 people to support it, but you won't be wasting
other peeople's time on issues like - "Why this build in the CI did not
trigger?", why it takes, and wastes my time (waiting for presubmit), etc.

Yes, it does not come for free, but it's worth knowing and trying it out at
least.

If nothing else, here is the takeaway - Try to use a system with static graph,
where relationships are known before you start building things. It's not
always there, e.g. your #include "header.h" file is dynamic, but bazel forces
you to express even that, and later it catches it whether you've done it, and
breaks unless it's fixed.

~~~
klodolph
> Then it hit me... And I'm sure the same is for buck, pants, please.build, gn
> and other similar systems.

There’s an exercise you can do where you design a build system on the basis
that it shouldn't do unnecessary work (which can be very slow and frustrating
in practice).

My personal experience is that you can really quickly get to the point where
just reading the entire graph into memory gets expensive. People talk about
how Google is huge… but long before you get to that scale, you can end up with
a build graph that just takes forever to parse and evaluate. (At Google's
scale, it doesn't even fit in memory any more.)

So you decide that, as a hard design requirement, you should be able to only
load the portion of the repository that you are building. And then you want to
make this cacheable, so you can change the repository and know what’s changed
in some quick / reasonable way.

If you go down this path, you end up rediscovering some of the big design
decisions behind Bazel, Buck, Pants, Please, and GN.

------
habitue
Good heuristic for whether it's worth considering moving to bazel for your
build system:

\- Do you have 200+ developers working on a monorepo?

\- Are you willing to vendor all of your dependencies and maintain their
builds yourself?

If so, consider it. The productivity you're losing to unnecessary rebuilding
and re-running unchanged unit tests will probably be paid back if you can
contort your development process to the one Bazel expects.

If you're a small shop, the benefits Bazel is going to provide over, say, Make
(or whatever standard build system your primary language uses), are going to
be minimal. And the overhead of maintaining Bazel is going to cost you a ton
of developer time you may not be able to afford.

~~~
wereHamster
Another factor: are your languages supported by bazel? If you use the same
languages that google uses (C++, Python, go), it's fair to say that those are
well supported. For all other languages, even if they are widely used outside
of Google (JavaScript, nodejs), you may be out of luck.

~~~
zapita
Go support is not great either. Bazel can build Go just fine, but you will
need to throw away the standard Go tooling and use Bazel instead. There are
third-party helpers like Gazelle, but you know you’re in for a bumpy ride when
even basic operations require a helper.

~~~
klodolph
Go support is awesome, IMO. Personally I have favored Bazel over “go build”
for a while, except for pure Go projects with no generated sources.

Gazelle is wonderful and it doesn’t belong in Bazel core. Bazel is a build
system for every language, and Gazelle is for a subset of Go developers. Since
it’s not part of Bazel core, you can always replace it with something else.

~~~
zapita
But would you recommend using Bazel and Go without Gazelle or an equivalent
third party?

~~~
klodolph
I recommend Gazelle for importing third-party Go dependencies but not for your
own Go code. If you are using Bazel, just write the BUILD.bazel file yourself
with the appropriate go_library / go_binary / go_test rules.

------
bobdobbs666
I was subjected to bazel on a small project because the manager insisted we
use it. The rest of the company used a number of either custom tools or cmake
or premske.

It is utter hell when you have tons of third party libraries (internal or
external to the company) that you don’t have the source to and it is
especially painful when trying to integrate bazels behavior against other
build systems. Also bazels packaging and use of internal symlink renaming was
a constant source of suffering. Bazel pretty much destroys a number of totally
valid work linux commands for looking for so files.

Bazel might be useful in the case of a monorepo with a massive engineering
pool AND a massive cloud infrastructure backing that repo to handle all the
artifact sharing, but after having used cmake, premake, waf, random perl and
ruby scripts, or just checking vs projects into perforce manually, I’d pick
any of those before bazel for most projects. I say that having worked on code
bases from a few 10s of thousands to 25+ million LoC with teams small, large,
and distributed.

Bazel probably has its place but I have yet to find it.

~~~
klodolph
My personal experience is that Bazel cut through a bunch of the problems that
I’ve had with CMake, Waf/SCons, etc. Builds were fragile, they were not
reproducible, and there were implicit dependencies. This is mostly as someone
who’s rewritten a few build systems, rather than as someone who’s been
subjected to build systems by others (I mostly inflict these changes on other
people). With Bazel, I have much higher confidence that I’ll get consistent
results when I check out the repository on different computers or work with
other people.

That said, the major sore point with Bazel for me is the general lack of
expertise about how to work with it sanely. Depending on what part you’re
looking at, it’s somehow both “too opinionated” and “too flexible” at the same
time.

I think it will capture a big chunk of the mindshare for build systems over
the next few years, and you’ll see more and more of it. Over that time, people
will develop the expertise and best practices for different development
problems.

For managing third-party dependencies specifically, Bazel gives you a ton of
options, including options that only really make sense for huge orgs like
Google. Google vendors their third-party libraries directly into the monorepo.
If that doesn’t make sense for your org, Bazel lets you work with external Git
repos, with artifact repositories, with package repositories like NPM, or with
tools like pkg-config.

The thing that makes this hell, right now, is that few people how to use it
well and the documentation is rough. I’m personally very happy with it, even
for small codebases, but I’ve used it a lot.

~~~
bobdobbs666
Lack of docs plus lack of user base is also a giant failing of bazel. It’s
almost always impossible to figure out something I could make happen 6 ways
with most other build systems make work in bazel. And there’s little community
so now instead of getting work done I am debugging bazel source.

Also building distributable packages with bazel never seemed to work well due
to the constant aliasing of so files. Things that would work in the direct
bazel build would fail in packages and vice versa so now we had even more
pain.

Trying to suck up just header files and multiple so files was always arcane
bullshit as well.

We did work with git and other such functionality, but if you had to build a
package from another build system to bring into bazel there were always
annoying pain points.

Also bazel managed to bring in implicit dependencies in our system so that
clearly isn’t something bazel magically handles but was rather a product of
your expertise.

After reading build systems ala carte I am just more convinced bazel is not
the build system that I would really ever need. I’m not sure that build system
exists yet to be honest :). But in the work I do other systems solve my
problems better.

------
ddevault
For anyone thinking about Bazel for their project/organization... run as fast
as you can in the opposite direction. It's easily the most complex and
unintuitive build systems in the world, and I'm saying that as someone who
used SCons. At the last job where I used it, I was on a team of 5 whose
responsibilities included Bazel upkeep, which required anywhere from 10 to 50%
of our time. This was used by a broader engineering team of 50, working on 3-5
"big" projects and a few dozen small ones.

~~~
zellyn
If you are an organization with a large enough codebase (especially if it's in
a monorepo) that you need a shared remote cache of build artifacts, or remote
build sharding and execution, and have multiple languages (even protocol
buffers) interacting in complex dependencies, then you should run as fast as
you can away from less rigorous Blaze-alikes (Pants, Buck, etc.) straight
towards Bazel.

Yes, it's complicated, but it's also quite rigorous, and the rigor pays off.

(We at Square had already found a Blaze-alike necessary. We are currently busy
converting our Java build from Pants to Bazel.)

~~~
shrewduser
I'll never understand the fascination with mono repo's.

~~~
serverholic
Well for one you can commit to multiple projects in a single PR. Makes
coordinating changes across projects much easier.

~~~
dehrmann
It gives you that illusion; it doesn't solve versioning and deployment orders,
and I'd argue that that's the harder part of changes across projects.
Polyrepos make messy things...messy.

~~~
ecnahc515
Deployment ordering at large scale is avoided and usually done by not making
breaking changes. 4 phase migrations, always. Roll out new API, update
existing software to use new API, wait for everything to stop using old API +
backfill, remove old API.

~~~
erik_seaberg
I agree that gradual adoption of new APIs is the way to go, but once you're
doing that you no longer _need_ an atomic commit across all projects.

~~~
dehrmann
You actually never want an atomic commit for that class of changes across
projects because HEAD should always be deployable to all services. It's
obviously messier at FAANG-scale, but with even 25 devs, not properly staging
API-breaking changes leads to a lot of "only deploy commits before xxxx to
service foo."

------
kylecordes
As with many projects using semantic versioning, the major version bump just
signifies there are some breaking changes. Most projects will just switch from
1.x to 2 work noticing.

~~~
kryptiskt
That "changed some corner cases that likely won't affect you" and "rewrite it
all" looks the same in SemVer makes it next to useless, not that any other
system would be better. We just shouldn't have any expectations about version
numbers conveying much information.

~~~
ivanbakel
How is SemVer next to useless? The major version bump informs you that you
should go look up what breaking changes have occurred before you upgrade. It
is inherently useful for under-approximating the "safe" range of versions of a
piece of software that can be used, which is seen in practice in many package
managers.

That it can't differentiate between those two cases is because it's not meant
to. It's like complaining that the blurb of a novel is "next to useless"
because it doesn't tell you the complete story in a detailed way over several
hundred pages.

~~~
k__
SemVer isn't useless because of major bumps, but because of the minor and
bugfix.

Theoretically every version change can introduce a bug, which leads to an
implicit API change and as such require being a major version bump.

Also, fixing a bug can also introduce an API change, because the API can
behave differently with and without the bug.

SemVer just covers the intent, not what's actually happening, which makes it
kinda useless in most scenarios. I guess Elm gets it right, tho'.

~~~
afarrell
> SemVer just covers the intent, not what's actually happening

If I say "I'm leaving the office to get a sandwich", that statement only
covers my intent. If I then sprain my ankle badly, my statement doesn't say
what's actually happening.

SemVer has this flaw because it is a way for a human to say "this change does
not introduce a change to the API" and that human can be wrong. That seems to
me not _useless_ , it just means it is only useful for projects who are
willing to trust the maintainers of your dependencies to avoid being wrong
about introducing bugs.

\--------

It seems like you're arguing that a project which uses a dependency should:

1) Have humans check the dependencies anyway.

or

2) Wire up their automated test suite to something which can record calls to
the API of the dependency and the results of those calls. Turn the record of
those calls into an set of API contract test cases. Then, on any version bump
(minor, major, or patch), run those autogenerated test cases on the new
version.

... I think option 2 might be a good idea? It could be a required reviewer for
any dependabot PR.

------
JadeNB
I didn't know, so, just for anyone else who didn't:

> Bazel is an open-source build and test tool similar to Make, Maven, and
> Gradle. It uses a human-readable, high-level build language. Bazel supports
> projects in multiple languages and builds outputs for multiple platforms.
> Bazel supports large codebases across multiple repositories, and large
> numbers of users.

------
kovek
I’ll jump here to say that Bazel 1 was awesome, and I’m looking forward to
trying out Bazel 2.

I was wondering how to make sure Bazel doesn’t rebuild something it has built
previously? (Caching)

~~~
jingwen
There are many layers of caching within Bazel (remote/local, inmemory/disk),
but the central functional incremental engine is called Skyframe [1]. Almost
every computation within Bazel that can be incrementally executed is managed
in this engine.

[1]:
[https://bazel.build/designs/skyframe.html](https://bazel.build/designs/skyframe.html)

------
breatheoften
Does bazel use the word “provenance“ at all?

Provenance is a word I first saw advertised in a platform called dotscience.io
— that I find fundamentally interesting. And it seems quite relevant to
hermetic builds.

Provenance is about giving any state derived from an arbitrary computation an
identity that is derived from the content hash of the inputs needed to re-
compute that state ... in dotscience they achieve this by instrumenting io and
creating zfs filesystem snapshots when computing new provenance artifacts.

I think this concept could be the ultimate building block for a build system —
and it could become the job of oses/containers/runtimes/databases to
Coordinate to allow this abstraction to be tracked with sufficient efficiency
that programmers would feel allowed to freely use the concept of provenance
when building ... it seems to me like provenance could provide all the
information needed to support a distributed build cache? You wouldn’t actually
need a build language at all — just an api in each language to ask for the
saving of provenance artifacts. The artifact would hold all the info needed to
be able to recompute the artifact with the same state — which is also all the
info needed to decide when the artifact is out of date ...?

~~~
dub
Bazel is part of the story of how Google manages provenance for build
artifacts ([https://cloud.google.com/security/binary-authorization-
for-b...](https://cloud.google.com/security/binary-authorization-for-borg/))

~~~
Boulth
This is not entirely correct. It's not Bazel but "build system very similar to
Bazel" (from your source) and that's I guess their internal Blaze tool.

I wonder what's the real usage of Bazel (not Blaze) in Google.

~~~
karlding
According to this comment [0] by laurentlb (one of the people working on Bazel
who also commented in this post) from a year ago, Blaze is just Bazel but with
integrations to Google-internal tools.

[0]
[https://news.ycombinator.com/item?id=18823546](https://news.ycombinator.com/item?id=18823546)

------
djsumdog
I hadn't heard of this, and see there is a lot of concern over using this
project except for specific use cases.

I'm always weary of build tools that try to do multiple languages. On Scala
projects I use SBT, and for people who have tried to hack on SBT itself or its
plugins, you know it's a big mess under there. On other projects I've tried
using Gradle with Scala, but I found a lot of times Gradle just wasn't setup
for a Scala workflow or was missing essential tooling to make it as effective
as SBT (although its configuration is considerably more sane). Most of the
tooling and plugins around Scala are built around SBT as well (for better or
for worse).

I try to stick with the major tool for a given language; cargo for Rust, SBT
for Scala, the built-in tooling for go, with the exception of Java projects
where I'd gladly take gradle over the hellscape that is Maven.

~~~
vlovich123
That works for small teams/projects. If you're looking at hundreds, thousand,
or even tens of thousands person orgs you're spending time training everyone
on every individual build system (very time-consuming and error-prone).
Additionally because of the lack of consistency & unfamiliarity with having to
deal with multiple interconnected projects, these tools fall apart
spectacularly in that each team will end up with their own flavor of the build
system. This makes transitioning between projects very hard & silos off teams.

That can be fine but can make it even more of an efficiency loss for someone
switching projects/contributing partially to another project. Uniformity
reduces costs on many fronts but like anything else it's a tradeoff. Now you
need a team to maintain your Bazel/Buck/etc for each language & it may not
jive 100% well with languages that have opinionated package managers/build
systems alread (Node, Cargo, SBT, etc). On the other hand you'd probably end
up having to create teams to maintain your company's Node, Cargo, SBT builds
anyway except now you need to hire domain experts who not only understand each
language but also how it should integrate within your larger infrastructure. A
single uniform build system framework makes that easier.

------
zmmmmm
Build systems seem to sit in that category of perenial category of things that
keep getting re-invented, and either recapitulate existing problems or create
new ones.

I don't think people will ever fundamentally all agree on:

    
    
        - static vs dynamic configuration
        - custom language vs piggy back on existing
        - intelligent, deeply integrated / understands code 
          it is building vs "language agnostic" but 
          necessarily shallow integration
    

All of these are fundamental tradeoffs that mean every tool will have
limitations that about 50% of people don't like. And so we will keep re-
inventing forever I think.

------
klysm
Didn’t this just hit 1.0?

~~~
bru
Yes, 2 months ago.

~~~
dehrmann
Must have some people from the Chrome team working on it.

~~~
jerryr
Not that they can’t also be contributing to Bazel, but I believe that Chrome
uses GN.

------
theodorejb
I miss the days when JavaScript frameworks could be built with a simple npm
install and executing a Grunt/Gulp file. Now to build Angular I need Yarn,
Java, Bazel, and hundreds of megabytes of additional tooling downloaded by the
build script. On a slow connection it takes ages to download everything, and
even then the build often fails (on Windows I have yet to get it working
successfully).

Edit: I'm referring to building the framework itself (e.g. to contribute a
fix). Building an Angular project with the CLI works quite well.

~~~
teeray
I miss the days when JavaScript frameworks required only a script tag.

~~~
lenkite
Those days are BAAACK. Use Vue ! (The best JS framework in the world!)

<script type="module"> import Vue from
'[https://unpkg.com/vue@2.6.0/dist/vue.esm.browser.min.js';](https://unpkg.com/vue@2.6.0/dist/vue.esm.browser.min.js';)
new Vue({ ... }); </script>

