Hacker News new | past | comments | ask | show | jobs | submit login
Bazel 1.0 (googleblog.com)
411 points by vips7L 4 months ago | hide | past | web | favorite | 201 comments

One of the major time savers of bazel is Remote Build Execution (RBE), which allows you to build modules in parallel in the cloud. So if you have 1000 CPUs, you can really just have a client do `bazel build -j 1000 //...` and you can get a huge speed-up. Remote (and local) builds all happen in a sandbox, so you don't have to worry about e.g. preparing a docker image with the worker / build slave environment. (You do, however, have to register your dependencies with Bazel, which can be a hard at first). To add to this, bazel also has a remote global cache which can benefit large teams.

For fairly large C++ codebases, RBE is really a competitive advantage. I've seen RBE cut down iteration time by an order of magnitude. I love CMake, and CMake can get you plenty of parallelism, but CMake doesn't really provide a tool for building several CMake sub-projects in parallel, and bazel handles this really well.

Sadly Bazel RBE is still primarily a Google Cloud product. Also, GCE is slow to work on supporting auto-scale, so you have to pay for unused workers. (Like most products in Google Cloud, Google is ripping you off with alpha-quality stuff). There's some very rough open source RBE stuff on Github that you can run yourself, but nothing really production-grade yet.

gg ( https://github.com/StanfordSNR/gg ) is a promising-looking alternative. It's research code, but it might be the community's best hope for a non-Google alternative (that e.g. supports AWS Lambda for parallelism). Bazel is great, but without independence (e.g. what Kubernetes achieved) it's difficult to see bazel as dependable as make or CMake long term.

Check out recc: https://gitlab.com/bloomberg/recc

It's a remote execution client, that you can use with your current cmake project, along with buildgrid: https://gitlab.com/BuildGrid/buildgrid which is a basically a FOSS RBE.

Along with workers: https://gitlab.com/BuildGrid/buildbox/buildbox-worker

With these FOSS alternatives, you can tailor/scale your builds yourself without relying on RBE.

All of these utilize the Remote Execution protocol, so you can plug and play with other clients/workers/servers that use the protocol.

Nix also supports using any SSH server with Nix installed as a remote builder[0]. Performance gains aside, it's also very useful for building for other platforms where you don't have a cross-toolchain set up. For example, I've had some colleagues use nix-docker to build packages for Linux from their Macs, even if the packages weren't cross-aware[1].

[0]: https://nixos.wiki/wiki/Distributed_build

[1]: https://github.com/LnL7/nix-docker#running-as-a-remote-build...

> GCE is slow to work on supporting auto-scale, so you have to pay for unused workers.

Can you expand on what you mean here? GCE has had autoscaling for years and it's quite configurable, I don't see why you would need to keep unused workers around unless your build load is extremely spiky.

Not auto-scaling VM compute instances, auto-scaling the Google Cloud RBE service (I think it's called Cloud Build right now?). If you have a worker pool of size 1,000, you have to pay for that even if your dev team only uses it at full capacity 8-10hrs per day. Or maybe the Google Cloud sales people working with my company are misinformed? IDK, but my experience with TPUs, GPUs, AI Engine, Dataproc, BigQuery, etc... every GCloud feature I've used (besides vanilla Compute VMs) have gotchas and/or time sinks that are costly unless you have a load of free credits. For AI Engine, it took like 6 months to upgrade their internal Kubernetes cluster; in the meantime, crashed jobs would burn an extra 15 minutes of GPU dollars.

There's a lot of good stuff in Google Cloud, but be sure to get Google to pay you to wade through the construction with free credits.

Cloud Build and Cloud RBE are two different products. The former is Google Cloud's CI service (https://cloud.google.com/cloud-build/), while the latter is a hosted backend for Bazel's remote execution feature (some infos are here: https://groups.google.com/forum/#!forum/rbe-alpha-customers). They are not related.

Disclaimer: I work on the Bazel team at Google.

> Cloud Build

If you're talking about [1], it is billed in minutes. You can have 10 concurrent workers. Custom Workers is in alpha.

[1] https://cloud.google.com/cloud-build/

in Google Cloud Build you pay per minute, not per instance.

It isn't a native CMake solution, but the ROS (Robot Operating System) community has been dealing with the "build many CMake/python/other projects in parallel" problem for a long time, and has produced several tools and which address this.

The most recent effort is called colcon, which you can find here: https://colcon.readthedocs.io/en/released/

In the past (for example with catkin_tools, colcon's predecessor), the biggest barrier to use with conventional CMake packages has always been that they don't export dependency information in an easily machine-consumable way, so any regular CMake package brought into the workspace needed to have a package.xml file patched into it.

Colcon still has the ability to bring in extra metadata for things like ensuring environment variables and so on are set up correctly in the final workspace, but it's gotten a lot better at parsing/interpreting the native expressions of dependency and figuring out what to do. See for example how you build the multiple projects which make up the Gazebo simulator with colcon:


Of course, colcon also supports adding python/setuptools repos, and includes experimental support for bazel, gradle, cargo, and other buildsystems which can do an "install" step into a reasonably sane FHS tree. So in that way it's kind of like a cross-language supercharged virtualenv— it gives you the benefits of building multiple pieces of a system from source, semi-isolated, without the hassle of dealing with containers or VMs, where you also have to have all your tools in the isolated environment.

Used to be able to do something similar with Gentoo back in, like, 2002, I think. Distribute build jobs to multiple remote machines.

For C++ we've had better luck, on our local cluster, with distcc than with bazel's remote execution, and we've found the the cache slowed things down. Has your experience all been in the google cloud?

I only wish Bazel had been written in a language that produced native binaries like C++ or Go instead of Java. It seems like a waste for me as a non-Java developer to install Java on my dev machine just for Bazel.

We didn't have Bazel on ARM for NixOS because we can't bootstrap openJDK on ARM. There's no source code for a JVM that compiles on ARM afaik. Though in theory we could package binary blobs from oracle and bootstrap JDK and Bazel from there, it means now our trust path for your critical build tool has a random Oracle blob in its trust path there that is extremely hard to get rid off.

Build systems should be easy to build. Bazel is not by a far shot due to java being a mess in this area (ironic given it being advertised as a cross platform language).

These issues are not important for everyone perhaps. (Heck the entire Kubernetes ecosystem pulls random binaries from docker hub without any cryptographic signatures.) but it's still a shame; especially for a build system.

Google builds stuff for themselves and then dumps it for extra reputation points on the dev community.

The part where this starts to suck is when their tools become pseudo-standards and everyone has to live with their decisions.

I'm biased because I work at Google on an open source project, but the complaint here seems to be:

1. A corporation spends significant resources developing a project.

2. Then they release to allow the public to use it at zero cost. Often they even invest in supporting that external use.

3. The project provides enough value to so many users (who chose it of their own volition) that it becomes nearly a standard.

And this is somehow a bad thing?

The complaint is that the tech mega-corps are too big and should be broken up. Then they wouldn't distort markets (among other things) by dumping free products subsidized through ads on them, only to forget about said products two years later.

Regarding your specific interpretation:

1. The product was developed for internal use AFAIK, then released externally.

2. Google is getting something out of this: a PR boost, better engineer retention and the engineers feel good about themselves.

3. All these projects not only don't become standards, but they just add to the pile of things one has to learn because there are still too many people that are fawning over anything published by FAANG.

I think it's unfair to sketch it as such. Isn't it clear that the parent is unhappy with the side-effects, rather than the project itself?

This is actually called the "throw over the fence" open source development model.

Most Google projects are like this, including Android which is the main reason why vendors struggle with os updates.

AFAICT Android is a pain to upgrade because of custom closed-source drivers and lack of ABI stability in Linux, so you need to recompile a driver for a newer kernel.

Android is not a Google's internal product, unlike Bazel.

Last week we had a major security vulnerability in Android just because Google can't user the same Linux kernel that everyone else are using.

There exist LTS kernel versions for a reason, you know.

This. Treble is suppose be their ABI

Right. Like the v8 javascript engine. If you want to build it, you need to download gigabytes of sources and install Googles internal build system tools first...

yes, and often lots of prebuilt build tools no one really knows how to build from source. It's a real mess open source.

They even maintain internal tools that are similar versions, but better. Google uses Borg internally, but they open sourced the cheap knock-off known as Kubernetes instead.

You could try cross-compiling OpenJDK from x86_64 to ARM. This is a lot easier in newer OpenJDK releases than it used to be.

Currently doing this for BSD using OpenJDK 11.

If needed, run with qemu userspace emulation.

> we can't bootstrap openJDK on ARM

Irrespective of Bazel, that's an lot of software than can't run on NixOS ARM, including build-related stuff:

* Elasticsearch, Cassandra

* Eclipse, IntelliJ, Netbeans

* Jenkins

* Java toolchain, Android toolchain

> (ironic given it being advertised as a cross platform language

Indeed that is one of Java's hallmark features. Servers, desktops, phones, toasters.


FWIW it's not unheard of to require compiling on one platform for another. This is how most of the embedded world works.

Is not the openjdk available for arm? What do you need from Oracle? https://adoptopenjdk.net/releases.html#aarch64_linux

This is aarch64 (that is 64-bit ARM), and NixOS seems more 32-bit arm looking at their wiki https://nixos.wiki/wiki/NixOS_on_ARM On the other hand, they do 64-bit too, so maybe there would be some way to do for one, then bootstrap the other?

Other idea, distributions like Debian, seems to build OpenJDK as well, so maybe that would be a way to get the initial package (through an existing distro) https://buildd.debian.org/status/logs.php?pkg=openjdk-11 - see the armel/armhf/arm64 architectures (arm64 equals aarch64).

We only support aarch64 officially. 32 bit ARM is only community-supported.

Aren't there 32-bit ARM binaries available here? https://bell-sw.com or here? https://adoptopenjdk.net/releases.html?variant=openjdk11&jvm...

This was true for a long time, but we do actually have a bootstrapped openjdk for arm since a few months: https://github.com/NixOS/nixpkgs/pull/65247

Isn't there a way to get the Android toolchain to give you what you need? After all Dalvik is just Java warmed over and has a working VM, and there are plenty of ARM based Android phones.

That might get you bootstrapped to the point where you can use that to re-compile the community edition of Java.

You are conflating the runtime environment with the development framework. Afaik, you can't develop native Android apps on Android.

You want to build java from source on arm ? Can’t you borrow what AdoptopenJdk Does

Can't you bootstrap older openJDK with (now discontinued) GCJ?

Parts of Bazel are written in C/C++.

The command line client as well as some low level file utilities.

But Bazel is distributed as a standalone package in several packaging systems; no need to install Java yourself.

Though I doubt many people know this; there's virtually no reason you would be exposed to its language implementation choices.

EDIT: Actually I believe Bazel is changing its .deb package to not be standalone, in order to comply with Debian requirements.

> there's virtually no reason you would be exposed to its language implementation choices

Um, doesn't it still have a client that starts up a persistent server and you then have to wait for that server to start up (which, being Java, takes forever) and then deal with those server processes hanging around?

That sounds like being exposed to its language implementation choices.

As opposed to, say, having fast start-up times and not using a server design like basically every other build too.

But my information might be out of date. That's what always soured me on it previously, though.

Bazel doesn't start a server for Java JIT reasons.

It starts a server for (1) concurrency control (2) management of worker processes some languages use (3) caching the build graph (recall that Bazel works with very large code bases).

These reasons are independent of implementation in C++, Go, Rust, Java AOT, etc.

(And yes it doesn't have to use a persistent process to solve these problems. That is the solution it chooses.)

> not using a server design like basically every other build too

Buck, Pants, sbt, Gradle

> (And yes it doesn't have to use a persistent process to solve these problems. That is the solution it chooses.)

How much did the fact that Java is very slow starting up influence the decision to use a persistent process instead of some other solution?

Java starts up in ~1s, something like Graal could make this faster. But I find this Java criticism more a symptom of Java-derangement syndrome, because python and node build systems have even worse startup times and no one says anything.

> python and node build systems have even worse startup times and no one says anything.

Noop build with Waf (build system written in Python) takes 0.13s on my system. Waf reports that the build actually takes 0.04s, so I guess 0.09s is Python's start-up time (and some other overhead).

Java Hello World takes 0.132s on my Macbook pro. If I turn on -Xshare:on to use class data sharing, then it drops to 0.119s. Ergo, Java startup time is non-factor. Graal could make this even quicker, for example, a GraalVM AOT helloworld can be reduced to .008s startup, see https://github.com/graalvm/graalvm-demos/tree/master/java-ko... for example.

"time bazel" returns 0.098s

Running a null build took 0.84s, but Bazel does significantly more work, as it's working on a big monorepo, as both a build tool and a package manager.

In short, it is not a problem. But I've had npm take many many seconds for simple operations. "npm list" takes 2.9 seconds.

Python can be slow, depending on the tool. Waf sounds like it is simple and fast, but there are lots of other examples of slow python frameworks out there.

> "time bazel" returns 0.098s

For me it returns 1.6s on first run, .9s on second (bazel 0.29.1).

    $ time (echo 'quit()' | python)

    real    0m0.025s
    user    0m0.000s
    sys     0m0.031s

    $ time (echo 'quit()' | python3)

    real    0m0.036s
    user    0m0.031s
    sys     0m0.016s
(both are after it was already cached)

Huh; I see < 0.15s.

> so I guess 0.09s is Python's start-up time (and some other overhead)

That is probably mostly Waf startup time. Python itself starts way faster than that (on the order of 10-20ms on my machine).

Startup time in any interpreted language is very quickly simply proportional to the amount of code loaded.

   time python -c ''                               # 0.024s
   time python -c 'import argparse'                # 0.036s
   time python -c 'import argparse, json'          # 0.040s
   time python -c 'import argparse, json, httplib' # 0.065s

Indeed, but it isn't a cool toy among this crowd.

My guess is that startup time is not the issue, but loading a large data structure (its cache) from disk can be.

Especially if you can't or don't want to use memory mapping because it's hard to do well.

But more importantly, I think that the server can monitor file changes ahead of time with things like inotify, saving the time of stat()-ing files when the user wants to perform a build.

Change monitoring with inotify is going to hit limits quickly.

I recently wanted to take a look at VSCode and it immediately shat itself over the maximum number of inotify watches being too low (the kernel I'm running restricts this to 8k for non-root users).

I then bumped the limit to the max (512k, I think, or about 7 copies of linux.git)…still no luck. Now imagine you have a Google-scale number of files.

> Especially if you can't or don't want to use memory mapping because it's hard to do well.

Why do new languages never address this issue?

In-kernal block caching of files is pretty crude - since the kernel has no knowledge of your data structures or which bits of your files you're going to be reading next, you end up having hundreds of page faults requiring single-sector disk reads for most workloads.

That’s often the case if your file format evolves without taking a mmap use case into account. But for a format that’s designed with mmapping in mind, it is often ridiculously faster and more effective.

But most languages and environments don’t let you do that easily.

Aren't the hot compilers kept around under this mechanism to speed up the next incremental build round? I think it's not just caching of intermediate build output but also not letting the tools start from a cold start.

Yes, and I included that in my point. Certain compilers: Java, Swift, TypeScript are faster with long-running worker processes.

To be fair, that doesn't require a long-running Bazel process to manage them, but that does become a natural choice.


I do know that Go wasn't an option :)

(Because it didn't exist)

JVM startup is sub-second. Still not great for e.g. small command line utilities, but completely fine otherwise.

JVM startup for `java -version` is around 150ms on my machine:

    $ time java -version
    openjdk version "11.0.1" 2018-10-16
    OpenJDK Runtime Environment 18.9 (build 11.0.1+13)
    OpenJDK 64-Bit Server VM 18.9 (build 11.0.1+13, mixed mode)

    real 0m0.159s
    user 0m0.116s
    sys 0m0.052s

This is how long it takes for a program like git to do actual work. Once you start running an actual program like Maven, it blows up to 500ms for `mvn -version`:

    $ time  mvn -version > /dev/null

    real 0m0.570s
    user 0m0.431s
    sys 0m0.145s

Bazel takes 150ms to do nothing:

    $ time bazel --version > /dev/null

    real 0m0.144s
    user 0m0.028s
    sys 0m0.063s
As a baseline, make takes under 30ms to do nothing:

    $ time make -version > /dev/null

    real 0m0.025s
    user 0m0.002s
    sys 0m0.016s

Seems similar on my system. Some more comparisions:

Waf written in Python:

    $ time ./waf --version >/dev/null

    real 0m0,068s
    user 0m0,056s
    sys 0m0,012s
Ninja written in C++:

    $ time ninja --version >/dev/null

    real 0m0,001s
    user 0m0,001s
    sys 0m0,000s

Java 13 takes 97ms on Windows 10, and I haven't bothered to produce a Java native image before attempting it.

    PS C:\Workdir> Measure-Command { java -version }
    openjdk version "13" 2019-09-17
    OpenJDK Runtime Environment (build 13+33)
    OpenJDK 64-Bit Server VM (build 13+33, mixed mode, sharing)

    Days              : 0
    Hours             : 0
    Minutes           : 0
    Seconds           : 0
    Milliseconds      : 97
    Ticks             : 975572
    TotalDays         : 1.12913425925926E-06
    TotalHours        : 2.70992222222222E-05
    TotalMinutes      : 0.00162595333333333
    TotalSeconds      : 0.0975572
    TotalMilliseconds : 97.5572
About 60 ms more than make.

Hurray Make still wins, now what are we going to do with those 60ms... /s

FWIW, I don't think "java -version" is a good benchmark of java startup time - why should it start up the JVM just to print the version?

I think that measuring anything less than 1s for CLI is silly, school playground measuring competition.

The original comment was "JVM startup is sub-second. Still not great for e.g. small command line utilities, but completely fine otherwise."

For tooling it matters. Some scripts will have backticks and $() and each use of these could be up to a second and be acceptable? Not really. And if you want to run a command like `go fmt` every time you save a file in $EDITOR then you want it to be fast. Maybe when we use the Language Server Protocol everywhere, it will be a different world. But a lot of editors shell out for these features still - and you would never dream of shelling out to e.g. `mvn dependency:tree` because it's too slow.

Unless I am missing something, 97ms looks pretty much sub-second to me.

With Graal AOT you can make command line utilities that start on the order of 0.008s


> make takes under 30ms to do nothing:

But in large code bases, make can take several seconds to do nothing, depending on the number of `$(shell find)`.

Whereas Bazel's time remains constant. Could it have been constantly 30ms instead of 150ms?

Sure. But that's nowhere near to being a problem for the usual code bases Bazel operates on.

$ time ./java -version openjdk version "14-ea" 2020-03-17 OpenJDK Runtime Environment (build 14-ea+19-824) OpenJDK 64-Bit Server VM (build 14-ea+19-824, mixed mode, sharing)

real 0m0.074s user 0m0.063s sys 0m0.025s

Does the initial startup time matter?

I think the server is kept around for a number of reasons that have nothing to do with Java startup time. Probably the biggest one is file-watching, using inotify, FSEvents, etc.


The server shuts itself down after some idle time, I’ve never had to actively manage any instances of it.

I always have the opposite reaction. When I see its a native tool I immediately think, "oh great, now to build this I need to know how to understand your weird glib / gcc version weird compile / link / error for some ecosystem I don't understand". Instead when I see Java / JVM and know that almost certainly I will have no problems like that.

Eh, static binaries solve this & Gradle presents plenty of potential problems in my experience.

There are alternatives that satisfy this (valid, IMO) requirement. For example, there is build2 (https://build2.org ; full disclousure: I am involved with the project) that tries to achieve the same objectives (correct & fast builds, uniform interface accross platforms, being language agnostic, etc) though not necessarily using the same mechanisms (build2, for example, tries to be fairly light-weight). Here is the introduction if anyone is interested: https://build2.org/build2/doc/build2-build-system-manual.xht...

I checked as I was interested, but it seems that build2 doesn't do distributed builds.

edit: remote->distributed

Not yet, but it's coming. And most of the really difficult infrastructure is already there, at least for the C/C++ compilation.

I think you don't need to install Java to use Bazel, it comes bundled. They have done some improvements to installer so that the user requires as less additional tools as possible to run Bazel.

>I only wish Bazel had been written in a language that produced native binaries like C++ or Go instead of Java.

Java can be compiled into native binaries using GraalVM. Not to mention you dont actually have to install Java as it comes in bundled.

native-image performance without PGO might be worse than with JIT, at least in some experiments that I did.

See also https://github.com/oracle/graal/issues/979

I have the same complaint about meson and python 3. You purposely use a low-level language to write dependency-free code, then your users need to install a bloated runtime just to build the damn thing.

You should look at Please: https://please.build

Bazel-like in many ways; written in Go.

Bazel has been embedding Java for a year or so now. There's no need to concern the host environment with installing the JRE anymore. As a user, the fact that Bazel is written in Java is transparent.

You don't need to install Java. It's bundled. In fact I'd recommend that you don't even install Bazel itself. Instead, install this: https://github.com/bazelbuild/bazelisk. This will install either the latest Bazel for you automatically, or you can pin a version of Bazel you want in each repository, to limit and control build system churn.

It's funny how Java people always marketed Java as "write once, run everywhere" and they often bundle compatible platform-dependent JRE with platform-specific plugins. ^_^

Is that such a big issue though ?

The default install of my dev machine (Ubuntu) installed two different versions of python and perl5, neither of which I use. Other languages have been installed indirectly because I installed one tool or another. As long as the language itself doesn't get in the way (e.g. the tool is picky about the language version and requires extra work to install it) why should I care ?

I bet you're using Python and Perl indirectly, because lots of Ubuntu's tools are written in those languages.

You can use SDKMAN (https://sdkman.io/usage). It is like nvm.

There are native code compilers for Java, one just has to bother to learn about them.

Plus, even if one does not use them, shipping a minimal runtime is hardly any different from having libc or libc++ dynamically linked.

I am going to be trying out Bazel next week and I intend to run it in a container. I assume this is possible?

I guess that's why they use "gn" in newer projects like Fuchsia or Chrome?

Why do you care if a JVM is on your dev machine?

- One more thing to install - Does this work for 8/11/12/13/14/15/16?

It's a stripped down JVM is bundled with bazel, you don't need to install it.

My 2015 MBP is crawling between Safari, Docker, VS Code, and BitDefender (antivirus). Not sure how bulky the JVM is these days. That said, I wish it supported Python.

Why do you even have Bitdefender installed?

A lot of people have it for compliance reasons.

A lot of regulations are actually a lot more reasonable than one might think and refrain from mandating any specific technical measures.

Yet, people always seem to point to regulations to justify all kinds of snake oil salesmen garbage, even if obviously unfit to fulfill the stated objective.

If you are building software for others to consume it is a good idea to have at least some protection. Even it doesn’t ever pickup MacOS malware/virus - it might stop you from distributing that intended to infect other target OS or software.

For me, the difference between other build systems and bazel was like the shift from CVS to git... Profoundly better internal data model, but definitely a learning curve.

That makes sense but there aren't many people out there using CVS right now. What were you using before Bazel? I.e. what are you comparing Bazel to?

Bazel, the tool, is fantastic. Haven't had a better build tool that I've seen yet. The build rules are awful. Almost every single rule is broken in some catastrophic way and, as is typical with Google open source, it is very difficult to ask for things to either be fixed or designed differently.

Major examples: rules_python flat out does not work for python 3, rules_docker does not work without python 2, rules_proto cannot generate stubs for languages, you cannot specify third party deps at a package level and there is no explanation as to why (they will say it would destroy hermetic builds or reproducable builds but that doesn't make sense when you think about ways to support both).

This has improved lately. Rules_docker's image pusher has been replaced with a copy written in Go. This pusher is built using rules_go, so that it does not rely on any host tooling.

I'll have to update my tag and try again. I wish it could just use my docker daemon which likely already had the image and layers.

Problem: There are N confusing build systems with arcane rules and you just want to compile your program.

Solution: Design a simple, straightforward build tool.

Problem: There are N+1 confusing build systems with arcane rules.

I’m getting a little tired of this particular brand of lazy cynicism.

I’m happy to see people inventing new build systems, programming languages, game engines, ORMs, or anything else where the conventional wisdom is to use the existing tools. It may be crazy optimistic but we need people to keep these skill sets alive.

I agree at a personal level. If anyone is pushing a toy project, basically no reason to knock it. Reason that Serenity OS story a few days ago was awesome.

However, solutions pushed by corporations with grandiose claims do strike a nerve with me. If only because they are typically pushed with the idea that they will spread through some technical superiority. I'm much more open to this getting more use strictly from who is pushing it, not what it is capable of.

It is successful enough that it has spawned a few clones—Buck, Pants, Please.build, and the thing that Chromium uses now. From what I can tell, before Bazel was open-source, everyone who left Google wanted to use Blaze badly enough to write their own version.

Not exactly ringing endorsements here.

In large, most of the clones you are talking about are because this is a google pushed product. Chromium specifically is because google pushed it in.

Regarding everyone at google wanting this, before entering industry, most folks never used a proper build system at all. It is not uncommon for college or earlier users to just use whatever their IDE does for them. Such that, yes, any automation is welcome after that.

None of this is to say it isn't worthwhile. Heck, they have enough manpower on it that it should be a good solution. It still hits a sour note with me, though.

But the problem is that there were no proper build systems to use, not that they weren’t familiar with them.

Gmake, cmake, scons, waf, whatever - non of them provide hermetic deterministic builds, track compiler versions properly, etc.

The only build system I had used which could, with crazy effort, provide some of those features, is iirc ClearCase (or whatever it used for version control and management), and it was painful in every way and still didn’t enforce it - just had enough to allow you to do it with great effort (which gmake/cmake don’t even do)

If this is because Google is pushing it, two things don’t make sense to me.

- Why would you clone an open-source product because Google is pushing it?

- Why are some of the clones older than Bazel? Bazel was open-sourced on Sep 8, 2015, going by first tag, and Buck two days later, so Buck must have been in development for a while. Pants was late 2014, and Please was early 2016.

> Regarding everyone at google wanting this, before entering industry, most folks never used a proper build system at all.

Could you give an example of a “proper build system”? Even if this applies to most folks, Google hires so many people, and there are enough old-timers there (hell, Stuart Feldman was a VP of Engineering) with industry experience.

One of the fundamental problems here is that for large enough projects or multi-project repos, it becomes burdensome just to load and evaluate all of the build rules. The inevitable result is that you have to split the build rules across multiple files for different parts of the tree, and people have been doing this for decades, this is nothing new. If you look at existing build systems, there are very precious few that will actually traverse the dependency graph well across these different subinstances. Recursive make sure doesn't do this very well (see Recursive Make Considered Harmful, 1997). IDEs usually do a good job of this but they only solve a narrow set of problems to begin with.

Once you start with the goal of solving this particular problem well, my claim is that your solution will have some surprising similarities to Bazel (unless you ditch some other objective like incremental builds).

I don't even disagree with what you are saying. I just don't know if it is succeeding because of the quality of the solution, or the weight of the author.

My hunch is it is a little of both. Combined with a giant dose of not caring about existing solutions. Which is not to be scoffed at too heavily. Being able to say "I don't care" about existing users is a huge asset that any startup should not discard. One of your biggest assets is the lack of external assets to be liabilities.

Still leaves me sour to see the hubris that there is something intrinsic to this. There likely isn't.

First there was blaze, the closed-source google internal build tool.

People left google and wanted to copy blaze, hence pants and buck were created.

Then google released bazel, an open source version of blaze.

Companies using bazel [1]:

* Asana * Databricks * Dropbox * Google * Huawei * Lyft * Pinterest * Stripe * TwoSigma * Uber ...

This is not a new project that needs pushing.

[1] https://github.com/bazelbuild/bazel/wiki/Bazel-Users

This seems silly. If I was to make a similar list of companies using Make, would that have deterred building bazel? Because I guarantee the number was non-zero and included some major players.

Bazel is hardly new, it’s been in use at Google for many many years as Blaze.

This is misleading. Bazel is full of brand new code and only parts of it have been in use at Google as part of Blaze.

No. I work on Blaze/Bazel. The vast majority of the code is shared between Blaze and Bazel. Examples of Bazel-specific code include support for Windows and support for external deps (needed for multirepos and interaction with package managers).

The core is exactly the same.

My employer encountered lots of basic bugs, like file handle exhaustion, which together with the amount of recent activity in GitHub, suggested to me that there is a lot of new code in Bazel. If it's essentially the same as Blaze as you say, then why the new name?

I was there when the name was selected. It turns out "blaze" suggests speed, which is why it was a popular name for other software, and some of it was registered as trademark.

I side with you. Why build Angular, React? There was jQuery...

After all, there are only tools. If you need a better tool, build it and ignore the naysayers. Everything is an experiment. If it inspires people, the better.

Sure, but don't ignore the fact that new things are technical debt. Maybe they're worth it, but they're still debt.

I don’t think that you and I have the same idea of what “technical debt” is.

Technical debt is when you prioritize short-term gains and pay for it with additional maintenance burdens long-term. In my experience, Bazel adoption can be the opposite—replace your custom scripts with Bazel, make the investment now, and deal with less maintenance later.

I’ve dealt with some really painful in-house build systems, and migrated some of them to Bazel. That was paying off technical debt.

Oh, definitely. Painful in-house build systems are way more technical debt than a nice standardized one, so this could definitely be a winning move. Just remember that all code is still debt.

I'd like to share two responses to the sentence "all code is technical debt".

1. A user named gnus-migrate [1] says it well: "I think you completely misunderstood the concept of technical debt. It was just a metaphor created to explain to non-technical managers the need for refactoring. Saying all code is technical debt makes the term meaningless. / I think that you're saying that even clean code needs to be maintained. While that is true it has nothing to do with the idea of technical debt."

2. Eleenrood responds: "Actually it is logical consequence of Cunningham['s] idea. If you assume that overcomplicated solutions generate debt, you can also assume that best solution generate also some amount of debt. Just way smaller than non-optimal solutions. It actually simplify it, cause you don't have to come up with magical line between "debt clear code" and "code adding technical debt". Instead you weight code, which adds more technical debt than other. Your best solution in this case is the one with smallest technical debt weight."

I will admit my preference, at the start of writing this comment, was for the first. It just sees nicer to define perfect code as 0 technical debt, right?

On the other hand, I agree that all code has maintenance costs. (At the bare minimum, from time to time, someone has to read it to revel its perfection and convince themselves that it still remains untarnished given changing business requirements!) So, more points for definition #2.

Next, let's compare software development to factories and machines (what could go wrong with this metaphor?). Yes, running and maintaining factory equipment has ongoing costs. But I would not say that planning for maintenance costs is the same as incurring debt. Hmmm, more points for definition #1.

On the other hand, if a factory is given the choice between (a) buying outright and doing its own maintenance versus (b) renting equipment that includes both the cost of the equipment and the maintenance, you would expect in a perfect (idealized) market that both would cost the same. ... (Slight tangent: Accountants are sticklers about how companies account for costs. Some systems differ in how owned equipment and rented equipment are treated. I'm not expert, but I have a healthy, painful respect for these considerations.) ... All of which would suggest that sooner or later, one way or another, you have to pay for that maintenance. Call it a cost or call it debt, does it really matter? In both cases, if you want to make widgets, you have to pay it. So, points for definition #2 for acknowledging that code requires maintenance which, like it or not, implies a debt/cost.

So, I can see why people prefer either. I kind of like to embrace the tension, so I like both.

But if you need resolution, it can be found! :) The difference between these views is only a matter of the baseline.

    Another Arbitrary Table

                     technical debt    technical debt
    option           definition 1      definition two
    ------           --------------    --------------
    perfect code     0                 300
    imperfect code   400               700
In both cases, the net difference is 400.

Of course, this works for addition, but not so well for division. Since division by zero is undefined, I tend to prefer definition #2!

[1]: https://www.reddit.com/r/programming/comments/8w8s03/all_cod...

That's not really the problem Bazel is trying to solve.

Yes, a big design requirement is to be polyglot, but if that's all you want, Make has been around thirty years.

Bazel attempts to solve the problem of performance and correctness in large codebases.

Most build tools do not scale well.

Bazel is almost a standard for build systems at this point.

Did you ever use it, or at least see what a bazel build rule looks like [1] before calling it arcane?

[1] https://docs.bazel.build/versions/1.0.0/tutorial/cpp.html#un...

Build rules for bazel are straightforward when you’re on the happy path & trying to do something that bazel already understands. It’s when you’re not on the happy path that things get awkward.

E.g. How easy is it to integrate a new C++ compiler into bazel? (and I don’t just mean what you get by changing CC to point somewhere else. I want the full integration with the bazel build system in order to get those hermetic builds that are the whole point of using bazel in the first place).

How easy is it to build things which are not in the set of blessed languages? What if I have a Haskell component? Can I easily integrate that into my bazel build? Last time I tried the answer was definitely not.

If you can use the pre-defined BUILD rules then bazel is pretty great. If you can’t and have to start wading into the weeds of bazel’s hinterland then it rapidly stops being so. If you’re inside Google, then you’re probably fine because a blaze/bazel engineer is ultimately only an email away. For the rest of us outside the Googleplex, extending bazel is an extended exercise in frustration in my experience.

There's genrule() which basically runs a shell script. With that, you can build anything for which you can write a command-line. And you can go as far as invoke a binary that is built by another bazel rule.

You can also define your own rules in a subset of Python (by wrapping other rules including genrule()), so adding new languages should be simple enough.

Based on my experience with bazel the phrase “should be simple enough” is doing an awful lot of work in that sentence ;)

It’s just a SMOP, right?

> Bazel is almost a standard for build systems at this point.

what a standard, used by a whopping 2% of the C++ community : https://www.jetbrains.com/lp/devecosystem-2019/cpp/

Bazel has yet to even land in Debian, it's some years short of being the standard.

Bazel is the bane of every software packager who needs to use it and who does not work for Google. GN (or whatever it's actually called) is a close second in terms of horridness and NIH-ness, then GYP (and depot_tools too, may as well chuck that in there). Guess which company foisted all this on us?

I think it's THE standard. Almost all teams at Google use it. It's just the few people outside of Google who are still using legacy tools.

I wouldn’t call Bazel confusing or arcane, but it is powerful. I find it rather delightful to use.

As a relatively newcomer to bazel, I did find it arcane:

- I couldn't find an easy way to list targets

- in particular, a way to show me the path to the build artifacts (so that I could inspect it in my editor)

- I couldn't find a way to do build configurations (especially intersections of build configurations)

- it didn't seem great at handling non-file build steps (set up a cluster / server / whatever so I can run my tests)

- the documentation was slow to load and hard to find things in (you basically had to use search, instead of browsing a good table of contents)

As I said, though, I'm a newcomer; it's quite possible there are good ways to do those things, I just hadn't managed to find it in the documenting yet. That and the issues are by no means unique to bazel.

I also do like some aspects of it (like enforced out-of-tree builds), but that's off-topic ;)

Listing targets in a package:

    bazel query //some/package:*
Listing targets in a package including subpackages:

    bazel query //some/package/...
These and more are covered in the documentation on querying: https://docs.bazel.build/versions/1.0.0/query.html#target-pa...

Configs: look into select() and modes.

Non-file steps: normally in bazel you would consider the server setup a part of the test. So `blaze build :my_test` creates an artifact, and blaze run or blaze test runs the artifact which does non-hermetic stuff like setting up a server.

Bazel's goal wasn't (explicitly) to be simpler or more straightforward (though I'd argue that its restrictions do cause that in many cases). It was built to solve problems that other build systems (make-like) don't: hermeticity and reliability of builds.

Except half of the N were Xooglers implementing their own clone of Blaze.

obligatory: https://xkcd.com/927/

That's the one!

Whats the advantage of using Bazel over CMake ? What does one do about package management ?

For packages, you download things by specifying an archive (zip file, git repo, etc), and a SHA256 hash. Think of it as extremely-precise version pinning. You can also refer to other Bazel projects and use their internal rules if permitted.

The biggest advantages of Bazel over something like CMake are:

1) It is very general, so you can easily express dependencies of this sort:

- concat three text files, then

- run them through a parser generator, then

- use the generated parser in a C library, then

- use that C library in a Go server library and a Rust client library, then

- package the Go server into a Docker container and build an iOS binary that uses the Rust client

This is a cross-language dependency chain that also runs a few Unix tools. In this example, if you edit the Rust client, only the Rust library and the iOS client will be rebuilt. If you edit one of the text files that compose the grammar, everything will get rebuilt.

The dependencies are also a DAG, so if your projects have a shared Go library and you change that, the Go server would get rebuilt, as would the Docker container.

While all this might seem distressingly general, it removes the need for things like "bootstrap.sh" files, "build.rs" files, and all of the unstated dependencies they entail.

2) If you download all of the toolchains (C++, Java, Rust, Go, etc), it's impossible to have "works on my machine" problems. Exceptions are use of "local_*" rules, and Apple binaries (you need machine-local SDKs).

3) This allows great caching. You can run the equivalent of ccache for all of this stuff, locally or shared on a LAN. RBE is even more advanced.

(1) sounds like table stakes for any build system - `make` has done all that for decades

Make cannot guarantee that this is the case. Bazel sandboxs all builds and hides them from non-reproducable things (internet, files they shouldn't get to see, etc) to make it "impossible" for you to have a build that isn't specified correctly in the DAG

Well, sort of. The way make tracks changes triggers more rebuilds and the build products are much less reusable. There's a good comparison table at the bottom of section 2.4 in https://www.microsoft.com/en-us/research/uploads/prod/2018/0...

Bazel's dependency graph tends to be much more detailed than Makefiles, extends all the way to toolchains at the base, and all the way up to build products and test results.

It can also be composed together across huge file layouts in ways make doesn't support.

I used buck at Facebook quite a bit, and it's my favorite build system. I assume most of the stuff I liked about buck translates to bazel:

* Build rules were extremely simple to write

* Sandboxing meant we could safely share a global cache of build artifacts, massively reducing build times

* It runs a persistent server process that caches dependency and file change state, so incremental builds are fast

* Buck had a lot of android-specific magic to make APK generation fast as well, not sure if that's in bazel.

A lot of the stuff that's nice about these build systems only really becomes an issue on large codebases (thousands of engineers, tens of millions of LOC).

The main weakness of buck (not sure about bazel) was handling 3rd party code; either you'd checkin prebuilts or port the build to buck.

In my opinion Bazel/Blaze and Buck are mainly useful to solve the problem of having too much code to build on one machine. Reproducible builds which can be done in a distributed fashion aren’t really useful when you can build your whole project in a minute or two on one machine. That being said, once you get the hang of it the overhead is not too high compared to simpler systems.

Xoogler here who used the internal version of Bazel ( blaze ) extensively, liked it and now opted to use it for our 2 person startup ( at least for the backend ).

You don't need a giant codebase to start benefitting from it. Even if you have a smaller codebase, and properly use bazel to setup tests, you only test what needs testing, and that makes a huge difference in productivity.

What languages are you using with Bazel? I have a project that uses Typescript and Python. Are you using Bazel with interpreted languages?

Bazel has experimental but pretty sophisticated support for TypeScript, with arguably a better compilation experience for large projects than the best known alternatives. (Disclaimer: I'm biased because I work on it, but I also work on non-bazel TypeScript projects so I am aware of the tradeoffs.)


None of the last three companies I worked at could build in under a minute, sadly.

There are security reasons to want a reproducible build.

Does anyone know of a real beginners level guide to Bazel that isn’t the docs. Got a typescript monorepo with a few backend services/react frontends I wanted to build and I got a bit lost in the complexity of it due to package.json handling

We wrote a bit about how we build a TS monorepo with Bazel on our company's blog [1] if that can be helpful

[1] https://dataform.co/blog/typescript-monorepo-with-bazel/

thanks will check it out

One thing that I like about bazel is the way it integrates test and build actions to the same action graph and behind the same CLI. Tests are then also hermetic and sandboxed, and can be cached (or executed) remotely.

For example, you could configure your CI to push test results to a remote cache, and when developers check out a clean master, ”bazel test //...” against that cache will report all tests as pass without running anything.

The lack of backwards compatibility between Bazel versions has been the deal-killer for me thus far. Old build instructions should work with newer releases of the same tools. It's important for the tools to remain compatible because you can't change the instructions contained in old releases.

I hope the release of 1.0 is an indication that Bazel is taking interface stability more seriously.

Right. Every time I build tensorflow I have to build bazel first, or get the bazel version for the tf version.

I have to say, having used a number of Google's tools over the years: basel, coral sdk, maps api and perhaps others, I'm always surprised at how 'how ya doing' the experience is. Don't get me wrong, I appreciate their tools and its openness, but I wish Google spend a bit more time doing software engineering, and less throwing shit against a wall ascertaining stickiness.

> Every time I build tensorflow I have to build bazel first, or get the bazel version for the tf version.

This is fixed now: https://github.com/tensorflow/tensorflow/commit/991aec351b57...

If you use Bazelisk (https://github.com/bazelbuild/bazelisk), you'll automatically get the matching Bazel version for TensorFlow in the future (and all other projects that have adopted the .bazelversion file).

Awesome. Bazelisk provides exactly the sort of compatibility I was hoping to see.

Though, manually specifying a maximum Bazel version doesn't seem like it should be necessary. The authors of Bazel know what features they have changed since the version specified as the minimum, and Bazel itself knows what features the BUILD file uses. That information together could be used to calculate the maximum version.

Yes. 1.0 indicates the beginning of semantic versioning, which was not in place during the alpha and beta phases.

Exactly this. It's not fair to complain about breaking changes in a product pre 1.0.

Also, unless they had wildly different versioning pre 1.0, they could have just said "we are releasing 1.0". semver already states that 0.y.z should not be considered a stable API [1].

[1] https://github.com/semver/semver/blob/master/semver.md

Is it worth giving up language-specific build tools (CMake, Maven, Gradle and others) for one build tool to rule them all?

My experience with using the Google-internal version of Bazel is that yes, you want to give up all other tools to use this. Having a proper dependency graph makes it possible to cache maximally and accurately, and that results in fast builds. You don't need it if you only have like one go program that you're building, but you will start to see the disadvantages when you have a handful of things you build. Did you build all the apps that depend on the proto you updated? Do you need to build your PHP app because you change an internal go library? Tools like "docker build" have no idea and tend to rebuild too much or too little, and yield different results on different machines. This causes a lot of headaches.

The thing that kills me about Bazel is that it is quite painful to deal with Java. I never seem to get a working version on a Linux distribution. On Windows it's always sitting down in the taskbar showing ads. Oracle calls you to demand money. You check the bounds of an array and they sue you. It's just not worth it. So I don't actually use Bazel.

Dude, IDK if you have used bazel OUTSIDE of Google, but I've used it both inside and outside of Google and I can tell with 99.999% confidence that the experiences are dramatically different.

Bazel != Blaze (internal version of Bazel).

1. Google has a well maintained monorepo. Most companies don't. That diminishes the meaning of a good build system in the very first place; even if Bazel is powerful, with separate repos it's power isn't the shiniest.

2. Google open sources a part of Blaze, which is the external Bazel, but not all to it. That's what Google does with basically everything it open-sources. The outcome then, is that the external tool require you to build a lot of other not open-sourced tools to replicate the excellence of the entire Google internal eco-system that makes the internal tool so amazing. Same goes with Bazel. It works amazingly internally because there are so many people supporting it, and so many other tools work nicely with it. But not all the features and tools are open-sourced together with Bazel, so the diff of experiences internally VS externally is significant.

TL, DR: if you're a small company without a good monorepo strategy and someone experienced on this, DON'T EXPECT BAZEL TO BE YOUR MAGIC CURE.

Disclaimer: my current company uses Bazel pre-1.0 and it's definitely dramatically worse than what I used internally at Google. I haven't tried 1.0 but I don't expect the ecosystem problem to be solved in 1.0 anyways. Also Bazel sort of sucks for python anyways. Really hoping someone can prove me wrong and say 1.0 is actually the lit shit.

Can you quantify what you mean by worse? What makes you feel that way?

> I never seem to get a working version on a Linux distribution. On Windows it's always sitting down in the taskbar showing ads.

I've never experienced any of this. Maybe because I use OpenJDK?

I use Amazon Corretto. There are few other openJDK distributions.

> Having a proper dependency graph makes it possible to cache maximally and accurately, and that results in fast builds.

Pretty much every other build system (e.g., make) is rooted in having a proper dependency graph. This is not a unique property of Bazel.

That's right. However, there are distinct implications on how this dependency graph is constructed, analyzed, and evaluated. "Build Systems à la Carte" (Mokhov, Mitchell, Peyton Jones) [1] goes deeper into formalizing these differences.

[1] https://www.microsoft.com/en-us/research/uploads/prod/2018/0...

Make (to use your example) does not do any sandboxing to ensure that the declared dependency graph is actually correct. This is the big innovation in Bazel (and Co).

Electric Make has had sandboxing since 2002, no conversion from your familiar make-based builds to a new shiny build tool required, and it can make on-the-fly corrections to execution order if that sandboxing reveals that incomplete dependency specifications caused something to run in the wrong order (relative to a strictly serial build).

"Sandboxing" in a build tool cannot be claimed as Bazel's innovation.

Can you link to the project?

Sorry for the belated response. Electric Make is part of CloudBees Accelerator: https://www.cloudbees.com/cloudbees-accelerator.

He was probably talking about a product from Electric Cloud:


How you have an ads taskbar Java app.

Doing Java since it was introduced in 1996 and never got to see Java ads on the taskbar, must be a special version.

On Linux I just install a headless OpenJDK 1.8. Something similar is available through brew on the Mac.

Oracle’s version is dead as far as I care.

Who do you think writes 90% of OpenJDK and drives language design?

Long-run, yes. Projects are too often multi-language these days, and if you mix C++ and Java, there might not be a clear or convenient seam between the C++ and Java parts.

Short-term, it will take a while after 1.0 for the surrounding tools and libraries to stabilize. It will take a little while for better documentation, blog posts, etc.

Personally, I find it a huge step up from e.g. CMake for C or C++ projects. But I’ve been using it for a while.

Depends on your project. My experience is that if you use a single language throughout, then your language-specific build tool probably suffices. But once you go polyglot, interactions between these language-specific build tools are bad and you'd be better off replacing all of them with one tool that understands everything. Especially if you have cross-language dependencies (e.g. your Go program depends on C++ using cgo, or your Python has a module written in C). Whether or not you want that tool to be Bazel is your choice.

My only experience with Bazel is using it for Tensorflow (TF). It seemed very roundabout for what was required and definitely had some painful version issues as TF got closer to 2.0. Hopefully this means more stability and ease of use going forward for TF users.

> We will have a window of at least three months between major (breaking) releases

That seems... short

Well you can keep using the LTS releases. I can't find offhand how long "long" is. Presumably much longer than three months...

It would great to see benchmarks against gradle.

Gradle can become really slow in large projects.

Anyone had any experience using Bazel with Angular projects?

How is it different than Buck or Pants? What makes Bazel better than them? If I already use Buck or Pants what benefit would I get if I move to Bazel?

Bazel is more popular, with a larger community and better language support than the other two. It seems that Bazel has "won".

Looking forward to seeing this as a requirement in projects that already require npm, phyton, cmake, gradle and ninja to compile :(

What does Bazel do exactly?

Bazel is an open-source build and test tool similar to Make, Maven, and Gradle. It uses a human-readable, high-level build language. Bazel supports projects in multiple languages and builds outputs for multiple platforms. Bazel supports large codebases across multiple repositories, and large numbers of users.

It also builds code repeatably & hermetically (interpreted languages excepted; at google python is built super hermetically where a python binary + environment is bundled with python script(s) into an executable; some companies like dropbox have copied this). You run build, and you get binaries for N kinds of systems (e.g x86_64, ARM). It builds a dependency graph of your code that you can query and results in cache-able builds (only rebuild the part you need). Since building dependencies in a tree can be massively parallelized, it also allows you to parallelize your builds well across machines (with some effort).

You use the one tool to compile C++, Java, Python, Go, Rust, etc.

Learn CMake.


Maybe it's because your stated goal is baiting a flamewar in bad faith.

It's probably because you came off as a bit unhinged and opened with a salvo against HN (which many people here understandably like).

If you'd phrased it as "meh, I like make" you'd probably have half a dozen upvotes from the minimalists.

For me the obstacle for using make is, it is not available for windows and even if I install via msys2 or cygwin, it is hard it make Makefiles independent of OS.

I am not sure Bazel is the answer. BUILD files feel verbose and need more effort to maintain than platform specific files(go.mod, Cargo.toml, pom.xml).

As long as you've got a basic unix like environment what's hard about being cross platform? Last time I tried it was very simple, at least with higher level languages and using gnu make specific features.

For build tools I think "defining their own platform" tends to be a better description than being cross platform, the have there own languages, tools and idioms. In this case it's on top of the java platform as well.

Visual Studio (used to?) come with nmake, a mostly compatible version of Make.

<Microsoft Product> used to come with mostly compatible version of <Unix tool>.


I'm not sure if MS is to blame there, in my very brief experience it was compatible enough, it was the lack of unix tools and gnu extensions that made it difficult to work with. You'd probably get the same issue trying to be posix compatible.

Applications are open for YC Summer 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact