
Bazel – Correct, reproducible, fast builds for everyone - drivebyubnt
http://bazel.io/
======
habosa
Working at Google, Blaze is one of the technologies that amazes me most. Any
engineer can build any Google product from source on any machine just by
invoking a Blaze command. I may not want to build GMail from source (could
take a while) but it's awesome to know that I can.

I think this could be hugely useful to very large open source projects (like
databases or operating systems) that may be intimidating for contributors to
build and test.

~~~
OneMoreGoogler
> Any engineer can build any Google product from source on any machine

A little too optimistic :) You can't build Android, Chrome, ChromeOS, iOS
apps, etc. via blaze.

~~~
ovidiup
When I worked at Google I built a Blaze extension to be able to build Android
apps. It worked really well, though I'm not sure how well it was maintained
after I left in 2010. Internally at Google, Blaze was extremely customizable,
and I hope Bazel too, so one can easily add support for building iOS apps etc.

EDIT #1: I see support for building Objective-C apps is already present in
Bazel. EDIT #2: Bazel uses Skylark, a Python-like language, which could be
used to implement all sorts of extensions, including the one I was referring
to.

~~~
rictic
There's an extension language in bazel named Skylark, which will be familiar
to you if you wrote build_defs internally:
[http://bazel.io/docs/skylark/concepts.html](http://bazel.io/docs/skylark/concepts.html)

------
thechao
I've been burned by so many build tools over the years. I've finally settled
(for C/++/asm) on the combination of Make + ccache: I build a _very_ paranoid
Makefile that recompiles everything if it feels like anything changes. For
instance, every rule that compiles a C/++ file is invoked if _any_
header/inc/template file changes. I let ccache do the precise timestamp/check-
sum based analysis. The result is that (for large builds < 10MMLOC) I rarely
wait for more than a few hundred milliseconds on incremental, _and_ I have
confidence that I never miscompile.

I just wish that I had a high-performance replacement for linking that was
cross-platform (deterministic mode for ar), and for non-C/++ flows. Writing a
deterministic ar is about 20 lines of C-code, but then I have to bake that
into the tool in awkward ways. For generalized flows, I've looked at
fabricate.py as a ccache replacement, but the overhead of spinning up the
Python VM always nukes performance.

~~~
beagle3
> I build a _very_ paranoid Makefile that recompiles everything if it feels
> like anything changes.

Do you have some kind of way to verify that your makefile dependencies conform
to your source dependencies? Is clang/gcc tracking sufficient for your use
case? What about upgrading the compiler itself, does your makefile depend on
that? If so, how?

Have you considered tup[0]? Or djb-redo[1]? Both seem infinitely better than
Make if you are paranoid. tup even claims to work on Windows, although I have
no idea how they do that (or what the slowdown is like). Personally, I'm in
the old Unix camp of many-small-executables, non of which goes over 1M
statically linked (modern "small"), so it's rarely more than 3 secs to rebuild
an executable from scratch.

> (deterministic mode for ar)

Why do you care about ar determinism? Shouldn't it be ld determinism you are
worried about?

[0] [http://gittup.org/tup/](http://gittup.org/tup/)

[1] [https://github.com/apenwarr/redo](https://github.com/apenwarr/redo)

~~~
thechao
> Do you have some kind of way to verify that your makefile dependencies
> conform to your source dependencies?

Nope. I explicitly use a conservative approximation—this guarantees
correctness, over speed. Building everything every time with a clean tree is
where I begin; I start optimizing after that.

> Is clang/gcc tracking sufficient for your use case? What about upgrading the
> compiler itself, does your makefile depend on that? If so, how?

Self-rewriting Makefiles (to consume the .d files), combined with the cleaning
necessary for them, become a large technical debt—especially given the
complexity of the Makefile needed to generate them. Modern CCen just aren't
capable of this. Perhap Doug Gregor's module system will land in C21/C++21,
and we'll see some good, then.

> Have you considered tup[0]? Or djb-redo[1]?

Yes. They are both don't provide significantly better correctness guarantees
combined with sufficiently better performance to justify the cost to porting
to older Unixen. (This is a consensus opinion at my shop; I, personally, enjoy
tup.)

> Why do you care about ar determinism? Shouldn't it be ld determinism you are
> worried about?

Determinism let's me cache *.o/a/so/dylib/exe/whatnot without getting false-
positives due to time-stamp changes and owner/group permissions in the obj/ar
files (see ar(1)). ld is deterministic under all the CCen I use by setting the
moral-equivalent of -frandom-seed.

~~~
beagle3
Thanks.

> this guarantees correctness, over speed.

Wouldn't "promotes" be a better word? what guarantee do you have?

> Self-rewriting Makefiles (to consume the .d files), combined with the
> cleaning necessary for them, become a large technical debt—especially given
> the complexity of the Makefile needed to generate them. Modern CCen just
> aren't capable of this.

Haven't needed it in a long time, but back when I did generating one for me
was all of running the compiler with "-MD" in the compile phase, and including
it in the Makefile - no special "make depend" phase, no noticeable slowdown.
What technical debt are you ref

> Yes. They are both don't provide significantly better correctness guarantees
> combined with sufficiently better performance to justify the cost to porting
> to older Unixen.

Interesting. It is my experience that redo (from apenwarr) is trivial to run
and use anywhere there's Python and isn't Windows -- it's almost as fast as
Make, and it makes correctness guarantees that Make cannot (e.g., .o file
replacement is atomic).

~~~
thechao
Maybe... 'prefer'. I'm more confident that a really conservative Makefile will
build my code correctly.

My issue with -MD was not that it didn't provide precise (and correct!)
dependencies; my issue was that the build system's most mysterious breakages
are when modules (and dependencies) are changed. In that case, there are three
situations:

1\. Your .d files are out-of-date, and thus your build is broken;

2\. You have to have a policy of "updating the .d files"; or,

3\. Your makefile has to be .d savvy.

The last option is the one I see most often taken, but with rare success.

> it makes correctness guarantees that Make cannot (e.g., .o file replacement
> is atomic).

I wish Make wasn't so entrenched.

------
latkin
Correct, reproducible, fast builds for everyone _not running Windows_

~~~
jacquesm
Convince your employer to ship a half decent unix environment with its OS and
it will run on windows too. It's mostly a choice by microsoft to ship a half-
baked command line interface with its products, you can't blame google for
that.

~~~
NeutronBoy
By 'half-baked' you mean 'not * nix compatible' command line. Powershell is
_amazing_.

~~~
tacos
I'm a Microsoft fanboy and Powershell is bad Python.

How 'bout we meet in the middle and y'all make the changes the Cygwin guys
need? And hey, where'd that POSIX subsystem go? Finally got C99 support almost
shipped now how 'bout you reverse that other horrible decision, too?

------
ngd
This is an open sourcing of Google's internal build tool.

I know it as Blaze, which Bazel is an anagram of. Many files in the source
have references to Blaze.

------
mashraf
Is Google departing from just throwing white papers over the wall and let
community figure out the implementation details? blaze white paper was dropped
a while ago and there are already two clones in Pants and Buck at Twitter and
FB. It would be interesting to see how far off clones are from original
implementation.

~~~
cbgb
Do you have a link to that white paper? A quick search on their research site
doesn't really yield any results.

~~~
kchod
I'm a developer on Bazel, and AFAIK there is no white paper. We definitely
don't want to "throw it over the wall," we're going to try to push more and
more development into the open over time.

~~~
bruckie
There's not a white paper, but there's this series of posts: [http://google-
engtools.blogspot.com/2011/06/build-in-cloud-a...](http://google-
engtools.blogspot.com/2011/06/build-in-cloud-accessing-source-code.html)

(and an accompanying presentation)

------
jacquesm
Getting rid of the timestamps in jar files is a huge improvement. I really
hate it that when I recompile some huge java project I can't run a checksum on
the jar to verify that the build is identical to a previous run (or when being
dumped into some project that my current source tree is an accurate reflection
of what is running in production).

------
zobzu
I had a bit of a read but I didn't find where it explains (code or doc) how it
achieves reproducible builds.

It seems like a stricter, huge make-like harness (in fact it reminds me of the
mozilla firefox python build system a bit).

It's not bad by any means, but it seems like to me it doesn't "magically" fix
the "be reproducible" problem at all (which is what it seem to claim)

Am I missing something?

~~~
lberki
You are absolutely correct: Bazel by itself does not make your builds
reproducible. If a tool calls rand() or bakes the current time into its
output, reproducibility goes out of the window.

What Bazel does, however, is to make it possible to run build steps in a
sandbox (although the current one is kinda leaky) so that your build is
isolated from the environment and thus behaves in the same way on any
computer. It also tracks dependencies correctly so that it knows when a
specific action needs to be re-run.

This makes it possible to diagnose non-reproducible build steps easily. At
Google, the hit rate of our distributed build cache usually floats around 99%,
and this would be impossible without reproducible build steps.

~~~
jrockway
Does work done by Debian to make Linux packages build reproducibly help Bazel?

[https://wiki.debian.org/ReproducibleBuilds](https://wiki.debian.org/ReproducibleBuilds)

Would Bazel help with the remaining long tail of packages in Debian?

------
yarapavan
Surprisingly, significant parts of the code is not open source. According to
this page,
[http://bazel.io/docs/governance.html](http://bazel.io/docs/governance.html),

    
    
       Is Bazel developed fully in the open?
    
       Unfortunately not. We have a significant amount of code
       that is not open source; in terms of rules, only ~10% of 
       the rules are open source at this point. We did an 
       experiment where we marked all changes that crossed the
       internal and external code bases over the course of a few 
       weeks, only to discover that a lot of our changes still 
       cross both code bases.

~~~
spankalee
I don't think you're interpreting that section quite right. That section is
talking about whether or not Bazel is fully _developed_ in the open, and the
answer is "Unfortunately not".

What they mean is that changes to the internal source of Blaze often involve
changes to both the open sourced part, which is Bazel, and the closed parts,
which are additional rules that are neither open sourced, nor included in
Bazel (Blaze has about 5x as many rules as Bazel).

It's best to make atomic changes, so rather than split the changes, review and
submit the open source changes externally, and the closed rules changes
internally (which would complicate reviews, testing, syncing and rollbacks),
then pull in the external changes, they submit these cross-code-base changes
internally, then dump the change into the external repo. The next paragraph on
that page makes it clear that the code is open, even if not all of the
development process is.

To be clear, all of Bazel is open source and the source is available here:
[https://github.com/google/bazel](https://github.com/google/bazel)

~~~
tarblog
Can you explain or give an example of a "rule", it's unclear what this means
to me.

~~~
DannyBee
[http://bazel.io/docs/build-ref.html#rules](http://bazel.io/docs/build-
ref.html#rules) For example, cc_binary is a rule. Rules are the things that
know how to take whatever is specified as it's inputs, do something to them,
then produce some specified set of outputs

Google has a large number of rules (IE far far more than just the rules you
see in bazel). As part of open sourcing, they have stared out by open sourcing
about 10% of those rules.

Some of this is because they are google-entangled. Some of them don't make
sense to the open source community. etc

------
cies
What would be needed to get this to work with Haskell?

I read in the "Getting started":

> You can now create your own targets and compose them.

So does this mean it is a replacement for `make`? => Yes

Found the answer here:
[http://bazel.io/docs/FAQ.html](http://bazel.io/docs/FAQ.html)

~~~
kchod
If you're interested in adding rules for a new language, check out Skylark:
[http://bazel.io/docs/skylark/concepts.html](http://bazel.io/docs/skylark/concepts.html).

------
w4tson
It's another impressive feat from Google and reading the comments I've kind of
established that

1\. Binaries are checked in to source 2\. It's more structured than Gradle 3\.
It's for very large code bases 5\. It's _nix only

But...

1\. We've already had the "chuck it in a lib directory" approach. The
distributed approach maven/ivy etc seems to be working for the millions of
developers out there who just have to get through the end of the day without
production going up in flames. I suppose it's like moving a portion maven
central into your code base. Checked in. Feels very odd, and kinda against one
of the pillars of JVM: Maven. Love it or hate it it's one of most mature
build/repository types out there. npm, bower anyone?

2\. Got to agree with astral303. This isn't really something to shout about.
Better reproducibility? Gradle/SBT have had incremental builds for quite a
while. We all know there's no silver bullet, if you don't declare your inputs
and outputs to gradle/blaze tasks or seed with random values then you're only
going to get unrepoduceable builds.

3\. Very large, I get that.

4\. Very large code bases tend to enterprise systems. Enterprise systems tend
to have a plethora of platforms/OSs so it being _nix only is a drawback.
However I suppose that if in charge of 10MLOC code base then I could mandate
_nix only builds? However in my experience they also tend to gravitate towards
standards that seem to have longevity.

I'm yet to give it a go so I'll reserve final judgement. However I will say
that I do wonder how far we'd be if Googles through their brightest minds at
and worked with Maven/Gradle/SBT etc to scale their builds. (Yes I realise
it's multi-lang - so is gradle). Perhaps the whole community would benefit
from performance benefits.

Anyway hats off Google guys. It looks impressive and no doubt I'll jumping all
over it in 12 months. In the mean time I'm off to go read up on Angular 2.0,
or Typescript or ES6 or ES7 or whatever else I _need* to know to get me
through the day.

Really I'm just jealous I don't have 10MLOC code base :D

~~~
cromwellian
I don't know about Bazel, but Blaze doesn't "check in binaries". Build
artifacts are cached, but not "checked in".

The problem with maven and gradle is that their build actions/plugins can have
have unobservable side effects.

This approach is more 'pure functional'. You have rules which take inputs, run
actions, produce outputs and memoize them. If inputs don't change, then you
use memoized outputs and don't run the action.

As long as your actions produce observable side effects in the outputs (and
don't produce side effects which are not part of the outputs, but product
state which depended upon in some manner), then you can do a lot of
optimizations on this graph.

In my experience with maven and gradle, they are way way slower, and that's on
relatively small projects

~~~
w4tson
Apologies for comment- I'd just gotten home from the pub was drunk :D

I look forward to trying it out. The ObjectiveC rules sound interesting
especially given the state of XCode which is a laughable IDE.

------
setheron
If i'm sticking to primarily Java; is there a benefit to using Bazel as
opposed to Maven / Gradle / Sbt ?

~~~
astral303
At first impression, unless you have a single gigantic source code base,
unlikely. From their FAQ:

>> "Gradle: Bazel configuration files are much more structured than Gradle's,
letting Bazel understand exactly what each action does. This allows for more
parallelism and better reproducibility"

The value of "more parallelism" depends on the complexity of your Java source
code base. I can easily imagine why this extra structure can lead to more
parallelism.

However, I am not buying "better reproducibility" without justification or
explanation. I've had very reproducible Maven builds for years (and I don't
see how Gradle would be different). So I would love to know which aspects are
improved upon with this structure, if someone could expand or explain.

Finally, I'm very wary of "much more structure". The worst thing about Maven
is its extreme insistence on structure and schema and very specific
architecture of your build tasks and components. In contrast, with Gradle, you
can freely shape your build scripts to reflect the "build architecture" of
your source tree in a minimal, maintainable way. Furthermore, when your
application's needs change, refactoring your build is far easier in Gradle,
thanks to its internal-DSL style (the build script is code).

If the structure isn't "free", you pay for structure with reduced build script
development speed. For Google, it's a tradeoff worth having with that massive
source tree.

~~~
ulfjack
I work on Bazel.

We've put a bunch of work into making sure that we know about every file that
goes into the Java compilation, and if any of them changes (and only then) do
we recompile. Within Google, we use a form of sandboxing to enforce that.

You're also right that it isn't free - we have reason to believe that larger
projects and larger teams will see benefits from using Bazel. Use your best
judgement.

------
malkia
Oh, but my favourite option "blaze menu" is missing :)

~~~
asuffield
Huh. I never knew that was there. I'll remember this next time I'm around
Charleston.

------
pacala
A couple of questions:

* If I have a Maven-based project with heavy reliance on pre-built jars from Maven Central, what's the recipe to port it to Bazel?

* Related, if I have multiple github repos, say a couple open source libraries and a couple private repos, what's a good recipe in conjunction to Bazel?

~~~
kchod
Check out [http://bazel.io/docs/build-
encyclopedia.html#maven_jar](http://bazel.io/docs/build-
encyclopedia.html#maven_jar). In the root of your build, specify the jars you
want from maven and then add them as dependencies in your BUILD files. The
first time you run "bazel build", they'll be downloaded and cached from then
on. It's somewhat limited in functionality at the moment, but should work for
basic "download and depend on a jar".

For multiple Github repos, use [http://bazel.io/docs/build-
encyclopedia.html#http_archive](http://bazel.io/docs/build-
encyclopedia.html#http_archive) or [http://bazel.io/docs/build-
encyclopedia.html#new_http_archiv...](http://bazel.io/docs/build-
encyclopedia.html#new_http_archive) (depending on if it's a Bazel repository
or not). Let us know if you have any questions or issues!

~~~
pacala
Thanks for the tips. I'm super-hyped that blaze was open sourced, it is one of
the best systems I've ever had the pleasure to work with.

A couple more questions :)

* Any pointers for adding Scala (sbt?) support? I'd start here: [http://bazel.io/docs/skylark/rules.html](http://bazel.io/docs/skylark/rules.html).

* Suppose I develop using multiple repos and http_archive. I'd like to make changes both to a library and to a project that depends on it simultaneously, without committing the library patches to master github repo just yet. Is there a way to configure the http_archive, let's say by saying "bazel --mode=local", and have it customize the remote archive http to use a different url (say, my github's fork instead of the master github) for that build?

~~~
kchod
Scala support: yes, add using Skylark. Definitely let us know if you run into
any rough edges, Skylark is a work-in-progress.

For multiple repos: there's no command line flag, but you could change the
WORKSPACE file to use [http://bazel.io/docs/build-
encyclopedia.html#local_repositor...](http://bazel.io/docs/build-
encyclopedia.html#local_repository). Unfortunately, this may be of limited use
to you. At the moment it's optimized/bugged to assumed that your local repos
don't change, so it won't rebuild them (this is great for things like the JDK
and gcc, but not so much for actual in-development repos). Feel free to file
feature requests for any functionality you need, I'll be working on this a lot
over the next couple months.

------
mikojava
Here's the Gradle Team's perspective on Bazel

[https://www.gradle.org/gradle-team-perspective-on-
bazel/](https://www.gradle.org/gradle-team-perspective-on-bazel/)

------
pjjw
Any reason the python support was ripped out? I've got my suspicions about not
wanting/not being able to properly release the python packaging method in use
internally, but I'm curious if I'd be tilting at windmills to try and get it
to output pexes.

~~~
DannyBee
I suspect the reason was: "They need to start with something and go from
there".

So they started with the use cases likely to be the most popular.

Additionally, there are definitely cases where the implementations of rules at
Google are a morass, and rather than dump it on the open source community, it
makes more sense to clean them up when they get rebuilt.

------
cromwellian
If only our code search and code review systems were public too.

~~~
solomatov
BTW, do you have blaze build for gwt? ant seems unwieldy for me.

~~~
cromwellian
Internally to google, gwt_application, gwt_module, gwt_test is a built-in
rule. GWT itself is built with blaze internally (not ant) as well.

~~~
solomatov
Do you have plans to open this stuff?

------
pron
How does it compare with Java 9's sjavac
([http://stackoverflow.com/a/26424760/750563](http://stackoverflow.com/a/26424760/750563))?

EDIT: I fully understand that this is a build tool for multiple languages. But
its raison d'etre is speed. So I'm asking what techniques does Bazel use to
accelerate builds and how do they differ from those used by sjavac, which is
also designed to accelerate builds of huge projects?

~~~
hanwenn
I work on Bazel.

Bazel also builds other languages, such as C++ and Objective-C.

We do invoke the Java compiler through a wrapper of our own. We think we can
make that work as a daemon process to benefit from a hot JVM, but haven't
gotten round to that.

~~~
moondowner
Any plans on supporting Windows? That will definitely increase the adoption of
Bazel.

~~~
hanwenn
[http://bazel.io/docs/FAQ.html](http://bazel.io/docs/FAQ.html) \- "What about
Windows?

We have experimented with a Windows port using MinGW/MSYS, but have no plans
to invest in this port right now. Due to its Unix heritage, porting Bazel is
significant work. For example, Bazel uses symlinks extensively, which has
varying levels of support across Windows versions."

In other words: it's a lot of work, and frankly, our team doesn't know enough
about windows to be very good at porting it. We would welcome contributions to
make it work on Windows, of course.

~~~
MrBuddyCasino
That would be great. I'm not a Windows user, but having Windows support is a
pre-requisite for adoption in many corporate environments, and proper symlinks
are available since Windows Vista.

------
Zariel
Is this the tool that Google uses to build its Golang source? Or is that
something else which is not available?

~~~
kchod
The Golang source code for the server code at google is built with this tool.
The rules that accomplish this are rather complex due to their interactions
with our C++ libraries, and predates the open source "Go" tool. The experience
with the Google internal rules, motivated some of the choices in the "go"
tool, I believe.

If you're interested, hanwen wrote a bunch rules with similar semantics as the
internal rules, see
[https://github.com/google/bazel/tree/master/base_workspace/e...](https://github.com/google/bazel/tree/master/base_workspace/examples/go/)
.

It would be nice to make these semantics match the external ones better, but
it requires us to open up more tooling, so people won't need to write BUILD
files.

~~~
zzzhao
In what cases would using Bazel make sense to build Go projects? If they're
extremely large? If they have a lot of dependencies on code in other
languages? If you need sophisticated build/release tooling?

BTW, thanks for the release! Will have a fun time digging through this over
the next few days. I heard some murmurs that Blaze was going to be open
sourced from around the watercooler but didn't think it'd be so soon.

~~~
hanwenn
I guess if you want to integrate Go tools with builds in other languages. If
you are using pure Go for your entire ecosystem, there is not much point in
using Bazel, as the "go" tool is very capable for that scenario.

------
shmerl
_> Why doesn't Google use …? Make, Ninja: These tools give very exact control
over what commands get invoked to build files, but it's up to the user to
write rules that are correct._

 _> Users interact with Bazel on a higher level. For example, it has built-in
rules for "Java test", "C++ binary", and notions such as "target platform" and
"host platform". The rules have been battle tested to be foolproof._

But does it give the optional custom level of control that for example CMake +
Ninja provide? Or it's only high level rules?

~~~
blinks
[http://bazel.io/docs/skylark/concepts.html](http://bazel.io/docs/skylark/concepts.html)

You can [at least internally] define custom rules to handle pretty much
anything, in almost-but-not-quite-python.

------
frownie
From the FAQ :

Multi-language support: Bazel supports Java, Objective-C and C++ out of the
box, and can be extended to support arbitrary programming languages.

c'mon, not even the Go language from Google itself ?

------
jibu
Maven doesn't work so well when there are loads of small self contained
'micro-libraries' (yes, sub-projects, but they are so involved to set up they
almost defeat the purpose). Was considering pants -- which doesnt seem like it
has great adoption? -- but this seems like its substantially more fully
featured.

Presumably will also make opensourcing internal projects easier. That can't be
a bad thing :)

~~~
spullara
WRT to Java support: Since it doesn't appear to generate poms or publish to
maven repositories it doesn't seem very useful on the open source part of
things. It seems explicitly for generating internal, proprietary software from
a monolithic source tree. I would have much rather seen the incremental
compiler and jar generator integrated to maven than replacing the entire build
system.

~~~
alblue
Actually Maven 3.3 was released recently which has a smart builder for
building separate parts in parallel, and using Takari plugins you can use the
Eclipse complier which is parallelising in itself. See
[http://takari.io](http://takari.io) for more details.

------
nchelluri
I worked at Ning for a couple of years
([http://www.ning.com/](http://www.ning.com/)) and the internal codename of
our create-your-own social network was Bazel.

When I first saw the headline I thought they'd open-sourced it.

------
danneu
The "b"-with-leaves-sprouting-from-it logo is also used by
[http://beanstalkapp.com/](http://beanstalkapp.com/)

------
brooksbp
Will GYP/GN be deprecated in favor of Bazel?

What, if any, does the convergence among these projects look like longevity-
wise?

------
forrestthewoods
Will there ever be Windows support?

------
hbhakhra
This seems very promising. Does anyone know if this would this work with the
OSGI framework?

------
zerr
Fast - compared to what?

------
bubersson
Wohoo! This is awesome :)

------
toolslive
depends what you mean with reproducible: build a jar twice, and its md5sum
will change because there are timestamps in the archive.

------
rquirk
What is this lameness?
[https://github.com/google/bazel/tree/master/third_party](https://github.com/google/bazel/tree/master/third_party)
\- why not use gradle repos to download jars with known hashes? Sticking all
those jars in the git repo is just... well, I expected better from Google.

~~~
jdlshore
Try not to be so rude.

The FAQ is pretty clear about their reasons. It talks about tools, not other
dependencies, but I'm sure the reasoning is the same: "Your project never
works in isolation... To guarantee builds are reproducible even when we
upgrade our workstations, we at Google check most of these tools into version
control, including the toolchains and Bazel itself."

It's a sensible policy and one I use myself. Do you have a better reason for
disliking this policy than a knee-jerk "yuck?"

~~~
rquirk
Right, I'll try not to be so grouchy :-D

Some reasons are the bloat, the possibility of "accidental" forks when a non-
upstream version is compiled and checked-in binary-only, crufty old versions
hanging around, and security problems. It adds extra work for downstream
packagers having to pick it apart for distros.

Bundling gets particularly bloaty for git repos, since the history is always
included in each clone. For perforce or SVN it doesn't matter so much as you
only get the latest version of everything. In git each time there's a
dependency update, it will pretty much add the size of the new jar to the .git
directory. Over time it's going to grow huge. If at a later date the
repository owner decides on a new policy where the third party files are not
bundled, then even removing the directory from the current head doesn't shrink
the repo size.

There are binaries in there for Mac, Linux and Windows (.exe file at least).
You either need one or the other, not all at the same time.

This sort of thing is fine for proprietary software used in a controlled
environment, but for open source it looks kludgy.

An alternative could be to have a "dependencies" repository that would be
shallow-cloned as needed. At least that way the source code repo only would
have source in it, not jars or executables. It'd ensure separation was
enforced and you could still track requirements per version or change the
policy later.

