Hacker News new | past | comments | ask | show | jobs | submit login

Working at Google, Blaze is one of the technologies that amazes me most. Any engineer can build any Google product from source on any machine just by invoking a Blaze command. I may not want to build GMail from source (could take a while) but it's awesome to know that I can.

I think this could be hugely useful to very large open source projects (like databases or operating systems) that may be intimidating for contributors to build and test.




Standard caveat: I don't speak for my employer.

Using Bazel (aka. Blaze) every day is one of the things that has made me dread ever leaving Google. Fast, reproducible builds are amazing. Once you have used this tool, it is very hard to go back. Personally, I'm thrilled that it has been open sourced.


Having recently left Google, GRPC (stubby) was my biggest concern; I spent about two weeks hacking together a good code generator for GoRPC before GRPC came out and obviated the time. Now, I'm glad I haven't bothered with a build system, which was going to be next.

Nice to see a bunch of projects that've been generalizable and heavily used internally finally see the light of the outside world. Now, to start evangelizing them.


A huge +1 on this as well.

I left Google a couple of years ago and we ended up building our own rpc around protos (Thrift just doesn't cut it), and our Make/maven based build has the standard problems with such things, so I'm really looking forward to using grpc and bazel in the near future. A huge thumbs up to Google!


Can you give a (very quick) pointer/explanation to what about Thrift didn't cut it for you?


The description language is oddly bulky and can't decide what order it's in.

The api to open a connection is, again, needlessly bulky: I make a socket, I wrap it in a buffer, I wrap it in a protocol, I create a client using it, then I connect? There's the same level of complexity offered in grpc, but it's offered through an options object with sane defaults.

There's no security protocol; or, if there is, nobody seems to use it. Is there an async call structure? If there is, nobody's ever heard of it. All the code I can find seems to be written at a preschool level. This may be due to working at a company that was an early adopter, or simply because the company is staffed by preschoolers.


As someone that used to work at Google, and currently works at Facebook, there's a lot of legacy API that you can mostly ignore. Async is there, but more or less just works, and uses C++11 lambdas quite nicely.

I think I preferred Google RPC, but it's not a huge difference to me.


It wasn't invented in Google and is therefore inferior. Google has a massive incentive to develop projects specific to their requirements and then open source and evangelize them to stomp out approaches not optimized for them.


Don't forget that there is a constant inflow of developers to Google. They all bring the tools and practices that they know, and will definitely leverage what they can.

It's not like we're run through a brainwashing machine when walking through the door :P

There just happen to be a lot of challenges to make things run on Google's infrastructure - and not all the tools we know and love happen to work well on it. Some of that is legacy, some of that doesn't have many parallels externally.

So, we write solutions that work well on Google's infrastructure. Some of that gets open sourced, in the hopes that it's also useful to the greater community.

But I definitely would not consider Google-written code to be superior to other solutions out there - just an alternative option to choose from - and certainly not the best for many situations.


Well, that's also largely due to all the source (transitive dependencies) being present in one monolithic repo.


Yea what's up with that? Sounds like a pretty terrible practice to me. Is that something that c++ forces you to do?


Just philosophical. I'm honestly not sure which approach I like more after having done it both ways: highly isolated projects (open source world, at Amazon), and monolithic (at Google).

It all boils down to dependency management in the end.

---

For the monolithic world:

* You're always developing against the latest version of your dependencies (or very near it).

* This comes at the cost of a continuous, but minimal, maintenance burden as upstream folk make changes.

* * However, because things are monolithic, upstream projects can change _your_ code, as well. You can be confident that you know exactly who is affected by your API change.

* * Similarly, being able to make an API change and run the tests of _everyone that depends on you_ is a huge benefit.

* You have to be more diligent when testing things that go out to users, as your code is constantly evolving.

---

For the isolated world:

* You can develop without fear of interruptions; your dependencies are pinned to specific versions.

* You get to choose when to pay the cost of upgrading dependencies (but typically, the cost is pretty high, and risks introducing bugs).

* * Security patches can be particularly annoying to manage, though (if you let your dependencies drift too far from current)

* During deployment, you can be extremely confident about the bits that go out.

* You can get away with less rigorous infrastructure (and maintenance costs related to that)


No, it's something that Perforce allows (because it scales sufficiently). It has nothing to do with C++.

You can impose order within the monolithic repo by partitioning projects into their own branches or directories and only pulling down the necessary pieces.

Whether this is better than a bunch of small repos is debateable.


Perforce (the product) doesn't really scale sufficiently for the Googles/Amazons/Microsofts of the world, sadly.

I think they've all moved to custom forks/implementations due to the insane SPOF that Perforce servers are (and their hardware requirements). But up til that point, heck yeah!


Well, Perforce in its default state may not scale sufficiently, but at least two of the companies on that list have managed to make it work (presumably with a lot of investment, though). :)

I didn't know Amazon was using Perforce. I interviewed someone from Amazon recently and he indicated they were on Git for most things now.


Amazon has phased out perforce a year ago. There is very little left, most repositories (perforce, svn, etc) were migrated to git repositories.


I've started at Google 4 months ago, and it's one of the best things to discover. Now open-sourced :)


Though not really open source ;)



Whoops, I am an idiot.


But you're our idiot!


> Any engineer can build any Google product from source on any machine

A little too optimistic :) You can't build Android, Chrome, ChromeOS, iOS apps, etc. via blaze.


When I worked at Google I built a Blaze extension to be able to build Android apps. It worked really well, though I'm not sure how well it was maintained after I left in 2010. Internally at Google, Blaze was extremely customizable, and I hope Bazel too, so one can easily add support for building iOS apps etc.

EDIT #1: I see support for building Objective-C apps is already present in Bazel. EDIT #2: Bazel uses Skylark, a Python-like language, which could be used to implement all sorts of extensions, including the one I was referring to.


There's an extension language in bazel named Skylark, which will be familiar to you if you wrote build_defs internally: http://bazel.io/docs/skylark/concepts.html


The Chromium tool chain is pretty insane. ninja, fetch, etc... There's really no excuse for this since Google has such a strong (and now open source) build system.



Look at Nix & NixOS: http://nixos.org/ It would be interesting to see a comparison of Bazel/Blaze to Nix.


Wow, sounds like enterprise Gentoo, in a good way.


Yes, Blaze and a hojillion computers will give you a spiffy build system. The public now has the former, but not the latter :)


One piece at a time! Also, who is to say that Google's way of orchestrating those hojillion computers is best? Separating the two pieces, as has been done here, makes it possible for others to create different (and maybe better) orchestrations.


So the builds are reproducible automatically?


See http://bazel.io/docs/FAQ.html, "Will Bazel make my builds reproducible automatically?

For Java and C++ binaries, yes, assuming you do not change the toolchain. If you have build steps that involve custom recipes (eg. executing binaries through a shell script inside a rule), you will need to take some extra care:

Do not use dependencies that were not declared. Sandboxed execution (–spawn_strategy=sandboxed, only on Linux) can help find undeclared dependencies.

Avoid storing timestamps in generated files. ZIP files and other archives are especially prone to this.

Avoid connecting to the network. Sandboxed execution can help here too.

Avoid processes that use random numbers, in particular, dictionary traversal is randomized in many programming languages."


Specifically, people should note that many code generators are not carefully designed for strict reproducibility, and will stick time stamps in generated output.

Even if you undo that, code generation tools are liable to at some point traverse a dictionary without caring about whether the result is deterministic. I spent some time at Google fighting with antlr to try to get it to have deterministic output and I still think that I left some corner case uncovered.


When you say Java, does that include Android? I see that Android is supported, but couldn't find anything about reproducibility.

Reproducible Android builds would be very interesting.


http://bazel.io/docs/roadmap.html

It looks like they only have iOS and not Android in this first release, but they are planning on adding Android support ~June of this year.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: