Hacker News new | past | comments | ask | show | jobs | submit login
Bazel For Open-Source C/C++ Libraries Distribution (liuliu.me)
64 points by todsacerdoti on Sept 16, 2020 | hide | past | favorite | 76 comments



> ccv has a simple autoconf based feature detection / configuration system.

Hahaha, there is nothing simple with autoconf :D

I supposed if you don't feel like you need to understand all the moving parts it's not horrific.

To be fair, autotools evolved in a different era, and to solve differnet problems than bazel.


I think that I understand how most AC_* macros I use internally works. Cannot claim for AX_* ones. I do write every line of configure.ac myself though :D


If you want to upgrade your Bazel dependencies using pinned semver constraints and a lockfile I've made this [0] for you.

It superseeds http_archive and falls back to git_repository if needed. Just run `bazel sync`. See it working at [1]. Note: there's an open (and somewhat long standing) issue WRT Bazel fetching from gitlab.

[0]: https://github.com/fenollp/bazel_upgradable [1]: https://github.com/voidstarHQ/voidstar/blob/master/WORKSPACE


Man, I still just use a simple batch file. These build systems (and dependencies they are built to handle) are insane.


Different strokes for different folks. The solution to your problems is a simple batch file, but that might not be the solution to other peoples problems.

These "get of my grass" comments really miss the point. If you don't understand why you might need a more complex build system than "a simple batch file" then this really isn't software you need. That doesn't mean it's insane.


I start out with a shell script.

At some point "recompile everything" is a bit slow, so I swap it out for a makefile.

At some point the makefile gets too complicated and I swap it out for Bazel.


And would it not be much easier to use apenwarrs redo?


A lot of the time I see these build systems try to solve a few issues:

1. Dependency handling

2. Single script for multiple platforms

3. Easy for a new user to just "run"

But I have yet to find a build system (for C/C++) that solves all these issues better than simply having a few scripts.

I use just a build.batch, linux_build.sh and osx_build.sh usually and I have never had problems.

A new user can just download those scripts, and as long as they haven't tampered with file/library locations it just works.

If people like their build systems - fine by me - but I just think they are unnecessarily complex a lot of the time. Granted I don't have a huge project with 100's of dependencies, but even then, those dependencies don't come all at once.


Problem is I do tamper with those. I often cross compile software (my day job is embedded linux). Different distributions like to put things in different locations for their own reasons. I have a multi-core computer and I'd like to use more than one of them so my build goes faster.

As a developer I don't want to have to worry about how to make all of the above work. I learned cmake and so I don't have to worry, now that I know it (the learning curve is there) almost all my issues go away because it just works with all the weird stuff people do.


Nothing wrong with using a shovel to dig in your backyard for planting. But that experience does not extrapolate to using an excavator to build a skyscraper.


It sounds more like the problem is autoconf.

Replacing autoconf with cmake or a similar metabuilder will provide a lot more flexibility to build on different platforms, including, eg, windows, and cross-build.

If it's easier to build on any platform, you don't need to try to get into the custom packager business and just focus on your library.


I did my fair share of setting up C and C++ builds in the past and I can tell you: The difference between CMake and Autoconf is marginal, while CMake is of course a bit more versatile in terms of target build-systems, and of course its DSL is not as obscure as autotools toolchain, its no meaningful difference.

What Bazel (or Buck) bring to the table is slightly different, its better control of inter-module dependencies and external dependencies. In a way they are not only build tools but opinionated and effective 'linkers', bundling together your applications and libraries in a way that it is hard to mess up and very reproducible.

The other option would be the nix/guix route, which I think might be better-suited for the open-source model but require you to adapt at least the package managers.


The meaningful difference for me is that I work in a place where most of the developers love their IDEs. CMake is literally the only cross platform build system with first class IDE support on both Windows and Linux. If you use VS Code on macOS, you get it there as well.


CMake does very good when you have a C/C++ project, but it very poorly suited for cross-language dependencies. Blaze-like build systems, on the other hand, were built specifically for that.


After digging into Bazel a bit recently, I really hope they invest significant energy into the IDE story next. That is a huge, gaping hole that is being not-really solved in all the language/IDE ecosystems and is a _massive_ roadblock to adoption.

IDE integration would remove 90% of my hesitation to push it at my company.


I really like Bazel as well. As you said, cross language dependency management in CMake is hilariously annoying. However, I'm at practically a C++ only shop so it rarely comes up.

Where I am we were using CMake 2 for our build system for a long time and it got into bad shape. When we made the decision to rewrite the build system (across a few dozen subprojects), we looked at a few - Meson, Bazel, CMake, etc. CMake was hard to argue against because of the recent work (last few years) by Microsoft to support CMake projects directly (File -> Open Folder). We wanted to try to get away from maintaining separate IDE projects for every platform for every project.


It’s not much work. Internally at Google they have Idea integration. The challenge for open sourcing it may be that the plugin is also responsible for Google3 integration if I recall correctly.


"not much work" is absolutely not the feeling I have about it based on some conference talks about the topic (e.g. the various xcode talks, or the Wix talks on intelliJ), and my own thought experiments on getting it tied into Visual Studio / Visual Studio for Mac.

I think the Bazel team introduced Aspects and punted the rest for a later date. Now they hope each language/IDE community will take that infrastructure and build out the rest of the ecosystem. I think its too important of a piece of the puzzle to leave to the community.


> If you use VS Code on macOS, you get it there as well.

CMake can generate Xcode projects.


Right, what I meant by developers who love their IDEs is that they complain if they see a console. They just want to open an IDE and click "open folder" or "open project", so not having to use an intermediate IDE project generator is a massive boon.


The meaningful difference for me is I cross compile everything.

Every cmake project I've worked with just works. As in it only takes a couple hours to get it packaged for my custom (stupid...) package system I have to use, almost all of that boilerplate work.

Only 1 in 50 autoconf project I've worked with actually works - in theory autoconf supports this case well, but in practice there is always something broken and so I know I'll spend a couple weeks (full time) getting it packaged.


Well, one "meaningful difference" is I can find devs who have experience with cmake...


You have to start somewhere, right? The same thing was said about things like Kubernetes a couple of years ago. Look at where we are now.


Autoconf has been around for years, cmake for much less, yet cmake was able to become the major build system for C++ and get everywhere.

I don't think autoconf has a future - even when it was "the only choice" everyone avoided it if they could. Cmake took over and became the thing everyone knew.

Now if you (like the article) proposing something else you might have a chance. However autoconf has lost.


Autoconf "lost" in the sense that it solved the problems that it needed to solve and is no longer needed.


Autoconf solves the problem of easily porting software to different systems because autodiscovery is its default modus operandi.

Now that we live in a post-innovation world, there is no need to port software: it's either Linux, Mac OS, or Windows.

Except, of course, there is still a lot of innovation going on, and there's a lot of porting and cross-building that happens outside the the web domain, and a heck of a lot of things that are none of the big three development hosts.

CMake is starting to converge towards what the autotools have provided for some decades now, but it's not there yet. When it comes down to it, the only real difference is different domain-specific languages and that just results in tribalism.


Autodiscovery is not the norm anymore, though.


Sure. As I said, we're living in post-innovation times and we only develop natively for Linux, Windows, and Mac OS and only on 64-bit Intel.

Except for all the times we don't.


What I meant was that we're moving towards hard dependencies.


Hard dependencies on what? The new BSP release for the next-gen silicon on a new operating system?

Trust me, I was there in the nineties when hard dependencies proved unscalable. The only thing that has really changed since then is the scale (it's gone up orders of magnitude) and the flood of people who have no idea what goes on under the hood.


The problem is autoconf because the problems that autoconf was designed to solve are not the problems we want to solve today. Autoconf does an excellent job of solving the problems that it was designed to solve--configuring a build system on different POSIX-like systems--but times have changed.

My experience using CMake and Autotools and migrating between the two is... the only appreciable benefit to CMake is that it works well enough on Windows, and Autotools sucks on Windows. Whereas Bazel is in a different league, IMO. For people on Windows, the fact that CMake works well on Windows is enough to sing its praises, but I think Bazel will be eating the C and C++ world over the next five years.

In defense of Autotools, it's designed to work on systems which don't have Autotools installed, back in an era where you might download Apache and compile it from source to get a web server running on your system. It will work on a stock install of Linux, Solaris, AIX, or macOS. Almost nobody cares about this use case any more.


> For people on Windows, the fact that CMake works well on Windows is enough to sing its praises,

Maybe, but how relevant is this? Application development on Windows is historically not based on open source, and for developing anything else than Windows apps, supporting something different from a POSIX system is not really important.

> but I think Bazel will be eating the C and C++ world over the next five years.

Do you think bazel is better than NixOS and GNU Guix? I think things like bazel are pretty much geared to big companies which want to move ecosystems to the cloud and have a lot of manpower to support very complex dependency systems. I think that for distributed open source projects, a system like Guix which decouples individual packages while suppurting (or even requiring) a build from source is better. There is also the issue that with a bazel build, there is potentially a whole lot of stuff which runs uncontrolled on your local system. It is basically "curl | bash" on stereoids, which means you hand over your machine to the cloud.


One thing however that the autotools do and i do not see any of the other supposedly replacements do is that from the users (be it an application user who wants to compile the code or a developer who wants to compile a library but does not have plans to develop it, so for all intents and purposes the are just a "user" of that library) is that it does not require autotools to be installed unless you want to modify the project structure itself (even simple code changes are most of the time fine without having autotools). It only requires a Make but that is basically a standard and available anywhere, unlike the other systems that they target (and cmake, etc, also need Make anyway).

I always found that annoying because if i just want to compile something (be it a program or a library) from source, i also need to install a bunch of build systems i'm not going to ever use myself. That is just pure and unnecessary bloat for me.


The flip side of this is that it means most projects come with an old version of autotools--and if you need to do something where "old" is equivalent to "broken", it becomes a wonderful world of pain.

A specific example I ran into was compiling random projects with -flto. libtool, in its infinite wisdom, decided long ago that it was a bad idea to pass any -f flags to the linker invocation and so stripped them out, even if you the user manually told it to via LDFLAGS. This of course breaks -flto--so newer versions of libtool got the picture and whitelisted that flag. Except projects haven't updated libtool for a while, so you have to patch their source to get it to work.


AFAICT if you need to modify the build configuration -changing optimization levels is one such things since optimizations can potentially break applications- you are expected to install (and hence, update) autotools as if you were a developer of the program/library.

But TBH my comment was more about the other build tools than autotools... Autotools have many issues and are way more complex than need to be, but i do not see a build tool that does exactly the same thing as autotools do, despite being so many tools out there.


In what concerns Windows, cmake + vcpkg is getting pretty sweet, now even binary dependencies are finally properly available, catching up with conan.


Now I just finished three days of work to remove vcpkg.

Windows + vcpkg + MSVC it works perfectly. Windows + vcpkg + clang it is a knightmare of linking problems.


clang is supposed to have a compatible ABI with MSVC as long as you use link.exe for linking. Is this not the case?

http://blog.llvm.org/2018/03/clang-is-now-used-to-build-chro...


clang, and most likely not the version packaged by Microsoft on Visual Studio, that is the problem right there.


I didn't encounter any problems with vanilla clang on Windows for building projects (installed with "scoop install llvm"), in fact I was surprised that it worked out of the box, even for code that extensively uses Windows APIs). I suspect that it requires a MSVC toolchain installation, or at least the Windows SDK installed, but since everything "just worked" I didn't dive too deeply into what's actually happening under the hood.


Right, I found a tutorial of someone explaining how to use vcpkg with VSCode+CMake [1] (without the VS2019 IDE) and he mentions the process required a lot of trial and error.

VS2019 otoh, in my experience, just worked out of the box (with the single package I tried ...).

1: https://gamefromscratch.com/vcpkg-cpp-easy-mode-step-by-step...


I tried vcpkg recently and I was pleasantly surprised!

The Win installation [1] (assuming VS is already present) was pretty easy. I tested by running `vcpkg install glfw3` and copy-pasted an example from the web into VS2019. The example ran with _zero_ configuration!

Is the experience as nice in Linux or Mac?

1: https://github.com/microsoft/vcpkg#quick-start-windows


Yes, the experience is just as nice on Linux, assuming you're using CMake to generate your Makefile. When you run the cmake generate command, you just need to add one option to specify the location of your vcpkg directory. Of course the same applies on Windows if you're using CMake to generate your Visual Studio project, except that you'll probably want to specify the triplet on the command line too (the default one is 32-bit).

Footnote: If you're doing this on Windows and want static linking then the experience is less smooth. You still need to specify the vcpkg directory and triplet (x64-windows-static), but that doesn't automatically change your application's CRT to the static one (which all your libraries are now assuming). There are a couple of ways round this but the easiest is probably to use MSVC_RUNTIME_LIBRARY CMake property (requires at least CMake 3.15, which is pretty new).

[1] https://cmake.org/cmake/help/v3.15/prop_tgt/MSVC_RUNTIME_LIB...


I focus mostly on Windows, but given that even Google is integrating Android's NDK with vcpkg, I guess it works quite alright.


Bazel is terrible in terms of bootstrapping it, its dependency tree is huge, and it's a great pain for distributions to package.


Had to build bazel from scratch, you need:

- a previous version of Bazel

- GCC/Clang

- JDK

- Python

- Zip and Unzip

Not sure I'd call that "huge". Perhaps the JDK?


> you need:

>

> - a previous version of Bazel

Why do all these systems seem to need to prove Goedel's theorem? What is wrong with providing a tool that builds portably with a simple C compiler?


Since GP said bootstrapping rather than just building... where do you get that previous version of Bazel?


From the bazel releases. It always starts with a binary.


I mean sure, you always need to start with a binary somewhere, but to put this into perspective... the binaries needed to bootstrap all the software on the distro I run on my laptop weigh in at about 60 MiB[0]. An x86_64 build of Bazel alone weighs in at about 27 MiB[1]. As someone who cares about source bootstrappability, that feels like sliding backwards.

I suppose you may be able to find a chain of older versions of Bazel starting with one that doesn't require Bazel itself to build, like the Guix devs did for Rust[2].

[0] https://guix.gnu.org/blog/2020/guix-further-reduces-bootstra...

[1] https://github.com/bazelbuild/bazel/releases/tag/3.5.0

[2] https://guix.gnu.org/blog/2018/bootstrapping-rust/


There are two supported ways of building a Bazel binary. One uses an existing Bazel, this other uses a bash script that calls the compilers.


And compare that to a few hundred bytes to bootstrap the whole Guix system.


He was talking about bootstrapping. But anyway, JDK and Python? Really?


Bazel is written in C++ and Java. The compile.sh (and the root of the repository) uses bash and probably a few other binaries. What are the issues exactly? Please file a bug if you have issues bootstrapping Bazel (it's not a need for most users, so the process might not be perfect).


Bazel is in Java...


I'm glad to see this. What people first and foremost forget is Bazel is a package manager/overlay in itself (and a very good one too), which allows for very complex (and yet safe and reproducible) builds.

The learning curve is steep though, but the payoff is so worth it.


I just started using Bazel this weekend for a personal (Rust) project. I understood the concepts which convinced me it was something I wanted to use. The documentation and tutorials definitely make the learning curve steeper than I think it needs to be.

It works so great though, now that I've managed to set it up.


Agreed. I flirted with it a couple times in the past and just couldn't get understand through the docs. This last time I ended up binging the bazelcon videos and things mostly clicked. The main thing to understand is the WORKSPACE/BUILD distinction and then you can mostly get there I think.


Thanks for the feedback! Do you have more specific feedback about the documentation? We can definitely make clearer the distinction between BUILD and WORKSPACE. Are there other concepts to clarify, or specific pages we should revamp?

Thanks!


I don't have any specific example off the top unfortunately, and some of my confusion was likely because I was trying to pick it up along the road of Bazel maturity so things were changing. When I say BUILD vs WORKSPACE distinction, I actually mean it was tough as an outsider who hasn't used a blaze-like system (or Cmake for that matter) to understand what the guardrails in bazel were and how the WORKSPACE provides certain guarantees to the rest of the system. So, it is a conceptual thing that once I understood that role, things clicked.


My impression is that it can end up with your requirements spread over half of the Internet with very little control which code runs during your build.


Depends on how you do it. It's designed first and foremost to work with vendored dependencies in a monorepo, and the external repository support is newer.

Bazel won't pull in transitive dependencies unless you ask it to in the WORKSPACE file, which means that you can use a private mirror for all your dependencies if you like, which is fairly easy in practice (I do it for personal projects).

The experience will vary depending on your preferences and the languages you use. With Go + Bazel, my experience is that the Bazel version will have fewer dependencies than the equivalent "go mod" version, because "go mod" will pull in dependencies more coarsely and Bazel has more fine-grained control. Go mod will pull in dependencies for the entire repo that you depend on, but with Bazel, you only need dependencies for the individual subpackages. As a specific example... suppose you use github.com/jackc/pgx. With "go mod" you end up with "github.com/sirupsen/logrus" in your transitive dependencies, but with Bazel you don't, unless you use the part of pgx that requires logrus.

You can also end up with "half the internet" in your dependencies pretty easily, but I think these days that is just the price of including random libraries in your project.

As far as I can tell, the approach to avoid depending on half the internet is to be very conservative about your dependencies, but that is true regardless of the system you use to manage them. Just to pick on the JavaScript ecosystem, if you start a new JavaScript project and pull in TypeScript, Rollup, and Terser, you'll end up with a very SMALL lock file because each of these libraries have a very small number of carefully chosen dependencies.


> It's designed first and foremost to work with vendored dependencies in a monorepo

So how is that suitable for distributed open source projects? I understand this is how Google works, but open source?


That's up to you. Bazel can fetch and build core from wherever: local or remote. Also, when fetching from git, it will high encourage you to use commits instead of tags, or when downloading archives from http, use a sha256.

Actually it will nag you until you do.


Disable outgoing network in CI and vendor everything? For many modern tools such as Python, JS, and Rust, you’d have to do this if you want to “control” everything anyways


resulting in things like the official tensorflow-distribution needing patching at 2 parts iirc to compile on new glibcs, because tensorflow and some network-dependency both vendor an outdated library overwriting a new glibc-function (some logging thing...). It works, until you want to fix some thing. Good times.


I really wanted Bazel to work for GO (not even C++). I gave up. It's got fine goals --- really nice ones --- but the doc sucks. We get stuck in crummy alley ways between corners of Bazel that does work well.


Just let the distributions maintainers do their job. That's how you get a quality user-centered software distribution.


Linux doesn't have 97% market share. You need to build applications for Windows, Android and others.


Does that not just encourage them?


Encourages who?


Not only doesn't Linux not own the OS world, package managers are usually outdated and only allow for one specific version and then there is the whole fragmentation of package formats.


It's probably because I'm not very familiar with C/C++ library distribution, but what exactly does this mean, in the context of this post?

> Just let the distributions maintainers do their job.

Is this sentiment supporting Bazel usage or recommending against it?


In the old school Linux distro world, dependency management is handled by the package manager. This way has benefits and drawbacks:

* there is specific version of the library associated with the OS that the developer can develop against - if the OS updates, the developer also has to update their apps - so they don't leave old bugs and vulns open

* there is specific version of the library associated with the OS that the developer has to develop against - if the OS updates, the developer also has to update their apps - which sometimes can be very annoying if the API breaks

There is more to it, but this is the thing that constantly has me changing my opinion on the matter.


> It's probably because I'm not very familiar with C/C++ library distribution, but what exactly does this mean, in the context of this post?

Some vocal part of linux community get extremely angry when packages don't use all the .so's provided by their (often outdated) distro of choice but instead rely on the dev of the app chosing the dependency versions which work best with its software.


I too love dependency hell.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: