Hacker News new | past | comments | ask | show | jobs | submit login
The only build system that might someday replace make (2010) (apenwarr.ca)
83 points by zdw 7 months ago | hide | past | favorite | 47 comments

From my recent Twitter mentions: Build Systems à la Carte (2018). This presents a theoretical analysis of build systems (which they have defined broadly enough to include Microsoft Excel) and how to pick and choose the various properties they provide.

This came up because I am promoting a conceptually similar unified theory of Reactive UI, which also has things in common with build systems. In fact, Svelte also happens to cite Excel as an inspiration for its approach to incremental updates. My blog post on this went live yesterday, and a birdie tells me it will hit the HN front page later today.

[1]: https://www.microsoft.com/en-us/research/uploads/prod/2018/0...

Have you considered making a compiler plugin for Rust so Svelte style GUIs are possible? Reactivity should have first class support just like concurrency (which too have a runtime).

Wow, that's seriously off topic for a post about build systems, isn't it? :)

The Rust ecosystem does not support "compiler plugins" well (by contrast, this is a first-class concept in Kotlin). What is being actively pursued is "procedural macros," but we're running into the limitations of those - we're moving away from them for deriving lenses, for example.

I'm sure it would also bring up lifetime issues, but, to be fair, those are probably similar to issues for async, and that only took a couple of years or so to land. But people are actively thinking about using generators for UI.

I do think this is one of the open questions. It might be that really first-rate reactivity does require language support in some way. But this feels like like increasing the scope of an already possibly overambitious problem.

To be fair, the UX for building Svelte-style GUIs is pretty unparalleled. It feels a lot like SwiftUI. The React variety with setState and Redux creates a ton of verbose code.

The biggest problem: "it can do everything make can do"

This means it cannot enforce pinning-on-a-hash ('hermetic' builds in bazel lingo): if you allow dev team to say `git clone hxxp://whatever/whatever/master` somewhere in the bowels of the build system, they invariably go for it. It's easy, agile and it works (for months if not years).

Result? On a large project, every week you get a build broken by some third-party change. Because make is too capable.

You can run the build in a container without network access. Admittedly, neither make nor redo make it any easier to create such a container, but they don't make it harder, either.

Blaze-like systems offer other benifits over makefiles that are, in their own right, amazing.

1. Distributing modules with BUILD files makes everything just work

2. You don't think at the level of commands/files, you think at the abstraction of targets. How targets happen in an implementation detail.

3. `select` for varying builds in a sane and readable way

4. Shared cache

5. Distributed build and test runner

6. Query language

7. All builds occur out of tree

8. A language for describing builds that doesn't feel obtuse

And the list keeps going on. A "new make" is not an improvement. It is likely where the world is going to go it just misses out on obvious benefits of other build systems.

All those things could be done with redo with varying degrees of effort.

"Varying degrees of effort" and "the default" are very different. Do you have an example of those features that can be demoed in redo?

Some of those sound like the benefits of Makefiles, actually. But if it floats your boat...

Which of these things is a benefit of make?

#1 (but they're called Makefiles), #2 (implicit rules), #5 (with -j on NUMA), and (unless you overuse the GNU function extensions) #8.

Any project building against HEAD deserves it. Use a tag, in a config file. Not make's fault.

Segfaults, double-frees and buffer overflows are not C's fault. Any programmer using C deserves them. Right?

Happily, not eveyone thinks this way. That's why there is myriad of safe languages -- from Ruby to Rust -- which replaced C in many areas. Sure, C is still around, but many people prefer to give up some control for lots more safety.

The same logic applies to build systems. Sure, you can use "low-level" build system to hand-write gcc invocation in each project, but a bigger system makes your build files much safer.

This has got nothing to do with my point of using a changing code base to compile against.

This has a lot to do with your point that "they deserve it" being not useful to improve things. How about we give that project a tool that just doesn't have this dangerous functionality?

One can say the same thing about C++ ever since 0x version

What do people here think of Bazel [1]? In my understanding it is an open-sourced version of the build system that Google uses internally.

[1] https://bazel.build/

We’ve been using it to build and deploy a ~500k loc JavaScript monorepo and I have an unpopular opinion about it. While I see the benefits of it for google scale, for us it’s just been one headache after another.

Reimplementation of python for build file language? No thanks.

Reimplementation of package managers? No thanks.

Reimplementation of Docker? No thanks.

It’s just another thing to learn, worry about, and discover bugs in rather than something that just works. I already know how docker, yarn, and python work and that stuff works well, why are we boiling the oceans recreating and relearning all of this just to do it the “Bazel” way?

What a waste of time and energy!

Man, I'm sorry to hear about your experience :(

I've used bazel in a different environment (C++) and it has been amazing (compared with other C++ build systems).

Another comment below mentioned to stay away from bazel for anything else other than C++ (and maybe Java).

Bazel is also the build system to use if you have large iOS swift applications, with it's networked HTTP build cache and so on.

I've also seen it used with java android apps.

Thanks! As you probably know, c++ is definitely unique in that there is no built in way to manage dependencies and no de facto package manager. I would definitely use it for a cpp project.

Could you talk a bit about your experience? The two points you mentioned feel more like theoretical / principled opinions rather than on the ground experience / issues

Everything feels beta. It crashes on about 10% of our builds with an inscrutable java error deep in the bowels of bazel. The solution, according to the bazel authors, is to rebuild, which works so our ci pipeline has an auto retry mechanism because bazel is so buggy. I’m not joking.

On top of a that, everyone on my team had to learn skylark, their rules for docker, their rules for node, their... I’m exhausted just thinking about it...

Why bother unless you’ve got google size? It’s good software written by smart people, but I know I made a mistake in choosing it for our build solution.

Would it be possible for you to link to the GitHub issue for that crash?

I have dealt with it as part of packaging and adapting tensorflow. Some of my opinions:

* Bazel needs gigabytes of memory to run;

* It needs minutes to bootstrap (which it will every time, because you don’t want a multi-gigabyte-sized daemon running around);

* It breaks every second day. If you are building version X of $something, you better go and get the exact version of Bazel that the developers of $something were using when tagging X;

* And when they fail to build there’s no chance you will be able to figure it out without spending days on it.

Don’t use Bazel unless writing code that only you yourself can maintain is your goal.

Bazel is about as far from redo as you can get: they aim to solve very different problems. If you have a huge monorepo, and thus need distributed caching and distributed builds, and cross-language dependency tracking and incremental rebuilds, Bazel is for you. It has the advantage of being built _after_ Google understood all the places where lack of rigor or accidentally-too-capable build language features cause trouble.

(At Square, we currently use Pants for our Java monorepo. Don’t do that; we’re switching.)

What's wrong with pants?

It's not widely used. Pure Java (rather than Scala) seems off the beaten track. The documentation is sparse. The IDE plugins are less capable, and less documented. It is sloppy about rigorously declaring needed files. It's slow just to start up. Actual Python rather than Starlark as a build language. The API from custom rules to internals is leakier (I believe). Bazel has a huge ecosystem of third-party rules to do all sort of things.

There are a lot of things I like about it, and I like how robust it is once it works. But what I don't like is probably more interesting:

Getting it to work can sometimes be a pain in the rear (although I'm not sure other systems fare much better). I'm still trying to figure out some missing header file errors... some are obvious and easy to fix, but but some are confusing as heck. I also hate the fact that I have to specify header files redundantly, once in the source file and once in the build tool.

Their documentation can also be painful to quickly find stuff in (e.g. try finding how to display the command line options passed to the C++ compiler); sometimes some flags are just spread out in the prose rather than in the normal table format. Their @// notation is not exactly intuitive; I'm still not sure I fully understand it.

And lastly, there are lots of rough edges, e.g. there's no way to call an existing Makefile and pass it conditional compiler options, so good luck sharing your compile flags across the entire codebase in an elegant manner.

We had a difficult experience with Bazel as well using it for a Go project. As mentioned it brings its own reimplementation of any compiler or package manager. So if there is a new feature from the language, you have to wait till the Bazel rules for that language gets updated, only then you can start using it.

For a Go project which the whole build command is a single line of `go build`, it takes lots of config files and configuration, and tweaking till you get the exact same result. Only till there is an update and the whole build crashes and you have to figure it out why.

To me m, Bazel is a great tool for Google or a project at the scale of Google projects. It is beneficial if you can benefit from distributed builds and caching layers. Only for a project big enough that those could make a difference.

But for all other projects, Bazel, felt like brining an 18-wheeler to go to grocery shopping.

If you program C++ give it a shot. For all other languages stay far from it.

> It might not look like it, but yes, you can subscribe without having a Google Account. Just send a message here: redo-list+subscribe@googlegroups.com

Thank you. Sigh.

> with no baked-in assumptions about what you're building

This is probably the most underrated aspect of make that most other build systems immediately discard: it doesn't need to "support" your language or toolchain, which is really helpful when dealing with proprietary/in-house or otherwise unusual tools and processes.

That's a double-edged sword: because make doesn't "support" your language things get hairy (think sed) as soon as you need to support anything non-trivial (say header dependency extraction for C/C++). Doing it in a portable way is often just impossible (try to do header dependency extraction for MSVC on Windows with make).

So ideally you want both ad hoc rules (like make) as well as some king of build system plugin support which allows you to write more complex rules in something more powerful and portable. This is the approach we have taken with build2[1].

[1] https://build2.org

In the subsequent decade that this tool fell short of world conquest, containers and CI/CD pipelines have grabbed much mindshare.

The question is less: "How do we derive a simpler tool?" and more "How do we convert enough of the prior art to hit a tipping point?"

That's a very old article you're reading, so it's a bit out of date. Here are some more recent redo docs, including how to use it to build containers: https://redo.readthedocs.io/en/latest/cookbook/container/

This gave me a weird déjà vu of reading about some software that solves all the problems elegantly and performantly and that I've never otherwise seen and will forget in a day. The other software might've been a build system too.

It might have even been this build system. This has been coming up off and on for many years.

I played a bit with redo some time ago. It hit a chord with the way I like to do things (small, minimalistic tools that do one thing right). But it was hard to find my way around it, and for certain things (commands that produce more than one "product") the only answers were, in all honesty, hacks. Best example I remember now is bison, which produces a .c and a .h file. The hack was to produce a tar which contained the .c and .h files, declare that as the output from bison, and then have rules that state that your .c and your .h depend on the tar. Blergh.

Ohhh the tragedy of build systems... Maybe the best example of meta technical debt on the planet, for so many canonical reasons that technical debt emerges in general. Top reasons no software engineering team fixes this:

The build system will not directly improve the end product. So no company wants to pay for it / use the time.

Due to the previous reason, a build system is not a compelling value proposition as a product (except for speed, see next)

Hardware/cloud technology has progressed fast enough that raw horsepower can be thrown at a slow build system with reasonably good results.

The ubiquity of legacy build systems makes integrating anything new extremely painful. Almost no software project these days is free from non-trivial dependencies that will certainly be using a legacy build system. Again, for any project team, this pain and extra effort cannot be justified as it doesn't make a difference to the end customer.

Engineers hate arbitrary syntax changes with respect to some scripting language when there is equivalent common practice. This goes for new general purpose languages too (entirely separate rant, but what is the point of arbitrarily changing the most widely known, generic C-style syntax of a function call? What value does it provide not to just re-use the return_type name(arguments) pattern that 99%+ of all programmers must already know? What is the return on the cognitive load???)

Most engineers simply don't care. They aren't interested. They click a green arrow or type "make" as a ritual and then wait for what they care about, which is the end-result. The only time anyone cares is when it has to be setup for the first time, or when it breaks. Everyone at this time will moan and say "this could be better, this build system sucks, we should make a better one", but once it is setup/fixed, everyone rapidly forgets about it never to return.

It is not a sexy problem. Telling anyone outside of software engineers that you made this awesome build system is pointless - they won't even understand the premise of the problem. In fact, interviewing recent CS grads, even they don't really know what a build system is.

  - If you think about it for a moment, these sort of reasons are the harbingers of technical debt.  
The primary one is not having a direct contribution to the end-goal. Obviously old-salt engineers know a good build system will save thousands upon thousands of hours in the long run. Just ask anyone that had to deal with a make system where dependencies were defined properly and -j couldn't be used. Or had to work with a system where incremental re-builds didn't work. Or spend half a day debugging in the bowels of a legacy build system.

Another key attribute of this kind of problem is that it is deceptively complex. Similar to how everyone tries to re-invent logging frameworks and message queue frameworks only to discover that the perfect solution is a unicorn - a white whale. The deficits in the other implementations were actually trade-offs, not architectural flaws that some inferior engineers overlooked.

My own recent experience with this topic was learning YOCTO+bitbake, and maybe similarly build root. Working with embedded systems, I was so optimistic that finally a build system came along that would wrangle the workflow of source code -> image -> flash. Could something finally provide full bi-directional dependency mapping between and image and source code? Could I look at a binary in build output and instantly discover what source went into that file, what scripts/rules created, what decided to put it in that filesystem location in the output? No. It improved many things significantly, especially when it comes to configuration management for custom Linux, but in the end you have to learn an entirely new (arbitrarily different) scripting syntax just to end up with something that is essentially as complex and daunting as a custom make-based system. Never-mind trying to set something up from scratch - it would be an entire project in itself if you didn't have the base provided by poke or the hardware manufacturer.

Anyways, build systems are a fascinating meta-topic in software engineering. My guess is that in 50 years, most software will still be built by make. And petabytes of bandwidth and megawatts of electricity per day will still be used to send "HTTP/1.1 200 OK\r\n" in ASCII plaintext as if the computer on the other end is reading the response aloud to a human being. Another tale of technical debt born of legacy compatibility and brute force compensation.

Companies invest in build systems if it will increase their developer productivity enough, which is usually a big company problem, thus why google built blaze, twitter & FB reproduced it with pants and buck and now google has open sourced it as bazel.

FWIW, I've used redo and it seems to me to be the make-killer.

The overall concept of automatically memoizing parts of a batch computation to turn it into an incremental computation is really interesting, because something like 95% of software amounts to different systems for managing caches. The old saying is that there are two hard problems in software: naming, cache invalidation, and off-by-one errors; and cache invalidation is what we're talking about here.

A difficult issue here is that choosing good caches in a computing system is inherently a global optimization problem. Suppose you're spending 95% of your time in function X, so you memoize function X, with a 90% hit rate, but probing and invalidating the cache, and the extra space it takes up, takes 5% of the time that X took. So your system overall does the non-X (100-95)% = 5% of the work it was originally doing, plus 10% of the X work (9% of the total), plus a new 5% of 95% = 4.75% in the cache; so the system is doing 18.75% as much work as before, so (modulo concurrency) it's a bit over 5× faster.

Working on that remaining 9% of the original that is cache misses invoking X, you notice that X is spending 99% of its time in several calls to, indirectly, another function Y, so you memoize Y, and this works out better: you have a 95% hit rate, and managing and probing the Y cache only takes 1% of the work that running Y took. So the 8.91% of the original runtime that was in Y is reduced to 0.0891% of the original in cache management, plus 0.4455% of the original runtime in the cache misses in Y, for a total of 0.5346% of the original runtime, so you've reduced the runtime from 18.75% of the original to 10.3746% of the original; you've almost doubled performance again by adding this other cache.

But you're still spending 4.75% of the original runtime managing the X cache. Now you can improve performance by removing the X cache. Originally you were spending 94.05% of the runtime in Y (although you didn't know it) which you can reduce to 4.7025% in misses on the Y cache, plus 0.9405% in Y-cache management, plus the 5% of the original runtime that wasn't in X, for 9.7025% of the original runtime.

So you quintupled performance by memoizing X, and then later you improved performance by 6.5% by undoing the memoization of X. And this is in a very simple model system which doesn't take into account things like the memory hierarchy (another kind of caching), space budgets, and cache miss rates varying across different parts of the system.

So we have a global optimization problem which we are trying to solve by making local changes to improve things. This is clearly not the right solution, but what is?

Umut Acar described an aproach he calls "self-adjusting computation" which uses a single global caching/memoization system within a program to do fine-grained cache invalidation. His student Matthew Hammer has extended this work. I haven't figured out how their system works yet, but it seems like a promising start.

The topic topics/caching.html in Dercuano collects a bunch of notes on this topic more generally: http://canonical.org/~kragen/dercuano-20191110.tar.gz.

I really appreciate Raph posting the link to the Mokhov–Mitchell–Peyton-Jones paper in https://news.ycombinator.com/item?id=21617434.

djb? Pass.

Got burned hard on Qmail which was strangled to death by his ridiculous management practices. Never again.

If you actually read TFA you'd know that it's describing a non-djb implementation of a djb design.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact