This came up because I am promoting a conceptually similar unified theory of Reactive UI, which also has things in common with build systems. In fact, Svelte also happens to cite Excel as an inspiration for its approach to incremental updates. My blog post on this went live yesterday, and a birdie tells me it will hit the HN front page later today.
The Rust ecosystem does not support "compiler plugins" well (by contrast, this is a first-class concept in Kotlin). What is being actively pursued is "procedural macros," but we're running into the limitations of those - we're moving away from them for deriving lenses, for example.
I'm sure it would also bring up lifetime issues, but, to be fair, those are probably similar to issues for async, and that only took a couple of years or so to land. But people are actively thinking about using generators for UI.
I do think this is one of the open questions. It might be that really first-rate reactivity does require language support in some way. But this feels like like increasing the scope of an already possibly overambitious problem.
This means it cannot enforce pinning-on-a-hash ('hermetic' builds in bazel lingo): if you allow dev team to say `git clone hxxp://whatever/whatever/master` somewhere in the bowels of the build system, they invariably go for it. It's easy, agile and it works (for months if not years).
Result? On a large project, every week you get a build broken by some third-party change. Because make is too capable.
1. Distributing modules with BUILD files makes everything just work
2. You don't think at the level of commands/files, you think at the abstraction of targets. How targets happen in an implementation detail.
3. `select` for varying builds in a sane and readable way
4. Shared cache
5. Distributed build and test runner
6. Query language
7. All builds occur out of tree
8. A language for describing builds that doesn't feel obtuse
And the list keeps going on. A "new make" is not an improvement. It is likely where the world is going to go it just misses out on obvious benefits of other build systems.
Happily, not eveyone thinks this way. That's why there is myriad of safe languages -- from Ruby to Rust -- which replaced C in many areas. Sure, C is still around, but many people prefer to give up some control for lots more safety.
The same logic applies to build systems. Sure, you can use "low-level" build system to hand-write gcc invocation in each project, but a bigger system makes your build files much safer.
Reimplementation of python for build file language? No thanks.
Reimplementation of package managers? No thanks.
Reimplementation of Docker? No thanks.
It’s just another thing to learn, worry about, and discover bugs in rather than something that just works. I already know how docker, yarn, and python work and that stuff works well, why are we boiling the oceans recreating and relearning all of this just to do it the “Bazel” way?
What a waste of time and energy!
I've used bazel in a different environment (C++) and it has been amazing (compared with other C++ build systems).
Another comment below mentioned to stay away from bazel for anything else other than C++ (and maybe Java).
I've also seen it used with java android apps.
On top of a that, everyone on my team had to learn skylark, their rules for docker, their rules for node, their... I’m exhausted just thinking about it...
Why bother unless you’ve got google size? It’s good software written by smart people, but I know I made a mistake in choosing it for our build solution.
* Bazel needs gigabytes of memory to run;
* It needs minutes to bootstrap (which it will every time, because you don’t want a multi-gigabyte-sized daemon running around);
* It breaks every second day. If you are building version X of $something, you better go and get the exact version of Bazel that the developers of $something were using when tagging X;
* And when they fail to build there’s no chance you will be able to figure it out without spending days on it.
Don’t use Bazel unless writing code that only you yourself can maintain is your goal.
Getting it to work can sometimes be a pain in the rear (although I'm not sure other systems fare much better). I'm still trying to figure out some missing header file errors... some are obvious and easy to fix, but but some are confusing as heck. I also hate the fact that I have to specify header files redundantly, once in the source file and once in the build tool.
Their documentation can also be painful to quickly find stuff in (e.g. try finding how to display the command line options passed to the C++ compiler); sometimes some flags are just spread out in the prose rather than in the normal table format. Their @// notation is not exactly intuitive; I'm still not sure I fully understand it.
And lastly, there are lots of rough edges, e.g. there's no way to call an existing Makefile and pass it conditional compiler options, so good luck sharing your compile flags across the entire codebase in an elegant manner.
For a Go project which the whole build command is a single line of `go build`, it takes lots of config files and configuration, and tweaking till you get the exact same result.
Only till there is an update and the whole build crashes and you have to figure it out why.
To me m, Bazel is a great tool for Google or a project at the scale of Google projects. It is beneficial if you can benefit from distributed builds and caching layers. Only for a project big enough that those could make a difference.
But for all other projects, Bazel, felt like brining an 18-wheeler to go to grocery shopping.
Thank you. Sigh.
This is probably the most underrated aspect of make that most other build systems immediately discard: it doesn't need to "support" your language or toolchain, which is really helpful when dealing with proprietary/in-house or otherwise unusual tools and processes.
So ideally you want both ad hoc rules (like make) as well as some king of build system plugin support which allows you to write more complex rules in something more powerful and portable. This is the approach we have taken with build2.
The question is less: "How do we derive a simpler tool?" and more "How do we convert enough of the prior art to hit a tipping point?"
The build system will not directly improve the end product. So no company wants to pay for it / use the time.
Due to the previous reason, a build system is not a compelling value proposition as a product (except for speed, see next)
Hardware/cloud technology has progressed fast enough that raw horsepower can be thrown at a slow build system with reasonably good results.
The ubiquity of legacy build systems makes integrating anything new extremely painful. Almost no software project these days is free from non-trivial dependencies that will certainly be using a legacy build system. Again, for any project team, this pain and extra effort cannot be justified as it doesn't make a difference to the end customer.
Engineers hate arbitrary syntax changes with respect to some scripting language when there is equivalent common practice. This goes for new general purpose languages too (entirely separate rant, but what is the point of arbitrarily changing the most widely known, generic C-style syntax of a function call? What value does it provide not to just re-use the return_type name(arguments) pattern that 99%+ of all programmers must already know? What is the return on the cognitive load???)
Most engineers simply don't care. They aren't interested. They click a green arrow or type "make" as a ritual and then wait for what they care about, which is the end-result. The only time anyone cares is when it has to be setup for the first time, or when it breaks. Everyone at this time will moan and say "this could be better, this build system sucks, we should make a better one", but once it is setup/fixed, everyone rapidly forgets about it never to return.
It is not a sexy problem. Telling anyone outside of software engineers that you made this awesome build system is pointless - they won't even understand the premise of the problem. In fact, interviewing recent CS grads, even they don't really know what a build system is.
- If you think about it for a moment, these sort of reasons are the harbingers of technical debt.
Another key attribute of this kind of problem is that it is deceptively complex. Similar to how everyone tries to re-invent logging frameworks and message queue frameworks only to discover that the perfect solution is a unicorn - a white whale. The deficits in the other implementations were actually trade-offs, not architectural flaws that some inferior engineers overlooked.
My own recent experience with this topic was learning YOCTO+bitbake, and maybe similarly build root. Working with embedded systems, I was so optimistic that finally a build system came along that would wrangle the workflow of source code -> image -> flash. Could something finally provide full bi-directional dependency mapping between and image and source code? Could I look at a binary in build output and instantly discover what source went into that file, what scripts/rules created, what decided to put it in that filesystem location in the output? No. It improved many things significantly, especially when it comes to configuration management for custom Linux, but in the end you have to learn an entirely new (arbitrarily different) scripting syntax just to end up with something that is essentially as complex and daunting as a custom make-based system. Never-mind trying to set something up from scratch - it would be an entire project in itself if you didn't have the base provided by poke or the hardware manufacturer.
Anyways, build systems are a fascinating meta-topic in software engineering. My guess is that in 50 years, most software will still be built by make. And petabytes of bandwidth and megawatts of electricity per day will still be used to send "HTTP/1.1 200 OK\r\n" in ASCII plaintext as if the computer on the other end is reading the response aloud to a human being. Another tale of technical debt born of legacy compatibility and brute force compensation.
A difficult issue here is that choosing good caches in a computing system is inherently a global optimization problem. Suppose you're spending 95% of your time in function X, so you memoize function X, with a 90% hit rate, but probing and invalidating the cache, and the extra space it takes up, takes 5% of the time that X took. So your system overall does the non-X (100-95)% = 5% of the work it was originally doing, plus 10% of the X work (9% of the total), plus a new 5% of 95% = 4.75% in the cache; so the system is doing 18.75% as much work as before, so (modulo concurrency) it's a bit over 5× faster.
Working on that remaining 9% of the original that is cache misses invoking X, you notice that X is spending 99% of its time in several calls to, indirectly, another function Y, so you memoize Y, and this works out better: you have a 95% hit rate, and managing and probing the Y cache only takes 1% of the work that running Y took. So the 8.91% of the original runtime that was in Y is reduced to 0.0891% of the original in cache management, plus 0.4455% of the original runtime in the cache misses in Y, for a total of 0.5346% of the original runtime, so you've reduced the runtime from 18.75% of the original to 10.3746% of the original; you've almost doubled performance again by adding this other cache.
But you're still spending 4.75% of the original runtime managing the X cache. Now you can improve performance by removing the X cache. Originally you were spending 94.05% of the runtime in Y (although you didn't know it) which you can reduce to 4.7025% in misses on the Y cache, plus 0.9405% in Y-cache management, plus the 5% of the original runtime that wasn't in X, for 9.7025% of the original runtime.
So you quintupled performance by memoizing X, and then later you improved performance by 6.5% by undoing the memoization of X. And this is in a very simple model system which doesn't take into account things like the memory hierarchy (another kind of caching), space budgets, and cache miss rates varying across different parts of the system.
So we have a global optimization problem which we are trying to solve by making local changes to improve things. This is clearly not the right solution, but what is?
Umut Acar described an aproach he calls "self-adjusting computation" which uses a single global caching/memoization system within a program to do fine-grained cache invalidation. His student Matthew Hammer has extended this work. I haven't figured out how their system works yet, but it seems like a promising start.
The topic topics/caching.html in Dercuano collects a bunch of notes on this topic more generally: http://canonical.org/~kragen/dercuano-20191110.tar.gz.
I really appreciate Raph posting the link to the Mokhov–Mitchell–Peyton-Jones paper in https://news.ycombinator.com/item?id=21617434.
Got burned hard on Qmail which was strangled to death by his ridiculous management practices. Never again.