1) It's simple, standard, everywhere, and everyone knows it
2) I haven't seen anything better. Most of the other systems are disastrously complex.
I certainly, never, in a million years would use an untested, untrusted, proprietary build system from a little startup which might go away tomorrow.
I do think there is room for Make 2.0. Not Make reinvented; just cleaned up, better for distributed builds, etc.
Turns out it was easier to learn to use cmake and rewrite the relevant parts of the Makefile than fix everything in the Makefile that didn't work in a cross compilation scenario.
CMake is 6th fastest language growing on github: https://octoverse.github.com/projects.html
Boost is switching to it, Qt may be switching to it for Qt 6, it's the most used in the jetbrains survey (https://www.jetbrains.com/research/devecosystem-2018/cpp/)...
for the sake everything holy, this is the chance for the C++ community to standardize on a build tool. Once this is done and settled, it will be much easier to improve CMake incrementally since there will mechanically be more people interested in improving it, and it will also be much easier to switch from CMake to next-gen-buildsystem-which-solves-all-cmake-problems in 10 years than the current fragmented situation.
No sandboxing? This complaint amounts to “it isn’t pure”. There are reasons to allow reading a file in a build rule that isn’t a dependency. Allowing temp files in a multi-step rule without complicating or obfuscating the dependency graph is one of them. You can do anything you want in a build rule. How often is this flexibility actually a problem? If someone is reading from a file dependency without using make to depend on it, they already know they’ll get the expected behavior; failure to rebuild when the dependency changes.
Unportable caches is asking make to solve a problem that make wasn’t built to solve. And one could argue this isn’t technically a build system problem at all. Wanting network caching for multiple hardware configurations is reasonable, but isn’t what make is for.
No language abstraction is something I like about make. I use make for automating analytics and image resizing and all kinds of things that aren’t C++ specific. Adding language abstractions would bloat and complicate make.
Timestamps are faster to compute than hashes. Occasionally this matters a lot.
I might be willing to do a PR soon
The kind of sandboxing you’re talking about can be mostly solved for any build system with some wrapper shell scripts, or an environment shell script. That’s how most people are doing it.
At my work, there’s a crazy sandboxing system for Linux written as a C program that actually creates an environment with a modified file system, and access to certain paths is strictly enforced... no process launched from inside this environment can access anything but the whitelisted paths, by any means. This is a very heavy hammer, and I suspect over the top and unnecessary. OTOH, this definitely prevents mistakes, and my company is large. I don’t know the history here, but the existence of a system like this leads me to assume there must have been some large and real pains that lead to the development of a heavy-handed sandbox.
> Writing a Large, Correct Makefile is Really Hard
> No Sandboxing
Would it be possible to track dependencies on a file level, simply declaring everything that gets read as a dependency? In the end, it is very likely that the generated artifact is different if any of the files touched to build it changes.
So, why not track open() syscalls made by make and its sub-processes? This obviously has some requirements towards the kernel to allow such tracking. It would also report helper files used by the toolchain and maybe even the code (executables and .so files) used by the toolchain, but one may argue that they are actually dependencies -- the artifact must be regenerated if the toolchain gets updated.
> Caches are Not Portable
It should be considered first if rebuilding on each node is really that bad. Next, caches are non-portable by nature if they use different formats, or if dependencies that were used to build the artifacts are actually different on the various nodes. So this sets limits on cache sharing in principle -- but this has nothing to do with make.
> No Language Abstractions are Provided
Not going to argue against that.
> Timestamps, Not Hashes
Is there anything fundamental to make that requires the use of timestamps? Otherwise this sounds more like "nobody has bothered yet to change make to use hashes".
I thought about Make because I remembered reading a nice article about how great it is for this kind of stuff. During my research, I also looked at Rake, which is Make in Ruby.
Since I was doing my project in Ruby, I felt it would be nicer.
I really enjoyed using Rake, it's very nice and will definitely re-use it in the future.
Martin Fowler has a great article about Rake and how it compares to Make.
Without this, it reads as FUD.
Compiler (and standard library) might be changed to do testing with multiple compilers (very common use-case).
Often Makefiles are not written with these two cases in mind, forcing a "clean" between changes.
From the article: "In determining when to rebuild, Make uses the timestamps of build inputs rather than their hashes. This results in many unnecessary rebuilds when a file has been touched, but has not actually changed. If this happens deep in the dependency tree, then the rebuild will take a long time!"
That's kinda clear to me -- why would this read as FUD?
> Since You’re Here
> We recently announced BuildInfer, a new tool to optimize your C/C++ build scripts. Take a look!
In my experience this kind of incremental build is a matter of developer convenience so that small changes can be made and checked quickly, but builds for test and release are always built clean.
I guess the codebases we're talking about here are at such a scale that the incremental approach is actually used for releases too?
It still seems to me as though this is something that s should be solved architecturally though with an appropriate composition of modules ...
Even with a well-specified dependency tree there are areas where an incremental build can still bite you in the ass. Things like inlining and transitive dependencies ... I think you still need to build from scratch to be sure.
- A: Builds well for small projects, scales poorly for large projects
- B: Builds well for small projects and scales well for large projects
Why not choose B?
EDIT my proposition is that the described build system actually scales poorly for large projects, because you've to maintain an entire tree of hashes! Notwithstanding of course having a helper process in the background to do this for me ...
Two bullet points jumped out at me straightway:
- Single monolithic code tree with mixed language code
- Development on head; all releases from source
Just one other concern I have now.
Something like Make, or most other build-systems run pretty much straight out of the box.
Something like Bazel lets say is apparently an order of magnitude more complicated. There's a few more moving parts.
My only remaining qualm relates to the additional cognitive load of this stuff. At scale, as in Google this isn't so much a problem because these costs would diminish at the point where you can actually assign people to look after it.
I really like the idea though! Would be nice to work in a place like Google some day ...
timestamps really sucks when e.g. you change header $foo which is used by a lot of your software (say, a basic data structure, or the definition of a 3d point), build (which rebuilds everything because of time stamps), do a git pull --rebase, so git undoes your changes, applies the remote commits, and redoes your change, and so you re-have to build everything again even though the header that was used everywhere has not actually changed
After a rebase it makes sense too, but I see what you mean about where no actual changes have been made though ... dare I suggest this sounds more like a bug in git though ...
yes, me too, but in this case the rebuild happens twice while the header changes only once (from the point of view of the programmer of course, from the point of view of the filesystem it changes twice). If hashes were used this would not be a problem at all.
1. change header.h
2. rebuild. this rebuilds all your .cpp because they all use the header - this is fine and expected.
3. git pull --rebase because your coworker fixed a bug in src1.cpp which landed in the master branch
4. here when you rebuild with make, even though only src1.cpp has changed from your point of view, everything actually gets rebuilt because git stashes your change, applies the remote commits and unstashes your change, so header.h's timestamp gets changed again.
W̶h̶i̶c̶h̶ ̶b̶u̶i̶l̶d̶ ̶s̶y̶s̶t̶e̶m̶s̶ ̶a̶c̶t̶u̶a̶l̶l̶y̶ ̶d̶o̶ ̶t̶h̶i̶s̶?̶ [answered by sibling comment]
For the record I definitely understand what the author is trying to point out, but I think they could be much more convincing.
I'd think the far more usual case is each dev has a copy of the code on his/her own machine, and another copy is on a build/test machine. There won't be any timestamp issues there.
Make is like a lot of tools, it's simple and works well, up to a point. If your project complexity exceeds its abilities, you likely can afford something more sophisticated, since you probably have dozens or hundreds of developers anyway.
For the most part though, you're right!
Stop inventing build systems.
While I agree with this sentiment, it's hard to know whether it's more or less evil than munging imperative and declarative build instructions in an XML file ...
Has anyone ever just thought about just replacing the tabs with something else and leaving it at that?
My own experiences with Make show there's a bunch of other things that make it difficult to manage and maintain beyond a certain point.
Tool automation with dependency handling and parallel execution is just an inherently hard problem. No one solves it well. But make solves it cleanly. And that counts for something.