Hacker News new | past | comments | ask | show | jobs | submit login
Why Not Make? (hackernoon.com)
27 points by entelechy on Nov 8, 2018 | hide | past | web | favorite | 51 comments

I don't think Make is ideal, but:

1) It's simple, standard, everywhere, and everyone knows it

2) I haven't seen anything better. Most of the other systems are disastrously complex.

I certainly, never, in a million years would use an untested, untrusted, proprietary build system from a little startup which might go away tomorrow.

I do think there is room for Make 2.0. Not Make reinvented; just cleaned up, better for distributed builds, etc.

Make may be simple, but most real world uses of Make I've seen are NOT simple. The primary benefit of CMake is it imposes a modicum more consistency on the build system. I'll admit CMake still has this problem, though, due to how flexible it is.

I would certainly use a mostly-compatible Make extension/reimplementation. The issues I have with Make are very minor: the syntax clashes between indentation and condition blocks, better built-in functions for string parsing, correct handling for filenames with spaces, a cleanup of built-in targets, etc.

it's not standard at all. what parts of make are the GNU extensions? What parts are you building with bash that aren't portable to windows, or even other nix machines that only have posix sh?

BuildInfer is not a build-system. It is a tool to migrate between build-systems.

I certainly, never, in a million years would use a proprietary build system.

I just had to deal with a large, handwritten Makefile when trying to package and cross compile something for buildroot.

Turns out it was easier to learn to use cmake and rewrite the relevant parts of the Makefile than fix everything in the Makefile that didn't work in a cross compilation scenario.

If I had a nickel for every time a developer said “it was just easier to rewrite...” I’d be a rich man. It always seems easier to start from a clean slate and do a rewrite than to try to understand someone else’s code, but in the end you just wind up with a different set of hacks and complexities.

I often find Make easier to debug than CMake because it’s more transparent about which commands are being used and there are fewer files to inspect in the process.

I feel like every single point against make in this article is considered a feature of make by many other people. But I guess more than why not, I’d like to hear more about what the alternatives are. I see this is an ad, and it implies but doesn’t say explicitly whether this product fixes all these issues, or what it actually does at all. People have been trying to redo make for decades, and despite all its warts (and it does have warts) so far it seems like nobody has been able to supplant it.

No sandboxing? This complaint amounts to “it isn’t pure”. There are reasons to allow reading a file in a build rule that isn’t a dependency. Allowing temp files in a multi-step rule without complicating or obfuscating the dependency graph is one of them. You can do anything you want in a build rule. How often is this flexibility actually a problem? If someone is reading from a file dependency without using make to depend on it, they already know they’ll get the expected behavior; failure to rebuild when the dependency changes.

Unportable caches is asking make to solve a problem that make wasn’t built to solve. And one could argue this isn’t technically a build system problem at all. Wanting network caching for multiple hardware configurations is reasonable, but isn’t what make is for.

No language abstraction is something I like about make. I use make for automating analytics and image resizing and all kinds of things that aren’t C++ specific. Adding language abstractions would bloat and complicate make.

Timestamps are faster to compute than hashes. Occasionally this matters a lot.

Sandboxing issue can be solved by extending cmake with a directive specifying which paths are accessible up front.

I might be willing to do a PR soon

Yes. This isn’t quite what the author was talking about though. They used the term “sandboxing” to mean that dependency rules in the build system can act on things that they don’t depend on. You’re talking about a higher level sandboxed view of the file system.

The kind of sandboxing you’re talking about can be mostly solved for any build system with some wrapper shell scripts, or an environment shell script. That’s how most people are doing it.

At my work, there’s a crazy sandboxing system for Linux written as a C program that actually creates an environment with a modified file system, and access to certain paths is strictly enforced... no process launched from inside this environment can access anything but the whitelisted paths, by any means. This is a very heavy hammer, and I suspect over the top and unnecessary. OTOH, this definitely prevents mistakes, and my company is large. I don’t know the history here, but the existence of a system like this leads me to assume there must have been some large and real pains that lead to the development of a heavy-handed sandbox.

While make, for me, fell out of day-to-day use once C/C++ projects waned, it's still a great and powerful tool. I really appreciate its consistency to a Unix philosophy where it provides the essential feature of target building through dependencies but leaves so much of the heaving lifting through good ole /bin/sh.

Make is trash, but some in house tool made 2 weeks ago is guaranteed to be a worse choice. Just generate Makefiles with CMake like everyone else.

Some alternatives to consider: https://buckbuild.com/ https://bazel.build/ https://please.build/ These have been deployed for years in major corporations.

> These have been deployed for years in major corporations.

CMake is 6th fastest language growing on github: https://octoverse.github.com/projects.html

Boost is switching to it, Qt may be switching to it for Qt 6, it's the most used in the jetbrains survey (https://www.jetbrains.com/research/devecosystem-2018/cpp/)...

for the sake everything holy, this is the chance for the C++ community to standardize on a build tool. Once this is done and settled, it will be much easier to improve CMake incrementally since there will mechanically be more people interested in improving it, and it will also be much easier to switch from CMake to next-gen-buildsystem-which-solves-all-cmake-problems in 10 years than the current fragmented situation.

TFA makes a bunch of claims about the functionality of make being bad or dangerous but doesn't really provide any backing scenarios to demonstrate why. E.g. where timestamps can bite you, how you can miss compiler flag changes, etc.

Without this, it reads as FUD.

The scenarios can be the following: timestamp are problem as soon as a drive is mapped from two different os without synced clock (happened to me with a virtualbox share). Taking care of compiler flags is important as you don't want to mix a debug build with a performance build for binary compatibility reasons.

Compiler flag changes might happen due to debug/release configurations.

Compiler (and standard library) might be changed to do testing with multiple compilers (very common use-case).

Often Makefiles are not written with these two cases in mind, forcing a "clean" between changes.

Not sure if you read the whole thing, but it seemed to me like it made pretty good points.

From the article: "In determining when to rebuild, Make uses the timestamps of build inputs rather than their hashes. This results in many unnecessary rebuilds when a file has been touched, but has not actually changed. If this happens deep in the dependency tree, then the rebuild will take a long time!"

That's kinda clear to me -- why would this read as FUD?

> why would this read as FUD?

From article:

> Since You’re Here > We recently announced BuildInfer, a new tool to optimize your C/C++ build scripts. Take a look!

Make compares timestamp of object-file with source-file amiright? This is nice and cheap and stateless. Do modern build-systems maintain a separate tree of hashes? I haven't seen this in my day-to-day but I don't tend to work on the bleeding edge ... I'd have thought the cost of calculating hashes for trivial source-files would be maybe only an order of magnitude cheaper than compiling one? Over a large source-tree I'd expect these costs to mount - I think I'd prefer to take my chance with timestamps!

Modern build systems (Buck, Bazel, Pants, etc) will compute a tree of hashes and manage it for you. They will also do this in the background (e.g. using Watchman). This is how Google, Facebook, Uber, Amazon etc. build their code, so the approach scales well.

So what I guess is that this is for building at scale?

In my experience this kind of incremental build is a matter of developer convenience so that small changes can be made and checked quickly, but builds for test and release are always built clean.

I guess the codebases we're talking about here are at such a scale that the incremental approach is actually used for releases too?

It still seems to me as though this is something that s should be solved architecturally though with an appropriate composition of modules ...

It's for your day-to-day builds, so when you switch between build types while working you don't have to suffer a full rebuild.

by build-types you mean release to test vs. build for my own work? All my code gets released off a build server, and I'd consider it bad practice for a developer to be releasing their own code from their own workspace ...

Same build, sure, but you don't release it - you just run it on your own computer, as opposed to the development build you might ordinarily run, to make sure you didn't break anything.

I still can't help but feel this could be obviated by a more top-down approach, like decomposing into modules.

Even with a well-specified dependency tree there are areas where an incremental build can still bite you in the ass. Things like inlining and transitive dependencies ... I think you still need to build from scratch to be sure.

If your choices are:

- A: Builds well for small projects, scales poorly for large projects

- B: Builds well for small projects and scales well for large projects

Why not choose B?

I'd always choose B given a green field, but it's when you already have A that you have to justify the effort to migrate ...

EDIT my proposition is that the described build system actually scales poorly for large projects, because you've to maintain an entire tree of hashes! Notwithstanding of course having a helper process in the background to do this for me ...

Ah, I see your point now. However, in practice this is not the case. These hash-based systems scale very well in large organizations. Some more info from Google about Bazel: https://qconsf.com/sf2010/dl/qcon-sanfran-2010/slides/Ashish...

Very interesting ...

Two bullet points jumped out at me straightway:

- Single monolithic code tree with mixed language code

- Development on head; all releases from source

Just one other concern I have now.

Something like Make, or most other build-systems run pretty much straight out of the box.

Something like Bazel lets say is apparently an order of magnitude more complicated. There's a few more moving parts.

My only remaining qualm relates to the additional cognitive load of this stuff. At scale, as in Google this isn't so much a problem because these costs would diminish at the point where you can actually assign people to look after it.

I really like the idea though! Would be nice to work in a place like Google some day ...

> Over a large source-tree I'd expect these costs to mount - I think I'd prefer to take my chance with timestamps!

timestamps really sucks when e.g. you change header $foo which is used by a lot of your software (say, a basic data structure, or the definition of a 3d point), build (which rebuilds everything because of time stamps), do a git pull --rebase, so git undoes your changes, applies the remote commits, and redoes your change, and so you re-have to build everything again even though the header that was used everywhere has not actually changed

I'd actually expect and desire a full rebuild in the case that a widely used header is changed.

After a rebase it makes sense too, but I see what you mean about where no actual changes have been made though ... dare I suggest this sounds more like a bug in git though ...

> I'd actually expect and desire a full rebuild in the case that a widely used header is changed.

yes, me too, but in this case the rebuild happens twice while the header changes only once (from the point of view of the programmer of course, from the point of view of the filesystem it changes twice). If hashes were used this would not be a problem at all.

I don't understand your point, how does this happen twi ... ohhh because dependent files get changed as well? Surely this is just a matter of how you specify your dependencies?

imagine your project structure is the following :

you do the following :

1. change header.h

2. rebuild. this rebuilds all your .cpp because they all use the header - this is fine and expected.

3. git pull --rebase because your coworker fixed a bug in src1.cpp which landed in the master branch

4. here when you rebuild with make, even though only src1.cpp has changed from your point of view, everything actually gets rebuilt because git stashes your change, applies the remote commits and unstashes your change, so header.h's timestamp gets changed again.

oh right yeah - this is the thing I was saying could probably be considered an issue with git.

Typical optimisation in such cases is to check file hash only when file timestamp or size changes. If after timestamp change hash is the same, it means file was not modified.

Okay that's half the problem. Where do I keep my database of hashes though?

W̶h̶i̶c̶h̶ ̶b̶u̶i̶l̶d̶ ̶s̶y̶s̶t̶e̶m̶s̶ ̶a̶c̶t̶u̶a̶l̶l̶y̶ ̶d̶o̶ ̶t̶h̶i̶s̶?̶ [answered by sibling comment]

Your sibling poster points out a great scenario that can trigger this: mounting shared/externally modified filesystems. Just saying "timestamps are bad and can cause inefficient builds" isn't compelling in itself.

For the record I definitely understand what the author is trying to point out, but I think they could be much more convincing.

Since the wide adoption of distributed version control systems (e.g. git), how many shops are still working on code on a shared filesystem being edited by multiple developers?

I'd think the far more usual case is each dev has a copy of the code on his/her own machine, and another copy is on a build/test machine. There won't be any timestamp issues there.

Make is like a lot of tools, it's simple and works well, up to a point. If your project complexity exceeds its abilities, you likely can afford something more sophisticated, since you probably have dozens or hundreds of developers anyway.

I had a shared drive on virtualbox where I had the problem of the clocks not being in sync. There is also the fact that some other tools aren't necessarily well behaved and will dirty files, causing spurious rebuilds.

For the most part though, you're right!

What if the objections in the article were solved directly?

> Writing a Large, Correct Makefile is Really Hard > No Sandboxing

Would it be possible to track dependencies on a file level, simply declaring everything that gets read as a dependency? In the end, it is very likely that the generated artifact is different if any of the files touched to build it changes.

So, why not track open() syscalls made by make and its sub-processes? This obviously has some requirements towards the kernel to allow such tracking. It would also report helper files used by the toolchain and maybe even the code (executables and .so files) used by the toolchain, but one may argue that they are actually dependencies -- the artifact must be regenerated if the toolchain gets updated.

> Caches are Not Portable

It should be considered first if rebuilding on each node is really that bad. Next, caches are non-portable by nature if they use different formats, or if dependencies that were used to build the artifacts are actually different on the various nodes. So this sets limits on cache sharing in principle -- but this has nothing to do with make.

> No Language Abstractions are Provided

Not going to argue against that.

> Timestamps, Not Hashes

Is there anything fundamental to make that requires the use of timestamps? Otherwise this sounds more like "nobody has bothered yet to change make to use hashes".

I've recently started to build my own static website generator (I know, I know, there are already millions of them). I really just wanted something simple that would allow me to generate a few haml pages together and deploy on my DO droplet.

I thought about Make because I remembered reading a nice article about how great it is for this kind of stuff. During my research, I also looked at Rake, which is Make in Ruby. Since I was doing my project in Ruby, I felt it would be nicer.

I really enjoyed using Rake, it's very nice and will definitely re-use it in the future. Martin Fowler has a great article[1] about Rake and how it compares to Make.

[1] https://martinfowler.com/articles/rake.html

I haven't used make for years - but I miss the simplicity. It just does what it's supposed to do, in plain text.

GNU make has many obscure features. Don't know about other implementations.

Make is fine: https://agottem.com/weld

Stop inventing build systems.

"Makefiles are inherently evil" [0]

While I agree with this sentiment, it's hard to know whether it's more or less evil than munging imperative and declarative build instructions in an XML file ...

Has anyone ever just thought about just replacing the tabs with something else and leaving it at that?

My own experiences with Make show there's a bunch of other things that make it difficult to manage and maintain beyond a certain point.

[0] https://ant.apache.org/manual/intro.html

Exactly. The proof of the basic utility of the system is that it persists in the face of a decades-long firehose of "better" build systems being thrown around. Anyone remember Imake? Scons? (Automake is still around, but everyone hates it.) Oh sure, the fashion these days is something different and those tools are "bad", but by 2024 it'll be something different still.

Tool automation with dependency handling and parallel execution is just an inherently hard problem. No one solves it well. But make solves it cleanly. And that counts for something.

your examples are overly complicated when comparing with cmake and do not work on windows, don't produce android APKs or mac bundles, etc etc.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact