
Non-Recursive Make Considered Harmful: Build Systems at Scale (2016) [pdf] - setra
https://ndmitchell.com/downloads/paper-non_recursive_make_considered_harmful-22_sep_2016.pdf
======
sitkack
One should start with, "Recursive Make Considered Harmful" [0] to know where
this is coming from.

The best architected use of make is in the FreeBSD build system [1,2] If you
want to experience "a system" please give FreeBSD a try.

[0]
[http://aegis.sourceforge.net/auug97.pdf](http://aegis.sourceforge.net/auug97.pdf)

[1]
[https://www.freebsd.org/doc/handbook/makeworld.html](https://www.freebsd.org/doc/handbook/makeworld.html)

[2] [https://www.freebsd.org/doc/en/books/porters-
handbook/portin...](https://www.freebsd.org/doc/en/books/porters-
handbook/porting-samplem.html)

------
g___
Another interesting paper is "Build Systems à la Carte"
[https://www.microsoft.com/en-
us/research/uploads/prod/2018/0...](https://www.microsoft.com/en-
us/research/uploads/prod/2018/03/build-systems-5ab0f42d0f937.pdf) which
explains characteristics of some build systems "static vs dynamic
dependencies; local vs cloud; deterministic vs non-deterministic build rules;
support for early cutoff; self-tracking build systems; and the type of
persistent build information. ... We show that we can instantiate our
abstractions to describe the essence of a variety of different real-life build
systems, including Make, Shake, Bazel, and Excel, each in a dozen lines of
code or so"

~~~
tome
... which is by three of the same authors.

------
pubby
Is it really worth blaming the tool? Has reimplementing make for the 50th time
really improved things?

The fact is, building software that requires 500 dependencies and 500 sub-
steps and 500 configuration options is going to be complicated. It's
complicated in the same way that implementing an operating system is
complicated. There's no way around it. The complexity is there because it's
inherent in the problem.

But it doesn't have to be. Instead of spending 300 hours implementing Shake,
or Rake, or Bake, or Cake, or Jake, or Take, why not spend those hours cutting
down the complexity at the source? Trim your dependencies. Stop putting so
many sub-steps and configurations into your build systems. Cause it's the
build systems with the 500 dependencies and 500 the sub-steps and the 500
configurations option that are harmful; not the tools

~~~
aseipp
> Has reimplementing make for the 50th time really improved things?

Considering the authors did so and find it to be an improvement in terms of
maintainability and usability... I'm going to say "yes". Do you think you know
more about the project than they do?

I used to work on GHC. The build system is complex. Hadrian is quite an
improvement in power and expressiveness (and is now capable of doing things we
wouldn't have been able to implement easily with Make, since extending the
prior system was too hard).

> The fact is, building software that requires 500 dependencies and 500 sub-
> steps and 500 configuration options is going to be complicated. It's
> complicated in the same way that implementing an operating system is
> complicated. There's no way around it. The complexity is there because it's
> inherent in the problem.

I get the feeling you're going to use this random truism as a springboard to
make suggestions despite the fact you've never been involved in the project?

> But it doesn't have to be. Instead of spending 300 hours implementing Shake,
> or Rake, or Bake, or Cake, or Jake, or Take, why not spend those hours
> cutting down the complexity at the source? Trim your dependencies. Stop
> putting so many sub-steps and configurations into your build systems. Is
> that the sane way to do things?

That would be nice if everyone had endless time and everything was always done
exactly perfectly up front. It would also be nice if you could work completely
on your own and never have to interact with any other software in the world.

Binary tarballs, source distributions, upstream library dependencies, cross
compilation, thousands of tests, tracking all dependencies _correctly_ (this
one alone is ridiculously hard), autogeneration tools (to save errors on
tricky parts). Feature detection at compile _and_ runtime (because your users
work on some old CentOS machine and no `pthread_setname` is not available),
profiling builds, running documentation generators, handling out-of-source
builds, handling relocatable builds. I can just keep listing things, honestly.
All of these -- more or less -- come back to your build system.

In fact, GHC goes _quite_ out of its way to expressly use as few non-Haskell
dependencies as possible. Why? Because the ones it already has are often
burdensome and complex, and we have to pick up the slack for them for every
user. Nobody using your project cares if Sphinx or their rube-goldberg Python
installation (spread over 20 places in /usr) was the reason doc building
failed; your build failed, that's all that matters. You've still got to figure
out what's wrong, though, for your user. And not wanting new dependencies has
been a common reason to reject things -- I myself have rejected proposals and
"features" to GHC on this basis alone, more or less. ("Just use libuv!" was a
common one that sounded good on paper and never addressed any actual issues we
had that it claimed to 'solve'.)

As a side note, it really just amazes me the amount of people who immediately
see any amount of non-trivial work in some project and immediately question
"well, why don't you just do <random thing that is completely out of context
and has no basis in the projects' reality>". Seriously, any time you think of
this stuff, please -- just give it like, 10 more seconds of thought? You'd be
surprised at what you might think up, what you might think is possible. It's
not the worst thing in the job, but being an OSS maintainer and having to deal
with analysis' that are, more or less, quite divorced from the reality of the
project is... irritating.

~~~
geezerjay
> Considering the authors did so and find it to be an improvement in terms of
> maintainability and usability... I'm going to say "yes". Do you think you
> know more about the project than they do?

Every single self-described make replacement project makes the exact same
claim, verbatim. Yet, when these projects start to see some use in the real
world... Queue all the design shortcomings and maintainability and usability
problems.

We're about 4 decades into this game. Perhaps this time everything is
different. Who knows. Odds aren't good, though.

~~~
scott_s
I think that being the build system for the Glasgow Haskell Compiler - which
is the most commonly used compiler for Haskell - counts as "some use in the
real world." I downloaded the source, did `git ls-files | xargs wc -l >
wc.out`, then `grep "total" wc.out`, summed the totals, and it comes to
1051451. That's an overestimate in lines of code, as there's certainly
documentation in there, but there's about 620,000 lines of Haskell.

~~~
geezerjay
Great, someone managed to get a build system to work for a project. That's
nice. I'm sure there are a bunch of cases where even hand-written makefiles
are being used to the same effect. Does that mean that any of those tools are
free from any design issue or maintainability problem?

~~~
scott_s
That's a good question! I think a good way to figure that out is to publish a
paper about what they did, how it solved their non-trivial problem, and then
invite others to try to use their tool and techniques to solve their problem.

In other words, you're asking a question that criticizes the mechanism that
would answer that question. It's a legitimate question, but a poor reason to
disregard what they did.

------
ioquatix
Functional build systems are a great idea. The main problem I had implementing
such a system was the file system. It's tough to capture all inputs and
outputs, even with the best of intentions.

The other problem is integration. You can't expect all 3rd party projects to
suddenly adopt your build system, so eventually you have to invoke
`configure`, `make`, `cmake`, `pkg-config`, `xcode`, and so on. While you can
satisfactorily capture most of these inputs and outputs, it's non-trivial to
do it completely, at some point you have something that works and it's good
enough.

~~~
tathougies
I mean... Nix and Guix build entire systems off of the idea of capturing every
single dependency in the build system, and they work in production

~~~
IshKebab
Do they even capture things like the C++ headers that your compiler uses? I
think that's what he meant. When you compile a C++ file the compiler goes off
and reads all kinds of files that are totally unknown to your build system.

Simpler languages like Go don't really have this problem, but they also have
sane build systems so I don't know why you'd need CMake or whatever.

~~~
klodolph
This is a long-ago solved problem, and the solutions are only getting better.
GCC has the -M family of options which are explicitly designed to feed the
header dependencies into the build system, and these flags have been around
since the dark ages. More modern systems like Bazel will sandbox the build so
if you try to #include a file that's not listed as a dependency you'll just
get a compiler error. If you want, you can specify the compiler itself as a
dependency.

Hermetic builds are the way to go for a ton of reasons.

~~~
JdeBP
It is a solved problem, but your compiler does not in fact implement an
adequate solution with that mechanism.

* [https://news.ycombinator.com/item?id=15044438](https://news.ycombinator.com/item?id=15044438)

* [https://news.ycombinator.com/item?id=15060146](https://news.ycombinator.com/item?id=15060146)

------
taeric

        While we have demonstrated that our approach works, 
        we have not yet implemented all features of the build
        system, and hope to do so over the next few months
    

This is a pretty major caveat. Almost damning in its significance, honestly.
There are plenty of "works, but haven't quite implemented all of the old
features" projects littering the world. I love that there are learnings here,
and those should be seen as the most important artifact of any project. I do
wish there were paths to get those learnings back to the old systems, though.
:(

~~~
mkesper
Table 1 in the paper looks promising, though. Let's hope their estimation is
correct:

 _We implemented 5 a new build sys- tem for GHC from scratch using Shake and
our build abstractions from §5. The new build system does not yet implement
the full functionality of the old build system, but we are currently address-
ing remaining limitations; nothing presents any new challenges or requires
changes to the build infrastructure._

~~~
mitchty
Well its in place now for ghc 8.6 ref
[https://ghc.haskell.org/trac/ghc/wiki/Status/Apr18](https://ghc.haskell.org/trac/ghc/wiki/Status/Apr18)

[https://github.com/ghc/hadrian](https://github.com/ghc/hadrian)

Its mostly complete, and as someone that has used ghc's make system and likes
make, this build system is miles better and isn't a chthulian horror.

------
gbacon
_To validate our claims, we have completely re-implemented GHC’s build system,
for the fifth and final time._

Famous last words.

~~~
mannykannot
Build, version control and package management: three problems perennially in
search of a definitive solution.

For those who are sure we already have a definitive solution to one or more of
these, the problem is in persuading everyone else, especially those who think
something else is it.

~~~
hedora
Here are multiple examples of “definitive” build systems, by your definition.
Many are over a decade old, and haven’t been standing still. These days, they
generalize to many, many languages, and are working towards byte-for-byte
reproducibility of the build artifacts:

[https://reproducible-builds.org/who/](https://reproducible-builds.org/who/)

~~~
jpfed
So this is a dumb question, but isn't the question of reproducible builds made
trivial by vendoring dependencies? Why is it treated like this holy grail when
it should be dirt simple? I must be misunderstanding something.

~~~
heavenlyhash
It's not a dumb question, but unfortunately, we live in a dumb^H^H^H^H
interesting world.

Check out all the links under this heading: [https://reproducible-
builds.org/docs/#achieve-deterministic-...](https://reproducible-
builds.org/docs/#achieve-deterministic-builds)

Many legacy systems capture all sorts of nondeterministic values -- from build
date (it might be a desire to be "helpful", but breaks reproducibility) to
accidentally depending on the order of inodes on your files system!

All of these problems are solvable. It _should_ be dirt simple. It's just a
"small matter of programming" :)

------
flossball
Meh.

Linux and BSD build systems deal with most of these issues usually with wide
support of a variety of recursive makes. Though RPM, DEB honestly suck and
never really tried to solve issues automatically. Still drives me nuts that
packages are tainted by the 'gold' systems they are built on. The complexity
of build systems means very few minds are up for it and most solutions are
naive and end up with tons of patchy exceptions and work arounds.

ROCK Linux supported cross compiler capabilities and auto-detection of build
parameters and dependency library tracking. (I was working on automated
dependency ordering and QEMU based full cross builds, before I got a real
job.) It was very robust and outside package developers breaking their own
builds it worked solidly. No idea what cool things T2 Linux got up to after
ROCK, but maintaining a fresh build system is hard. Build systems are always
going to be fragile systems with complexity. The paper seems to be a survey of
what they learned vs. definitely having any solution.

------
cfv
I'd like to issue a blanket ban on the "considered harmful" thing, as well as
sharing print-optimized PDFs on the web for people on computers to read.

Both things are archaisms that can be easily avoided, and in the particular
case of this article, the 2nd part of the title works just as well as the
title.

~~~
mannykannot
I am not generally in favor of bans, but if it came to that, I would first
like to see a ban on complaints about the style and format of interesting
material made freely available by the people who have already put considerable
effort into creating it.

~~~
cfv
Style and format are important tho, would you like your docs written in haiku,
or as a single three hour light opera video?

Embracing an archaic grandstanding pose and an academic look when much more
easily grokkable formats are available just doesn't help get your message
across that well. And when there's one of this roughly monthly I'd personally
love to at least see less of them.

------
sleepychu
Dropping in with my make horrorstory. I once had the pleasure of interacting
with a 20k makefile (builds any one of 15 or so projects for any platform
because "splitting up the makefile would lead to code duplication since a lot
of the makefiles would be similar") I'm told that makefile is over 40k lines
today only a few years later.

~~~
coldtea
> _" splitting up the makefile would lead to code duplication since a lot of
> the makefiles would be similar"_

So, they've never heard of templating?

------
ixxie
Sorta reinventing the wheel?

[https://nixos.org/nix/](https://nixos.org/nix/)

~~~
na85
One could say the same thing about nearly the entire nodejs ecosystem.
Reinventing the wheel, and usually with lower quality to boot.

Sometimes it's okay to make something with redundant functionality.

------
foxhop
My colleague @ejholmes wrote a cool tool that borrows heavily from `make`
called `walk`
([https://github.com/ejholmes/walk](https://github.com/ejholmes/walk)). It's
written in Go and using a graph for dependencies so that tasks at the same
horizontal level in the graph may be run at the same time. We use this at
remind.com to significantly speed up our multi later AMI build times.

------
sqldba
I've only used make a very little bit... but pretty much anyone can understand
it for basic use.

This looks like you're writing source code in another language that follows a
kind of template, which you then compile and run, and then does stuff and is
extremely complicated.

Seems like a failure to me? Shouldn't something "better" be equally simple or
simpler?

~~~
fpoling
Example from the paper of Makefile code used GHC:

$1/$2/build/%.$$($3 _ osuf) : \ $1/$4/%.hs $$(LAX _ DEPS _ FOLLOW) \ $$$$($1 _
$2 _ HC _ DEP) $$($1 _ $2 _ PKGDATA _ DEP) $$(call cmd,$1 _ $2 _ HC) $$($1 _
$2 _ $3 _ ALL _ HC _ OPTS) \ -c $$< -o $$@ \ $$(if $$( ndstring YES,$$($1 _ $2
_ DYNAMIC _ TOO)), \ -dyno $$(addsu x .$$(dyn _ osuf),$$(basename $$@)) )
$$(call ohi-sanity-check,$1,$2,$3,$1/$2/build/$$ * )

I suspect it easier to master Haskell than this.

~~~
taeric
This just feels like a bloody straw man, though. Poorly written file in X not
as good as well written file in Y. News at 11. :)

Substitute "Clever" for "Poorly", if you'd prefer.

~~~
tome
Would you like to show us a well written version of that rule?

~~~
taeric
First, I probably couldn't even do a well written version of the rule in any
language. :) Not exactly my strength.

Second, sometimes the best trick you can do to make a build system cleaner, is
to change the system it is building.

Though, as I stated elsewhere in this chain. I don't necessary mean to dismiss
this effort. Just showing me gymnastics that are required to do something that
most people just don't care to do, doesn't endear me to the ill suitability of
the context. (Fun metaphor, actually. The fact that my house isn't designed to
allow easy gynmastic practices in the living room is not a criticism of the
house or of gymnastics. Showing me that a cartwheel will get you hurt there is
not really showing me anything relevant.)

------
hedora
It is interesting that none of their examples of why make is considered
harmful involved writing an idiomatic make file.

In fact, all of their examples of how bad make is involve the use of non-
standard makefile generators, or non-standard extensions to make itself.

It is sort of like arguing that JavaScript is harmful, and then showing
snippets of asm.js code to talk about how “tragic” the language syntax is.

Alternatively, I could argue that no one should use C because C++ template
metaprogramming is too opaque, and, as further proof, my use of a non-standard
preprocessor I implemented leads to 10,000’s of lines of deeply nested macro
invocations.

(There are all sorts of problems with make, but I’m not convinced the authors
actually understand them in enough detail to improve on it.)

------
jschwartzi
This looks like a good stab at avoiding all the pitfalls of Make while still
providing the same or better capability.

------
Ceezy
It looks great but most of make-like tools are not functionnal. So yeah
functionnal everything is better but nothing is functionnal right now.

~~~
colanderman
Make itself is largely a functional/declarative language. Beside the recipes
(which aren't Make but actually shell, though you can use Haskell if you
want), the only non-functional feature I can think of are the global
variables, which most makefiles generally don't modify in a non-lexical manner
anyway.

~~~
Ceezy
I was talking about tool like rake in ruby and scons in python. they are not
declarative at all. But most important is that if what your are building needs
you to set env variable ect... I don t see how it could really be done in a
functionnal way(without side effects).

~~~
smaddox
Stick the ENV in a state Monad, and run the process with that ENV as the
context?

~~~
Ceezy
There is no real monad in ruby. That why I feel like the article is missing
the point.

