
Falsehoods programmers believe about build systems (2012) - jnxx
https://pozorvlak.dreamwidth.org/174323.html
======
com2kid
Since 2012 we have, as a field, learned better.

You see, we all realized that just building code wasn't enough! What your 100
line JavaScript microservice really needs to be accompanied by is a build file
that determines how to create an entire Linux VM. Also, because scale, your
project also needs to know how to scale that VM out across multiple clusters!

The only answer of course is more build systems!

Oh but JavaScript sucks, so make that Typescript, so shove a transpiler in
front of your scripting language. Because somehow we've gone down the path of
needing build systems for interpreted languages.

I use all the above tech and get why it is there, but... seriously as a field
this is the best we can do? :/

(I legit like Babel and TS, but you have to admit it is /weird/ that we're
going through the complexity of having a build system but are reaping none of
the performance benefits!)

~~~
baddox
From my perspective, the main reason for most browser JS projects having a
build system is to essentially concatenate modules and to transpile newer JS
features so they work on older browsers. That’s really just not as crazy as
you make it sound. I don’t know about the Linux VM stuff being part of a
browser JS project, maybe I’m misunderstanding that part but it does sound
crazy.

~~~
ori_b
> I don’t know about the Linux VM stuff being part of a browser JS project,
> maybe I’m misunderstanding that part but it does sound crazy.

A bunch of stuff I see comes with a dockerfile these days.

While I get building in docker for reproducibility, kind of -- shipping with
one is a pretty big red flag. It means that it's likely to be very painful to
port the code out of docker and integrate it into some other build. Which
means that it'll be painful to integrate it with other code. Leave the build
environment set up to the users. Embrace build diversity, it makes your code
less fragile.

~~~
core-questions
> It means that it's likely to be very painful to port the code out of docker
> and integrate it into some other build.

No it wouldn't. Dockerfiles are stupid simple. A bunch of ADD/COPY commands to
add files, and a bunch of RUN commands that are basically wrapping shell
scripts.

Plus, find me a modern build system (CI/CD system, rather) that doesn't
natively support Docker. It's become a lingua franca at this point.

I just wish it was faster on its own without me having to do all the legwork
of maintaining lower layer containers etc. to avoid lengthy package install
times.

~~~
com2kid
I had a problem yesterday where I had so many docket images laying around from
builds that starting a new container pegged my CPU at 100% and slowed
everything in a docker container to a crawl.

A colleague predicted the problem and I learned about one more bit of random
fragility that exists in today's ecosystem.

~~~
rumanator
> I had a problem yesterday where I had so many docket images laying around
> from builds

$ Docker system prune

It makes as much sense to complain about Docker because you let images
accumulate in your system as it would make to criticize file systems because
you never emptied your system's trash bin.

~~~
com2kid
My file system doesn't peg my CPU when it gets 10% full.

Why the hell should the very existance of unused docker images cause the
single image I am using to peg multiple cores?

And no warning was given. Heck if my hard drive gets close enough to full that
it's gonna be a problem my OS tells me.

------
pwm
Anyone curious about the core abstractions and capabilities of build systems
should read the amazing Build Systems à la Carte paper:

[https://www.microsoft.com/en-
us/research/uploads/prod/2018/0...](https://www.microsoft.com/en-
us/research/uploads/prod/2018/03/build-systems.pdf)

Here is a teaser, the 1st table:

    
    
      Build system | Persistent build information | Scheduler   | Dependencies | Minimal | Cutoff | Cloud
      Make         | File modification times      | Topological | Static       | Yes     | No     | No
      Excel        | Dirty cells, calc chain      | Restarting  | Dynamic      | No      | No     | No
      Shake        | Previous dependency graph    | Suspending  | Dynamic      | Yes     | Yes    | No
      Bazel        | Cloud cache, command history | Restarting  | Dynamic(∗)   | No      | Yes    | Yes
    

which shows that Excel can be viewed as a dynamic build system!

------
sdenton4
> "Programmers don't want a system for writing build scripts; they want a
> system for writing systems that write build scripts."

It is a truth universally acknowledged that a shitty scripting language in
possession of a large user base will be in search of a new scripting language
that automatically generates the shitty scripting language.

But then you have two problems.

~~~
marcosdumay
That's a good observation.

Any deficiency in a complex, widely used and difficult to change development
tool will be fixed by adding an extra layer over it.

If the deficiency is something like complexity or lack of reliability that can
not be fixed by adding extra layers, read this comment again.

~~~
hoskdoug
Great example is Fastlane abstracting over the impenetrable iOS command line
tools.

------
klodolph
> Build graphs are acyclic.

While yes, this is a falsehood... in practice, you generally want to figure
out a way to break cycles and remove them from the build system. In general,
make the user responsible for breaking cycles in the build system.

The only major use case for cycles that I can think of is self-hosting
compilers.

I would add some additional falsehoods:

\- It is always the right thing to rebuild files when the inputs have changed.

\- The host and target systems are the same (you are never cross-compiling).

\- The host and target systems are different (you are always cross-compiling).

\- You can reliably figure out the version number of whatever you are
building.

\- The build steps always produce the correct outputs if they do not fail.

\- You always want to build third-party dependencies from source.

\- You never want to build third-party dependencies from source.

\- The input files do not change while the build is running.

\- All of the inputs are files (and not, say, the current timestamp).

\- The entire build should use the same build settings (optimization / debug),
you don't want to do a partial-debug, partial-optimized build.

\- You don't want to rebuild when e.g. files in /usr/include change.

\- You always want to rebuild when e.g. files in /usr/include change.

\- Installing additional software will not break the build.

\- Running the same build twice will result in the same output.

\- Build products do not contain absolute paths.

\- You can move an executable from one path to another and it will still work.

\- You can delete object files and everything will still work.

~~~
jnxx
> > Build graphs are acyclic.

Normal C code, not that I can think of. But it is not difficult to give a
real-world example:

A LaTeX document with a table of contents and an index has a cyclic
dependency: The final document, including its pagination, depends on the table
of contents, and the table of contents depends on the final document.

~~~
amelius
> A LaTeX document with a table of contents and an index has a cyclic
> dependency

One of the uglier problems of LaTeX, which people of the likes of Leslie
Lamport and Donald Knuth should have been able to solve, one might think.

~~~
CJefferson
I think this is an impossible problem to fix -- inserting page numbers into a
document can change formatting enough to cause things to move to a different
page.

There are LaTeX documents which never reach a fixed point (I've made one by
accident exactly once), and need slight adjustment.

~~~
amelius
Then "iteratively computing a fixed point by simply rerunning the formatter
from scratch" is not a solution to the problem.

Actually I think you can do it iteratively by never decreasing the space
occupied by an element between iterations. Of course this might yield ugly
results but at least it halts with a solution.

------
yongjik
IMHO the greatest falsehood I believed about build systems was:

\- If build system X is used, people will read its documentation and write
build rules according to the documentation.

No, people will just copy what's already there. And if it pulls in 35 external
dependencies, compile 500 unnecessary files, and fails 25% of the time because
of race conditions between external dependencies, people will add "Note: if
build fails, run again."

Sigh.

~~~
dcow
And then there's `make`. I've seen a philosophically correct Makefile maybe 3
times in my life. Most of the time I see people making every target .PHONY and
using patterns to save keystrokes rather than match multiple files.

You'd tend to think people learn over time but no. It's gotten worse. Right
when you thought the industry was on the cusp of a build revolution the go
community decides to push an old tool that most people can't utilize even 3%
of when paired with `go build`. I kinda blame the go ecosystem for pushing
Makefile adoption when they already have a caching build tool just so the
maintainers didn't have to give the compiler a polished UI. Simplicity at the
cost of functionality even if it's simply broken amirite?

Sigh.

~~~
ancarda
I've never seen Go projects use Make? When did this change happen?

~~~
dcow
The golang project itself is a perfect example:
[https://github.com/golang/go/blob/master/src/Make.dist](https://github.com/golang/go/blob/master/src/Make.dist)

------
ex_amazon_sde
In 2020 we can add additional falsehoods based on my experience in Amazon:

\- It's OK to embed/vendor dependencies

\- It's OK to hardcode VCS URLs for your dependencies

\- you can pull random stuff from the Internet at compile time and large
companies will accept it

\- It's OK to release libraries without version numbering or using git
commitish

\- Everybody uses containers

\- Nobody cares about reproducible builds

\- Nobody cares about security, license compliance and being able to rebuild
things reliably

\- All FAANGS will use your project even if it cannot packaged into OS
packages

~~~
Apofis
This happens everywhere... some platforms became so complicated to do initial
setup with that Docker became a necessity instead of an install script.

It now takes me weeks to setup a fresh machine to do local dev on. I guess I
should make a docker image of my dev box.

    
    
        cd devbox
        docker-compose up
    

...would be nice, now that I think about it.

~~~
ex_amazon_sde
I suspect you missed the point about _falsehoods_.

------
eddieh
If the build system isn't first a DSL for specifying a DAG, then it fails at
being at least as good as Make. Maybe "build graphs are acyclic" is a
falsehood, but I would never want to use a build system that isn't
fundamentally constructing a DAG. There are plenty of imperative build systems
(or build systems that people us imperatively) and they're all horrible.

------
morelisp
This "falsehoods programmers believe" is fundamentally different from the more
classic ones (e.g. "about names") because its topic is also wholly controlled
by programmers. That means in the end it doesn't share much with those
documents, but can instead also be characterized as "just" a list of bugs in
other programs.

If a program doesn't handle my name, that's a problem with the program. My
name is my name; the program doesn't decide that. Depending on the context
some government might; the program also generally isn't supposed to decide for
the government. Often in practice the program _controls_ both of us; this is
the core complaint of "falsehoods" articles, that a program artifact is
overstepping its ethical/legal/otherwise extra-computational authority.

But there's no moral demand that e.g. a program exit with 0 status even when
it's failed, or that build trees should be cyclic. These are tools _by
programmers and for programmers concerning programs_ and when we run into
issues it's usually because we chose a poor modeling tool (e.g. a build system
should not be ideal to calculate the fixed point of a data processing
pipeline, independent of any cycles) or because of a flat-out mistake (there's
no reason a compiler can't be given some output directive other than inventor
oversight).

Other "falsehoods" are criticism of the software encoding the falsehood, but I
read probably 90% of the complaints here as equal or greater failings on the
tools invoked. Not only for the sake of my build system do I want DAG
pipelines, updated timestamps, or meaningful exit codes - if anything these
are the demands I wish to place on other software for the sake of my brain!
More like "falsehoods compilers believe about their primacy in program
generation."

Or as one of the comments suggests,

> build systems are a symptom of software languages that are not designed to
> build software systems

------
bigdict
> It is accepted by all decent people that Make sucks and needs to die

falsehood #1

~~~
fmajid
Amen.

The worst are crackpot systems like ant. Whoever thought a half-baked DSL with
a XML concrete syntax was a good idea?

~~~
raverbashing
> ant.

> Whoever thought a half-baked DSL with a XML concrete syntax was a good idea?

You kinda answered your own question there no?

------
saurik
> stat is slow on modern filesystems.

Is this not true?!? While working on my build times I benchmarked that
something like a quarter of my time in clang was due to my having large
numbers of sparse -I folders as the stat calls were so expensive (this was a
12" MacBook running macOS 10.14 on, I assume, APFS).

------
rainbow-road
Most of the listed assumptions are bad, but some of them, to me, seem
essential. In particular, dealing with cyclic build graphs sounds like an
absolute _nightmare_ , and I doubt that any build system that allows for such
a thing can be easy-to-use.

------
haolez
I tend to use whichever build system is mainstream in the environment I'm in.

However, if I'm dealing with something special, or if the build system is
unstable, I always fall back to Make. Such a beautiful tool!

Tup is also worth to mention: [http://gittup.org/tup/](http://gittup.org/tup/)

------
dang
If curious see also

2018
[https://news.ycombinator.com/item?id=16196899](https://news.ycombinator.com/item?id=16196899)

------
sabujp
this is from 2012 and some of the points haven't aged well (see blaze/bazel)

~~~
carlmr
Which ones and why? Is blaze/bazel free of all these falsehoods. Did it manage
to make new mistakes?

~~~
sabujp
I'm no blaze/bazel expert (hopefully someone will add their own thoughts),
but:

    
    
        1. build graphs are trees/acyclic : bazel enforces this, no cyclic dependencies allowed
        2. Compilers will always modify the timestamps on every file they are expected to output : if a write happens why not update mtime? nanosecond precision, what year is it?
        3. It's possible to tell the compiler which file to write its output to : makes no sense to me, we expect our compilers to be writing to the correct files in the correct directories.
        4. It's possible to predict in advance which files the compiler will update : yes it is, this is how caching works again see (1)
        5. It's possible to determine the dependencies of a target without building it : yes yes it's completely possible given (1)
        6. Detecting changes via file hashes is never the right thing : Bazel i think uses a combination of hashes and stamps, but why would using hashes not work 99.99999999999999998% of the time?
        7. Non-experts can reliably write portable shell script : lol, experts can't reliably write portable shell scripts either
        8. Your build tool is a great opportunity to invent a whole new language: or just coopt pieces of one (python: starlark, skylark)
        9. Said language does not need to be a full-featured programming language : yes, it doesn't (again .bzl files)
       10. Said language should be based on textual expansion : not sure what the alternative to textual expansion (parsing?) is
       11. Adding an Nth layer of textual expansion will fix the problems of the preceding N-1 layers : again how is this different from parsing

~~~
tzs
> 2\. Compilers will always modify the timestamps on every file they are
> expected to output : if a write happens why not update mtime? nanosecond
> precision, what year is it?

It has nanosecond precision, but on most systems it does not have nanosecond
resolution.

    
    
      $ touch a1; sleep .0001; touch a2; stat a1 a2 | grep Modify:
      Modify: 2020-07-18 21:37:56.421487075 +0000
      Modify: 2020-07-18 21:37:56.421487075 +0000
    
      $ touch a1; sleep .0001; touch a2; stat a1 a2 | grep Modify:
      Modify: 2020-07-18 21:37:57.397502229 +0000
      Modify: 2020-07-18 21:37:57.397502229 +0000
    
      $ touch a1; sleep .0001; touch a2; stat a1 a2 | grep Modify:
      Modify: 2020-07-18 21:37:58.273515840 +0000
      Modify: 2020-07-18 21:37:58.277515902 +0000
    
      $ touch a1; sleep .0001; touch a2; stat a1 a2 | grep Modify:
      Modify: 2020-07-18 21:41:00.736348218 +0000
      Modify: 2020-07-18 21:41:00.736348218 +0000
    
      $ touch a1; sleep .0001; touch a2; stat a1 a2 | grep Modify:
      Modify: 2020-07-18 21:41:01.708363288 +0000
      Modify: 2020-07-18 21:41:01.708363288 +0000
    
      $ touch a1; sleep .0001; touch a2; stat a1 a2 | grep Modify:
      Modify: 2020-07-18 21:41:02.504375653 +0000
      Modify: 2020-07-18 21:41:02.508375715 +0000
    

Note that the difference is either 0 or 0.004000062. This is on a Debian 9
system running on an Amazon Lightsail instance with an ext4 filesystem.

Here's a quick and dirty script to play with this:

    
    
      C=0
      R=0
      while :
      do
        R=$((R+1))
        touch a1; sleep $1; touch a2
        T1=$(stat a1 | grep Modify | sed -e 's/.*://' | sed -e 's/ .*//')
        T2=$(stat a2 | grep Modify | sed -e 's/.*://' | sed -e 's/ .*//')
        if [ $T1 != $T2 ]; then
            dc <<HERE
      20k $T2 $T1 - p
      HERE
            C=$((C+1))
            if [ $C -eq 100 ]; then
                echo "$R runs"
                exit 0
            fi
        fi
      done
    

Output on the above system:

    
    
      $ bash t.sh 0.001 | sort | uniq -c
            4 .004000061
           83 .004000062
           13 .004000063
            1 222 runs
    

and on a VMWare Fusion VM running Debian 8 on my Mac:

    
    
      $ bash t.sh 0.001 | sort | uniq -c
           29 .003999599
           44 .003999600
           27 .003999601
            1 323 runs

~~~
sabujp
ahh yea thanks for the reminder, i think bazel also hashes all inputs and
outputs

------
jpollock
Build systems are tools just the same as an editor. In the same way that
editors are opinionated about keymappings, build systems are opinionated about
things like timestamps vs hashing.

If you're using a tool, you really should know how it works and where it will
break. You should also choose the tool based on the problem you're trying to
solve. If you've got a build system that rebuilds when /usr/include changes,
and you're trying to do cross-compiles, choose another build system.

~~~
raverbashing
Timestamps vs hashes shouldn't be even a thing. Why does this even matter?

Sure ok maybe you touched a file and the timestamp wasn't updated (it should,
and your tool is making things harder on you). Or if you're going by hash,
same hash, same file. Period. (Unless you're doing something funny with file
metadata, and then again, you're shooting yourself in the foot)

Same with some comments on the original article saying "tool doesn't return
correct status codes for success/failure" well, maybe you need tools that work
then

Most of the advice in the article is sound though

------
ancarda
>I want to see an end to Make in my lifetime

Why? What's wrong with Make?

~~~
asveikau
Make is actually pretty amazing for its simplicity, longevity, how it gets so
much of the job done without doing very much. Yes there are pitfalls and
caveats, good uses and bad, but the balance reached between simplicity and
ability to be productive and get you most of the way is amazing.

I feel that when I hear somebody say they don't like make, usually they
misunderstand this, or they think somebody's bad experiences with makefiles is
somehow representative of the capability of the tool.

One person created make in a weekend in 1976. We are still talking about it.
Most humans and most software will never achieve this.

~~~
eddieh
Exactly this! Make is the worst build system, except for all the others.

------
somewhereoutth
Source ---[Big Red Button]---> Artifacts

How did this go wrong??

~~~
marcosdumay
I think it started with developers not wanting to write all of the source and
artifact files into the makefile.

Then there were some portability issues that should have been handled by the
compiler, but by that time there were already metabuilders, so nobody
bothered.

------
scoot_718
Some of those aren't believed assumptions, but asserted ones. For instance, I
would think a build system that asserts that builds are acyclic is good.
People who want a cyclic build are unimportant and can go elsewhere, while I
enjoy my simpler build system because of it.

Simplifying assumptions are good.

------
trishankkarthik
YAML IS NOT AN EFFING PROGRAMMING LANGUAGE, EVEN IN 2020

~~~
dang
_Please don 't use uppercase for emphasis. If you want to emphasize a word or
phrase, put asterisks around it and it will get italicized._

[https://news.ycombinator.com/newsguidelines.html](https://news.ycombinator.com/newsguidelines.html).

~~~
tdons
yaml deserves all caps

~~~
dang
If we take the union of all allcaps-deservers across the set of all users with
such feelings, half the site will run through an upcase function.

As in any mutually assured destruction scenario, we must all restrain
ourselves.

~~~
trishankkarthik
dang, I was clearly joking...

~~~
dang
scott_s wrote the canonical comment about that in 2014:

[https://news.ycombinator.com/item?id=7609289](https://news.ycombinator.com/item?id=7609289)

[https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...](https://hn.algolia.com/?dateRange=all&page=0&prefix=true&query=7609289&sort=byDate&type=comment)

~~~
trishankkarthik
If you guys can't even take a "bad" joke, gods help you all. Feel free to ban
me.

