1. You can only specify that a recipe creates multiple output files (for instance, an output file and a separate index file) if it has wildcards.
2. Temporary file handling is completely broken. You can declare a file to be temporary, so that make deletes it after all the jobs that use it have finished. However, make randomly deletes the files at other times (like for instance if a command fails), and fails to delete the files at other times.
3. There is a complete inability to specify resource handling - for instance, I want to mark that this recipe is single-threaded, but that one uses all available CPU cores, and have make schedule an appropriate number of jobs.
4. If you want to have crash-recovery, then you need to make your recipes generate the output files under a different name and then do an atomic move-into-place afterwards. Manually. On every single recipe.
These reasons (and others) are why I gave up on make for bioinformatics processing and wrote a replacement. I'll release and publish it at some point.
I know make can seem a little baroque, but this is just wrong.
1. Multiple targets for a single recipe:
file.a file.b file.c: dep.foo dep.bar
...
This says that the recipe makes all of file.a, file.b and file.c in one go.
2. Make definitely doesn't randomly delete files. It deletes implicit targets.
Make by default knows how to build a lot of standard things like object files for c programs, yacc and bison stuff, etc. These are called implicit targets. These are considered intermediate files to be deleted. You can override the defaults or add your own implicit targets by using pattern matching like this:
%.foo: %.bar
...
If you want to use pattern matching for non-implicit targets so they don't get deleted, you can do that too:
a.foo b.foo c.foo: %.foo: %.bar
...
The list before the first colon says which targets the pattern-matched rule applies to and shouldn't contain wildcards. These targets won't get deleted.
3. This seems like a misunderstanding of make's basic role. Make just spawns shells when running a recipe; like bash, it shouldn't need to know how many threads you're using to run an arbitrary command. If you want make to build targets in parallel whenever possible, look at the `-j` option. If you want a certain build recipe to run multi-threaded, use the proper tool for the recipe.
4. Not sure what you mean by crash recovery, but considering the above, I'm pretty sure you might just be fighting make unnecessarily.
Honestly, try reading the info manual. It's kind of massive and daunting, but the intro material is really accessible, and taken in pieces, you can easily learn to become friends with this venerable tool.
1. That doesn't do what you think it does. From the manual: "A rule with multiple targets is equivalent to writing many rules, each with one target, and all identical aside from that." It does not mean one rule that creates multiple targets. To achieve that, you need to use wildcards. For some reason, when using wildcards, the syntax is interpreted differently.
2. If I create a rule to create an intermediate file "b" from original file "a", then another rule to create file "c" that I want to keep from "b", but there is an error running the command that creates "c", then make will happily delete the intermediate "b" (which in my case took 27 hours to create) although it knows the final "c" wasn't created properly. This means that when I rerun make (having fixed the problem), that 27 hour process needs to be run again, which is a waste of my time.
3. I want to say "make -j 64" on my 64-thread server, and not have 64 64-thread processes start. But I also do want 64 single-threaded processes to run when possible.
4. By crash recovery, I mean that by default a process will start creating the target file. If someone yanks the power, that target file will be present, with a recent modified time, but incomplete. Make will assume the file was created fine, so when I rerun make it will try to perform the next step, which may take 10 hours to fail. I want make to notice that the command did not successfully complete, and restart it from scratch.
For #2, .PRECIOUS doesn't help me. From the make manual: "Also, if the target is an intermediate file, it will not be deleted after it is no longer needed, as is normally done." This means that my intermediate files will never be deleted by make, even when everything that is built from them has been completed.
For #3, no I think I know how to read ps. I don't want 64 64-thread processes running on my 64-thread server, because that is hell for an OS scheduler, and makes things run slower, not faster.
For #2, you could always make a dependency that removes your intermediates for you after your final use. You can't be mad at make because it deletes intermediates and because it doesn't delete intermediates. Make isn't psychic.
For #3, I didn't mean to come across as pedantic. I haven't encountered what you're describing, but I have personally been surprised by how Linux does process accounting, so I apologize; I just figured you were being bitten by the same thing.
I like make a lot, but I don't use it for everything, because sometimes there simply are better tools for the task, and I hope you were able to figure out a solution.
I agree with all your criticisms, except I'm a bit confused about #3. Are you saying you're using Make with multi-threaded build actions?
As far as I know, most compilers are single-threaded, so this isn't much of an issue in practice. But I'm curious where you've encountered this problem.
No, I was using make to process large files for bioinformatics. So, think 60GB (compressed) of sequencing data from a whole genome sequencing run, which comes as a set of ~800,000,000 individually sequenced short stretches of DNA in two files. A multi-threaded process converts that into a file containing the sequences and where they align in the human genome, and takes about a day. Once that job has been finished, other jobs can be kicked off to use the produced data. Overall, the build process is a DAG with several hundred individual jobs, and performing that in a make-like system helps it to be managed effectively. Just not make itself.
All his criticisms are correct except maybe #3 which I don't understand.
Another problem I've found is that Make doesn't consider the absence of a prequisite to mean the target is out of date. So if foo.html depends on foo.intermediate, and then you delete foo.intermediate, then "make foo.html", foo.html will be considered up to date. I guess this is part of the odd feature where Make deletes intermediaries, but even if you have .SECONDARY on, which I do, it still behaves this way.
The bottom line is that it's extraordinarily easy to write incorrect Makefiles -- either underspecifying or overspecifying dependencies -- and it's very difficult to debug those problems. My Makefile is still full of bugs, so I "make clean" when something goes wrong.
One thing that would go a long way is if it had a shorthand for referring to the prequisites in commands, like $P1 $P2 $P3, and if it actually enforced that you use those in the command lines! I don't want to create variables for every single file, and when I rename files, rules can grow invisible bugs easily.
The biggest "weakness" is that make can seem confusing at first. I strongly recommend reading the make info page. It's pretty huge with a lot of material, but the intro stuff is really accessible.
I would avoid learning make hodge-podge from StackOverflow as that will just frustrate you. If you take the info page in pieces and are a little methodical about it, you will probably end up liking make!
I was thinking of the multiple build outputs issue. The fact that someone gave the "obvious but wrong" [1] solution as an answer only underscores this problem.
Make is full of cases where the obvious thing is wrong. That is not a good UI!
As a conceptual summary, I would say that the problems stem from a couple underlying causes:
1) The execution model of Make is confused. It is sort of "functional" but sort of not. To debug it sometimes you have to "step through" the procedural logic of Make, rather than reasoning about inputs and outputs like a functional program. I mentioned this here [2].
2) You want to specify the correct build graph, and Make offers you virtually no help in doing so. An incorrect graph is when you underspecify or overspecify your dependencies. Underspecifying means you do "make clean" all the time because the build might be wrong. Overspecifying means your builds are slow because things rebuild that shouldn't rebuild.
In practice, Makefiles are full bugs like this. In fact I should have mentioned that my Oil makfile is FULL OF bugs. Making it truly correct is hard to express because some dependencies are dynamic (i.e. the gcc -M problem.) But I just "make clean" for now.
The Google build system Bazel [3] is very principled about these things, but I don't think it makes sense for most open source projects because it's pretty heavy and makes a lot of assumptions. It works well within Google though.
It does some simple things like check that your build action actually produces the things it said it would! Make does not do this! It can run build actions in a sandbox, to prevent them from using prerequisites that aren't declared. And it has better concepts of build variants, better caching, etc.
All these things are really helpful for specifying a correct build graph (and actually trivial to implement).
3) Another thing I thought of: Make works on timestamps of file system entries, but timestamps in Unix mean totally different things for files and directories! You can depend on a directory and that has no coherent meaning that I can think of. Conversely it's hard to depend on a directory tree of files whose names aren't known in advance.
4) Both Make and Bazel essentially assume the build graph is static, when it is often dynamic. (gcc -M again, but I also encountered it with Oil's Python dependencies) The "Shake" build system apparently does something clever here.