
Mtime comparison considered harmful - panic
https://apenwarr.ca/log/20181113
======
evmar
In Ninja I sorta stumbled through some of the same issues described here. I
eventually realized that the interesting question is "does this output file
reflect the state of all the inputs" and not anything in particular about
mtimes, and that"inputs" includes not only the contents of the input files,
but also the executables and command lines used to produce the output.

If you squint, mtime/inode etc. behave like a weak content signature of the
input. And once you have that perspective, you say "if mtime != mtime I had
last time, rebuild", without caring about their relative values, and that
sidesteps a lot of clock skew related issues. It does "the wrong thing" if
someone intentionally pushes timestamps to a point in the past (e.g. when
switching branches to an older branch) as an attempt to game such a system,
but playing games with mtime is not the right approach for such a thing,
totally hermetic builds are.

One nice trick is that you can even capture all the "inputs" with a single
checksum that combines all the files/command lines/etc., and that easily
transitions between truly looking at file content or just file metadata. The
one downside is that when the build system decides to rebuild something, it's
hard to tell the user why -- you end up just saying "some input changed
somewhere".

~~~
chubot
Is there a description of Ninja's algorithm anywhere? I looked at the manual
[1] and didn't quite see it.

Does Ninja use a database like sqlite? It seems like it has to if it does
something better than Make's use of mtimes. (e.g. the command line, which Make
doesn't consider.)

I looked at redo (linked in the article) and it uses sqlite to store the extra
metadata.

[1] [https://ninja-build.org/manual.html](https://ninja-build.org/manual.html)

~~~
evmar
No, sorry. And I also mixed what Ninja actually does with some random
observations in that comment.

Ninja does use some database-like things, but they are just in a simple
text/binary format. It's actually been long enough that I have forgotten the
details.

[https://ninja-build.org/manual.html#ref_log](https://ninja-
build.org/manual.html#ref_log) (contains a hash of the commands used) /
[https://ninja-build.org/manual.html#_deps](https://ninja-
build.org/manual.html#_deps) (database-like thing with some mtimes, see
[https://github.com/ninja-
build/ninja/blob/master/src/deps_lo...](https://github.com/ninja-
build/ninja/blob/master/src/deps_log.h#L29) )

~~~
apenwarr
This reminds me, I should study ninja's binary format and maybe borrow it :)

The sqlite3 database used in redo was just something I threw together in the
first few minutes. sqlite was always massive overkill for the problem space,
but because it never caused any problems, it's been hard to justify working on
it. I'd like to port redo from python to C, though, and then the relative size
of depending on a whole database will matter a lot more.

~~~
evmar
Someone rewrote ninja from scratch in C at one point and it's shockingly tiny.
(No tests = no extra abstractions to make testing possible.)

~~~
chubot
I found samurai and it is indeed tiny! ~3400 lines was less than I was
expecting for *.[ch] !

[https://github.com/michaelforney/samurai](https://github.com/michaelforney/samurai)

Ninja isn't too big though. It looks like about 13K of non-test code, which is
great for a mature and popular project. Punting build logic to a higher layer
seems to have been a big win :)

------
chungy
The "popular misconceptions" section seems to have a couple of the author's
own misconceptions.

* On precision, he notes "almost no filesystems provide that kind of precision" (nanoseconds), but I would honestly say the exact opposite statement. ext4, xfs, btrfs, ZFS are some of the very common file systems that support this. He cites that his ext4 system only has 10ms granularity, which is most certainly not the default, but likely a result of upgrading from ext2/ext3 to ext4. As an aside, NTFS only has a granularity of 100ns.

* It is unclear what he means by "If your system clock jumps from one time to another...". If this is talking about NTP, it's probably accurate. My first reading was "daylight saving" or time zone changes, in which case, everyone uses UTC internally and such changes don't affect the actual mtimes. (You might get strange cases where a file listing regards a file modified at 01:45 to be older than a file modified at 01:20, but if you display in UTC, you can see it's just DST nonsense)

~~~
delroth
You're looking at the 90th percentile, he's looking at the 99th percentile.
Claiming things like "everyone uses UTC internally" is obviously wrong when
many people just a few years ago were still setting up systems in localtime
when dual booting with Windows. Upgrading to ext4 is also not a rare edge
case.

~~~
chungy
Upgrading to ext4 is not extremely rare, but the recommended procedure
involves mkfs and copying files over anyway:
[https://ext4.wiki.kernel.org/index.php/UpgradeToExt4](https://ext4.wiki.kernel.org/index.php/UpgradeToExt4)

In-place upgrades do have the potential to leave some non-default options for
the final ext4 file system, such as 128B inodes instead of the 256B ones,
which is where certain features like reduced timestamp granularity comes in.

~~~
apenwarr
In any case, my system was installed from scratch quite recently using native
ext4. As others have pointed out and as I linked in the article, it’s likely a
kernel issue. I assume many people have the same thing.

------
peff
> the .git/index file, which uses mmap, is synced incorrectly by file sync
> tools relying on mtime

This part implies that the index file is written via mmap, but that's not
true. It is fully rewritten to a new tempfile/lockfile, and then atomically
renamed into place.

Git does not ever mmap with anything but PROT_READ, because not all supported
platforms can do writes (in particular, the compat fallback just pread()s into
a heap buffer).

------
qznc
From this article I learned that build systems don't have the fundamental
choice: mtime or checksum. Instead a better solution is mtime plus a bunch of
other things. The article explains the faults of mtime and checksum clearly.

This insight makes me want to try redo.

One thing I dislike about redo is that it probably does not work well on
Windows. Has anybody ever tried?

Redo also makes me wonder: Is a build directory just a habit from using Make
or is it a flaw of redo to not support that well? With "build directory" i
mean the concept where the build process generates files in an extra directory
which can simply be deleted and nothing does pollute the source directory.

~~~
thristian
> _One thing I dislike about redo is that it probably does not work well on
> Windows. Has anybody ever tried?_

Yes, although only for a toy project. I'm sure it'd work fine with Cygwin or
WSL, it might work with MSYS, but I've definitely had it working with
busybox-w32[1].

> _Is a build directory just a habit from using Make or is it a flaw of redo
> to not support that well?_

You can use redo in a Make-like fashion by putting a single `default.do` in
the root of your project that decides what to do by examining the filename
it's been asked to build. That does give up some of the benefits of redo,
however (since a single file builds everything, when you edit that file redo
wants to rebuild everything).

Having a separate build directory that can be easily wiped is a good idea, but
I'm a lot less worried about it (or things like 'make clean') now that I have
'git clean -dxf'.

[1]: [https://frippery.org/busybox/](https://frippery.org/busybox/)

------
contras1970
[http://jdebp.eu/FGA/introduction-to-
redo.html](http://jdebp.eu/FGA/introduction-to-redo.html)

hoping jdebp will chime in..

~~~
qznc
I have never before seen this idea to put CXXFLAGS into a file and treat it as
a file dependency. That would also work with Make. Clever idea.

Nevertheless, a build system which does this implicitly is better, imho.

~~~
apenwarr
Perhaps unsurprisingly, djb seems to have pioneered this with his Makefiles,
which generally produce, then depend on and run, a 'compile' script that
contains the flags.

------
oever
The article suggests writing explicit rules to check for changes in the
toolchain.

These dependencies can be recorded automatically with LD_PRELOAD. LD_PRELOAD
can redefine functions such as fopen and let you record what files are read
when running e.g. cc.

This makes it feasible to record the entire relevant state of a system, files
and environment variables, at build time.

An argument in favor of checksums is the use of build caches. Switching
between branches on large codebases triggers a lot of rebuilds. With a build
cache, that can be avoided. SCons is a build system that uses such a cache.

~~~
jake_the_third
Depending on LD_PRELOAD is extremely fragile and finicky.

Not only can a process sidestep libc entirely by calling the `open`(2)
syscall, but there are often many ways of combining function calls to achieve
the same outcome. This method will also fail completely on systems that have
new, previously unknown functions that are not monitored by the LD_PRELOAD
solution.

Worst of all, a LD_PRELOAD solution would not cover operations that are done
on the behalf of the target program by external programs via IPC (think system
daemons and dbus), at least not without intercepting and interpreting all io
that target does.

In short, it doesn't scale.

------
saagarjha
> Random side note: on MacOS, the kernel does know all the filenames of a
> hardlink, because hardlinks are secretly implemented as fancy symlink-like
> data structures. You normally don't see any symptoms of this except that
> hardlinks are suspiciously slow on MacOS. But in exchange for the slowness,
> the kernel actually can look up all filenames of a hardlink if it wants. I
> think this has something to do with Aliases and finding .app files even if
> they move around, or something.

I don’t think this is true anymore with APFS.

------
gumby
The checksum issue could be addressed by having the compiler generate the
checksums in a "sidecar" file and have the build system depend on _those_.

Obviously this faces the "not every toolchain will support this" but you could
have a switch to use checksums and continue to use the older approach by
default.

You would have to call cksum instead of touch in your Makefile.

With a modules system (like C++20 is trying to embrace) you could in theory
generate a vector of checksums (with subranges in the file) for each file and
only recompile portions of the file that needed it. It's rather an accident of
history that we use the granularity of a file at all.

------
esotericn
Why not just use a complete checksum of the file?

The average project is what, a few MB? Less? Most of which is going to be
cached after the first compile anyway.

Even on an enormous codebase, as long as you have an SSD or a nontrivial
amount of RAM I can't see this being an issue.

You don't care if a file is newer - you care if it's different!

~~~
ams6110
This is discussed near the end of the article. It works best when the file
system itself stores a checksum in its metadata so it does not have to be
calculated for each file for each build. It's not appropriate when your build
may include dependencies based on other side effects besides file content. For
example, sometimes you depend on the timestamp of an empty file, or the
success or failure of another step based on e.g. a log message, to trigger
other actions

~~~
esotericn
Is it common to build over NFS? I can't immediately see a use case -
collaborative editing or something? Even in that case, wouldn't it be easier
to build on the box?

The other case seems valid in a sort of 'if your build intentionally makes use
of mtime, you'll need to look at mtime'. It seems like an odd thing to do in
the first place - I guess Makefile as deployment rather than for building?

A 470K LoC project I have here with >1000 files takes 0.04 seconds to do a
full sha256sum traversal on my box from the cache. That's single-threaded.

If I drop caches, it takes approximately 1 second (from spinning rust, not
SSD).

~~~
erik_seaberg
NFS is getting so rare that some systems aren't even organized to accommodate
"mount -o ro /usr" anymore.

~~~
ams6110
NFS is widely used in HPC to mount user home directories on compute nodes.

------
zzo38computer
Those looks like some good ideas, because currently I do use just mtime base
(for programs with multiple files; many of my programs are only one file and
so don't need to deal with stuff like that).

------
eecc
Ugh, these "* considered harmful" blog posts...

the only message I read is that someone wants to believe they're as smart as
Dijkstra. /s

