

Memoize – a replacement for make relying on strace - aartur
https://github.com/kgaughan/memoize.py

======
pbiggar
I played around with this a few years back when looking at build tools. strace
is ungodly slow, and I wouldn't recommend using this for anything real. When I
looked into why, it's basically the time it takes to print and read from
strace - printing to stdout every time there is a syscall is obviously very
slow.

Instead you can use tup, though someone mentioned it now requires Fuse, which
obviously isn't good. So instead you can use an LD_PRELOAD to record all the
file accesses, which is what I think tup used to do.

There is also some filesystem thing added to git that let's introspect file
access, though I don't recall the details.

That said, I think memoize is the first use of this technique (record actual
file accesses dynamically for dependency tracking), which is utterly genius. I
worked with Bill McCloskey who wrote it, and he is an incredibly smart dude.
He did a lot of work on making Firefox's GC better over the last few years.

~~~
majke
Isn't it sufficient to just track open() and stat(). Maybe access()? That
shouldn't be too slow.

~~~
thechao
Rename tracking is what undoes a lot if the more naive variants.

~~~
emelski
Hard links as well, which are nastier because they introduce aliases for the
things you want to track.

------
comex
See also fabricate, a script inspired by memorize that works on Windows:

[https://code.google.com/p/fabricate/](https://code.google.com/p/fabricate/)

I tried to contribute OS X support years ago but was too lazy to spend the
time to polish it - sorry!

Neither tool supports any sort of parallelism, which caused me to eventually
give up on them. (It is possible in theory! Though in the worst case it
requires killing compiler invocations and rerunning them.)

Edit: Apparently fabricate does support parallel execution now, which is neat,
but only with explicit markers, which I expect is somewhat detrimental to the
magic feel of these tools... maybe I should try it. In the previous
parenthetical I meant automatically detecting what can be run in parallel.

~~~
yesitworks
Any serious build tool can support such a thing if needed, here for example in
Waf: [http://waf-devel.blogspot.se/2015/02/using-strace-to-
obtain-...](http://waf-devel.blogspot.se/2015/02/using-strace-to-obtain-
build.html)

------
dudus
This has nothing to do with Memoization [1] that Python already have out of
the box through `functools.lru_cache` [2]

This is just a very bad naming choice for this library that seems to be an OK
alternative to Make.

[1]
[https://en.wikipedia.org/wiki/Memoization](https://en.wikipedia.org/wiki/Memoization)

[2]
[https://docs.python.org/3/library/functools.html#functools.l...](https://docs.python.org/3/library/functools.html#functools.lru_cache)

~~~
ianbicking
If you think of a build step as a function, taking some files as input (the
source) and returning a result (the compiled artifact), then this tool is
memoizing the that function.

~~~
scott_s
Agreed - from the name and the mention of strace, I immediately got how it
worked.

------
pdq
I believe tup [1] uses a similar technique for compile dependency tracking.

[1]
[http://gittup.org/tup/ex_dependencies.html](http://gittup.org/tup/ex_dependencies.html)

------
rlpb
If you're interested in a make replacement, check out redo (designed by djb):

[http://cr.yp.to/redo.html](http://cr.yp.to/redo.html)

[https://github.com/apenwarr/redo](https://github.com/apenwarr/redo)

------
amelius
Nice, but a problem with this approach is that strace is not "re-entrant".
That is you can't strace a program that uses strace (or more precisely, the
ptrace syscall).

------
malkia
I believe MSBuild is doing something similar (but on Windows) -
[https://github.com/Microsoft/msbuild/tree/master/src/Utiliti...](https://github.com/Microsoft/msbuild/tree/master/src/Utilities/TrackedDependencies)

~~~
garenp
MSBuild uses Tracker.exe (or Tracker.dll) which depends on Microsoft's Detours
IAT hooking library, which works much differently. Sadly it has to be
purchased separately.

~~~
malkia
Oh, is it hooking certain critical functions related to file I/O? Thanks for
the info!

------
efaref
Every few years someone tries to reinvent make, and they usually end up with
some unholy mess that complicates build systems or causes maintenance
nightmares, and is never quite as good.

It turns out that make's syntax is actually quite simple and appropriate if
you take a day or so to learn it properly. It's such a fantastic tool that if
its syntax was really that problematic, someone would have created a new
syntax front-end to GNU make by now. The fact that they haven't speaks
volumes: by the time you learn enough to be able to do it, you realise you
can't really come up with anything better.

------
imglorp
See also these things. But memoize.py looks more flexible; building anything.

    
    
        ccache - https://ccache.samba.org
        fastbuild - http://fastbuild.org
    

Also for the besodden remaining ClearCase users, they have nice feature which
seems rarely spoken of these days: clearmake and wink-ins.

~~~
garenp
ClearCase has a clearaudit command that can be used to kick off any other
arbitrary command and get dependency information (clearaudit /c <foo>). This
is much better than using clearmake, because you otherwise depend on a users
ability to (correctly) write makefiles upfront.

