
Redo, buildroot, and serializing parallel logs - dankohn1
https://apenwarr.ca/log/20181106
======
emelski
If you're looking to serialize parallel build logs _without_ changing to an
entirely new build tool:

1\. Electric Make, a high-performance reimplementation of GNU make and ninja,
has properly serialized build output logs since it was introduced in 2002
([https://electric-cloud.com/products/electricaccelerator](https://electric-
cloud.com/products/electricaccelerator)) ( _disclaimer: I 'm the chief
architect for ElectricAccelerator, of which Electric Make is a component_).

2\. I published a technique on CM Crossroads you could use with GNU make 3.81
to descramble parallel build logs in 2009. The article has moved around since
then but these days it seems to be found at
[https://www.cmcrossroads.com/article/descrambling-
parallel-b...](https://www.cmcrossroads.com/article/descrambling-parallel-
build-logs).

3\. The maintainers of GNU make took the concept described in that article and
baked it into GNU make itself in 2013 for version 4.0
([http://git.savannah.gnu.org/cgit/make.git/tree/NEWS?h=4.0](http://git.savannah.gnu.org/cgit/make.git/tree/NEWS?h=4.0))

~~~
apenwarr
That is a good article, but it doesn’t seem to address several of the concerns
raised in my redo article. In particular, it looks like the final output is
still in a nondeterministic order, it doesn’t print output from a given target
incrementally while it runs, and you can’t query the log retroactively to look
for clues in a very big run.

It’s a big advantage to not have to change tools, of course. Although redo
happily interoperates with makefiles (which is how the buildroot patch works)
so there’s no need to convert everything to get most of the advantages.

~~~
emelski
Correct, both the technique described in that article and the feature that
eventually wound up in GNU make only disentangle output from concurrently
executing build processes. With significantly more work GNU make could
probably be made to enforce a deterministic order.

However, Electric Make _does_ emit the output in a deterministic order, in
fact in exactly the same order as the build would have produced had it run
serially. In practice the delayed output is not such a big deal -- people just
don't really seem to care that much, when the build overall finishes 20x - 30x
faster than it used to. Electric Make also generates an _annotated_ build log,
essentially an XML-marked-up version of the log, which contains a tremendous
amount of additional information about the build and vastly simplifies
debugging, actually.

------
rhymenoceros
> I'm sure almost everyone reading this thinks I'm hopelessly pedantic to care
> so much about the sequence of lines of output in my build logs.

Not at all! In fact there are other tools in other domains that could benefit
from careful thought about how to interleave parallel outputs in a way that
favors interactivity/immediate output but preserves (effective) ordering
[1][2]. This is a tough problem a lot of tools just give up on, but the status
quo of "buffer everything until it is completely done" grates against one's
senses.

[1]
[https://github.com/saltstack/salt/issues/22733](https://github.com/saltstack/salt/issues/22733)

[2]
[https://github.com/ansible/ansible/issues/3887](https://github.com/ansible/ansible/issues/3887)

~~~
apenwarr
Wow, those are quite fascinating use cases I hadn't considered. I'm actually
curious about things like ansible; what exactly does it do that you couldn't
just do with a bunch of redo scripts? Or alternatively, what if you ignored
ansible's parallelism and just ran ansible once per host, from redo?

~~~
rhymenoceros
A lot of devops tools, including Ansible and Salt, have higher-level concepts
of ordering across hosts. In some cases this is implicit in the way the tool
operates and people just take advantage of it, but in others (like Salt's
orchestrate runner) the ordering of operations across hosts is explicitly
defined by users. That can be useful for scripting blue/green deployment
cutovers, pushing out new load balancer rules, etc.

> I'm actually curious about things like ansible; what exactly does it do that
> you couldn't just do with a bunch of redo scripts?

That might be more of a philosophical question :-) Most of those tools _could_
be replaced with a sufficiently intelligent set of scripts. I think they get
their popularity from providing more convenient syntax, building in
parallelism for applying changes across hosts, offering cross-platform ways of
doing common admin actions like creating accounts, etc.

But if you look at them the right way, they definitely share a lot in common
with build systems, particularly when it comes to managing dependent actions.

~~~
JdeBP
Several things can be operated like build systems, and this has come up on
Hacker News several times. I, for example, maintain my Debian package
repository and my GOPHER site using redo, and use redo to import external
software configuration into my service management.

* [https://news.ycombinator.com/item?id=14837740](https://news.ycombinator.com/item?id=14837740)

* [https://news.ycombinator.com/item?id=14928216](https://news.ycombinator.com/item?id=14928216)

* [https://news.ycombinator.com/item?id=14836340](https://news.ycombinator.com/item?id=14836340)

* gopher://jdebp.info/h/Softwares/nosh/guide/external-formats.html

* [http://jdebp.eu./Softwares/nosh/guide/external-formats.html](http://jdebp.eu./Softwares/nosh/guide/external-formats.html)

------
JdeBP
My implementation re-uses the MAKELEVEL environment variable to track
recursion, making it simpler in the cases where one has GNU make calling redo,
or indeed redo calling GNU make. (BSD make unfortunately uses a different and
undocumented variable.)

------
carapace
Redo is one of the very few things that ever made me think it might replace
make. This implementation is pretty awesome.

