
An optimization to help code compilation times on big CPUs - eaguyhn
https://www.phoronix.com/scan.php?page=news_item&px=Linux-Pipe-Parallel-Job-Opt
======
boshomi
the git commit: pipe: use exclusive waits when reading or writing[1]

[1]
[https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0ddad21d3e99c743a3aa473121dc5561679e26bb)

~~~
georgyo
I am always amazed at the quality of linux kernel commit messages.

~~~
kees99
Indeed. Those two lines in the test-case are particularly excellent:

    
    
      /* 64 processes */
      fork(); fork(); fork(); fork(); fork(); fork();
    

I can totally see most people, myself included, writing a dozen of lines
instead, with a for loop, checking for parent/child/error return code, etc.
Which might be preferable for some code somewhere deep inside the app logic
that will potentially run on a resource-constrained machine. But for the
explanation code to be read by a human, this one-liner is such a good choice!

------
Aardwolf
Now just make "-j" the default and all is good :) Why _shouldn 't_ make use
the available resources by default?

~~~
swiley
I really hate it when people do that and don’t provide a way to disable it.

There are a number of python libraries I can’t install on my laptop easily
because the native code runs out of physical (4GB!) memory spawning compiler
jobs and everything starts thrashing, it’s just dumb to go forking without
asking the user first. People who really want it can give make the -j flag.

~~~
alfalfasprout
Bazel is the worst at this. When you try to eg; build tensorflow from source
it maxes out all the cores, uses up as much memory as it can, and then crashes
when the OOM killer eventually kills it. This is even on a 64GB 64 core
machine.

Manually specifying resources doesn't work well either because it uses 2x+ the
resources you allocate it.

------
ridiculous_fish
Does this affect select() or only read()? Today if N procs select() on a
single pipe, all will be woken up if the pipe becomes readable. Will this now
mean that only one will be awoken?

------
kzrdude
Could the fix have any potential negative effects on other workloads?

~~~
the8472
Some other build systems use the pipe token approach too, rust's cargo for
example.

------
teddyh
GNU Make has been fixed, but I wonder if there are other programs which would
also have similar bugs in their code? I am reminded of when systemd exposed
bugs in system startup scripts, and everybody instead started hating systemd
for breaking their systems.

~~~
sumanthvepa
It's not GNU make that got fixed. It's the pipe system call that Linus
optimized. Make uses pipe as a synchronization mechanism. So all programs that
use pipe this will benefit.

~~~
mmastrac
What the parent comment is referring to is a pipe handling bug in early
versions of GNU make that prevented kernel optimizations from happening
earlier on.

------
chris_wot
Suddenly LibreOffice build time’s increase substantially.

~~~
FartyMcFarter
What do you mean?

~~~
chris_wot
LibreOffice relies extensively on parallel build from Gnu make.

Edit: crap, that autocorrected on my iPhone - that should have been decrease!

------
boris
Good example of how we develop software these days: instead of fixing GNU make
by ripping out that antiquated pipe-based jobserver and replacing it with
proper multi-threaded parallelism[1], we rather optimize the operating system
to make the antiquated stuff work better.

[1] Having hacked on GNU make I know this won't be easy (it's quite a mess).
In fact, it would probably be easier to replace it entirely and fix some of
its other deficiencies in the process.

~~~
temac
I would not call avoiding a classic thundering herd problem a bad choice,
especially when the top maintainer of Linux does it himself in Linux: which is
arguably more probable than him rewriting GNU Make (well we all know he
_could_ do it, but this would still be less probable...)

Plus it will not only optimize GNU Make workloads, but potentially other
programs that do that. And to be honest, playing with pipe for signaling (and
more rarely token counting) is STILL a must-have way to do IPC portably, esp.
if you want to integrate with polling on fds. Various OSes have also various
new things, but they are non-portable -- maybe someone should try to get
eventfd&friends in POSIX?

Anyway, if that's a "good example of how we develop software these days", I
would say we managed to develop software the right way at last, making basic
stuff work well before succumbing to the tentation to rewrite everything all
the time just to play with new shinny techs, getting fresh bugs in the
process.

Also you know what has evolved in the last few years, motivating this perf
fix? Availability of HIGHLY parallel computers for not too expensive. Yes, I
continue to find it highly refreshing that work is done to get all the boring
historical software work better in that context. Maybe you think we can afford
rewriting all the things all the time to follow supposedly radical
technological changes (and using less well designed OSes?), but if so then
this thought is not shared widely or at least not followed with action.
Rightly so because economically the outcome would be doubtful: just look at
the patch, it is way smaller than an hypothetical shiny new rewrite of GNU
Make...

