
Parallel GCC: a research project aiming to parallelize a real-world compiler - matt_d
https://gcc.gnu.org/wiki/ParallelGcc
======
modeless
There are so many tools to parallelize compilation, but comparatively little
effort has gone into reducing the vast amounts of redundant work that
compilers do. A staggeringly large percentage of compilation time is simply
wasted duplicating work that was already done hundreds of times over.

Tools like ccache are really primitive compared to what is possible. Change
one bit in a header file and you have to recompile your whole project, even if
the output ends up being bit-identical.

Zapcc[1] is more advanced, but still far from ideal. What I really want is a
compiler that can incrementally update the binary as I type code, and can
start from a snapshot that I download from someone else so that nobody ever
needs to do a full recompile. It would require a radically different compiler
infrastructure, but I don't see any reason why it would be impossible.

[1] [https://github.com/yrnkrn/zapcc](https://github.com/yrnkrn/zapcc)

~~~
kachurovskiy
First time I spend a day on a partial-compile-only bug I switch if off and
it's good old full recompilation from that point :)

~~~
slacka
Why not just do a full, clean build when you hit strange bug? If your workflow
is anything like mine, partial-compile and tools like ccache have saved months
of hours I would have wasted waiting for the compiler.

~~~
NullPrefix
>when you hit strange bug

Any bug is strange until you figure it out.

------
ndesaulniers
I look forward to this. One that that will be important for reproducible
builds is having tests for non determinism. Having nondeterministic code gen
in a compiler is a source of frustration and dispair and sucks to debug.

~~~
earenndil
My understanding from the article is that the code gen will still be
deterministic--independent operations will be performed in parallel instead of
in sequence, but data dependency will still be respected.

------
pjc50
The old-school approach to this was "distcc". At the company where we used it
for C++ we had a small compile farm and usually did "make -j 50".

~~~
pstrateman
distcc can be useful, but requires that the local build environment matches
the remote 100%.

~~~
IcePic
No it doesn't, it sends preprocessed files just to not depend on remote env.

~~~
cat199
> it sends preprocessed files just to not depend on remote env.

to then be compiled by the host compiler ..

~~~
IcePic
Sure, one compiler can't be egcs 2.95 and the other clang 8 (at least not in
most cases) but the target compiler system doesn't even have to have include
files, libs, libfoo-devel packages installed or they can be, but from old
versions. As a proof of concept I once had a linux with gcc4.2.1 make c->.o
files that later ran on OpenBSD. Would I trust that resulting binary? Sure
not.

But in cases of "lets spin up 10 compile boxes with <same ubuntu within a week
of updating>" or have something along the lines of Xcode which allowed you and
your colleagues to help out each other with small work units in order to make
a single compile run quite much faster, that is a definite possibility.

For that final release build, you might consider running it on one of those 10
VMs in the example above and have 'only' 9 others help out with the sub-parts
in order to get some kind of .. guarantee.

If it takes a minute for a full build on one box, getting a hint after 10
seconds that you misspelled something and that it won't ever link decently is
worth something too if you value your time as a developer.

Not saying it's perfect, only chipped in on the "must be 100% or it can't ever
work", hopefully without needing to go into the "coach says we need to give
110% this game, and 120% if it's the finals" nitpicking.

~~~
boring_twenties
This doesn't work for C++, as there is no standard ABI and thus no safe way to
link object files from different compiler versions into the same binary.

------
andyayers
FWIW, MSVC has had this ability for a few years now. It helps with both normal
compilation unit compiles and with link-time code generation. See for instance
Bruce Dawson's notes: [https://randomascii.wordpress.com/2014/03/22/make-vc-
compile...](https://randomascii.wordpress.com/2014/03/22/make-vc-compiles-
fast-through-parallel-compilation/) .

------
nwmcsween
OT but why isn't there a compiler that simply does absolutely basic passes to
get the IR into a 'normalized' form then apply optimizations based on a
'database' of super optimized 'chunks'? For unoptimized parts run the super
optimizer.

~~~
exikyut
So basically a code lowering engine that works a bit like
[https://en.wikipedia.org/wiki/Hashlife](https://en.wikipedia.org/wiki/Hashlife).
Interesting!

(NB. Googling "lowering" turned up the seemingly-random
[https://news.ycombinator.com/item?id=14422944;](https://news.ycombinator.com/item?id=14422944;)
that page isn't such a bad set of starting links, so I figured I'd include
it.)

~~~
Eug894
Please, read this idea regarding use of superoptimizers:
[https://drive.google.com/file/d/1GSv89tiQmPDcnFEu4n4CqfaJcUJ...](https://drive.google.com/file/d/1GSv89tiQmPDcnFEu4n4CqfaJcUJxVmL5KrSCJ047g4o/edit)

------
NilsIRL
The use case is really niche.

The only time it comes in handy is when:

1\. You have enormous files AND 2\. You have a little number of files.

~~~
ris
I'm not sure about that. The traditional way of exploiting parallelism for
large builds is to work on a per-file basis. Want more parallelism? Fork more
jobs. But this can fall apart in a couple of ways:

1\. It can get extremely expensive in memory use. Your parallelism limit can
be dictated by the memory your machine has. Spending your available cores over
fewer simultaneous compilation units could also have benefits arising from
data locality, even in situations where a machine's memory headroom isn't a
concern.

2\. Many large build processes have serialized steps which are currently
unable to benefit from any form of parallelism. This is becoming even more the
case with the rise of IPO & LTO.

But as ever, we'll probably have to wait to find out how large the practical
benefits end up being from this work.

------
loeg
Huh, I thought compilation was already quite parallel even on many core
machines. I guess that's only true for larger software projects (or ones with
many small compilation units, at least). The step that could use some TLC and
increased parallelism is _linking_.

On the other hand, maybe this project will help improve whole-program LTO link
time. I see that is mentioned as future work for this effort:

> Parallelize IPA part. This can also improve the time during LTO compilations

Big kudos to Giuliano Belinassi, who seems to be the one driving this effort.

------
ape4
A tricky job to say the least

------
The_rationalist
Llvm already has one process per compile unit I believe.

~~~
ndesaulniers
Which is not parallelization. There's always the question of should I
parallelize my compilation at the compiler level, or the build level (ie.
Multiple translation units in flight). I think there's room for both, so you
hopefully can get faster incremental compiles of a few number of TUs, but
still have the old data level parallelization of multiple TUs.

~~~
blattimwind
From my observations multi-threading the linker would be more useful, since
for small changes in bigger projects the linking step usually takes a lot
longer than compiling a couple units.

~~~
tacostakohashi
There exists a multi-threaded linker called gold.

~~~
ndesaulniers
LLD (llvm's linker) is also threaded but much faster than gold.

------
coldtea
> _This page introduces the Parallel GCC -- a research project aiming to
> parallelize a real-world compiler. This can be useful in many-core machines
> where GNU Make itself can not provide enough parallelism_

Huh? One wants a fast compiler even for a single compilation, not just for
multiple ones that parallel make can handle as implied here...

