
Zapcc – A faster C++ compiler - ingve
http://baptiste-wicht.com/posts/2016/11/zapcc-a-faster-c%2B%2B-compiler.html
======
david-given
This is based on clang --- it's not a new compiler.

<tangent>

Are the any working open source C++ compilers which _aren 't_ based on gcc or
clang? I know of TenDRA, which appears to have ceased to exist and was always
kind of incomprehensible and non-working and apparently it never got as far as
STL support; there Path64, whose github page no longer contains the compiler
repo; there's Open64, whose website no longer exists (although there does seem
to be a daughter project, OpenUH, which did a release last year)...

Is there anything else?

~~~
cmrdporcupine
Not at all open source, but Intel's ICC is free.

~~~
int_19h
Ditto Digital Mars C++ (nee Symantec C++, nee Zortech C++). But as I recall,
like most other "alternative" C++ compilers, it's also stuck in the C++03
land.

~~~
WalterBright
That's because I work on D now :-)

~~~
PeCaN
Which happens to have really great compile times, in spite of using even more
metaprogramming than C++. Thank you for your work!

I was pretty skeptical of D for a while, mainly because of the GC, but I've
started to take a liking to it. It's a very useful tool.

~~~
WalterBright
I've been slowly converting my older C/C++ code still in use into D. Next up
is the dmd back end!

------
mhd
> In conclusion, we can see that zapcc is always faster than both gcc and
> clang.

For testing his template-heavy library. A very small data set, and while
templates are one of the problems when it comes to C++ compilation speed, it's
certainly not the only one.

Let's see the differences when compiling Firefox or the whole KDE suite.

~~~
olegkikin
On the main site they have all kinds of examples.

[http://www.zapcc.com/benchmarks/](http://www.zapcc.com/benchmarks/)

~~~
santaclaus
Weird that the run-time benchmarks are on different codebases than the
compilation benchmarks.

------
DannyBee
So, while neat, pretty much all of this would be solved by C++ modules and
precompiled modules.

(and in fact, is, based on what i've seen. But i still hope these guys get to
market and make some money before that takes over the world, because i know
how hard it is to do what they are doing :P)

~~~
ndesaulniers
Is google3 converted yet? ;)

~~~
DannyBee
Not fully, but it's definitely well on the way :)

------
jdright
I prefer external caching/distributed solutions like ccache/sccache/sndbs than
these private forks of clang. Less risky and up-to-date.

~~~
thechao
One thing that always confused me about ccache was that it doesn't cache lib
generation and executable generation. I know that the authors have insisted
(until they were blue in the face) that supporting lib/exe caching would
require rewriting ccache... I just don't understand _why_. Once you know about
about -frandom=0, and you've removed all aspects of non-determinisim (__TIME__
& co.), then all that's left is the moral equivalent of `dwarfdump -u <exe>`
for each compiler, and you're good-to-go for deterministic caching.

~~~
JoshTriplett
The architecture of ccache maps preprocessed sources to object files; it uses
the compiler to preprocess the source, and it hashes the result, knowing that
nothing other than the compiler, command line, and preprocessed source
determines the output.

Linking involves far more complexity, with the input files harder to
determine. There's no equivalent of -E or -fdirectives-only for linking. A
ccache for linking would have to identify and hash all library, object file,
and linker script inputs, including those pulled in indirectly by linker
scripts, in addition to the toolchain, the command line, and any linker
plugins.

It's absolutely possible, and I'd love to see someone do so, but it seems
significantly harder than caching compilation.

You'd also want to time the result, and figure out how long the reading and
hashing takes compared to linking. ccache misses take only slightly longer
than a normal compilation; link-cache misses may take _much_ longer than
normal.

On top of that, unlike a compilation cache that seems very likely to hit on
the 99% of files not changed in a build, a linker cache would only hit when
_absolutely nothing_ has changed in the entire build. It might help for a
project that links numerous tiny libraries or binaries (which seems relatively
uncommon), but for a project that primarily builds a single library or binary,
it'd only help if you rebuild entirely identical sources twice.

(It might, however, speed up Linux kernel builds if you've only changed the
code for a couple of modules and not anything in the core kernel.)

------
jeremiep
I would love to see a comparison of the performance of compiled programs.

If zappcc creates slower executables but spitting them out faster, it could be
used during development to speed up iterations. And if the executables are
faster, I'm very curious as to how they achieved both faster compilation and
faster runtimes.

~~~
userbinator
I think a good example of this is tcc, one of the fastest C compilers I've
seen --- because it doesn't do much optimisation at all and is single-pass, it
can generate code as it parses, but the output is dismally inefficient.

Another example of ultrafast compilation is Delphi, but once again the
generated code looks more like a dumb line-by-line translation with plenty of
redundant and unnecessary instructions (making _de_ compiling interesting in
that it easily produces something quite close to the original source.)

~~~
greglindahl
You might want to compare tcc to just about any compiler with -O0 -- they're a
lot faster if they are allowed to generate slow code. It's also super-
straightforward to find compiler bugs, if you're lucky enough that it's an O0
bug!

------
the_duke
Mhm. I'm confused.

The value of a caching compiler should really become apparent in incremental
builds, as in rebuilding after changing a single file. Yet the author talks
about "not seeing any improvements".

Like he said, he might be doing something wrong.

The speedup observed anyway might come from the compilers having to
rebuild/instantiate templated code everytime it's included.

~~~
DannyBee
"The value of a caching compiler should really become apparent in incremental
builds, as in rebuilding after changing a single file. Yet the author talks
about "not seeing any improvements"."

There are millions of reasons this may not be true in C++. For starters, the
use of time and date macros, etc.

Without precise dependency tracking of what source lines are dependent on what
macros (which is super hard, and i don't think they do), which is not usually
what is done (dependency tracking is often much more coarse grained), you may
not see an improvement.

VisualAge C++ was one of the best incremental C++ implementations i ever saw,
and even it did not get to this level.

~~~
pjmlp
> VisualAge C++ was one of the best incremental C++ implementations i ever
> saw, and even it did not get to this level.

How did it compare with Energize C++?

I only know both from magazines during those days, even though someone
uploaded an Energize video to YouTube.

In regards to VisualAge C++ I think only those of us that were active back
then can remeber anything about it. Besides the magazines I had with the
product review, I never seen much information being posted on the Internet.

~~~
DannyBee
I never used Energize C++.

I remember looking at the code to visualage C++ when i was at ibm about 12
years ago.

It was fairly impressive (it built a database of the program with fairly fine
grained dependency tracking), at least to younger me.

I don't know if people today would have thought it was a mess or not :)

------
hokkos
From here :

[https://www.zapcc.com/faq/](https://www.zapcc.com/faq/)

How will you license Zapcc?

Zapcc will be available under a commercial license from Ceemple Software Ltd.

------
santaclaus
Hell, even if this only gives a speedup with template heavy code I'm on board.
We use a number of header only, template meta programmed to death libraries
(RapidJSON, Eigen, ViennaCL), and compilation speed improvements would be a
huge productivity boost.

------
mrich
I would love to see a C++ compiler implement multi-core optimization and code
generation for template instantiations. I feel this can be a big win for cases
where you instantiate a template n times and they are basically all
independent from each other.

~~~
DannyBee
clang/llvm are working on it. For codegen anyway.

It's not clear that it will help with template instantiation, because it's
fairly hard to parallelize.

Certainly, possible, but very hard to do "optimally" (IE by sharing work
instead of duplicating it). Given any initial work is likely to have to do
duplicate work per thread, this usually cuts into your speedup quite a lot.

Codegen, on the other hand, is pretty much fully parallelizable. This is the
whole reason thinlto exists.

------
halayli
Why not use clang's PCH (precompiled headers) feature?

[http://clang.llvm.org/docs/UsersManual.html#precompiled-
head...](http://clang.llvm.org/docs/UsersManual.html#precompiled-headers)

~~~
wichtounet
Unfortunately, it's quite inconvenient to use precompiled, there are a lot of
limitations in each compiler. For instance, only the first header include a
source file can be precompiled and other things like that, this is a more
general approach. But PCH can bring a really good speedup too and are free ;)

------
midnightclubbed
Curious as to how much faster this makes standard workflows where you are re-
compiling only a few files and the linker is typically the bottleneck. I use
incredibuild in my day-job it's a great help when doing full re-compiles or
changing a pervasive header but offers no help on smaller builds where it
doesn't parallelize the link. Zappcc doesn't look to do anything with the
linker either.

~~~
mrich
You can try gold with parallel and/or incremental linking (although
incremental linking didn't work on a large library where I would've needed it
most)

------
trymas
non open source, based on clang, but faster?

in the perfect world I would like that everyone who uses clang would freely
benefit from this improvement. should some big and good corporations just buy
out those guys?

~~~
greglindahl
The entire point of using a BSD/MIT-style license is that it gives "freedom"
to a different set of people than the GPL does. In this case, why is it not
perfect for someone to invest money building a commercial product on top of
clang? Isn't it just fine that they want to make a lot of money?

------
dman
Applied for the beta program day before yesterday, hope I get in.

------
faragon
TL;DR: not a new compiler, but one based on clang plus some
tuning/optimizations.

------
winter_blue
If Clang had used a GPL-like license, Zapcc would have been forced to share
all their modifications to Clang with the whole world, and we would've all
benefited from it -- and maybe the optimizations would even have been merged
back into the mainline of Clang.

But as it stands now, this is a closed source product, that you to buy:
[https://www.zapcc.com/buy-zapcc/](https://www.zapcc.com/buy-zapcc/)

~~~
haberman
Why do you think the Zapcc developers would have worked for free on this? It
seems pretty clear that they developed the software because they thought they
could make some money off it.

And thanks to the fact that they did this, we now know it is possible.
Competition will hopefully motivate the Clang developers to develop similar
performance improvements in the mainline of Clang. Everyone will benefit.

A Clang user is certainly no worse off than they were yesterday.

~~~
bonzini
Hmm, what about contracting? Exactly what Cygnus did with GCC.

~~~
hayd
or even GNUPro? \s

