Hacker News new | past | comments | ask | show | jobs | submit login
Zapcc – A caching C++ compiler based on clang (github.com/yrnkrn)
102 points by turrini on June 17, 2018 | hide | past | favorite | 26 comments

Could someone give us more of the story around ZAPCC ? I see that it is a C++ compiler that is based on clang, and that it only works well on Linux. Could someone (perhaps turrini) share some comparison tables of the relative speed, size, disk usage, of this compiler compared to clang and gcc ?

This post by the developer on the CFE-dev mailing list is a good start - http://lists.llvm.org/pipermail/cfe-dev/2015-May/043155.html

Explanation why fork of clang was required (as opposed to just improving clang) would also be interesting.

I evaluated zapcc at one point. Its a fork of clang because the changes are very intrusive to clang, and probably would have had a hard time getting accepted in to trunk without a ton of debate. I believe it actually daemonizes in to a long running server process (zapccs) that then holds the instantiated templates in memory, communicating with the zapcc compiler process over IPC. I think that most clang maintainers would have found that radical of a rearchitecting kind of controversial especially from an outsider to the project.

> especially from an outsider to the project.

That should really not be a criterion.

Sure it should. It's harder to work with people you haven't worked with much before, because you haven't had-then-dealt-with as many miscommunications and such.

I mean sure, it shouldn't change the validity of the technical design, but when it comes to implementation (i.e. merging the changes/redesign and modifying the roadmap and such accordingly), it has every reason to.

How else will the project know that the major changes will be maintained? Having code in a project without an owner will just lead to bugs and bit rot.

So now not only projects have to be maintained, we also have to maintain changes? I'm seriously getting sick at the notion that nothing can ever be finished.

Bugs? That will be those introduced by the change, nothing more. If there is any bug left, that's only because nobody discovered them yet. Having a maintainer won't change that, only usage reveals bugs. (Unless of course the "maintainer" is instead tasked with finding bugs in the first place).

Bit rot? That's only an issue when the environment around the code changes. Clang is a compiler, there's not much it cares about its environment. And if you're talking about changes within Clang that could have an effect, and those are well within the maintainer's control. We shouldn't need a maintainer for every patch.

This change in particular doesn't seem to introduce any new feature. I'm guessing it only allows Clang to run faster. The new code path replaces the old, so it shouldn't require much more maintenance.


More generally, we should do away with the notion that everything should be maintained, forever. We should be able to code correct programs. We should have environments stable enough to preserve that correctness. And we should stop believing we need so much code in the first place. Let's not kid ourselves, programs "require maintenance" (euphemism for "aren't finished"), mainly because they're so damn big. We can do better, really.




> Explanation why fork of clang was required (as opposed to just improving clang) would also be interesting.

well, they wouldn't be able to sell it if they had just been pushing patches to clang.

It appears to be a C++ compiler attempting to keep parse tree from templated code in-memory between translation units.

That said, I am not involved and don't use.

Does ZapCC provide libclang.so which utilizes this caching?

I noticed that parsing heavy-templated files with clang-based tools is also very slow which probably means that some kind of template instantiation (or other processing step) is being made. These tools could greatly benefit from any speedup.

While reported 2x average speed-up may be not big enough for me to consider ZapCC for offline compilation, 2x less time to get list of completions in Clang-based IDE is something I would be very happy to get!

> in-memory compilation cache

I'd like to see a persistent cache. In-memory doesn't work well for the "occasionally recompile Firefox, LLVM/clang, WebKitGTK, etc." desktop use case...

Bazel is exactly this (and amazingly good at it), a cache of your entire build tree and all its dependencies. But it is a full blown build system that you need to buy into. You can’t bolt it on to whatever you are using like ccache.

This new zapcc thing seems to be more granular than ccache, it caches individual template instantiations. Bazel can't do that.

Bazel supports persistent compilers with its worker protocol [0]. It's already supported for javac. It may be worthwhile to investigate how well zapcc can integrate with Bazel.

[0] https://blog.bazel.build/2015/12/10/java-workers.html

Isn't that what ccache provides? Admittedly it's not perfect but it sounds like it might help your use case. https://ccache.samba.org

This new zapcc thing seems to be more granular than ccache, it caches individual template instantiations

No, it is close to what LTO, parse caches and precompiled headers provide. No need for daemons to get the same result.

How is zapcc better than ccache?

ccache works at the translation unit level, caching object files. Zapcc caches (among other things) template instantiations between compilations.

If you are in a codebase that has some very complex header files with lots of (potentially nested) template instantiations, and then a large number of executables that all instantiate these templates, Zapcc can make a huge difference in compilation times. CCache doesn't really buy you anything in this scenario.

ccache does not speed up full build. zapcc does.

What do you mean by "full build"?

zapcc speeds up the build even if you changed a #define across the whole project or added a line to a header that you use everywhere leading to recompiling of all your .cpp files.

Thanks for the open source release! I tested it with their commercial version before and it was great for C++ projects.

How does this differ from PCH?

Disk I/O. This stays in memory between compilations.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact