Hacker News new | past | comments | ask | show | jobs | submit login
Unity builds lurked into the Firefox Build System (serge-sans-paille.github.io)
65 points by sylvestre on May 6, 2023 | hide | past | favorite | 61 comments



Note that this is not referring to the Game Engine Unity. It's just referring to #including .cpp files.


Indeed, the title almost makes no sense grammatically in its current form, a consequence of the word “how“ being removed. It would be obvious it was not the game engine when the word “unity“ appeared as the second, lowercases word.


With the advent of LTO, unity builds are mostly a band-aid for poor management of header files. The Linux kernel project was able to net a ~40% reduction in compilation CPU-time just by pruning the contents of some key header files [1].

It really boils down to two rules:

1. Don't declare anything in header files that is only used in one compilation unit. Internal structs and functions should be declared and defined in source files, and internal linkage used wherever possible. gcc and clang's -fvisibility=hidden is useful here.

2. The more frequently a header file is included (whether transitively or directly), the more it should be split up. If a "common" or "utility" header file is included in 10000 source files, then any struct, function, etc. that you add to that file will have to be parsed 10000 times by the compiler every time you build from scratch, even if only 10 source files actually use the struct/function that you added. gcc and clang's -H flag is useful here.

[1] https://lore.kernel.org/lkml/YdIfz+LMewetSaEB@gmail.com/


I think "just" is perhaps not the right word for something that took a senior dev over a year and more than 2000 commits just to get to an RFC patchset that doesn't compile for all architectures... Tremendous work, but it clearly wasn't easy or a matter of "follow these simple rules".


> unity builds are mostly a band-aid for poor management of header files

That's what its always was about (to improve build times), better optimization is just a welcome side effect. But header hygiene is hard because the problem will creep back into the code base over time.

> The Linux kernel project was able to net a ~40% reduction in compilation CPU-time

Linux is a C codebase. Header hygiene is much easier in C, because C headers usually only contain interface declarations (usually at most a few hundred lines of function prototypes and struct declarations), while C++ headers often need to include implementation code inside template functions, or are plastered with inline functions (which in turn means more dependencies to include in the header). And even if the user headers are reasonably 'clean', they still often need to include C++ stdlib headers which then indirectly introduce the same problem.

For instance your point (2) only makes sense if this header doesn't need to include any of the C++ stdlib headers, which will add tens of thousands of lines of code to each compilation unit. For such cases you might actually make the problem worse by splitting big headers into many smaller ones.

PS: the most effective, but also most radical and controversial solution is also a very simple one: don't include headers in headers.


> This generally leads to faster compilation time in part because it aggregates the cost of parsing the same headers over and over.

But this also reduces the opportunity to parallelize compilation across multiple files because they have been concatenated into fewer build units, and each unit now requires more memory to deal with the non-header parts. For some build systems and repositories, this actually increases build time.


> But this also reduces the opportunity to parallelize compilation across multiple files because they have been concatenated into fewer build units (...)

Irrelevant. There is always significant overhead in handling multiple translation units, and unity builds simply eliminate that overhead.

> and each unit now requires more memory to deal with the non-header parts.

And that's perfectly ok. You can control how large unity builds are at the component level.

> For some build systems and repositories, this actually increases build time.

You're creating hypothetical problems where there are none.

In the meantime, you're completely missing the main risk of unity builds: increasing the risk of introducing problems associated with internal linkage.


unity builds do often have worse performance than separate compilation for "incremental rebuilds" during development. that all depends on how the code is split up and how bad of a factor linking is.

as in the article, it's best to support both


You also need to consider that (at least in C++), your own code is just a very small snippet dangling off the end of a very large included stdlib code block, and that's for each source file which needs to include any C++ stdlib header.

For instance, just including <vector> in a C++ source file adds nearly 20kloc of code to the compilation unit:

https://www.godbolt.org/z/56ncqEqYs

If your project has 100 source files, each with 100 lines of code but each file includes the <vector> header (assuming this resolves to 20kloc), you will compile around 2mloc overall (100 * 20100 = 2010000).

If the same project code is in a single 10kloc source file which includes <vector>, you're only compiling 30kloc overall (100 * 100 + 20000 = 30000).

In such a case (which isn't even all that theoretical), you are just wasting a lot of energy keeping all your CPU cores busy compiling <vector> a hundred times over, versus compiling <vector> once on a single core ;)


On very large projects you can always cut them into several libraries, and compile them on different cores. Quite easy to do in practice.


I believe Firefox builds only unify files within the same directory and a maximum of a ~dozen cpp files per unit. So there are still plenty of build parallelism across directories.


Not necessarily - I've been prototyping a fork of tcc that does both. It's multi-threaded rather than multiprocess.


I used Unity builds for my projects basically forever, at some point I discovered the practice had a name and some debates around it.

It is a simple thing to do, and the gains are substantial, faster and simpler, less maintenance, especially across different platforms.

For big projects I simply cut them into several libraries.

I've seen some incredulous reactions, mostly from young coders, and I know that makefiles should be faster, but in practice I never found that to be true.


"Lurked into"?

You can lurk but surely you can't lurk into something?


I assume the author is not a native speaker based on some of the odd grammar and phrasing in the post. It doesn't really detract from the work

Edit: They appear to be french: http://serge.liyun.free.fr/serge/


The word "lurk" doesn't exactly exist in French. "se rôder" fits in some cases, but another translation "se cacher" (to hide) fits others. I'd write "X crept into Y" as "X s'est glissé dans Y", but that has a connatation moreso as an accidental short-term mistake. I don't know how I'd express the idea concisely in French. Also hard to tell exactly how he wanted to convey it, something between "crept", "were hidden", "were lurking" probably? As I've discovered the hard way, there is not always an analogous term for the same fundamental idea/concept between two given languages; mastering the nuances of these differences is important for proper fluency. I probably make errors like this frenquently writing/speaking French.



This used to be mandatory for nvcc/CUDA, if you had multiple source files (not just headers) you had to #include all of them in your main file. It made me very uncomfortable.


I’ve been out of C/C++ development for a long time but seem to remember that precompiled headers were a thing back in the day. That approach didn’t have the name space issues pointed out here. Why are precompiled headers not used anymore?


As far as I am aware, they never work that great on UNIX compilers, as no big effort was ever spent improving them.

About 20 years ago, on UNIX workloads we used to speed the compilation via ClearMake, a kind of distributed version of code cache that would plug into the compilers, however it has part of ClearCase SCM product.

On Windows, with Microsoft and Borland (nowadays Embarcadero), they work quite alright.

Also, modules will fix that, as per VC++ reports, importing the whole standard library (import std, as per C++23) takes a fraction of only including iostream.


They are still used in some places. But they have some downsides:

Precompiled headers don't play nicely with distributed compilation or shared build caches (which are perhaps the fastest way to build large C++ codebases). So while they can work well for local builds, they exclude the use of (IMO) better build-time optimisations.

They also require maintenance over time- if you precompile a bad set of headers it can make your compile times worse.


They're very much alive and well on MSVC. Our work projects use both unity builds _and_ precompiled headers.


In a project that already has good header hygiene, precompiled headers don't help much to speed up builds. They're just a bandaid when the situation is already completely out of control.


Interesting. I've been aware of this technique for years because of the SQLite Amalgamation, but that was always sold as a way to simplify distribution and perhaps improve performance of the binary. I hadn't considered it as a build speed optimization, though that seems somewhat obvious in hindsight.


> I hadn't considered it as a build speed optimization, though that seems somewhat obvious in hindsight.

Some build systems like cmake already support unity builds, as this is a popular strategy to speed up builds.

Nevertheless, if speed is the main concern them it's preferable to just use a build cache like ccache, and modularize a project appropriately.


Why not both?

Also, does ccache work with MSVC?


> Also, does ccache work with MSVC?

Technically it works, but it requires some work. You need to pass off ccache's executable as the target compiler, and you need to configure the settings in all vsproj files to allow calls to the compiler to be cacheable, like disabling compilation batching.

Using cmake to generate make/ninja projects and use compilers other than msvc is far simpler and straight-forward: set two cmake vars and you're done.


Unity builds mean that you can no longer use internal linkage safely anymore, and that's not something I like to give up. It forces the codebase to follow a certain style that I don't like. Hopefully modules will give the advantage of unity builds without this downside.


Headers and C style macros are probably the most unfortunate aspects of C (and by extension, C++).

So many hacks in compilers to try to work around this. A shame there is no language level fix for this nonsense.

Really wish there could be a C++—- that would improve on C in areas like this, and avoid all the incredible nonsense of C++. And no, not Rust or Go.


Have you tried Zig? I think it fits those criteria, and is known for its good build system, although AIUI it is quite a large language compared to C


> Headers and C style macros are probably the most unfortunate aspects of C (and by extension, C++).

Headers only became a massive problem in C++ because of templates and the unfortunate introduction of the inline keyword (which then unfortunately also slipped into C99, truly the biggest blunder of the C committee next to VLAs).

Typical C headers (including the C stdlib headers) are at most a few hundred lines of function prototypes and struct declarations.

Typical C++ headers on the other hand (include the C++ stdlib headers) contain a mix of declarations and implementation code in template and inline functions and will pull in tens of thousands of lines of code into each compilation unit.

This is also the reason why typical C projects compile orders of magnitude faster than typical C++ projects with a comparable line count and number of source files.


Headers (in new code) will hopefully become optional due to modules. That would be such a big boost to the language.

C Macros are pretty much considered code smell in C++, right?


I think Hare (https://harelang.org/) might fit the bill: it retains the minimalism and simplicity of C, but fixes issues like this (and others). Unfortunately I don't think it's ready for real use yet, but I am keeping an eye on it.


We leverage many third party C++ libraries with complex templates, concepts, and constexpr expressions that seem to require lots of CPU to compile. We've found unity builds to be almost 3X faster, so we make it the default for both developer and CI jobs.

But we keep a separate CI job that checks the non-unity build, so developers have to add the right #include statements and can't accidentally reference file-scoped functions from other files. While working on a given library or project, developers often disable the unity build for just that project to reduce incremental build times. It seems to offer the benefits of both approaches.

Precompiled headers don't give nearly the same speedup. We're excited for C++ modules of course, but we're trying to temper any expectations that modules will improve build speed.


The compilation-unit-per-file model (and in fact the whole concept of linking) are a legacy incremental build solution for C which somehow metastasized into fundamental requirements of building software on current OSes. It is an atrocity and should be disavowed by all developers.


I always use unity builds for all my projects now. That combined with using tcc as compiler (for C code) makes builds really fast. Another nice feature of unity builds is that I don't need to declare functions twice and keep the declarations synced. It's also nice to only have one place to find information about a function; people often put comments in header files that you can miss if you go to the definition.

All of those things combined make C programming more enjoyable.


> Another nice feature of unity builds is that I don't need to declare functions twice and keep the declarations synced.

What exactly leads you to have multiple declarations in sync, and thus creating the to "keep [multiple] declarations synced"?


I mean if you use multiple translation units and header files, you need to have a copy of the function declaration in that header file to be able to call it from other translation units.


Why can’t static analyzers analyze the main cpp that #include-s the actual code? I don’t understand that point.

And what were the resulting affects on build times?


They can. In addition to that, they get confused by a .cpp that isn't the top level file of a compilation unit.


Lots of game studios use Unity builds like this. It saves a massive amount of time. Last I heard it also improves Incredibuild performance which is another popular tool for decreasing build timed.


Another benefit not mentioned is optimization. The compiler may be able to inline more function calls when function definitions and callers are in the same unified compilation unit.



The post was renamed, the URL changed accordingly: https://serge-sans-paille.github.io/pythran-stories/how-unit...

(HTTP 301 on the old URL would have been appreciated)


To avoid some of these issues, it can be helpful in a project to require that all files including header files must be compilable on their own. Doesn't get rid of all the problems (you can still depend on transitive includes without explicitly including them) but enforces a minimal amount of code hygiene.


Tried Unity builds recently for ClickHouse, but without success: https://github.com/ClickHouse/ClickHouse/pull/18952#issuecom...


It's another reminder how <expletives omitted> C++ is, but for a long time nothing better existed.


Been using unity builds forever. The trick is to also have a standard build and try to compile it every week or so to catch anything that might have been missed, like a source file that is missing a header and wont compile alone because it got the header through the unity build ordering.


Compilation units are a relic of a time where computers only had a few KB of memory. At this point computers are fast enough and have enough memory to compile the whole thing in one go faster than whatever gains doing change detection and linking will have.


This, is just deeply untrue. Do you really think everybody working on compilers and linkers are deeply ignorant? I can easily saturate my 64 GB RAM home setup during a compile.


While everyone else is (rightfully) correcting you, I am curious what sort of codebases you’re working with?

Are you working on large compiled software? Any game, rendering engine or large application benefits from compilation units in my experience.

Some of my libraries that I work with take upwards of an hour for a fresh compile. Having sane compilation units cuts down subsequent iteration to minutes instead.


Yeah, no. To this day Firefox developers building Gecko need a beefy desktop machine to be able to do it in a reasonable amount of time. I could do a clean build in 6 minutes with a ThreadRipper whose cores were all pegged, but forget doing the same in under an hour on a laptop.

And that was with unified builds enabled.


And more importantly, on that same machine the build would take more than ten minutes without unified builds.


Hahaha tell that to my Unreal Engine build times.

A brand new AMD Epyc, 64 core machine, will take over an hour to compile. Good times.


I’d really like to see a comparison someday between Epics weird C# based build system and something like CMake+Ninja.

I suspect there’s compilation optimizations to be made, but I don’t think it would save more than 30% here and there.


> I suspect there’s compilation optimizations to be made

There definitely are. I've spent a lot of time with UBT, and a "reasonable" amount of time with cmake and friends. UBT isn't quite the same as CMake + Ninja. UBT does "adaptive" unity builds, globbing, and a couple of other things.

> but I don’t think it would save more than 30% here and there.

Agreed. The clean build with UBT is painfully slow compared to Cmake + Ninja, but the full builds themselves are pretty good, and I'd bet that there's probably less low hanging fruit there.

I did a good chunk of work on improving compile times in Unreal, and there is definitely just low hanging fruit in the engine for improving compile times. Some changes to how UHT works around forward declares would also make a significant difference too.


The big issue, in addition to speed, I had with UBT was how difficult it was to debug when it did the wrong thing. Often this was when having to adopt new Xcode versions, where CMake gave a lot of escape hatches to adapt it whereas UBT required spelunking.

At some points, there’s multiple layers of historic cruft that just seem arcane.

Last year, epic released a video where an engineer went through it and even they hit points where they said: “I have no idea what this area of code does”


No disagreements there. Spelunking is a great word for it, but spelunking is a requirement for most "deep" unreal engine development. On the other hand, its incredibly empowering to switch your ide to build UnrealBuildTool, and put "My project Development Win64" as the arguments and be able to debug the build tool there and then to see what it's actually doing.


That’s true. I should give it a go again now that Rider is available. It’s been a huge QoL improvement in the rest of my Unreal/Unity development work.


I would as well! It’s honestly a bit beyond me, the Unreal build tools run deep, so I imagine it would take some effort.


Why do clean builds of my code take like 30m then?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: