Hacker News new | past | comments | ask | show | jobs | submit login

> This generally leads to faster compilation time in part because it aggregates the cost of parsing the same headers over and over.

But this also reduces the opportunity to parallelize compilation across multiple files because they have been concatenated into fewer build units, and each unit now requires more memory to deal with the non-header parts. For some build systems and repositories, this actually increases build time.




> But this also reduces the opportunity to parallelize compilation across multiple files because they have been concatenated into fewer build units (...)

Irrelevant. There is always significant overhead in handling multiple translation units, and unity builds simply eliminate that overhead.

> and each unit now requires more memory to deal with the non-header parts.

And that's perfectly ok. You can control how large unity builds are at the component level.

> For some build systems and repositories, this actually increases build time.

You're creating hypothetical problems where there are none.

In the meantime, you're completely missing the main risk of unity builds: increasing the risk of introducing problems associated with internal linkage.


unity builds do often have worse performance than separate compilation for "incremental rebuilds" during development. that all depends on how the code is split up and how bad of a factor linking is.

as in the article, it's best to support both


You also need to consider that (at least in C++), your own code is just a very small snippet dangling off the end of a very large included stdlib code block, and that's for each source file which needs to include any C++ stdlib header.

For instance, just including <vector> in a C++ source file adds nearly 20kloc of code to the compilation unit:

https://www.godbolt.org/z/56ncqEqYs

If your project has 100 source files, each with 100 lines of code but each file includes the <vector> header (assuming this resolves to 20kloc), you will compile around 2mloc overall (100 * 20100 = 2010000).

If the same project code is in a single 10kloc source file which includes <vector>, you're only compiling 30kloc overall (100 * 100 + 20000 = 30000).

In such a case (which isn't even all that theoretical), you are just wasting a lot of energy keeping all your CPU cores busy compiling <vector> a hundred times over, versus compiling <vector> once on a single core ;)


On very large projects you can always cut them into several libraries, and compile them on different cores. Quite easy to do in practice.


I believe Firefox builds only unify files within the same directory and a maximum of a ~dozen cpp files per unit. So there are still plenty of build parallelism across directories.


Not necessarily - I've been prototyping a fork of tcc that does both. It's multi-threaded rather than multiprocess.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: