Hacker News new | past | comments | ask | show | jobs | submit login

There is another scenario where this is an issue: if this code ends up in a header which is included in a lot of places. You might say "that's dumb, don't do that", but there is a real tendency in C++ for things to migrate into headers (because they're templates, because you want them to be aggressively inlined, for convenience, whatever), and then headers get included into other headers, then without knowing it you suddenly have disastrous compile times.

Like, for this particular example, you might start out with a header that looks like:

    SomeData get_data_from_json(std::string_view json);
with nothing else in it, everything else in a .cpp file.

Then somebody comes around and says "we'd like to reuse the parsing logic to get SomeOtherData as well" and your nice, one-line header becomes

    template<typename Ret>
    Ret get_data_from_json(std::string_view json) {
        // .. a gazillion lines of template-heavy code
    }
which ends up without someone noticing it in "CommonUtils.hpp", and now your compiler wants to curl up in a ball and cry every time you build.

It takes more discipline than you think across a team to prevent this from happening, mostly because a lot of people don't take "this takes too long to compile" as a serious complaint if it involves any kind of other trade-off.




> There is another scenario where this is an issue: if this code ends up in a header which is included in a lot of places.

This is all on itself a sign that your project is fundamentally broken, but this is already covered by scenario b) incremental builds.

Even if for some reason you resist the urge of following best practices and not create your own problems, there are a myriad of techniques to limit the impact of touching a single file in your builds. Using a facade class to move your serialized JSON to an implementation detail of a class is perhaps the lowest effort one, but the textbook example would be something like a PIMPL.

The main problem with the build time of C++ projects are not the build times per se but clueless developers, who are oblivious to the problem domain, fumbling basic things and ending up creating their own problems. Once one of them stops to ask themself why is the project taking so much time to build, more often than not you find yourself a few commits away from dropping build times to a fraction of the cost. Even onboarding something like ccache requires no more than setting an environment variable.


Fundamentally broken, or waiting for modules to become a thing? I tried to use https://github.com/mjspncr/lzz3ᵃ for a few years but it became impractical to me to fiddle with tooling.

a: You don't have source file and header file, you put everything in one file and lzz sorts it out during build.



> CommonUtils.hpp

That's the root cause of the slow build. That file is likely to be depended on by way too many other files, triggering massive rebuilds when unrelated code is modified. The headers should be as granular as possible. Breaking up the generic utils file into many specific files will help contain the damage whenever any given file is changed.

I wish it was possible to track source code dependencies at the function level.


It's not just that. If ALL that was in the header was the function prototype, that adds basically nothing to the compile time. The problem is when you have significant codegen and parsing in headers, like you do with templates and class definitions and stuff like that.

Like, most C projects import enormous headers with gazillions of function prototypes without having a particularly measurable impact on compile times (compile times in C is quite good, in fact!)


Right. For a second I forgot this was a C++ discussion.

Breaking up the headers into granular files should still help reduce the amount of instatiation that's going on at compile time provided there isn't much overlap in the headers included by the source files.


GCC, LLVM, and MSVC++ all support precompiled headers. How often is a unique and minimal set of #includes worth the extra cost?


That helps reduce the cost of parsing the headers but doesn't eliminate the issue. Changing a header triggers a rebuild of everything that includes it. If the header is ubiquitous, nearly everything gets rebuilt.

We want to reduce the set of rebuilt files to a minimum. That means separate headers so that files that use A don't need to be recompiled because B changed and A and B are defined in the same header.

Taking this logic to the extreme would lead to one file per type or function. I've read a lot of code that's structured this way and it works well. Editing lots of small files is a little annoying though. In the end it's a tradeoff.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: