Hacker News new | past | comments | ask | show | jobs | submit login
Everything You Never Wanted to Know About CMake (izzys.casa)
138 points by pmoriarty on Feb 3, 2019 | hide | past | favorite | 87 comments

Turns out I'm not alone who spent considerable time fighting with CMake configuration issues. I ended up patching it to support a basic debug server protocol, so you could step through the CMakeLists.txt files in a debugger. In case anyone's interested, here's the CMake fork with debug support [0] and there's a detailed tutorial [1].

[0] https://github.com/sysprogs/cmake

[1] https://visualgdb.com/tutorials/cmake/debugger/

Have you talked to the CLion team about this, whether to integrate it into the product or make a plugin for it (if such a thing is possible)?

Given how CMake centric CLion is, and how abusively dumb the CMake DSL is, your project sounds like a great thing to teach CLion about.

CMake is awful, and the fact that the C++ community seems to be settling around it will be a massive disadvantage in the long run. Why do you think so many WASM examples use Rust? It's because Cargo is a sane build system that makes cross-compilation easy.

Issues with CMake:

- The DSL is not very good. It needs proper functions, for one.

- Non-hermetic builds (also mentioned in this thread)

- No ability to easily query the build DAG

- Headers are not modelled properly (they should be a dictionary of paths -> paths, not a list of include directories)

- Build folders cannot reliably be used between configurations, leading to confusion and cache misses

- Everything is convention driven. It does not model the build graph properly.

- Globs do not work properly (maybe that has changed recently?)

- The cache is not portable across a network, or even between folders on the same machine

The C++ community deserves better!

Others in this thread have mentioned some modern alternatives:

- Buck Build (Facebook, Uber, AirBnB)

- Bazel (Google)

- Pants (Twitter)

- Please (Thought Machine)

It doesn't actually matter which of these succeeds. They all model the builds in such a way that you can easily transpile between them.

Buck, Bazel, Pants, Please, Meson, Gn, SCons. I haven't even finished reading comments in this HN entry and I'm already lost

My only experience using Bazel involved discovering that build instructions from August have already been broken by deprecations. It was not a good first impression.

- Headers are not modelled properly (they should be a dictionary of paths -> paths, not a list of include directories)

What do you mean by this exactly? Perhaps you are thinking of the old include_directories() function rather than the now-recommended target_include_directories()? This attaches one or more include directories to each target rather than having a single global list of all include directories. You can indeed have a target for each directory of your source files (this is also recommended practice).

This even works for imported targets. These are targets that represent existing prebuilt libraries on your system, and they are increasing returned by find_package in place of the old pair of ${FOO_LIBRARIES} and ${FOO_INCLUDE_DIRS}. Internally they call target_include_directories() with PUBLIC (or INTERFACE) modifier that means that those directories will be inherited by other targets that link against this one.

The idea (only implemented in Buck AFAIK) is to model not a list of include directories, but a mapping from the path specified in your "#include" to the actual file on disk.

So your code might look like this:

    #include <foo/bar.hpp>
The project folder structure might look like this:

    └── src
        └── include
            └── foo
                └── bar.hpp

And the mapping might look like this:

      "foo/bar.hpp": "./src/include/foo/bar.hpp"

This enables the build system to know exactly which headers are meant to be exposed by a library to its dependees. The build system can then tell you:

(1) Exactly where a header comes from

(2) Exactly which headers a library exports

(3) If two libraries will collide in terms of headers

(4) If someone is trying to use a header that is not explicitly exported by a library.

(5) If some is accessing a header in the wrong way (e.g. abusing the layout of source-files and not using the correct include-path)

This approach scales very well. Constructing the mapping can be done via globs (globs work properly in Buck), so most projects just do:

    exported_headers = subdir_glob([
      ('include', '**/*.hpp'), 
Additionally, the RHS of the mapping might be another build rule, thus supporting generated headers in hermetic builds.

The ability to specify at a finer level of granularity than directories sounds wildly overcomplicated, to be honest. I'm not aware of any other language that works at that level either (maybe Python's import mechanism does let you tinker at that level, but it would be extremely niche). Also, I'm also a bit confused about how it can work even in C/C++; surely most compilers don't even have the capability of specifying header file locations on an individual level, as opposed to the classic -I switch (or /I for Visual Studio).

I've used generated header files (protobuf) with CMake and they work fine. Although I prefer to create them at configure time rather than build time, even though that's philosophically incorrect, just because that way I can see them in the generated IDE projects and they don't break autocomplete.

I have used Buck for many projects, and in practice it is not complicated all... in fact I think it might be simpler. The reason why is that you define the includes relative to a single path, and the layout is just a Python (well, Skylark) dictionary in the build system. You never have issues with includes happening from the wrong paths, or the build breaking because files have moved around and someone wasn't including things in the correct way. Proper globs are also amazing from a build maintenance perspective.

You can also wire up Protobuf headers in the map.

Cargo (and Rust tooling in general) still have an annoying design flaw in my mind -> they all pretend that system package managers don’t exist.

Admittedly with rust not having an ABI you’d still have to rebuild all the packages whenever the compiler was updated, but I’d like to see rustup support toolchains installed to the system.

This is a rustup issue, not a Cargo issue. You can use a system-installed Cargo/rustc just fine.

rustup link can point to a system installed rust, so should be fine!

I wouldn't recommend bazel to anybody. I've had an absolutely terrible experience with it. First of all, it's slow. Secondly, it keeps breaking builds with every single release. Thirdly, the language is weird and not at all obvious and it makes monkey patching hard (this is important for distro packagers).

Many C/C++ projects are switching to Meson these days. Mesa for example.

My experience with Meson is that it is much better designed than CMake but very slow.

Meson itself though is just a meta build system, it's used in combination with Ninja that does the actual build. Ninja is quite fast. But they claim at least that they are working on making Meson itself fast too.

This idea of having a "front-end" build system and a "back-end" build system is a bit of a throw-back. It is better to integrate the two for better optimization and configuration management.

I don't know if that's true. Don't you gain all the usual advantages of decoupling systems (clarifying responsibilities, documentation, independent growth and optimization, interoperability with other programs, easier testing) if you are forced to define a clear interface between the two?

It certainly seems like specifying a build process and performing a build process are two distinct things.

They are different stages of the build, yes (although in a hermetic system you could in theory run the build process in parallel with the configure step for different targets). The challenge is integration. Things are much simpler when you ship the two programs together since they need to be compatible in terms of folder structures etc. It is also nice to keep things in memory sometimes. In other words, I think they should be different modules of one build system, rather than completely different programs.

To your point, integration will allow the two to be iterated upon faster, since changes can be more easily released together.

Better integrate may be, but not merge into one. Since super optimized build systems also increase complexity of their usage. So the idea is to have user friendly front end, which produces complex input for very fast backend.

Ninja view their build idea as "assembly like". While assembly surely allows you producing very fast results, it's not easy to use at all, if you are using it directly.

> Since super optimized build systems also increase complexity of their usage.

That's not correct. Complexity arises from the inability of expressing what you want and lack of abstractions over platform details.

Ninja for instance is to primitive to be productive and does not have any understanding of C++.

Buildsystems like buck allow you to express things in a declarative manner eg.:

      name = 'foo',
      srcs = glob(['src/**/*.c']),
      exported_headers = glob(['*.h'])

      name = 'app',
      srcs = ['main.c'],
      deps = [':foo', ':bar']
buck implements sophisticated (disk & network) caching, optimization and scheduling strategies to make things fast.

Furthermore it will use services like watchman (if available) to precompute what things need building if you change some files for fast incremental builds.

Lastly it will strip and sort symbols in your binary to make sure the hash of your binary is always the same for the same set of inputs.

All this complexity is handled for you and all you need to do to run your executable is

    buck run :app

Ninja is by design low level, it doesn't need to know that about C++. That's the point of having a front end that would know about libraries and such.

So not sure what you argument is about. It's like saying that assembly has no high end abstractions. Sure, it doesn't. It in itself is not an argument against splitting the build into several passes.

That said, Buck looks like an interesting build system.

In general, to have proper handling of librarires and etc. the language itself should support the notion of modules, like Rust does. Then you can implement sane tools (cargo). C++ is still crippled in this regard. Though there are some ideas how to improve it:


I've used Meson to build mesa, as complex a project as you are going to find/make, and as entirely expected, from invocation to Ninja build files it's.. maybe a second?

Safe to say, if you are building any C++ specifically, that is entirely and utterly negligible.

Mesa surely builds a lot faster with Meson+ninja than with autotools.

There's just literally nothing good that can come of calling your product "pants". I don't care if it's amazing! - because I already know that, actually, it isn't, simply on the basis that its authors' judgement is clearly unsound.

CMake suffers from a lack of conventions. If you want to follow some set of conventions and have one directory for your headers and another for your sources and another for your tests etc you still end up having to write a hundred lines or more of error-proned CMake.

As ugly as things like Maven and Gradle are, if you follow all of their conventions they get out of your way.

In addition:

- no cross-platform way to require a compiler version or way to set compiler flags

- linkers are not abstracted from you (good luck trying to get cross-platform support for relocatable binaries; cmake really bungles up rpath etc)

- also good luck trying to link both static and dynamic libraries and targets

- third-party libraries aren't really a first-class thing

- no simple way to specify "build all these things into the target directory with a conventional file layout"

80% of projects need to deal with these and not a lot else. Why is there not a list of "follow these conventions and you don't really need to write or look at cmake file"?

(To be fair: compilers, linkers, and OSes share some blame in not being more consistent but the goal of a competent build-system is to handle these things for the 80% case.)

It doesn't really matter how terrible the syntax and internal implementation are if most reasonable use-cases don't need to look at it.

They recently added a convention: target-based CMake!

Now, because they are switching to what every sane build system was 5 years ago, but over the course of 10s of 3.x versions, your CMake buildfiles are about to explode in complexity and frankly, stupidity. Lots of "does this target exist already because I need to support some older version alongside bleeding edge" and usually, if not, just copy that stuff over from their HEAD.

Everything they touch goes terrible. We just need to stop with like version 3.10, pretend CMake is dead and slap a DEPRECATED warning at the beginning. It's the only way out.

Author of the post here.

The compiler version thing is definitely an issue. That said, if you're worried about having to litter `if/endif` calls all over the place, I recommend you look into CMake's generator expressions.

relative rpath support was added very recently and will be in the CMake 3.14 release. It's a shame it took this long to make it in.

third party libraries can be imported via add_library, and then setting the imported location. This allows, I should note, the ability to link against both static and dynamic libraries.

IXM is being developed to make it easier to do the layout system as well as make the 80% you've mentioned here basically a non-issue, and comes with a fairly decent (in my opinion) default layout. Because it's not even in an alpha state I've not had time to document it.

(Also, I am working on a build system replacement for CMake, which includes a compiler frontend translator so you can solve the cross platform compiler flag issue. CMake is simply being used to "brute"strap the project until it is self hosting)

I want to upvote this for solid info, but I want to flag it so fewer people have to know about CMake. It blows my mind that experienced software developers would develop yet another pseudo-language DSL vs creating a portable library (python, java, go, whatever) that has a sensible interface.

I think a build system should always allow to convert its build description to a bash file that contains a more or less linear account of all actions needed (without support for incremental builds). Now if every library were distributed with such linear script, the user is guaranteed not to get stuck in obscure build problems. And of course, they can first try the official incremental route (cmake in this case) if they like.

Assuming that linear account works.

Most of the build issues that a user will face is there system not matching the developers.

We can have pretty complicated logic (arbitrary logic in most build systems) for probing the build system, and much of this logic will inevitably get lost when compiling a linear script.

Once you put all of the complexity in the compiled script, I don't think you've gained anything over "make clean ; make" (or equivelent)

Maybe but bash is hardly less weird than cmake, especially for Windows devs.

Creating your own build tools becomes unmaintainable by anyone else. People do it all the time, and pay the price when they have to change their builds. There is a reason CMake is used by a lot of projects, it's not somehow due to everyones incompetance.

If the build system was a library called from a popular scripting language, it would only take a few minutes of reading the API docs before it builds could be modified. Also, there is a niche for new build systems: one-off projects that "don't matter." If you want to introduce a new build system, your first target market could be throwaway point-a-to-point-b projects and then you could expand outwards from there.

That's basically Gradle and if you think cmake is bad you haven't seen some of monstrosities that can be created when you open up a whole programming language.

From what I've seen cargo gets most of the parts right. You can have a programmatic pre-build script bit can't take over the whole building process.

Yes, with gradle you can create true monstrosities — and even my own projects aren’t free of those, sadly, because sometimes there’s just no alternatives.

Here’s a few examples:

Dynamically using git describe to determine version and build number (which is actually just the # of commits that ever occured, to allow reproducable builds)

    versionCode = cmd("git", "rev-list", "--count", "HEAD")?.toIntOrNull() ?: 1
    versionName = cmd("git", "describe", "--always", "--tags", "HEAD") ?: "1.0.0"
Putting git infos into certain classes

    buildConfigField("String", "GIT_HEAD", "\"${cmd("git", "rev-parse", "HEAD") ?: ""}\"")
    buildConfigField("String", "FANCY_VERSION_NAME", "\"${fancyVersionName() ?: ""}\"")
    buildConfigField("long", "GIT_COMMIT_DATE", "${cmd("git", "show", "-s", "--format=%ct") ?: 0}L")
Using string interpolation to change the output file to include the version

    setProperty("archivesBaseName", "Quasseldroid-$versionName")
But the worst part is that often on StackOverflow you only find ugly hacks adding custom tasks, removing tasks, modifying them, etc with ugly hacks.

And even worse, sometimes there’s just no good alternative at all.

Which is why gradle took so long to clean up some parts of their API, or introduce parallel builds.

Buck and Bazel work this way. The language is Python (with some optimizations). However, the build is fast because Python is only used to describe the build graph.

SCons in particular is along the lines of the latter, in that its. But it never really caught on, IMO.

Probably one of the biggest features in a configure/build/packaging software package is ubiquity, IMO. Ideally C/C++ would have had a prescribed-but-not-required thing like cargo/setuptools/go get/etc. IMO CMake is probably the least bad offering these days.

I read about scons, loved it, tried to use it, gave up. Tried cmake, found it workable but super confusing. Went back to make.

To be fair, CMake's language was first created somewhere back in 1996. It's quite old at this point and is basically in the "sunk cost fallacy" range of existence.

No, a good (not turing complete, easy to use, with types) DSL is a good thing. Meson does it right.

I'm not sure what you mean. If for whatever reason I need to set up a big C/C++ application so that it can compile on diverse systems, CMake is one of my best bets for getting that to happen. If anyone would like to suggest a better option I would gladly switch, but as it stands I think CMake is pretty useful to know about.

I'm not arguing that it doesn't do what it was designed to do. I'm arguing that it is a grotesque DSL with bizarre design choices. CMake would have been better off as a cross-platform library that didn't re-invent concepts that other programming languages have already gotten correct, and instead focused on making the build system be cross-platform.

I’ve mentioned the same thing before and ironically, CMake was developed in the course of building medical imaging software[0] that was using Tcl, a flexible, embeddable language, but then decided to roll their own ad hoc language in this case. They left a LOT on the table with that decision, I think.

[0] https://en.wikipedia.org/wiki/VTK

I think the op's point is that it was not necessary to invent a new programming language for this. You could write a library for existing languages (probably with a custom entry point) that does the same thing, but without a new and confusing syntax.

Bazel works very well for C++ applicatios. Cross platform builds are well supported.

bazel can’t even build static libraries


False. The bug above represents a nuanced case.


Scons came to prominence around the same time as CMake and took that approach, but seems to have been less popular.

CMake is very powerful but unlike most things there’s no underlying principle or theory that when you grasp it, everything just clicks, and you also can transfer the knowledge to other languages. It’s just a matter of memorising lots of tricks and special cases.

This is mostly true about the language but not about the way build artifacts and their dependencies work. That part is pretty solid.

The syntax of cmake is not very elegant, and error-prone. It also does not scale to large projects, configure times go up quickly. The documentation also leaves much to be desired.

That cmake is in the place it is nowadays is a reflection of the state of cross-platform build systems 10 years ago, and possibly some marketing.

Interesting alternatives worth checking out: Meson and gn (the latter can be used for building llvm)

I'd like to add Buck and his siblings Bazel & Pants to this list. Buck is used by large companies like Dropbox, Facebook and AirBnB.

Additionally I want to mention my company released also a package manager that uses buck as a packaging format: https://github.com/LoopPerfect/buckaroo

And so far over 320 libraries have been ported to buck and are maintained by our bots: https://github.com/buckaroo-pm

I don't get the hate for CMake. It has several useful advantages over plain makefiles:

* It's meant to be cross-platform, so a well structured CMake file will work in Windows.

* CMake modules allow you to include source-based libraries without a lot of drama. This is especially useful for cross-compiling or embedded use-cases.

* Out-of-source builds are supported with no additional work on my part. This is great for CI generating debug, release, and other build variants from a single cloned repository.

Comparing anything against plain Makefiles is a really low bar. Makefiles have been around since 1976. I think the hate for CMake comes from other directions entirely. The CMake scripting language is especially bad. Out-of-source builds were never really that hard in the first place. I think people hate CMake either because they were doing something slightly more unusual than CMake tolerated, or because they hated CMake's language (which is very easy to hate).

CMake has simply sucked less, overall, than the competitors of its time, for most common build tasks. That era is over, now that the new generation of build systems is here (Bazel, Buck, Pants, Please).

I wonder how Makefiles got such a bad reputation. After decades of reading how bad Makefiles are, I recently tried writing one for a moderately complex build, just for fun. I was pleasantly surprised: make is very fast, it's well-documented, and the execution model is so simple that I found it very easy to figure out how to script the things I needed to do. I'm not saying it's great, but there are a lot worse build systems around that people still actively advocate (cough msbuild).

Over the past 30 years or so, build tools for C and C++ programs have converged to what you see today.

For example, with GCC you can use the -M options, and then you can -include the result in your makefile. This was not always possible. You had to manually specify the .h files for every .c, and if you omitted one, you could get a successful build but a broken program!

Then consider the process of building a shared library, which is different on different platforms, but if you restrict yourself to GCC on Linux with GNU Binutils it’s damn easy.

There are a few other, minor failings of Make. But you’re right, it’s very comprehensible and straightforward. I still say that it’s a low bar, though, by modern standards.

How about significant whitespace with a difference between tabs and spaces, for starters?

* Poorly designed DSL.

* "Everyone" knows GNU make.

The lead at my old job was obsessed with CMake. It was an improvement over Boost.Build(barf..), and to the degree it replaces something like autotools I'm all for it, but it was yet another DSL in a company filled with DSLs. It took us 3 months to get new devs up to speed because of all the shit they had to learn. I can take a sw dev(embedded sw, because that was the role) off the street and there's a 90% chance they're conversant in GNU make. About 90% have no idea how to write a CMake file. When I looked at the CMake files, I noticed source and target rules were spread out over the file, which seemed like a really bad idea to me.

Yes, CMake let's you do great things, but holy moly is writing CMake files painful.

> It's meant to be cross-platform, so a well structured CMake file will work in Windows.

A well structured more conventional makefile can also run on Windows. The tree I'm working on builds for Unix and Windows with the same makefile. (No cygwin or WSL either, just GNU make running on Win32. A few ifdefs and strategically placed variables.)

On Windows, do you edit and debug your code in Visual Studio, or some other editor and just use the MSVC command line tools for building? Because one of the (very few) things I actually like about CMake is that it produces project files that Visual Studio can open directly.

I debug with windbg on Windows. I like it much more than the VS debugger. Habits I picked up from when I worked at MS.

I feel dirty whenever I modify a CMake file but you, madam/sir, have opened the gates of hell and had me peer in. shiver

I'm always willing to help others pierce the veil of reality from time to time. Hope you enjoyed the post :)

I used Makefile in university and that was simple enough until you needed to do something more complicated and then it started to become unwieldly fast. Back then, GNU was popular and automake/autoconf were attached to every project. If you were building on a Linux system, it worked, but being a Mac user, it always came with gotchas. It was quite a pain to use honestly and required a very deep understanding of it's .m4 design if you wanted to do anything clever.

Cmake seemed to work well, meaning, I could just open the GitHub project and generate an Xcode project file and bam, it just worked. However, agreeing with everyone here, holy hell, trying to start a new project with it, or migrate another project over to it is quite a futile experience.

It has several escape hatches in the weirdest places. Need to pass a conditional compiler flag? Good luck with that exercise. A compiler flag was added from a previous macro and there doesn't seem to be a way to keep it from doing that. I don't know, the whole system seems nice when you don't have to touch the make file, it just works, but I can only imagine the hair splitting effort it took to get that make file in a workable state.

The thing I want most from a build tool is not a nice DSL or portability. What u want most is hermetic builds. There are only a few told out there that do it. CMake is not one of them.

What do you mean "hermetic" in this context, and why does CMake fail?

Also, for my purposes (using both Windows and GNU/Linux at work), portability is a tremendous benefit.

In this context, "hermetic" means that for every step in the build, all inputs and outputs are known by the build system. Hermetic builds are reproducible, and can be more easily done by a distributed build cluster. Hermetic build rules can also be run in a sandbox, so you can have assurances that your build scripts are correct.

With non-hermetic builds, it's much more difficult to verify the correctness of your build rules, which means that you might get non-reproducible builds or you might get incorrect incremental builds. With hermetic builds, it's always safe to do an incremental build, no matter what state the repository is in.

In my experience, this is usually handled with a combination of source control and containers(Docker, but more commonly just a chroot image).

I'm not sure how source control is connected. Normally, I want hermetic builds whether or not the files I am building are checked into source control yet.

Well if you want repeatability, I'm assuming that means over some long timeframe, which means you need a way to get the code as it was at some point.

You then need to build that code using a consistent set of tools and libraries, which is where the chroot comes in.

Repeatability definitely does not mean some long time frame, it's over any time frame. Short time frames have the largest impact, because it allows you to use a shared cache for build products.

One big chroot around your whole build system isn't enough, you can run into nondeterminism problems due to relative ordering of different build steps during execution. This is why it's nice that the new build systems make each individual step hermetic, because you can make a separate chroot for each individual step (Bazel executes each build step in a separate sandbox).

This allows you to cache all intermediate results, share the cache, and get the same results both for clean and incremental builds, repeatably. It's also easier to remove nondeterminism from individual build steps rather than looking at the whole build.

Cool, didn't know that. I like the idea of breaking things up into steps.

So you're saying this takes care of race conditions in the build where things may happen out of order even if you have a fixed environment?

Yes. Since the different steps are executing in different sandboxes, you can be confident that there are no race conditions between them.

Which tools?

All of the newest generation of build tools.

- Bazel https://bazel.build/

- Buck https://buckbuild.com/

- Pants https://www.pantsbuild.org/index.html

- Please https://please.build/

I use CMake because most of the dependencies I consume use CMake, and tooling for CMake is more widely available.

For those who just want to get something building in cmake, just start with a template project and modify it.


I tried to learn CMake 4-5 years ago and just gave up. Cumbersome, weird ......

I have always been using makefile, not perfect, but good/solid enough for all my use cases.

Maybe just use python/etc to write a better wrapper for Makefiles? There are so many build system to choose these days.

Is there a good way to debug CMake build rules (especially in the presence of add_custom_command and friends)? I've got a case which builds differently in Make vs Visual Studio---in VS, dependencies get messed up and some files don't even get built at all (and then the build fails later on because those files are missing). I can't for the life of me figure out how to debug this, it's absolutely infuriating.

I've got by with some combination of the following:

- message(STATUS ...) or message(FATAL_ERROR ...) for printing stuff out at configure time, e.g.

    message(STATUS "libcurl found: ${HAVE_LIBCURL}")
- cmake -E echo for printing stuff out at build time, e.g.

    add_custom_command(TARGET lorenz POST_BUILD
        COMMAND ${CMAKE_COMMAND} -E echo "lorenz command line:"
- diagnostic-level MSBuild output for dependency issues - you can configure this in Visual Studio, Tools, Options, Projects and Solutions, I think (it's around there somewhere). You might not expect much from MSBuild debug output, considering how annoying the rest of it is, but it's actually extremely comprehensive, and I've found it useful for figuring out even rather weird stuff. There's quite a lot of it, though, so get a cup of tea

A passing familiarity with the MSBuild syntax might be helpful, but I've managed to do mostly without.

cmake --trace and --trace-expand maybe. Because a typical build of a sizable project runs order of magnitude 10k lines of CMake script, you can just dump it all and search. Worked pretty well for me a couple of times.

It always pisses me off when I have to build a GitHub project or something with cmake. Just use a Makefile and autoconf, dammit.

Every day I bitterly regret that they decided not to use Tcl for CMake in the end.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact