
A meta-analysis of C++ projects - Topolomancer
http://bastian.rieck.me/blog/posts/2018/cxx_meta_analysis/
======
gmfawcett
Let me play the pedant: technically, this is an analysis, not a meta-analysis.
A meta-analysis is an analysis of other analyses (typically, analyses
completed by other people, and discovered through a literature review). The
author has completed a single (interesting!) analysis of multiple projects.

[https://en.wikipedia.org/wiki/Meta-
analysis](https://en.wikipedia.org/wiki/Meta-analysis)

~~~
Topolomancer
I am the author of the article. Thanks for the kind words! You are right in
pointing out this incorrect terminology. I should know better and have changed
the title :)

~~~
gmfawcett
Oh, that's great! Thank you for taking my comment constructively.

------
saagarjha
A lot of the headers in the "tail" are included by the other headers, or made
available in some other way, which reduces their usage. For example,
std::tuple is defined in both <tuple> and <utility>, which would help explain
why <tuple> is lightly used. Most of <new> is implicitly defined by the
compiler. <initializer_list> is "built-in". <iterator> is generally subsumed
by "auto" and type inference.

~~~
smitherfield
_> <initializer_list> is "built-in"_

You still need to include the header for any of the following:

1\. Constructors (or functions) with `initializer_list` parameters.

    
    
      template<typename T> struct S {
              vector<T> vec;
    
              S(initializer_list<T> il)
                      : vec(il) {}
      };

2\. `initializer_list` literals in range-for loops.[1]

    
    
      for (int prime : {2, 3, 5, 7, 11})
               // ...

3\. Named `initializer_list` variables.

    
    
      const auto primes = {2, 3, 5, 7, 11};
      const initializer_list<int> primes2 =
              {13, 17, 19, 23, 29};

_> <iterator> is generally subsumed by "auto" and type inference._

The iterator adapters[2] are still pretty useful and (I thought?) commonly
used.

Also, the "range access" and (new for C++17) "container access" free functions
are very useful for writing template code that works with both C arrays and
STL-style containers, but they're defined in many other headers.

[1] Useful with variadic templates:

    
    
      [](auto&&... args) {
              for (auto &&arg : {args...})
                      // ...
      }

[2]
[http://en.cppreference.com/w/cpp/iterator#Iterator_adaptors](http://en.cppreference.com/w/cpp/iterator#Iterator_adaptors)

~~~
saagarjha
By "built-in", I was mostly talking about the fact that the literal syntax
doesn't require a header. Case in point, here's some code I took straight from
a toy AVL implementation I wrote a while back:

    
    
      auto children = {&grandparent->left, &grandparent->right};
      auto child = elvt::find_if(children, [parent](auto &child) {
      	return *child == parent;
      });
    

Yes, it's an abuse of initializer_list being a convenient "array-like"
structure, but it does show that not every use of initializer_list _needs_ to
be used as you've mentioned. (In fact, I probably use initializer_lists more
this way than in their "intended" way. I'm sure someone is going to yell at me
some day for it, but nobody's done it yet).

~~~
smitherfield
That requires including the header. It's probably included in one of the other
headers you included, but that's not required by the standard.

------
electricslpnsld
> most prominently, shared_ptr

Is the 'prominent' conclusion part of the analysis (I couldn't find anything
in the post), or just conjecture? Anecdotally I see unique_ptr far more often
than shared_ptr, but I would be interested in hard stats on which is more
commonly used.

------
tlb
In all my code, #include <vector> only appears once. That's because I have a
file std_headers.h, which includes <stdlib>, <algorithm>, <string>, <vector>,
and dozens more. With several #ifdefs for linux, FreeBSD, and Darwin and some
embedded systems. Every other c++ file includes it. If you're curious:
[https://github.com/tlbtlbtlb/tlbcore/blob/master/common/std_...](https://github.com/tlbtlbtlb/tlbcore/blob/master/common/std_headers.h)

I hate having a bunch of boilerplate at the top of each file.

~~~
arximboldi
Not a great idea if you care about compile times.

~~~
giomasce
Unless they use pre compiled headers.

~~~
tlb
I've tried a few times to make PCH work reliably, but it's hard to get the
Makefile right. My current project compiles both a regular executable and a
Node.js module, which have different build processes.

Is there an OS project with a reliable and portable PCH build process?
Searching github didn't find me anything to copy from.

~~~
ecuzzillo
I just went through the process of making a reliable and portable PCH build
process, and it's a lot of code to get right, because clang, gcc, and cl.exe
all have quite different models for how a pch should work.

cl.exe in particular has an insane set of requirements: \- you can't compile
the pch by itself, you have to have a cpp that #include's it and compile that;
\- then you have to link the resulting .obj file into the final binary; \- if
you use /Zi and you compile different versions of the pch for different
headers, you have to have exactly one separate .pdb file for each compiled
version of the pch; \- if you have more, it will complain that some (pch, cpp)
tuple do not share a .pdb; \- if you have less, it will stochastically fail to
open the global pdb file (because the pch writing process expects not to have
contention for the pdb, even though it's using mspdbsrv and /FS for
contention)

------
nthompson
Very cool analysis.

I would add: A lot of regex work is still being done out of boost.regex; maybe
people still haven't done the switch for most projects? For example, I think
boost.regex is still used in Chrome.

Just ran your code on boost, looks like "shared_mutex" is the least used
header there.

I use <future> all the time (in boost, no less) so it might be selection bias.

~~~
smitherfield
Allegedly, the quality of STL <regex> implementations leaves much to be
desired.

~~~
alexhutcheson
I default to using RE2[1] in most cases. If I really need backtracking, I
would probably just use PCRE through the C++ wrapper[2], but that hasn't been
a common need in my experience.

[1] [https://github.com/google/re2/](https://github.com/google/re2/)

[2]
[https://www.pcre.org/original/doc/html/pcrecpp.html](https://www.pcre.org/original/doc/html/pcrecpp.html)

------
akrasuski1
It would be nice to divide the cooccurence count by counts of respective two
headers, to get a sort of correlation. Now it's quite hard to see anything
beyond "most popular things are paired with most popular things" which is
obvious.

~~~
Topolomancer
Good suggestion! I will add this an update the repo/post.

------
crb002
C++ needs a standard "contrib" set of headers like Ruby. Relatively stable,
but only anointed if moved into the STL.

~~~
coldacid
Kind of like Boost?

------
nerdymanchild
I love C++ I just wish there was some sort of way to make it memory safe. At
the same time the Java stack is too heavyweight for the things I use C++ for.
If there werr some sort of memory safe language with the same use cases as C++
that would be game changing.

~~~
electricslpnsld
Swift, Go, Rust, etc, all fit your description.

edit: De-Germaning

~~~
twic
What's usw?

There are use cases for C++ that Swift and Go don't fit, due to having
dynamically managed memory.

Fans of D would also put D on this list. And Ada is not dead yet!

~~~
woodson
“usw” is a common German abbreviation that means “etc.”

