
Include-what-you-use: Clang tool to analyze includes in C and C++ source files - ingve
http://include-what-you-use.org/
======
rsp1984
I would love if imperfect #includes (what I understand this tool is supposed
to point out) simply became a compile warning and part of the toolchain. I
would love such a feature and would use it a lot.

However wrt this project I just looked at the "Instructions for Users" and it
tells me how to build it and that I need to observe this and that and a lot of
other things for it not to go wrong. Sorry, I just don't have the time for
that. Why not just provide binaries for the most important platforms?

~~~
gregorburger
[http://include-what-you-use.org/downloads/](http://include-what-you-
use.org/downloads/)

------
fizixer
I'm trying to get into refactoring (and even semantic analysis, and code-
generation) for large C++ codebases using libclang, libtool, and the ilk.

Any guides to get started, and/or best practices?

Does it increase productivity by order of magnitude? (I'd like to think so. I
think it's madness not to use any AST tool when working on large codebases).

Also I'm assuming "white-box" C/C++ tools like libclang/libtooling are much
more powerful than black-box ones (like eclipse CDT), because of
extensibility/programmability. Any comments, experiences?

what I'm trying to say is that a non-IDE but programmable editor (vim/emacs)
combined with programmable libclang/libtooling, would trump a prepackaged IDE
(like Visual Studio, Eclipse+CDT) by a wide margin. (of course there would be
quite a bit of programming involved though).

~~~
jjuhl
I think you'll enjoy this talk: CppCon 2015: Atila Neves "Emacs as a C++ IDE"
\-
[https://m.youtube.com/watch?v=5FQwQ0QWBTU](https://m.youtube.com/watch?v=5FQwQ0QWBTU)

~~~
25cf
The main feature that I miss from CLion is the ability to refactor. In CLion
you can rename a class, and all of its usages, declarations, etc. will be
renamed.

The most useful thing CLion can do is change the signature of a function, and
its corresponding implementation and header declarations will be changed. So,
you can rename a parameter and its declaration and implementation will be
renamed for you, or you can delete a parameter, or change the type of a
parameter, and CLion will change all its usages/definitions/declarations for
you. All of this effects across all files in your project, so if you're in a
.cpp file and you change the function signature, its declaration in the
corresponding .h file will change as well.

(I believe you can do all of this in Eclipse as well)

Sadly I have yet to see a vim/emacs configuration that gets anywhere near the
level of Eclipse/CLion.

~~~
moonchrome
>and all of its usages, declarations, etc. will be renamed.

Along with header include statements in any file that referenced the class -
so say I refactored class Foo in include/mylib/foo.hpp and it's included in
src/mylib/foo.cpp as <mylib/foo.hpp> \- CLion happily "refactors" that to
#include "foo.hpp" \- spent the same ammount of time fixing this stuff as I
would manually renaming use cases...

------
d00r
I've been using _deheader_
([http://www.catb.org/esr/deheader/](http://www.catb.org/esr/deheader/)) for
this.

It detects unnecessary inclusions and also warns about missing headers
required for cross-platform compatibility.

~~~
kazinator
I wrote a tool to do this in 10 minutes just several days ago. It iterates
over a file, and removes one #include at a time (while keeping all the other
headers in place), and each time builds a file. For any successful build, it
reports that header.

I used this to prune unnecessary includes throughout the TXR project.

There are some obvious false positives, like includes wrapped in #if/#ifdef
and also certain system headers which look unnecessary on one system, but are
actually required on others (due to the mess that is called POSIX).

After I applied some of the header removals, I pushed the changes to git and
went through all the supported platforms to validate the changes. Almost every
platform had some problem which resulted in some header having to be put back:
MinGW, Cygwin, Solaris, Mac OS X. For instance, Mac OS is a stickler for
needing <signal.h> to declare the kill() function. On other platforms, it also
shows up elsewhere, perhaps unistd. On glibc, you get va_list if you include
<stdio.h>. Some platforms guard against this; they use some internal typedef
name instead of va_list for vfprintf, so you must include <stdarg.h> for
va_list. Various problems of this type. So just because a tool tells you, "hey
this compiles without <stdarg.h> just fine", that of course means "well, on
this system".

    
    
       #!/usr/local/bin/txr
       @(next :args)
       @file.c
       @(next `@file.c`)
       @(collect)
       @  (some)
       @lines
       @  (and)
       @    (line linenum)
       #include @header
       @  (end)
       @(end)
       @(require (boundp 'linenum))
       @(do (rename-path `@file.c` `@file.c.bak`))
       @(try)
       @  (do (each ((rem-line linenum)
                     (rem-hdr header))
                (with-stream (s (open-file `@file.c` "w"))
                  (tprint (partition* lines (pred rem-line)) s))
                (ignerr (remove-path `opt/@file.o`))
                (when (zerop (sh `make opt/@file.o > /dev/null 2>&1`))
                  (put-line `@file.c:@{rem-line}: can remove @{rem-hdr}`))))
       @(finally)
       @  (do (rename-path `@file.c.bak` `@file.c`))
       @(end)
    

The output is in a form that looks like compiler diagnostics. I can stick it
in errors.err and run "vim -q".

This simple tool works in conjunction with the project rule that that headers
don't include other headers. The .c files include everything they need in the
right order. But the inclusions are added by hand by copy and paste, which can
pull in something that isn't actually needed.

You have to iterate the tool. If you remove some header which isn't needed, a
second pass can then determine that yet another header isn't needed, because
it was only needed by that which was just removed. The iteration isn't worth
building into the code.

------
anarazel
Personally I don't like this. Often enough there's includes in headers, which
are an implementation detail for the functionality in the originally included
header. Iwyu will just include that, increasing coupling between components.

~~~
bnegreve
I don't get it, can you give an example?

~~~
revelation
Big libraries like Boost and OpenCV usually have meta-headers for features and
components that just include all the other headers for that particular part
you want to use.

(e.g.
[https://github.com/Itseez/opencv/blob/master/include/opencv2...](https://github.com/Itseez/opencv/blob/master/include/opencv2/opencv.hpp))

A tool like this would see right through that and walk the tree all the way to
the final leaf headers that actually implement the parts you use. This not
only means one single include will balloon to many smaller ones, but when the
library moves things around internally your code will break.

(Unclear if this is what this tool in particular will do, but I've certainly
seen this behavior with Clion for example when it recommends what header to
include.)

------
nightcracker
I have made a similar tool that is a lot more hacky, but does not require
compilation or preprocessing of the source (and is this not reliant on any
single compiler):
[https://github.com/orlp/iwyu](https://github.com/orlp/iwyu).

It also does nothing smart, it is user trained. Every symbol prefixed with a
namespace you are interested in will be presented to you, and you must tell it
where to find the symbol. Any other symbols will be ignored.

------
maxlybbert
In related news, Stroustrup's and dos Reis's tool (
[http://stroustrup.com/gdr-bs-macis09.pdf](http://stroustrup.com/gdr-bs-
macis09.pdf) ,
[http://stroustrup.com/sofsem10.pdf](http://stroustrup.com/sofsem10.pdf) ,
[http://stroustrup.com/icsm-2012-demacro.pdf](http://stroustrup.com/icsm-2012-demacro.pdf)
) finally showed up on Github (
[https://github.com/GabrielDosReis/ipr](https://github.com/GabrielDosReis/ipr)
).

------
vbezhenar
Actually it would be nice to see that functionality as a compiler warnings.

~~~
santaclaus
With CMake it is possible to run the tool as part of the normal build process
[1]. Not as nice as first class compiler support, but still quite helpful as
part of day to day development, I find.

[1] [http://stackoverflow.com/questions/30951492/how-to-use-
the-t...](http://stackoverflow.com/questions/30951492/how-to-use-the-tool-
include-what-you-use-together-with-cmake-to-detect-unused-he)

------
makecheck
I find that a strict coding style can make this easier to implement, and more
likely to restrict warnings to the code that you directly control.

For example, I used a module prefix consistently in my code so that any
occurrence of an "Xyz_" prefix in a file would pretty much _guarantee_ that
"Xyz.h" is required to compile. I also put all local #include references in
the same part of the file under a predictable comment header. That way, I
didn't need a C++ parser; I just needed to know where to start searching for
the list of headers, infer prefixes from their names, and complain about
prefixes that were not found in the rest of the file. This has worked
surprisingly well, and I can tune the script to skip modules for any reason.

This is also important for the reverse case. I don't really _need_ an IDE or
ctags to figure out what file is needed for a #include in most cases because I
have the name of the function/type/constant/etc. to guide me. Similarly, I
know exactly what file to open in my editor without using magic tricks.
(Although, code completion and jump-to-definition are still quite helpful,
especially for things like OS calls that may follow no particular convention.)

------
wbsun
Hmm, I happen to know a colorful-logo company that has a internal
unrecommended tool that has the same name and does the same thing :) Are they
making this public?

~~~
DannyBee
I'm pretty sure we open sourced it in 2011 actually ....

(I don't think this version has been kept up to date with the internal one,
but i could be wrong)

------
scott_s
The first sentence reads: "'Include what you use' means this: for every symbol
(type, function variable, or macro) that you use in foo.cc, either foo.cc or
foo.h should #include a .h file that exports the declaration of that symbol."

For clarity and correctness, I think it should be changed to: "'Include what
you use' means this: for every symbol (type, function variable, or macro) that
you use in foo.cc _that is not declared in foo.cc_ , either foo.cc or foo.h
should #include a .h file that exports the declaration of that symbol."

Not having that phrase tripped me up for a few minutes. I guess everyone else
thinks it's implied, but for some reason that wasn't clear to me until I read
through more of their documentation.

------
kazinator
Ther is also: "don't include anything except for one header, and let a tool
manage what is in it based on analyzing your code":

Makeheaders:

[http://www.hwaci.com/sw/mkhdr/](http://www.hwaci.com/sw/mkhdr/)

------
_b
I've used this for years. It takes a bit of work to get it working well on a
codebase, but I find it worthwhile.

Refactoring often involves moving functionality between header files, which
can mean fixing up the includes in a large number of other files. IWYU makes
this a lot easier. IWYU does require the code to build before it works, but
this can usually be easily gotten after a refactor by making one header file
temporarily include the other.

I have, a couple times, modified templated code so it wouldn't confuse IWYU,
as that is easier than maintaining pragmas correcting IWYU in the files that
call the templated code.

------
haosdent
This tool doesn't work well.
[https://issues.apache.org/jira/browse/MESOS-1583](https://issues.apache.org/jira/browse/MESOS-1583)

------
echochar
How about a tool to "exclude what you don't use"?

I have several personal hacks for doing this I have written over the years but
I've yet to find anyone else who tried to automate it.

For example, one task is to determine the functions in a library that are not
actually used in your program and exclude those from linking, instead of
blindly linking libraries full of unused functions (that sometimes cause name
conflicts).

------
nice__two
The LibreOffice project has been using this a lot to refactor a large, ancient
codebase. They've also developed a lot of Clang plugins, which can be found
here:
[http://cgit.freedesktop.org/libreoffice/core/tree/compilerpl...](http://cgit.freedesktop.org/libreoffice/core/tree/compilerplugins)

------
rijoja
This seems like an awesome tool. I was thinking about going trough a code base
for an hobby project a while ago. But this tool could make it a much easier
task. I haven't tried following the instructions yet. But it seems to be
correct. Good work!

------
ndesaulniers
yay C/C++ tooling:
[http://nickdesaulniers.github.io/blog/2015/07/23/additional-...](http://nickdesaulniers.github.io/blog/2015/07/23/additional-
c-slash-c-plus-plus-tooling/)

