Which seems interesting, however when I tried the FreeType example, there seemed to be some preprocessing issue, such that some function definitions are conditionally excluded even though they are called later. I didn't have the time to find out if this was an issue in the original code or if the amalgamation script introduced it.
In any case, such single-C programs are very useful for quickly testing tools, so having more of them would be great.
I'm not a C programmer, but I have heard of amalgamation, and I wonder why a standard workflow to create a single compilation unit from multiple source files isn't more straightforward.
I've been working on a project that auto generates c programs - sometimes up to 1.5m lines of code - in a single file (actually two files but the second is only 35 lines)
Not open source but happy to share benchmarks if that would be useful.
There is quite a lot of IP in the generated programs so probably not possible to share sadly.
I wasn't aware of Csmith so thanks for highlighting. My C code doesn't really test many features of the compiler so I suspect mainly of interest in seeing just how the compiler handles a really large single file.
I like to code this way. You just include "foo.c" instead of "foo.h", which does not exist at all. The compilation is really simple, and there's half of the files!
I know it's half serious but it's simply not true, in the same way as grepping for "Stallman" in the leaked Windows source code (nobody actually mentioned RMS there, these were false positives). In this case, some headers contain multiple occurrences of GNU in a single header. Then there are several #ifdefs like "__GNU_LIBRARY__" or "__GNUC__" or e-mail addresses of people in the gnu.org domain.
In practice, it doesn't matter at all as the preprocessor replaces all license headers with a single space even before the compiler has the chance to look at it.
Clang, like GCC, has -C and -CC flags to preserve comments during preprocessing. However, these are really flags for the underlying preprocessor. Your parent might be thinking of some application of the Clang frontend that does not run preprocessing. For example, clang-format will probably not want to preprocess the code nor strip out comments.
https://www.sqlite.org/amalgamation.html