This is not the first time I've heard people complaining about GCC. Could someone who knows compilers intimately explain why so many people dislike GCC? Are there any FOSS alternatives? Also, can LLVM be a drop-in replacement for GCC sometime in the future?
(The complaints I've hear so far range from GCC having a crappy register allocator to it generating code that is downright wrong.)
There are effectively no FOSS alternatives. A few groups have toyed with making their own C compiler; some in the BSDs were toying with I think lcc, and I've seen reference to tcc. The thing of it is GCC is very mature and quite good at what it does, and neither lcc nor tcc are much more than toys. They won't make this mistake, but that's because they do next to no optimization anyway. GCC makes the occasional error--all compilers do, and I don't expect this bug to survive the next point release.
This hasn't always been the case--GCC did have a crappy register allocator once upon a time, and it didn't do a variety of SSA-based optimizations once upon a time, and a lot of these complaints ultimately date back to those days. Pre-egcs there was a lot more talk about replacing GCC, but now that it has actually been actively maintained and developed that seems to have largely dissipated.
LLVM is no real answer. It currently uses a GCC frontend. Clang, which I think is what you were referring to, is not intended to replace GCC's optimization layer, which is what's causing this problem.
Actually, clang has nothing to do with gcc. LLVM's gcc frontend is called llvm-gcc ( http://llvm.org/cmds/llvmgcc.html ) which was dropped for being too high maintenance or something of the sort.
As far as I know llvm-gcc has not been dropped. On Mac OS X it is even installed alongside gcc when you install the developer tools.
Clang is making good progress. From http://clang.llvm.org/index.html:
Clang is considered to be a production quality C and Objective-C compiler when targetting X86-32 and X86-64
See DarkShikari's comment for details on what goes wrong when a compiler tries to be all things to all architectures.
--------
I should clarify. I agree with what you said below, these days GCC is an excellent compiler for x86. It's just that it's near to impossible to be so excellent for all archs.
But isn't this the whole point of LLVM? Separating the compile process into two intermediate steps, one of compiling a programming given language into an intermediate virtual machine and another of compiling that virtual machine code into a binary for the target architecture. Then each module of the LLVM can be worked on and optimized independently with little duplication of labor.
So the promise of LLVM is not just being able to compile to every architecture, but to compile every language to every architecture.
tcc isn't a toy, but it's optimized for a very different scenario than GCC, one where compilation speed and small compiler size are paramount, winning out over things like error reporting and output code quality.
IIRC lcc was a production-quality, if simple, C compiler when it came out in 1995.
There's kencc compiler/linker suite, written by Ken Thompson. It's almost ANSI compatible, with preprocessor limited on a purpose and an extension for easier nested structure member access.
Can you elaborate on the other complaints? There are precious few first-tier C compilers in the world. In my experience gcc has the best track record of quality, frankly. In 15 years of more or less full time professional use, I've been bitten by precisely one compiler bug.
Many complaints revolve around GCC's monolithic architecture. See the clang comparison page (http://clang.llvm.org/comparison.html) for a decent overview.
clang should open the door for intelligent refactoring and analysis tools which aren't possible using the GCC codebase.
...and the fact that GCC's monolithic architecture was designed explicitly to make it difficult for separate processes to interact with it, access intermediary representations, or incrementally compile. Probably the best example of GPL wankery there is.
If the front end and back end can be separated cleanly, then a company could put their proprietary compiler back end behind the GCC front end without violating the GPL. Stallman has always worried about things like this; his objections to virtual machines have a similar basis.
That sounds like the same strategies used by proprietary software companies who go through all sorts of loops to discourage uses that don't ideologically fit with them. Almost like Free's version of DRM.
Wow, and this is stated explicitly? Seems like cutting off the nose to spite the face. I guess ideology is more important than code quality to the FSF. They have every right to do it, of course, but as an outsider it seems counter-productive
I've heard this in quite a few places, RMS once quoted the reason we have an objective-c compiler is because people like apple to open source it to due to the architecture of gcc wouldn't let it be separate.
GCC 4.5 adds support for plugins that allow this sort of thing. Mozilla have been making use of it for quite a while already: https://developer.mozilla.org/en/Dehydra
Mans covers a lot of incredible retardation here, particularly on ARM and PPC platforms, such as cases where GCC generates 2-3 times the number of instructions necessary for no good reason.
Until extremely recently, GCC's handling of multiplications of non-native types has been hilariously bad, up to and including loading the value zero into a register and multiplying by it.
2. General bugginess all over the place, particularly with non-x86 platforms. The number of failed unit tests for ffmpeg with gcc 4.4 on PPC64 was staggering.
Of course, commercial compilers are buggy as well, so this is not really exclusive to GCC; the latest ICC miscompiles large chunks of ffmpeg and Mans has been quite busy submitting bug reports for armcc.
3. Tendency to get worse with every new release; this trend was finally reversed with 4.3 and 4.4, the latter of which was the first compiler to beat 2.9.5 in performance (in ffmpeg, at least):
4. Tendency to declare obvious bugs to be not bugs; for example, failures in the register allocator for inline assembly in which it couldn't allocate 7 registers for the inline assembly despite 7 registers being available (but only inconsistently, on some platforms and some versions, for no apparent reason). Don't have links to the gcc bugtracker on me for these.
5. Incredibly bad optimization on non-x86 architectures. ARM is a particularly horrendous example, where we found the following happening when compiling x264:
a) In the most important DSP function in the program, SATD, GCC unrolled the loop completely. This is related to the fact that there are no loop unrolling heuristics for ARM--and instead of just disabling loop unrolling, the decision is binary ("fully unrolled" vs "not unrolled"), with the former being far too heavily weighted in favor of.
b) GCC then completely failed to allocate registers correctly, resulting in an enormous number of unnecessary loads/stores to the stack.
c) The function's speed was cut in half due to this.
Other issues include GCC's inability to consistently use the ARM's "free shift," up to and including putting shifts on register X right next to another op on register X despite the ability of even a simple peephole optimizer to merge the two instructions.
Also, GCC seems generally bad at three-operand architectures; one will find many redundant moves between registers on ARM, which generally should almost never exist in a three-operand instruction set. Here's an example similar to something I've seen (I don't recall the exact instructions):
The post appears to me to be a bug report not a "complaint". I feel you have mischaracterised it, these are not the same thing.
LLVM stands for Low Level Virtual Machine. http://llvm.org/ As I understand it it provides an abstraction layer which could sit below a C compiler for the C code to be compiled down into. If so LLVM will never fully replace GCC or any other C compilers.
I think the OpenBSD project is working on PCC http://pcc.ludd.ltu.se/ as a possible long term replacement for GCC.
There was a time when icc was really ahead of gcc on Intel processors.
When I worked in the chemistry department of my university, we did some compiler benchmarks on Pentium 4's, AMD Athlons and even some Alphas (gcc vs. compaq compiler in this case). We did this at a time where icc was being touted as best than sliced bread, but found out that while it mostly outperformed gcc on the Pentiums, it did so by a small margin. On the Athlons gcc was faster across the board. On the Alphas, gcc was also better (except on fortran code, where compaq compiler was way better).
And this was gcc 3.x. Since then gcc has evolved quite a bit, and I wouln't be surprised to see it take the performance crown everywhere if I repeated these benchmarks today.
At my last job our (numeric) C and C++ code was in some case twice as fast compiled with SPARCworks (or whatever it's called this week) compared to GCC.
To be fair tho', $vendor can concentrate 100% on $processor, GCC at least tries to be cross-platform.
Given issues like this, and how talented the kernel guys are ... why haven't they written a C compiler? (Not meant to rant; serious question -- we wrote silly C compilers for class; seems like programmers with real talent should be able to write real, industry strength C compilers, especially if it only needs to support C, and not C++/Java/ObjC/...)
I think (but this is just a guess) that a production multi-platform c compiler is as hard as a kernel to get right (or at least in the same order of magnited) and fast enough to be useful, so I think it's more of a "there is no need for NIH" attitude.
C++'s Standard Template Library? I bet you're talking about how it complains about your out-of-spec code. Lots of not recognizing c-string functions like strlen, strcmp etc? 4.x doesn't include <string.h> along with <string> anymore, you have to add it explicitly.
(The complaints I've hear so far range from GCC having a crappy register allocator to it generating code that is downright wrong.)
EDIT: Spelling.