Hacker News new | past | comments | ask | show | jobs | submit login

Can anybody comment on the love for Clang vs. the declining importance of GCC? As it seems to me:

GCC supports more platforms GCC creates more performant code

Is it a license thing or a technical driven issue?




I guess both. GCC is very good for compiling C/C++ to machine code. But if you want to write a new language and reuse an existing compiler pipeline (Rust, Swift,..) or write a new tool based on a C++-parser (clang-format, clang-tiny, clangd,...) then LLVM is the way to go.

LLVM-IR has enabled this nice separation between the language front-end and the compilation back-end. GCC could have done that too: they didn't for political reasons. Why? Someone could have used GCC's C++ front-end for its proprietary compilation pipeline. Today this is nearly impossible to change in GCC.

There are also some other advantages for LLVM I like: it supports multiple architectures in a single binary. With GCC I need to compile the compiler either for x64 or arm but not both. GCC is still great but it's not surprising that LLVM is so successful.


> LLVM-IR has enabled this nice separation between the language front-end and the compilation back-end.

You're implying that GCC doesn't have a separation between language front-end and compilation back-end, but that's not true.

GCC has a consistent internal representation known as GENERIC where the front-ends and back-ends meet just like LLVM.


What I mean: As a compiler writer you can't just create GIMPLE-IR and then pass it to GCC. You always need to go through GENERIC, which is actually intended to be the AST of your language. Of course you could argue that you treat GENERIC not as an AST but something similar to LLVM-IR and keep your language AST outside of GCC. But does anyone do that? The reason for that is that the IR doesn't contain ALL the information, there is also some global state stored in global variables, etc. Also since the IR is only internal it is not as well-defined as LLVM's IR.

This missing separation is even a problem for writing tests in GCC: Often it would be quite easy to write a small program in GIMPLE-IR (or RTL) and test your optimization pass against it. But you can't write GIMPLE-IR (it needs to go through GENERIC all the time), so you need to write a C file that is then translated into the IR you want to test again. This is sometimes not so easy with all that different optimization passes that run before your pass and hopefully that doesn't change in the future either.... There was some work on this (I don't know about the current status) but almost all of GCC's tests are still integration tests against C files (I think they addressed this by extending the C-front end to be able to parse functions marked with __gimple as Gimple-IR). Also GIMPLE didn't start as a separate IR. It started as a more restricted tree-structure. Even today operands are still tree's and optimization passes are called tree-<something>.c.

With LLVM I can write a toy compiler that outputs LLVM IR with printf and then just pipe it into llc to compile it. That's pretty cool. I didn't want to imply that GCC has NO separation between IRs but the whole story is much messier compared to LLVM. Which is actually fine, LLVM is much younger than GCC and could learn a lot from it. And GCC is still pretty great in the use cases it was intended for.


I think GCC has been much more willing to opening up the compiler recently (almost exclusively because of the competition from clang). For gimple there is this:

https://gcc.gnu.org/wiki/GimpleFrontEnd

I'm not sure how complete and usable it is though.


And GIMPLE... and RTL...

And there's three separate forms of GIMPLE while you're at it.

It's way easier to work with LLVM IR.


Unfortunately, this IR level is not readily accessible to third parties, for political reasons -- RMS does not want to make it easy for third-party code to integrate with GCC at this level, for fear that some of the code that uses that interface might be proprietary. See, e.g.,

https://lwn.net/Articles/582697/


you're of course right except the OP asked about clang and not llvm. clang frontend could generate postprocessed c and pass it into gcc for whatever that's worth.


The reverse has actually been done: https://dragonegg.llvm.org


I would not be so sure about the declining importance of GCC.

GCC has caught up a lot with clang in the last few years, from the quality of error messages, to standard compliance, compilation times, tooling, sanitizers, etc.

IME GCC tends to produce faster code in practice, even if clang is often able to out-smart GCC in many areas (e.g., IIRC heap elision was first introduced in clang). GCC has a huge amount of inertia behind it, it's still the only compiler officially supported for compiling the Linux kernel, and, as long as the GCC developers don't stop improving it, it will be difficult for clang to take its role as "default" compiler, at least in Linux environments.

I use them both daily and I am really happy that we have the possibility of freely choosing between two such amazing projects. It would be a step backwards if one of them ended up fading away.


Maybe decline is the wrong term, llvm/clang did dent GCC near monopoly. Now GCC reacted promptly which is good.


The near-monopoly of GCC was leading to stagnation. Clang's competition seems to have reinvigorated the GCC project, which is a fantastic outcome for users: two great compiler teams riffing off one another's improvements, and we all benefit from their work.


I don't know how much is impression, but on my side I felt like GCC was a crumpling mammoth. clang forced them to improve and at the same time it showed that gcc is not near death at all.


Compilation times? I thought that compliling with GCC has always been faster than with clang, at least in the general case.


For C++ in my experience clang is much faster, especially for plain (unoptimized) builds.


That has reversed in the last few releases of clang for my codebase.


Because gcc got faster or because clang got slower?


Because clang got a LOT slower and GCC got a bit faster. clang used to be incredibly fast and used little memory consumption, but recent versions have seen the compiler regress in both areas. On the other hand clang does a lot more now than it used to so it's somewhat justified.

My own experience with a substantial codebase (1+ million lines of code) is MSVC uses the least amount of RAM, followed by clang and then GCC. For performance in terms of compiling/linking it's GCC/clang tied and then MSVC from fastest to slowest.


MSVC would be really great if MSFT could solve their small file io overhead and enabled a mode where I could disable file modification notifications completely. Right now doing any compilation makes the anti viruses go completely crazy.


Add the workspace folder as an exclusion to your AV.


On corporate machines you dont know how many things are subscribing to file notifications.


It would be nice to have choice on which compiler to use for the kernel. We're actively working on it.

https://github.com/ClangBuiltLinux/linux/issues


Be wary of the optical illusion of the hype cycle, if you base yourself on HN's frontpage you'd think that Rust is more important than C++. In my experience GCC is still the default C and C++ compiler on Linux in most of the industry. If it ain't broke, why change it?

Even though I still use GCC to build by C code I'm very happy that LLVM exists, it brought some new blood and much needed competition to the FLOSS compiler scene. It also gave a better architecture to build on if you want to create a new language, it's great that a language like Rust for instance can generate such good machine code by building on top of the LLVM backend. In the GCC-only days the easiest way to create a new compiled language was to emit C as an intermediary target...


Depends on what your industry is. If you're compiling c++ for mobile, clang is pretty much your only choice for iOS and the default choice for Android ...


Correction, the only choice on Android as of NDK 18.


And the default on OSX.


... and indeed on several of the BSDs.

It has been the default compiler in base on FreeBSD since version 10, and FreeBSD is now built with it. There was also a push back around 2014 or so to get much of the ports tree built with it, too.

DragonFly BSD has been buildable with clang since around 2014, albeit using clang from packages/ports. In 2017 (release 50) the DragonFly people started work on pulling it into base.

OpenBSD switched building on x86/amd64 to clang in 2017 and building on armv7 to clang in 2018.

NetBSD has included clang in base but as far as I know does not (yet) build with it by default on any architecture.


I only mentioned Linux because I knew that my argument wouldn't work for BSDs! But in my defense the situation, at least originally, was more political than technical (at least as far as FreeBSD was concerned, I don't follow the other BSDs too closely). GCC switching to the GPLv3 was a big no-no for many people in the BSD world. As a result they were at the forefront to port the system to clang since they were stuck with an obsolete GPLv2 GCC version.


GCC is still important, but yeah Clang is definitely more popular and gaining popularity.

I don't think the performance difference between them is significant, but Clang is much nicer if you want to use it as a library, add a new backend, etc. Basically if you want to do anything other than run it from the command line.

Oh also, Clang does cross-compilation in a not totally insane way.

I wouldn't be surprised if GCC is abandoned eventually, but it will be at least 10 years.


> I wouldn't be surprised if GCC is abandoned eventually, but it will be at least 10 years.

Let's hope not. The healthy competition between the two projects has been a huge boon for the OSS community.


> I wouldn't be surprised if GCC is abandoned eventually, but it will be at least 10 years.

It will not be, because LLVM does not align with the goals of the FSF and GNU projects. A free compiler under a free license is a pretty central project for them. (Besides that, making predictions 10 years out into the future of software seems meaningless. Saying "this thing might still be around in like 2 decades, or not" is... not particularly noteworthy, nor damning)

Plus, I simply see no reason to believe this, even aside from that. If anything, GCC development has gotten far more active and improved greatly in the past few years, with consistently high quality releases, each bringing in major features, and adopting work from Clang, and bringing in their own unique work. It would seem each of these projects has livened up quite a lot over time and are offering higher quality than they ever have before.

What reason do you have to believe it will be completely abandoned? Just because there's an alternative available? That never happened with Linux, either...


License mostly.

Given the amount of BSD deployments, I bet if Linux never happened, I would still be using Solaris, Aix, HP-UX, ..., as I don't imagine UNIX vendors would have been so generous with the *BSDs.


GCC progress was pretty slow before the rise of clang. Clang introduced many features that werent really on the radar of GCC, like better formatting of error messages, tight integration of tools like thread and address sanitizer, and in some areas even more and better warnings.

Meanwhile - at least for a short period - gcc was on this optimization fad where they'd just take any occurrence of UB as a reason to silently eliminate chunks of code and render the whole program useless in some cases. With 4.8 they even broke SPEC 2006. The GCC folks always argued that doing so is legal by the standard, but eventually there was enough backlash that the worst instances of such aggressive optimizations were removed.


The case I know of with SPEC was non-conforming code, not undefined behaviour. You really can't hold gcc responsible for bad code, though there's often a workaround like -fno-aggressive-loop-optimizations/-Wno-aggressive-loop-optimizations in that case. I'm glad GCC optimizes well and takes advantage of language definitions (specifically Fortran's) while being more reliable than ifort which gets lionized for optimization.


That's not the point. Of course you should write proper code, not do goofy things with pointers etc. But what I don't think was to excuse was that the compiler did this without warnings. And again the GCC folks' reply was "well you can crank up the warnings with -Wwhatever-the-hell". But that's just plain stupid. Any time the compiler removes code because of UB, non-conformance or specific interpretation of ambiguous specs in the C (or whatever) standard, a warning should be emitted by default, because it's pretty much guaranteed that the coder made either a blatant mistake or tried to do stuff that's simply wrong, again either by unawareness or by expecting the compiler to still emit some code that will at least work on their specific platform.


I always advise turning on all warnings and maybe disabling them selectively, and using any static or dynamic analysis available (and understanding the warnings). Compilation time won't catch a lot of things anyhow, and most people just ignore warnings. It's a matter of judgement whether to issue specific warnings by default, but I don't want the noise from things like intentionally elided code from compile-time constants. In the case of aggressive-loop-optimizations I'd have thought the argument would be whether it should be on by default, or at something like -O2. [On my first workhorse systems, reading past the end of an array, as in the SPEC case, gave a hardware error if you didn't disable that.]

Incidentally, ifort turns on (or used to) generally-invalid optimizations without warning, which GCC doesn't, and the first question to ask about user complaints of SEGVs on the systems I support is "ifort?" if only because of an optimzation which is valid but hits typical resource limits.


> I always advise turning on all warnings and maybe disabling them selectively, and using any static or dynamic analysis available

I absolutely agree. I actually frequently compile with both GCC and clang as they tend to find different issues at times. Same goes for valgrind and address sanitizer.

> but I don't want the noise from things like intentionally elided code from compile-time constants

Not that I'm in any way familiar with GCC internals, but shouldn't it be possible to only print warnings for problematic cases, and not things like constant folding, unreachable code, redundant range checks etc.

> On my first workhorse systems, reading past the end of an array, as in the SPEC case, gave a hardware error if you didn't disable that.

That sounds like a better outcome than a program that runs but computes something entirely different because of a missing loop. Depending on the application, that could be a subtle bug that can go undetected for a while. Yes, the same could be true for just generating the according unsafe stream of instructions (depending on platform, mempry layout, moon phase), but I'm not even suggesting the compiler should just generate that broken code. In the past compilers did, because they weren't that elaborate yet in general. What I can't wrap my head around is this: Any such "optimization" basically breaks the program already at compile time, guaranteed. When can this possibly ever be wanted? I'm not saying it's worse than just emitting code that might or might not crash the program because it reads or writes past the end of an array. But it's not better either because the compiler already has the knowledge that your code sucks, and the GCC devs decided that instead of aborting compilation (my preference, obviously with override switch) or printing a warning (by default), they'll just render the program useless in a different way. I feel like it's just not a good choice how to handle that situation.

Or maybe look at it this way: in the past, code like the SPEC example almost always compiled to something that still worked as intended. This isn't great and shouldn't be encouraged but that's just how it was. Now with the new compiler and the same compile flags suddenly everything compiles and runs seemingly fine but produces the wrong result. And the compiler perfectly knew this would happen.

As for ifort, I've never even used it, but it sounds rather user hostile.

In general, I still like and use GCC. I think it's good they revised some of those more aggressive optimizations, and I really like how they also improved the readability of the compiler warnings a lot, following clang.


As someone who used to write c++ these are the reasons clang was great for development:

  - compiles faster
  - better error messages
  - tooling (clangformat) is great
When it came time to cut a release binary you could always also build with gcc


This is what I do. I use clang for everything except the final release. I get clang's speed and superior error messages while I'm editing code, and I get the very slightly, but measurably faster code generated by GCC.


I've always used gcc on all platforms for C and C++, but since last year I also use clangformat.

On the other hand, the effort of learning clang options, error messages and paths in order to have two compilers available can wait indefinitely until I really need a second opinion about some compilation error; so far it hasn't happened.


You could build with bazel, which ships compiler configs for clang and gcc on linux (among other combinations) in its "CROSSTOOL"[1] file. Then it's trivial to switch compilers with one argument. This feature is a result of Google's years-long support of both GCC and clang for its own build, test, and release.

  $ bazel build --compiler=my_favorite_compiler //my/fine:program
1: https://github.com/bazelbuild/bazel/wiki/Building-with-a-cus...


Sorry, I lost you at "bazel". Do you believe constructing clang command lines in addition to gcc command lines is harder than setting up a particularly complex build tool like Bazel? Not to mention that it doesn't appear to support GCC and Clang on Windows.


Bazel comes with an installer package that you click and run. I didn't find it that complicated. It comes out of the box with CROSSTOOL support for GCC, clang, and MSVC on Windows, among dozens of other toolchains.

https://github.com/bazelbuild/bazel/blob/master/tools/cpp/CR...


The official documentation at https://docs.bazel.build/versions/master/windows.html says:

Build C++

To build C++ targets, you need:

•The Visual C++ compiler.

Whether the developers gave up on the Mingw64 GCC and Clang toolchains or it's a documentation mistake, this statement places Bazel support for Windows in the "don't touch with a 10' pole" category.

But my point was that adopting Bazel is obviously far more complex than adapting existing build scripts and build script generators for Make, Cmake, Ninja etc to use Clang.


Options are mostly the same. You can easily switch out the two compilers in most build systems. Make: make CC=clang CXX=clang++. Cmake: cmake -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++. Autoconf: CC=clang CXX=clang++ ./configure.


This seems like it would be a nightmare to deal with if things ever went wrong. Is this actually a viable option?


Just run CI + tests using the release compiler instead of the dev one. Not a big deal.

Also it’s trivial to change the dev compiler when debugging something that might be a compiler error (it never is though).


I had a similar workflow, except icc for release, which still outperforms gcc on certain types of codebases.


As a compiler hacker, I'll posit that LLVM is far easier to hack on. And of course the licensing makes a huge difference.

So in terms of mindshare among people who are doing compilers, it's kind of a no-brainer.


Among other answers, clang supports -Weverything, which I was using just last night.


As I understand it, one benefit of LLVM is that it has a cleaner backend interface, so it is easier to create a new compiler by using LLVM as your backend rather than GCC's backend.

(In case it is not clear, just to be explicit: LLVM corresponds to only the backend part of GCC. Clang+LLVM correspond to the whole GCC C++ compiler.)


I think it really does boil down to the license: Apple absolutely hate copyleft, so they poured money into Clang.

It's sad, really — look how old the bash & emacs are in macOS these days. And I really don't think that copyleft would hurt Apple: what compiler technology do they want to keep proprietary?


Apple started heavily investing in clang and friends because they wanted a compiler suite that was more modular than gcc and that could be used as a library in other things.

They wanted to do things like take chains of filters in Core Image and JIT compile them into a single optimized filter that could target the CPU or the specific GPU in that Mac. They wanted to use the compiler's parser as a library in Xcode to handle things like syntax highlighting.

Gcc could not do these things as it was at the time. The different parts were tightly coupled in ways to make it hard to separate things out. People had proposed making it more modular but the gcc devs rejected them, with RMS making it clear that the design was the way it was in order to make it harder for proprietary programs to have ways to use gcc's parts.

Apple could have forked gcc and changed it to be more modular, but that would have required a lot of changes and a lot of work, and since it would not be accepted upstream they would probably have to do a lot of work every time there was a significant gcc update.

It made much more sense in every way for them to go with clang and LLVM, because those were designed to allow the kind of modular, library use that Apple wanted.


More performant I am not sure. Would have to see specific benchmarks comparing and the assembly for the newest gcc and newest clang.

The reaosn why I vastly prefer clang is for three main reasons.

Firstly, clang and llvm are written in c++ so it's much easier for me to work with. Secondly, it has proper documentation on how to, for example, add new optimization passes or adding a new instruction. This includes tons of talks and presentations on doing this live. Thirdly, the tooling for it is great, such as clangd, llc, lld, clanggtidy, etc.

Gcc seems to have none of this. Oh, and almost forgot, most importantly llvm + clang is actually open source (more open) via a MIT license instead of gpl.


> most importantly llvm + clang is actually open source (more open) via a MIT license instead of gpl.

Being allowed to do more things isn't necessarily freedom. For example, is a society where I can enslave people a freer society? I don't think so.

The GPL protects the freedom of software /users/. I the same way that prohibiting slavery protects potential slaves, at the expense of limiting the actions of slavers.

For example, when OS X used GCC all OS X users could get the full and exact source of the compiler their system used. Now that OS X uses LLVM that isn't the case. You can get something close but Apple includes proprietary changes.


This is an absolutely standard issue with developers looking at licenses from their perspective rather than the users's.

It's tiresome to remind people to please remember, GPL tries to ensure user freedoms that might not align with what developers would want, this does not make it less free, it makes it less free for developers.


I was with you until the end of your comment: > llvm + clang is actually open source (more open) via a MIT license instead of gpl

How is a license that allows someone or some corporation to take some code, add to it, and make those additions proprietary "more open" than a license that specifically disallows that type of scenario? The GPL ensures that derived code stays open while MIT does not. In my eyes, that's "actually open source". I can see the argument that licenses like MIT allow consumers of some piece of code more freedom to do as they please, but that hardly makes it more open.


The GPL's aim is keep openness for the end user as well as developers. BSD/MIT, etc. is aimed squarely for the benefit of developers. I far and away prefer GPL3 over anything else. I'm somewhat disappointed the Linux kernel and important tool chain programs are still under GPL2. I'm not an RMS fanboy by any stretch, but in the end, his views on the overarching concerns of GPL licensing and free/libre software are usually born out as true.


How is a license that allows someone or some corporation to take some code, add to it, and make those additions proprietary "more open" than a license that specifically disallows that type of scenario?

This gets argued all the time. What the corporation did was freedom for them not the code. GPL protects the code, MIT/BSD protects the people. Depending on your definition of "freedom" you will prefer one over the other.


GPL protects the freedom of the users of the software.


It protects the code which happens to protect the users, but it doesn't let people do whatever they want with the code. Which you prefer is up to you. Using the word freedom is just problematic when there are restrictions.


> clang is actually open source (more open) via a MIT license instead of gpl

Here's the obligatory retort that open source != free software: GPL (et al) is about guaranteeing end users' freedom to tinker, as opposed to MIT (et al), which are about allowing developers to use someone else's work in their product without restrictions. GPL enforces a share-and-share-alike ethos, while MIT asserts do-what-thou-wilt. They're not the same end goals.


GCC is written in C++ nowadays.


> Firstly, clang and llvm are written in c++ so it's much easier for me to work with.

Why is the implementation language important to you as a user?


Their interest seems to be primarily in alter the toolchain itself.

To make this a multi-use comment, I'll also reply to GP and say that in every single case where I've benchmarked time sensitive code, GCC has generated faster, tighter code. I still primarily use clang for other reasons, but saying that GCC is more performant seems to be entirely valid.


The answer is in the paragraph you quoted from. "work with" includes extending the implementation.


Two words, nested functions. They should have been in the C standard a long time ago, solving the same problems without them is a major pain. And even though Clang supports most of the rest of the GCC extensions, they stubbornly refuse to touch nested functions. That and the code of conduct social justice warrior bullshit they've been pulling lately keeps me away from Clang these days.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: