> I'm concerned about the gradual move from GCC to LLVM. The lack of copyleft protections on LLVM means that it's much more dependent on corporate sponsorship, and means that there's a risk that major improvements to LLVM compilers will only become available as proprietary products.
As someone who works within LLVM professionally, I don't think this is particularly likely -- compilers are massive and complicated beasts, and the "moat" for proprietary improvements is small: they're either niche and therefore not of interest to the majority of programmers, or they're sufficiently general and easiest to maintain by releasing back upstream (where Apple, Google, and Microsoft will pay a small army of compiler engineers to keep them working).
Your concern is the one that kept GCC from stabilizing its various intermediate representations for decades, which is why virtually all program analysis research happens in LLVM these days.
Edit: To elaborate on the above: neither Apple, nor Google, nor Microsoft wants to individually maintain LLVM. Microsoft appears (to this outsider) to be actively looking to replace (parts of) MSVC/cl with the LLVM ecosystem, because they're tired on maintaining their own optimizing compiler.
That is a weird way to say "your concerns were entirely correct". Every single proprietary mobile graphics driver uses LLVM for its shader compiler and the lack of copyleft has severely curtailed progress on open-source drivers.
Of course none of it goes back to LLVM because updating your production copy of LLVM is just as messy and broken as the non-stable GCC IR you complain about and the fact that OSS development is still fundamentally incompatible with 1-year industry release schedules that shift all the time.
> Every single proprietary mobile graphics driver uses LLVM for its shader compiler and the lack of copyleft has severely curtailed progress on open-source drivers.
And you think if they didn't have the option of using LLVM they would have released an open source driver instead? That makes no sense to me.
Yes. Developing a state of the art optimizing compiler is hard. If their only choices are developing their own compiler or basing off of gcc and releasing their work, they'd choose gcc.
> If their only choices are developing their own compiler or basing off of gcc and releasing their work, they'd choose gcc.
But that's the exact choice Apple faced in 2005 and they did not choose gcc. They paid Chris Lattner and his team to develop an alternative compiler. Excerpt from wikipedia[1]:
>"Finally, GCC is licensed under the terms of GNU General Public License (GPL) version 3, which requires developers who distribute extensions for, or modified versions of, GCC to make their source code available, whereas LLVM has a BSD-like license which does not have such a requirement.
>Apple chose to develop a new compiler front end from scratch, supporting C, Objective-C and C++. This "clang" project was open-sourced in July 2007."
For some reason, the enthusiastic focus on the benefits of GPL principles seems to ignore the actual game theoretic behavior of actors in the real world to not choose GPL at all.
Sure, if you have the resources of Apple, you can make other choices. I'd be willing to bet that Apple has spent more than $1B on clang, llvm, tooling, testing, integration, et cetera.
That's the wrong question. How much resources does someone have who decides to write an entirely new compiler when a working one already exists?
Of course the answer then reveals that this was only possible because there was comfy GCC to fall back on all along. They started in 2005, Clang became default in XCode with the 4.2 release in October 2011.
Ask yourself if you can sell your manager on 6 years of effort for no functional difference, likely even inferior.
Nah, they license a proprietary one. There are a whole bunch of them (ICC, Digital Mars, PGI before they were acquired by Nvidia...). They generally aren't as good as GCC, but they keep your code private and copy-left licenses away from private codebases.
I don't consider the improvements that ARM, Apple, Sony, CodePlay, NVidia, AMD, among others, don't upstream due to IP considerations or revealing of hardware secrets, niche.
But if they have IP considerations or trade secrets to protect, they wouldn't contribute them to any other open source project either. This way, they can at least upstream everything, which doesn't fall under these restrictions.
> They don't upstream everything, while reducing their development costs, that is the point.
_ph_ meant to write "This way, they can at least upstream everything which doesn't fall under these restrictions." (note the deleted comma before "which"), i.e., they can at least upstream something instead of nothing.
They don't upstream what they never would be up streamed. But they probably they upstream a lot which they wouldn't have, if they couldn't use the open source tool at all. Also, a lot of companies reduce their development costs by using open source software, gpl or other, without ever upstreaming. So "saving development cost" doesn't sound like an argument to me.
With GPL there is a legal tool to force contribution, though.
As mentioned I don't care, after university my main UNIX platforms were HP-UX, Aix and Solaris with their respective system compilers anyway.
Linux is already getting replacement candidates in IoT space via Zephyr, NuttX, mbed, RTOS, Azure RTOS, and who knows if Fuchsia will eventually get out of the lab, so be it.
With GPL there is a legal tool to force contribution, though.
No, for practical purposes, there is not. You can only enforce delivery of the source when a product based on GPL software gets delivered. But of course companies, which create software, know about this. Any usage of GPL software for delivered products only happens after the decision to publish the created software has been made. In doubt, companies tend to not use GPL software as part of deliveries.
In practice companies are more likely to upstream non-gpl not less. With GPL they decide to use it without allowing any change, or they don't use it at all. With more liberal licenses they upstream anything that isn't core value because the cost of maintaining a fork is higher the more they are different.
I would be happily using HP-UX, Solaris and Aix to this day, so don't mistake my philosophical questions as a GPL advocate, just don't be surprised how the IT landscape will look like when GCC and Linux are no longer around.
On the age of cloud based runtimes and OS agnostic containers, Linux is an implementation detail of the server.
On Android Linux is an implementation detail, most of it isn't exposed to userspace, not even on the NDK as Linux APIs aren't part of the stable interface specification.
Upstreaming non-GPL code is much easier for these companies to upstream GPL code, because a GPL upstream can "accidentally" leak all the IP of that company by requiring that company to publish all of its software.
So the current options are: these companies don't open-source anything (which is what you get with GPL), or they open-source something (which is what you get with BSD, and in practice they open source a lot).
The claim that with the GPL they would open source _everything_ isn't true; they would just not open source anything instead. I also don't see any current legal framework that would allow them to opens ource everything without loosing a significant competitive advantage, and am very skeptical that such a framework can be conceived.
> On the contrary, the claim is that without the GPL, commercial UNIXes would still be around.
I'm not sure this is true. See LLVM as an example. GCC existing and being GPL only meant that, e.g., Apple couldn't properly use it. Apple could have bought LLVM and kept it private, or develop their own proprietary solution and not make it open source, or fork the open source project into a private project and never contribute anything back, etc. There were many options.
Open source software has always existed, if anything, the GPL demonstrated that a particular open source model does not work well for a big part of the industry, while it works well for other parts, and non industrial usage.
Linux being successful seems incidental to it being GPL'ed, at least to me. Other open source OSes, like BSD 4.x, have also been quite successfull (powering the whole MacOS and iOS ecosystems, after a significant frankenstransform into Mach). Maybe Linux would have been even more succesfull with a BSD license, or maybe it would be dead.
Exactly because I see LLVM as an example, given that not everyone using it contributes 100% back upstream.
I wouldn't consider BSD layer from NeXTSTEP an example of success for BSD's market adoption, given that not everything goes upstream, by now it hardly represents the current state of affairs.
If anything it represents what would have happened without Linux, all major UNIX vendors would continue to take pieces of BSD and not necessarily contribute anything back.
> I wouldn't consider BSD layer from NeXTSTEP an example of success for BSD's market adoption, given that not everything goes upstream, by now it hardly represents the current state of affairs.
Its essentially what happens if (1) only one company wants to use the open source product, and (2) the open source product has a tiny community where no development happens.
In that scenario, there is no benefit from anybody forking the open source project (a private company or an individual) for upstreaming anything. It just costs time, but adds no value for them.
Linux and GCC never was like this (not even early days), and I think this is independent of its license. LLVM never was like this either.
In fact, there are many private companies that maintain a GCC fork, like arm, and due to the GPL need to provide its sources with a copy, which they do. But they are not required to reintegrate anything upstream, which they very often don't, and ARM support in gcc-arm from ARM is much better than on GCC upstream. A volunteer can't merge (review, rebase, ...) a 200k LOC patch on their free time, so once these forks diverge, its essentially game over. You'll need a team of volunteers equal in man power to what the private company provides.
So while the license affects which companies can use an open source project in practice, and what they can or cannot contribute. The GPL2 and GPL3 licenses are not a good tool to actually allow everybody to benefit from those contributions.
Maybe a GPL4 could require companies to upstream and get their contributions accepted, but IMO that would just get even less companies to use those projects.
1/3 of the us internet traffic, not counting Juniper, then add all PS2/3/4 Consoles, MacOS's, FreeNas, EMC-San's:
Biggest German Online Seller, Sony Japan, Checkpoint and so on...search for yourself.
BTW: Market-share means nothing, you know Android is NOT Gnu/Linux
EDIT: Maybe your too young..but do you remember that SCO/Microsoft thingy with Linux
They did not...the opposite is true, they shat their pants because shortly before they said "yes" to Linux, that's why SCO came...first time you had really big money (IBM) behind Linux, please don't change the timeline...it's kind of important.
>And as information, during the early days, the Internet ran on commercial UNIXes.
What do you wanna say with that? During the early days of Smartphones they ran on commercial OS's??
What if those bigcorps don't merge some improvements for getting an edge over the competitors? And, what if every one of those bigcorps do the same? How would that impact the project in the future?
They pay the costs of maintaining their fork. Thus there will be constant internal chatter of is the non-merged stuff really valuable enough considering the cost of maintenance. Sometimes yes, but sometimes no as well. So overall that is a long term win.
> or they're sufficiently general and easiest to maintain by releasing back upstream
Exactly; maintaining a fork is a big pain. The ongoing cost of keeping it up-to-date is a pretty big incentive to merge it, even if there's not a legal requirement.
But Intel put proprietary work in ICC and their math libs, so there is something on periphery that’s going to keep the worry alive even if it doesn’t have a base.
Well, to be clear, both ICC and MKL are fully proprietary. They do indeed justify the worry, but they're not in the same "open source but may be susceptible to proprietary creep" category.
> they're sufficiently general and easiest to maintain by releasing back upstream (where Apple, Google, and Microsoft will pay a small army of compiler engineers to keep them working).
Are you under the impression that C and C++ code within Apple, Microsoft, and Google is fundamentally different from C and C++ code elsewhere? Because it isn’t. Google’s engineers maintain LLVM’s ASan for their purposes, but those purposes happen to be everybody else’s as well.
It may not be fundamentally different but it may only use a subset of features (e.g., no exceptions in C++ code) and/or target a subset of platforms. Without pointing fingers, I am sitting on (what looks to be) a codegen bug in Clang that is not of a high priority I am pretty sure due to the above reasons.
This is not to say that I am not appreciative of all the work Google/Apple/etc engineers do in LLVM (I will be eternally grateful for the Clang targeting MSVC work).
Right, the reality is that if you want to be a profitable tech company today, you have to leverage open source. "Their use cases" includes, like, everything in a Linux distro. (And yes, this applies to Apple and Microsoft as well as Google.)
The necessity of open source for industry has both negative and positive implications for those of us who care about open source / free software as an ideal and not simply a tool of capitalism. The negative one (which GCC's leadership failed to really internalize) is that the number of engineer-hours at the command of for-profit companies is much higher than the number of engineer-hours at the command of community-driven projects. If you deliberately build a worse product to prevent corporations from using it for profit, given enough time, the corporations will replace it. The positive one, though, is that those engineer-hours are generally more valuable when pointed at some common cross-company codebase unless it is the specific thing that makes you money, and FOSS as an ideal provides a well-accepted model under which they can organize cross-company work (doubly so when they employ idealists like us as implementors). A compiler makes very few people money directly. It's generally a tool that you want to work well, and it's helpful to have other people run into the majority of problems and fix them before they cause trouble for the things that do make money.
So it's not surprising that LLVM is catching up with GCC, nor is it surprising that LLVM is and remains open source. If you are concerned about the LLVM monoculture, build a competitor such that it is cheaper / more profitable for companies to work on your competitor than to either work on LLVM or build their own compiler. Figure out what will make companies want to contribute and encourage it. (GCC did not do this, but it is perhaps slowly relaxing here.) If you are concerned about LLVM becoming proprietary, make it so that is cheaper / more profitable for companies to release their changes instead of holding onto them; that is, figure out what will make companies feel like they must contribute and encourage it. (One common strategy, used by Linux itself, is to maintain a high level of internal API churn coupled with genuinely good improvements in that churn and a policy that people must update in-tree callers; at that point, the more API surface you use in your private fork, the farther behind you get, and you'll watch competing companies outpace you.)
> One common strategy, used by Linux itself, is to maintain a high level of internal API churn coupled with genuinely good improvements in that churn
Interesting angle that I had never thought of as deliberate. As someone who works on a (bespoke) integration/embedding of Chromium, I could say exactly the same thing about it too.
It's not churn for churn's sake. But having a policy of having drivers in-tree and explicitly not caring about out-of-tree stuff allows them a relatively free hand to improve the internals. Which is then seen as churn by out-of-tree code.
Meaning, internal things in chromium change so fast, so you'd slightly wish you could have your custom changes merged into upstream? (I.e into chromium)
So that the ppl at Google would keep your code working? And you didn't need to spend time resolving git merge conflicts, and compilation errors?
But you cannot, because some of those bespoke changes are "secret" and what you make money from? (And maybe some changes are off topic to Google)
I wonder how much time does it take to merge a new chromium version into your repo? Like, hours? Days? Weeks
Also, the fact that Chromium is a high profile security critical software with occasional emergency security updates for bugs exploited in the wild, doesn't help at all when you want to maintain your own fork.
Personally I see this as a good counterargument for "just fork it." Imagine if chrome decided to crank up churn with the intention to exhaust forks. Maybe all forks would join together to fight google or more likely they would just fall behind.
As someone who works within LLVM professionally, I don't think this is particularly likely -- compilers are massive and complicated beasts, and the "moat" for proprietary improvements is small: they're either niche and therefore not of interest to the majority of programmers, or they're sufficiently general and easiest to maintain by releasing back upstream (where Apple, Google, and Microsoft will pay a small army of compiler engineers to keep them working).
Your concern is the one that kept GCC from stabilizing its various intermediate representations for decades, which is why virtually all program analysis research happens in LLVM these days.
Edit: To elaborate on the above: neither Apple, nor Google, nor Microsoft wants to individually maintain LLVM. Microsoft appears (to this outsider) to be actively looking to replace (parts of) MSVC/cl with the LLVM ecosystem, because they're tired on maintaining their own optimizing compiler.