
Extracting the abstract syntax tree from GCC - mkesper
http://lwn.net/SubscriberLink/629259/6fc0d45326725ba5/
======
bedatadriven
This is precisely what we do in Renjin in order to translate C/Fortran to
Java:

[https://github.com/bedatadriven/renjin/blob/master/tools/gcc...](https://github.com/bedatadriven/renjin/blob/master/tools/gcc-
bridge-plugin/src/main/c/plugin.c)

This little plugin simply dumps GCC's gimple to JSON, yielding for example:

[https://github.com/bedatadriven/renjin/blob/master/tools/gcc...](https://github.com/bedatadriven/renjin/blob/master/tools/gcc-
bridge/src/test/resources/org/renjin/gcc/dqrdc2.f.json)

LLVM's design looks very attractive, but for scientific computing Fortran is
really important, and AFAICT, LLVM doesn't have an intermediate representation
comparable to Gimple: LLIR seems to be at the level of the registers rather
than the nice abstract Gimple, but perhaps I just haven't looked deep enough.

~~~
buttproblem
> LLVM doesn't have an intermediate representation comparable to Gimple: LLIR
> seems to be at the level of the registers rather than the nice abstract
> Gimple

LLVM variables are sometimes called registers but they are machine
independent. The bit width is arbitrary (e.g., you can create a 129-bit
integer). The use of the term register is a misleading analogy; often LLVM-IR
is compared to assembly language so the use of the term register was also
used.

I do not know GIMPLE and couldn't find a good description of the IR
instructions. But, it seems that LLVM IR is somewhat similar to low GIMPLE.

~~~
bedatadriven
Will have to take another look, but it still sounds to me like LLVM IR is more
comparable to RTL
([https://gcc.gnu.org/onlinedocs/gccint/RTL.html#RTL](https://gcc.gnu.org/onlinedocs/gccint/RTL.html#RTL)).

Gimple variables have types comparable with C types - you get pointers,
arrays.

For example:

    
    
      int sum10(int values[10]) {
        int i;
        int sum = 0;
        for(i=0;i<10;++i) { sum += values[i]; }
        values[0] = 342;
        return sum;
      }
    

is compiled down to:

    
    
      sum10 (int * values)
      {
        long unsigned int D.1598;
        long unsigned int D.1599;
        int * D.1600;
        int D.1601;
        int D.1602;
        int i;
        int sum;
    
        sum = 0;
        i = 0;
        goto <D.1595>;
        <D.1594>:
        D.1598 = (long unsigned int) i;
        D.1599 = D.1598 * 4;
        D.1600 = values + D.1599;
        D.1601 = *D.1600;
        sum = D.1601 + sum;
        i = i + 1;
        <D.1595>:
        if (i <= 9) goto <D.1594>; else goto <D.1596>;
        <D.1596>:
        *values = 342;
        D.1602 = sum;
        return D.1602;
      }

~~~
rwallace
As far as I know, LLVM does exactly the same thing; the LLVM representation of
that code would have a different syntax of course, but semantically would be
exactly equivalent to the Gimple version you've presented.

------
nanolith
Typically, it's not a good idea to argue with RMS. The pragmatic approach
would be to just write the plugin for GCC, and keep it separate. GNU can
certainly control what is mainlined, but the GPL is not violated simply by
creating a GPL plugin that provides a useful output target.

If Emacs were enhanced to support a standardized file format which contains
the relevant information, then it could accept this format from any tool. The
GCC plugin could output this format, as could Clang/LLVM. Because the format
is universal, it likely would be mainlined.

Then, it's up to the user. If the user wants to integrate the two, the user
can apply the appropriate patch to GCC to incorporate this independent plugin.
Likely, many downstream package maintainers would add this as a standard
option if demand were high enough. This is the very freedom that RMS has
advocated for. It can be done without poking the bear.

~~~
hga
You're ignoring that a lot more work needs to be done in GNU Emacs, and that's
another program under RMS's stewardship, and one where he could be much more
obstructive.

The only outstanding threat to fork that I know of is coming from a GNU Emacs
maintainer, to I believe fork GNU Emacs. Again. Oh boy, again to support C++
development! How could I have forgotten that?!?!?!!!

But you're right, arguing with RMS on this is likely to be completely
unproductive. But it is morally necessary before potentially disastrous
forking of GNU Emacs and maybe GCC.

~~~
nanolith
Actually, I didn't ignore that in my original comment. I mentioned that the
GNU Emacs work should make use of a file format which is independent of the
plugin work. There is no reason why GNU would refuse to mainline such an
enhancement if it had merits outside of the GCC dependency.

RMS wishes to restrict possible proprietary use of the GCC front-end. That's
his prerogative. However, he can't prevent the existence of such a tool. He
can only prevent it from being mainlined.

EDIT: Also, it may be easier to make a case for mainlining if the
functionality is demonstrable. As I've learned throughout my career, it's
often better to ask forgiveness than to ask permission. RMS excels in the
idealistic and theoretical. If he has to consider the practical -- such as
whether to mainline already existing functionality -- then he may find ways of
suggesting changes to the plugin than ways of saying "no."

~~~
hga
" _There is no reason why GNU would refuse to mainline such an enhancement if
it had merits outside of the GCC dependency_ "

Except everyone who knows RMS well enough knows he'd be extremely
obstructionist and indeed GNU would refuse to maintain/mainline such an GNU
Emacs enhancement. That's one of the reasons I believe the GNU Emacs
maintainer threatened to fork it, he obviously knows RMS well enough.

(So do I in the 1979 to late '80s period, and he's not changed in this sort of
thing. And he was reliably reported to be like this before I showed up on the
scene, by among others one of the two initial ITS TECO EMACS beta-testers. And
others I knew less well.)

------
nemoniac
According to GNU, the second of the four essential freedoms is the right to
study how the program works (among other things).

One could argue that Stallman is not acting within the spirit of this right by
being deliberately obstructive and obfuscatory.

------
sesutton
Its hard for me to come up with a way in which this is not RMS advocating
restricting the freedom of users of GCC.

If he is so opposed to proprietary software using GCC output why not use a
license that prohibits that? Its not even too late for new versions since the
FSF owns the copyright to all GCC code.

~~~
overgard
That's the great irony of the copyleft movement -- it's about restricting what
people can do under the guise of "freedom". RMS doesn't want people to
"misuse" his software, but his version of misuse is "use it for a purpose that
doesn't advance GNU." He's willing to cripple his software so that some
theoretical entity that doesn't actually exist and does not agree with his
politics can't misuse it.

On a practical level, if people wanted to execute that particular "misuse" (IE
use it in non-GPL software, which I would just call a "use"), they would
obviously use clang. Clang is basically better at this point anyway.

Basically what I'm trying to say is that RMS is a nutcase.

~~~
cbd1984
> That's the great irony of the copyleft movement -- it's about restricting
> what people can do under the guise of "freedom".

This is extremely misleading: Copyleft is about making sure everyone has the
same rights.

> Basically what I'm trying to say is that RMS is a nutcase.

And here you prove you have no serious argument, by devolving to insults.

~~~
overgard
> This is extremely misleading: Copyleft is about making sure everyone has the
> same rights.

In theory... except in this case, the point is ensuring that people who don't
share his political leanings can't use the product in a way he doesn't like
(to generate an AST). It's not even about modifying gcc in this context, he
doesn't want people to "misuse" it. It's entirely about telling people what
they can and can't do with their software. Where is the freedom in that?

> And here you prove you have no serious argument, by devolving to insults.

I'm calling him a nutcase because he's being paranoid by crippling his own
software to prevent a thing that people can already easily do with a
competitor (clang). It's not like people are like "oh no gcc can't emit an AST
I guess I'll give up and forget the entire venture" they just go "well I guess
I'll go use clang instead"

------
zeograd
As much as I love GNU tools "universe", I feel that this ideological position
of RMS regarding GCC future is hurting more libre software than it does good.

That's a bit saddening.

~~~
astrodust
The GPL principle of "all your code belongs to everyone" is infuriating to say
the least, and the efforts of those to force this on people is even more
damaging to the open-source community.

Free, open-source software has helped _considerably_ in the last twenty years.
That's why I think licenses like MIT or BSD help keep software free and truly
open: Use it for what you see fit without any obligation on your part.

That's free.

------
listic

        "misuse of GCC front ends"
    

While Stallman's intent is the opposite, this is sounds awfully like the kind
of thinking of big companies that are so easy to hate, where one can "misuse"
a binary by reverse-engineering it.

------
overgard
This is why I prefer the BSD license for everything I do. In Stallman's
universe, "freedom" only exists as long as it furthers his cause. What a
waste. As far as I'm concerned, if I write something and release the source,
I'm more than happy with people using it for whatever they want.

------
userbinator
Isn't GCC itself GPL with all the source available? IANAL, but AFAIK that
means as long as you release your modifications under GPL too, it doesn't
violate any license. Or are they saying that GCC doesn't actually use an AST
somehow, which doesn't match at all with my (admittedly limited) knowledge of
it?

I'm someone who has experience with what can be done in terms of
extending/modifying software even _without_ any source code, so my perspective
on things like this are more like... "you have the source _and_ an explicit
license to modify it, and no intention of keeping your changes proprietary, so
just do it!"

Either that or move to clang/LLVM, which I think would be better in the long
term.

~~~
TheDong
The issue is not that the plugin wouldn't be gpl; it would. The concern is
that the AST dumped is not code, but an artifact, and thus can be used by
proprietary software and make proprietary software better.

I could write a program that compiles your gcc-produced AST for a new
architecture, and not GPL it because the AST itself probably isn't GPL
protected. That's the fear.

~~~
laichzeit0
The fear that someone else might be "better" than you so you're intentionally
going to make it impossible for them to even compete is a really dumb
argument. Imagine if this was done in sports.

And that seems to have been the whole intention behind this. Make sure no non-
GPL backend can be used because god forbid it's better than ours. The shame!

~~~
jordigh
> The fear that someone else might be "better" than you so you're
> intentionally going to make it impossible for them to even compete is a
> really dumb argument. Imagine if this was done in sports.

It _is_ done in sports. Most sports forbid you from taking drugs that enhance
your performance, or are seggregated by sex in order to account for the
"natural" (?) unequal distribution of muscular mass across sexes, because we
think these would be unfair advantages.

The idea here is that the big proprietary software companies are not playing
on a level field, since they are not disclosing their source code. The GPL, as
I understand rms's intentions, is designed to inhibit that advantage and force
everyone to play fairly: if you take code, you must give back code.

I don't understand much about the particulars of this case, and public opinion
seems to be heavily slanted against rms. Nobody seems to agree with him here,
but he's been considered a fool in the past when he was more of a Cassandra.
His positions usually come from past experiences he's had.

Someone else mentioned elsewhere in this thread that historically the fear was
of Intel using gcc as a frontend for icc. Seeing how Intel still contributes
code to gcc and how they still maintain icc and how they still love publishing
non-free software, perhaps this is still a valid fear. Or perhaps everyone is
just using LLVM already for this purpose and it's not likely they would use
gcc instead. I don't know.

I don't want to cast an opinion on this case beyond what I've already said.
Maybe rms is wrong or maybe he's right.

~~~
dalke
"if you take code, you must give back code."

That's not the intention. It's a forward mechanism, not a backwards one. If
you distribute code to person/company/entity X then you must also distribute
the source code, or provide a way for X to get the source code.

If you make changes but never distribute the code then you don't need to
distribute your modified code to anyone. If you only provide a service then
you don't need distribute the code to anyone. Even if you modify gcc, sell me
a copy, and distribute the changed source code to me, that doesn't mean you or
I need to give the changes back to the gcc project.

The "fear", as I understand it, is that the Linux kernel and gcc are the two
biggest GPL-based projects. If gcc is diminished, then fewer people will be
aware of, much less agree with or advance, the ideals of software freedom. The
technical advantages of gcc are used as marketing for software freedom.

Hiding GNU projects behind commercial and proprietary solutions front-ends
reduces the marketing. Hence also the issue about "GNU/Linux" as "Linux"
diminishes the marketing of the GNU project.

~~~
falcolas
While incidentally true and pursuant to his real goal, I don't believe this is
his ultimate goal. His goal is, and always has been, to make sure that you as
the user of the software built upon his work, are always capable of opening it
up and looking under the hood, and making adjustments as you see fit.

Throw a proprietory module on the end which relies on his work but gets around
the GPL, and you've lost that right. That's not acceptable to him.

~~~
dalke
I believe you've just restated what I wrote.

Quoting from the Free Software Definition page at
[https://www.gnu.org/philosophy/free-
sw.html](https://www.gnu.org/philosophy/free-sw.html) :

> You should also have the freedom to make modifications and use them
> privately in your own work or play, without even mentioning that they exist.
> If you do publish your changes, you should not be required to notify anyone
> in particular, or in any particular way.

There is nothing that obligates giving back code. Rather, that obligation is
against the four freedoms of the free software. Hence, "if you take code, you
must give back code" is not, as jordigh understood, one of rms's intentions.

~~~
falcolas
True! Sorry, I was responding mostly to this phrase:

> If gcc is diminished, then fewer people will be aware of, much less agree
> with or advance, the ideals of software freedom. The technical advantages of
> gcc are used as marketing for software freedom.

------
insaneirish
> It took many years before the GNU Compiler Collection (GCC) changed its
> runtime library exemption in a way that allowed for GCC plugins, largely
> because of fears that companies might distribute proprietary, closed-source
> plugins.

Am I the only one who finds it ironic that purported "free as in freedom"
software is/was being held back because of fears of what someone can do with
that freedom?

Though once useful, the GPL and especially the AGPL are harmful, restrictive
licenses that have outlived their usefulness and in no way represent
"freedom".

~~~
rayiner
The most free world is not the one where there are the fewest restrictions.
Restricting peoples' natural freedom to kill each other or steal from each
other, for example, arguably makes everyone more free. The GPL is based on the
same principle. Restricting certain antisocial behavior increases net freedom.

~~~
hga
" _natural freedom to kill each other_ "

Although that's not a binary thing, I have the natural and legal right to kill
in legitimate self-defense. In rare situations, stealing is allowed to prevent
a greater loss.

In this case, most of us, even those of us who don't like the GPL in practice
like me, are not primarily arguing the GPL, but RMS's stewardship of GCC and
GNU Emacs about something that's allowed in I believe every FOSS license.

Just not in these two projects under the aegis of the FSF. Unless and until
they fork, or RMS is somehow convinced, whic history says is not the way to
bet.

~~~
leoc
> In this case, most of us, even those of us who don't like the GPL in
> practice like me, are not primarily arguing the GPL

However, people _are_ making the argument that this proves that GPL<BSD, and
so it needs to be rebutted until they stop making it. Speaking of which, it's
_particularly_ strange for people to take the position that RMS's behaviour
here is anti-freedom and thus obnoxious, and therefore GPL sucks. It shouldn't
be necessary to point out that he could pull the same stunt just as easily if
GCC were under a BSD license. In fact he could achieve the same effect even
more easily, because he wouldn't have to rely solely on constantly changing
the AST interface, or the threat of doing that: he could put future official
versions of GCC under a new license which tried to put up legal barriers to
AST access, or threaten to do so. More broadly, it's hard to imagine any
consistent position under which RMS attempting to exert some _de facto_
proprietary control over GCC is tyrannical, but licenses which would have
given him the authority to make future official GCC versions _de jure_
completely or semi-proprietary are upholding _freedom_! Of course, you _can_
argue that RMS' behaviour here proves that he is being hypocritical when he
denounces others for exercising proprietary control of software, through
proprietary works derived from FOSS software or by other means. But just as
obviously, catching someone in hypocrisy doesn't necessarily mean that they
were wrong.

Moreover, it needs to be addressed not only because people are drawing
incorrect conclusions about GPL vs. BSD, but because it shows that they don't
understand what's going on here. The problem here isn't licensing terms, it's
Too Big To Fork
[https://news.ycombinator.com/item?id=6810259](https://news.ycombinator.com/item?id=6810259)
. "Freedom of the press is guaranteed only to those who own one." Speaking of
which, it's strange. Here RMS is using his position to exert some _de facto_
proprietary control over GCC, a big, hard-to-replace hairball of code with
lots of clients locked in to its interface, and widespread anger is (quite
appopriately) the result. Yet the Web browser vendors have honed this kind of
behaviour into a routine, regularly exerting their _de facto_ power over the
Web, and sometimes in nakedly self-serving and destructive ways, and people
have been conditioned to hail and venerate this as the Open Web.

~~~
hga
My only quibble is that since RMS has wrapped himself and this issue in the
flag of the GPL and "the [foundation] for which it stands", in terms of the
instant controversy, and e.g. conflicts with OpenBSD, they are indivisible.

But I hope I've made it very clear that I indeed view this solely as a
stewardship issue that could be true under any FOSS licence, and I repeat that
the GPL is a tolerable licence for something like the GCC (well, it might doom
serious Ada to the proprietary world, but, eh).

------
glenjamin
The summary from this seems to be that RMS is against having a non-free tool
consume the output of a free tool? Is that an accurate summary?

If so this seems like a very strange position to take to me.

~~~
anon1385
You have to understand the perspective of the FSF. They view non-free software
as fundamentally morally evil. In the same way that most people today view
slavery as fundamentally wrong. The comparison to slavery is one RMS and the
FSF often make themselves.

To extend the analogy: if picking cotton was only economically viable if done
with slave labour, then most people would agree we are better off not picking
cotton at all. Similarly the FSF position is that if solving a problem using
computers is only possible in a way that uses or enables non-free software,
then we are better off not using computers to solve that problem at all.

~~~
hga
In his inimitable style, Theo de Raadt, extremely annoyed by a 2007 effort to
GPL a driver that OpenBSD reverse engineered with much sweat, notes that RMS
has been perfectly fine with GCC having code to run on, and built binaries
for, closed platforms....

This GCC stuff is very much a shades of grey situation in practice.

------
eggnet
This seems analogous to the stable kernel ABI difference between Linux and
FreeBSD. But I haven't seen any particularly deep articles comparing them, and
our treatment of the respective project leaders.

------
willvarfar
At Symbian we used a GCC backend called GCCXML
[http://gccxml.github.io/HTML/Index.html](http://gccxml.github.io/HTML/Index.html)
to parse our c++ code to drive various scripts and tools.

It never did function bodies, but then we never needed it to.

This was over 10 years ago now; frustrating that GCC has ever made clang
necessary :(

------
jesuslop
Excuse me, but what's so wrong with the existence of a proprietary gcc
backend?

~~~
rwmj
History. RMS was worried that "someone" (basically, Intel) would be able to
take the gcc frontend and attach the icc (very fast, proprietary) backend to
it, thus creating a compiler which could compile the whole GNU stack,
producing very fast code. This would entice people to use a non-free compiler.

Since then LLVM came along and made the whole point moot, since it has a
modular front end which can parse all the gcc extensions.

------
wging
It was my understanding that people have already extended emacs in the usual
way to do lots, using llvm or clang, and that this argument is only about
trying to do things that can make it into emacs proper. Is that not right?

------
politician
Stallman appears to make himself out to be a Capitalist with positions like
this -- he simply takes profits in the furtherance of his cause rather than in
dollars, and uses procedural controls like the GPL rather than technological
controls like obfuscated binaries to do so.

That said, personally, I think the GPL has both helped and harmed the
evolution of computing.

~~~
nitrogen
The entire point of Copyleft was to exploit copyright against itself.

------
gre
Code terrorism at its worst.

------
tacos
He played this game with precompiled headers a decade ago and there's STILL
barely any support for it in the GNU tools. When a debate around implementing
precompiled headers ten years after everyone else gets them requires lawyers
to get involved, maybe freedom isn't ringing as clearly as we were promised.

There's something fundamentally weird about the whole dev ecosystem wrt
developer productivity. Next time you're sitting in front of an 8 CPU box with
an SSD trying to solve a CPAN or library dependency, watching one C file
compile once every three seconds, perhaps think about why things always wind
up like this.

I super respect Stallman's position but at some point you have to look around
(and beyond!) and say "why are we working with crippled tools?" With few
exceptions all the core projects are uncontributable at this point. And RMS
isn't even the worst; he's just the most consistent and predictable.

~~~
falcolas
Because RMS' stance has nothing to do with your productivity, and has
everything to do with giving you the right to look at every single detail of
GCC and make it work better for yourself.

If he did open it up for a proprietary component to make its way into GCC
(while avoiding the licensing terms), and it becomes mainstream, you have lost
that right to view and modify the software that runs on your box, and that is
something he does not view as acceptable.

Ultimately, it's his decision; it's based on his work, and the contributors
agreed to his terms. That he makes it freely available with the condition that
it remains freely available is a bonus, not a right.

~~~
kps
> the right to look at every single detail of GCC and make it work better for
> yourself.

And yet, the referenced discussion breaks down into RMS on one side and
_people who want to make gcc work better for themselves_ on the other.

~~~
falcolas
> And yet, the referenced discussion breaks down into RMS on one side and
> people who want to make gcc work better for themselves on the other.

At the risk of someone building a closed backend off an artifact of the GCC
frontend; at the risk of someone profiting off RMS' work without following his
conditions for making that work free.

It's important to remember that he isn't against Emacs getting the information
it needs to do its work. In fact, this mirrors a common discussion in computer
security, and system administration in general: you give applications and
users only the permission they need, to protect against bad actors. If Emacs
doesn't need the full AST, why would you give it to them?

~~~
tree_of_item
Emacs does need the full AST. The whole point of Emacs is to allow users to
write applications that were UNANTICIPATED by its developers. You can't
predict exactly what sort of information users will need, so you can't make
some "safe" amount of information available.

It's mind boggling that this even needs to be explained. I get the sense that
rms doesn't really grok the Emacs philosophy anymore.

~~~
hga
Urk! You might well be right, I've been assuming he doesn't grok C++, which he
doesn't use.

This would be terrible, for as far as I'm concerned, his single greatest
contribution to computing is the EMACS philosophy that he developed out of a
primitive set of TECO macros. (I believe everything else would have happened
more or less, e.g. without Linux, BSD would be ruling now after a 2 year pause
for that AT&T lawsuit.)

------
spork1
comparing emacs to xcode/idea made me lol

~~~
gumby
> comparing emacs to xcode/idea made me lol

Why did it? Emacs is a great development environment. It allows you to focus
on the code and supports debugging right next to your code, same as an ide. It
is an ide.

Emacs lacks some features of xcode/eclipse et al but in exchange it has power
they lack. Why shouldn't it gain some of their features?

(I am assuming you loled because you considered emacs weaker. My apologies if
I got the sense backwards).

~~~
GFK_of_xmaspast
I've been programming in emacs since 1992, 2014 was the year I finally gave up
and started to move into IDEs (having to target android was the primary
motivation for using something else, but I'm rapidly getting spoiled, and am
going to shell out for clion once it's done).

There's stuff I miss about emacs, and I'll keep using it for text files and r
code and other stuff (it's still my primary python editor until I get around
to trying pycharm), but by and large I think I'm done with it for c/c++.

------
jimrandomh
Just let GCC die already. Not only is LLVM free from this issue, it's
technically superior across the board. The general replacement of GCC with
LLVM in most applications appears sufficiently inevitable that investing more
effort into GCC is wasteful.

~~~
frozenport
llvm cant build a working Linux kernel.

~~~
sanxiyn
It can. Patches are here:
[http://llvm.linuxfoundation.org/index.php/Main_Page](http://llvm.linuxfoundation.org/index.php/Main_Page)

~~~
cesarb
It can't if the phrasing is tightened:

"llvm cant build a working Linux kernel from unmodified sources, using the
.config from a major distribution"

It's getting closer; I believe it won't be long until clang/llvm can build a
Linux kernel from unmodified sources (getting Linus to accept the remaining
patches is a valid tactic).

