The Architecture of Open Source Applications: LLVM

yuvadam · on May 30, 2011

I've been hearing about LLVM for quite some time, but never fully grasped what it is. This article tackles everything you should know about LLVM. I'm finally starting to get it.

Perfect.

obiterdictum · on May 30, 2011

I've never built a compiler, but how well does having common optimizer for all platforms work in practice? (11.5.1) Does it cause a "lowest common denominator" effect thus making the code take less advantage of the target architecture than a architecture/vendor-specific compiler would?

I'd love someone with the experience of building compilers to chime in.

wladimir · on May 30, 2011

If the compiler takes a full machine description into account for each of the target architectures (AFAIK, LLVM does that. ), it won't be based around a "lowest common denominator". The machine description is a list of instructions and what they do and things like timing and scheduling constraints. The task of the compiler is to find the most optimized instruction sequence that do a certain thing for the architecture it is compiling for. This principle is the same for every architecture, so you don't need a specific optimizer per platform.

_delirium · on May 30, 2011

That's the theory, but you do run into problems if the entire parameterization misses some architectural differences that didn't exist when it was first developed (or weren't important). That's one problem gcc has, that its architecture description is aged and doesn't have a good way of specifying some things that are very important for 2010s performance, so has had to undergo some major surgery (still in progress). LLVM currently avoids that partly by being newer. It might also be better designed; we'll see how it ages over the next 20 years. =]

froydnj · on May 30, 2011

> That's one problem gcc has, that its architecture description is aged and doesn't have a good way of specifying some things that are very important for 2010s performance, so has had to undergo some major surgery (still in progress). LLVM currently avoids that partly by being newer.

LLVM also currently avoids that by not generating code that's as good for those 2010 architectures.

I would also be interested in hearing you expound upon those things that are "very important for 2010s performance".

_delirium · on May 30, 2011

> I would also be interested in hearing you expound upon those things that are "very important for 2010s performance".

I'm not particularly an expert in this area, but my understanding from articles/mailing-lists/etc. was that there were inadequacies in the previous machine-description and IR frameworks that led to the revamping with Gimple and MD-RTL over the past few years. Some of that was just maintainability, but I thought there were also optimization-related issues with the old internals. Is that not the case?

froydnj · on May 30, 2011

It's hard to say without reference to specifics; certainly the MD files have been around since time-immemorial, even if they've been augmented with new features over the years. And the gimple bits were primarily aimed at making the in-memory representation more efficient, a goal which they accomplished.

In any event, as you say, things have been revamped, so whatever issues there were with the old internals no longer apply, right? So there's no point in dragging out old chestnuts like "GCC can't support modern architectures or modern optimization techniques", because that's not true anymore, right?

(I apologize if this comes across as harsh; I'm just tired of seeing LLVM articles where commenters appear to be drinking the LLVM Kool-Aid without having any idea what the LLVM folks are grousing about. Sometimes the LLVM folks have a point, sometimes they're just asserting their engineering decisions are superior, which is debatable, and sometimes they're just grousing because they don't seem to like GCC. It's hard to say exactly what's in view from commenters and from the LLVM folks themselves.

GCC currently supports 8-bit microcontrollers, 32/64-bit desktop chips, a few "nonstandard" VLIW and DSP architectures, and lots of other chips in between. I, for one, am impressed with how much Clang and LLVM have done, but I'll also be more impressed if, in a decade and a half, LLVM's architecture hasn't acquired some warts and it seriously supports more than two architectures. After all, GCC was, in many ways, state-of-the-art when it first came out too...)

Someone · on May 30, 2011

A child will grow old, but that does not change the fact that it is young _now_.

LLVM _is_ the new kid on the block, and gcc _is_ old. GCC also is wise, but I think, given a) that we learned a lot about compiler writing since gcc first appeared and b) the support that LLVM has both in business and in academia, the writing is on the wall for gcc as the favorite compiler, first for X86 and ARM, later for other architectures.

Of course that may change; gcc can evolve. However, I doubt that will happen fast enough. the FSF does not like allowing proprietary compiler plugins and (I guess) has to little manpower to work on gcc.

froydnj · on May 31, 2011

> the FSF does not like allowing proprietary compiler plugins and (I guess) has to little manpower to work on gcc.

Fortunately, the FSF is not the entity driving development of GCC. In fact, I can't remember a commit in the last five years (there's been ~60k commits, so there'd be plenty to choose from) made by an FSF employee. So there are plenty of people and companies focused on moving GCC forward.

_delirium · on May 30, 2011

Ah yeah, that's fair. I'd also be impressed if LLVM doesn't have warts in a decade or two, which is sort of what I was trying to get at with the end of my comment--- that to the extent that LLVM has a cleaner architecture representation (if it does), it's mostly a function if it being a newer "clean 1.0", and not having yet had to adapt as architecture and optimization techniques change over time.

froydnj · on May 30, 2011

Ah, I see I missed that part of your comment. I think we are in agreement on that point, then.

wladimir · on May 30, 2011

As LLVM is used for everything from x86 to GPU code generation, I have some trust in it being able to hold up. But indeed, only time will tell.

cliffbean · on May 30, 2011

LLVM itself knows almost nothing about GPU-specific architectural features. Emitting PTX from LLVM, for example, is mostly just a translation from one compiler IR to another compiler IR.

kd0amg · on May 30, 2011

A lot of what gets taken out or changed by the optimizer comes from earlier parts of the compiler. Producing IR code to match the AST is much easier when you just focus on the tree node you're currently at and don't look around at the rest of the tree too much. Consider the recursive structure of the source code. A while loop is made up of a condition expression and a loop body. The condition expression can be made up of smaller expressions, which themselves may have more expressions as subcomponents. When producing IR code for the loop, you'd do something like

  label for top of loop
  [code to check condition]
  branch to end of loop based on condition check
  [code for loop body]
  jump to top of loop
  label for end of loop

Then you recursively generate the code for the loop's condition and body. When generating code for the condition check, you may have some subexpressions appear multiple times within the expression, even as simple as using the same variable twice. Neither of the AST nodes for referencing that variable can have the other as its parent (a simple variable reference is a leaf), so you'd have to do something extra in the IR code generator to remember having recently loaded that variable into a register. It's simpler to let the optimizer determine that a variable has already been loaded and does not need to be loaded again. Even a aimple optimization technique (local value numbering, essentially giving an ID number to every value you put in a register) is enough to determine that the variable was loaded earlier in the computation of this expression (this technique can actually detect arbitrarily large common subexpressions given a well-behaved front-end). Global control flow analysis can detect redundant computation across multiple expressions -- it checks whether some expression will have already been evaluated[1] and where its result would be kept. The optimizer can also detect when the result of an expression never actually gets used[1].

Realistically, the optimizer is not the only phase that would be aware of optimization issues, but detailed, architecture-specific stuff can be handled by the back-end.

[1] These are actually undecidable problems. The compiler cannot catch all cases of them without also getting false positives, so the general approach is to look for situations which are guaranteed to be true positives (e.g. all control flow paths leading into this expression calculated it earlier).

jbri · on May 30, 2011

Personally, I would expect the opposite - the optimizer doing a lot of work on optimization passes that have minimal effect on one particular architecture.

Remember that the optimizer is working on a generic intermediate form - as long as that form is a reasonable analogue of the target architecture (e.g. you're not targeting a registerless stack machine), then any "optimization" applicable to the target architecture is also (in theory) capable of being applied to the intermediate representation.

This is, of course, assuming a well-implemented backend that efficiently transforms the IR to the target machine code, that can use optimal instructions even when there's no direct IR analogue to them.

kingkilr · on May 30, 2011

It depends, it works very well on C-like static languages, however it's basically useless for a dynamic language.

chc · on May 30, 2011

I think that's a little bit too broad a generalization. Rubinius and MacRuby (probably the biggest Ruby implementations after MRI and JRuby) are both based on LLVM.

forgottenpaswrd · on May 30, 2011

Wow

Very good article, I started liking LLVM some time ago. Now I will use it on my programs, it seems really easy to do, I agree so much with the gcc design flaws(that forced me to create my own parsers and code generators).

henning · on May 30, 2011

As this essay discusses, GCC has serious technical flaws. LLVM is one of the best examples I've ever seen of "the best way to complain is to make things." That and the iPhone.

gaius · on May 30, 2011

It is said that these "flaws" are deliberate in GCC, to prevent it being reused as a component (e.g. as a syntax highlighting engine) in someone else's IDE that hides that it's using GCC under the hood.

shadowfox · on May 31, 2011

Interesting. Is there any source for this speculation?

prodigal_erik · on May 31, 2011

This isn't a very controversial theory:

http://lwn.net/Articles/259196/

http://lwn.net/Articles/301135/

schrototo · on May 31, 2011

Wow, so they're actively preventing progress for political reasons? Why don't they try to solve their ideological licensing problems with, I don't know, licensing solutions? Crippling the underlying technology doesn't seem like a very hacker thing to do...

Someone · on May 31, 2011

http://gcc.gnu.org/wiki/GCC_Plugins

nhaehnle · on May 30, 2011

To anybody curious about how LLVM looks on the inside, I recommend reading their excellent tutorial on building a front-end, which you can find here: http://llvm.org/docs/tutorial/

akkartik · on May 30, 2011

Looking at the big picture raised one question for me: how does LLVM deal with dependencies between optimization passes?

It is well-known, for example, that the whether you do register allocation before or after instruction scheduling can have huge implications for performance. Some programs prefer scheduling done earlier, and some prefer it later. gcc's optimization flags switch phase orderings among other things.

Does LLVM have any abstractions for managing phase orderings and intelligently picking between them? Or must the optimization classes be manually instantiated in the right order?

Someone · on May 31, 2011

LLVM leaves that problem for users of the library. Users instantiate optimization passes and specify in what order to call them. For example, 'opt' has (http://llvm.org/cmds/opt.html):

  -{passname}

opt provides the ability to run any of LLVM's optimization or analysis passes in any order. The -help option lists all the passes available. The order in which the options occur on the command line are the order in which they are executed (within pass constraints).

  -std-compile-opts

This is short hand for a standard list of compile time optimization passes. This is typically used to optimize the output from the llvm-gcc front end. It might be useful for other front end compilers as well. To discover the full set of options available, use the following command:

  llvm-as < /dev/null | opt -std-compile-opts -disable-output -debug-pass=Arguments

I do not know whether compilers 'included' with LLVM (clang, llvm_gcc) have support for tweaking the order. If they do not, you can always let them write unoptimized IR, and pipe the output through opt.

kenjackson · on May 30, 2011

Does anyone understand their purchasing table? It seems like some money goes unaccounted for: http://www.aosabook.org/en/index.html#purchase

Tuna-Fish · on May 31, 2011

Receiving money costs money.