

LLVM IR is a compiler IR - pcwalton
http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-October/043719.html

======
mdda
LLVM IR = "Low Level Virtual Machine" (originally written to be a replacement
for the existing code generator in the GCC stack) "Intermediate
Representation" (virtual machine language, prior to being crunched into actual
machine code).

~~~
reduxredacted
I hope I'm not the only one here upvote your comment simply because you
decompressed IR into Intermediate Representation.

I like to read about LLVM and some of the things that are a bit (ok, in a lot
of cases, _way_ ) above what I do day-to-day, except that it's sometimes taken
for granted that the readers know the acronym/reference. I read the whole
article and I understood what it was referring to, but for the life of me I
couldn't land on the two words because IR in my world is always infrared.

~~~
mdda
Somewhat infuriatingly, my simple IR->"Intermediate Representation" was my
highest ever scoring comment, and the one that pushed me over 500...

Lesson learned : Make something people want.

------
ot
These are exactly the reasons why I am very skeptical about pNaCl. I've done
some work with LLVM and I've been stuck on some of these points: the IR is
target-dependent, the function call conventions are basically those of C, the
optimization passes and the code generator are extremely slow when compared to
an ad-hoc JIT (this would mean terrible startup time for web applications).
And, furthermore, the size of the bitcode is still enormous, comparable to the
generated machine code or even worse (think about the size of C++ binaries).

I still think that LLVM is an incredibly great project, it brought compiler
infrastructure to 2010s. But it is designed to be a static compiler framework,
it hardly fits other purposes (think Unladen Swallow, don't know about
Rubinius).

~~~
erichocean
LLVM is being used as a JIT in Open Shading Language, a DSL for advanced 3D
renderers. The JIT + resulting code was so fast, the OSL team dropped their
batch rendering API altogether. (The resulting shaders are 25% faster than the
hand-coded C language shaders they replaced.)

Currently, the LLVM JIT-powered OSL is now responsible for 100% of the shading
duties at Sony Pictures Imageworks. I'd say that's a resounding success for
LLVM as a JIT.

~~~
ajross
I'm not sure how that refutes the point. OSL is a shading language, that's an
environment dominated by runtime execution time, not startup latency. So it
might be a "JIT" in that there's no stored binary, but it's performance
criteria are much (much!) closer to those of gcc or clang than to, say, a JVM
or script interpreter.

And, like the post said: LLVM IR is a compiler IR. It's a great fit here.

~~~
erichocean
> I'm not sure how that refutes the point.

>> [LLVM] is designed to be a static compiler framework, it hardly fits other
purposes [...]

I presented a ready example where LLVM was being used _not_ as a "static
compiler framework", and was being used for "other purposes". If that's not an
outright refutation, it's at least a useful data point.

~~~
ajross
Right, and my point was that that ready example was as close to a "static
compiler framework" as you can get without actually being one. It's an
exception-that-proves-the-rule case.

~~~
erichocean
OSL shaders are specialized dynamically at runtime and then JIT'd using LLVM.
That's about as far from a static compiler framework like GCC or Clang as you
can get, from my perspective.

In addition, fast JIT speed was a very high priority/need and the OSL team
spent a few months carefully tuning the LLVM passes to give fast code at a low
JIT cost. (All of this was done and discussed publicly; feel free to grep the
mailing list.)

Would you mind explaining how OSL is as close to something like Clang or GCC
as you can get? I just don't see it.

~~~
exDM69
> Would you mind explaining how OSL is as close to something like Clang or GCC
> as you can get? I just don't see it.

It seems that Open Shading Language is not really Just-in-Time (JIT) compiled.
It's compiled from LLVM IR to native machine code Ahead-of-Time (AOT) but at
runtime and then executed. This is actually not very far from a static
compiler, the only difference is that the target architecture is known at
runtime and the LLVM IR is compiled to machine code using that information.

What is the difference of JIT and AOT here is that in a JIT situation, there
is some form of an interpreter that is executing byte code of some kind
(probably not LLVM IR). When this interpreter reaches a loop of some kind, it
will attempt to compile it (from bytecode to LLVM IR and finally from LLVM IR
to native code). The interpreter then calls the compiled code, which will run
for as long as possible and finally return control to the interpreter which
will continue interpreting until it finds another opporturnity for JIT'ing. A
JIT compiler is typically employed when the original source language cannot be
compiled statically to machine code, because of e.g. dynamic typing.

So from what I can tell, OSL is just a static ahead of time compiler, with the
final stage of compilation taking place at runtime.

Please tell me if some of my background facts were incorrect (in particular
about OSL).

~~~
erichocean
I think this is correct. OSL is using LLVM in an AOT fashion, though a lot of
runtime code generation and specialization is being done just prior to running
the AOT LLVM JIT.

As opposed to something like, say, Google v8, which is using runtime feedback
to make hot code paths fast (and to remove dynamism when it can be shown to be
safe).

I guess I wasn't aware that people weren't including dynamic code generation
and AOT compilation in the "JIT" category. To me JIT meant generating machine
code "at runtime", and both OSL and v8 would be at opposite ends of the JIT
"at runtime" code generation/compilation spectrum -- OSL on the AOT side and
v8 on the keep-running-the-compiler side.

------
protomyth
quick link to Chris Lattner's response:
[http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-October/0437...](http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-October/043730.html)

~~~
icefox
On Oct 4, 2011, at 11:53 AM, Dan Gohman wrote: > In this email, I argue that
LLVM IR is a poor system for building a > Platform, by which I mean any system
where LLVM IR would be a > format in which programs are stored or transmitted
for subsequent > use on multiple underlying architectures.

Hi Dan,

I agree with almost all of the points you make, but not your conclusion. Many
of the issues that you point out as problems are actually "features" that a VM
like Java doesn't provide. For example, Java doesn't have uninitialized
variables on the stack, and LLVM does. LLVM is capable of expressing the
implicit zero initialization of variables that is implicit in Java, it just
leaves the choice to the frontend.

Many of the other issues that you raise are true, but irrelevant when compared
to other VMs. For example, LLVM allows a frontend to produce code that is ABI
compatible with native C ABIs. It does this by requiring the frontend to know
a lot about the native C ABI. Java doesn't permit this at all, and so LLVM
having "this feature" seems like a feature over-and-above what high-level VMs
provide. Similiarly, the "conditionally" supported features like large and
obscurely sized integers simply don't exist in these VMs.

The one key feature that LLVM doesn't have that Java does, and which cannot be
added to LLVM "through a small matter of implementation" is verifiable safety.
Java bytecode verification is not something that LLVM IR permits, which you
can't really do in LLVM (without resorting to techniques like SFI).

With all that said, I do think that we have a real issue here. The real issue
is that we have people struggling to do things that a "hard" and see LLVM as
the problem. For example:

1\. The native client folks trying to use LLVM IR as a portable representation
that abstracts arbitrary C calling conventions. This doesn't work because the
frontend has to know the C calling conventions of the target.

2\. The OpenCL folks trying to turn LLVM into a portable abstraction language
by introducing endianness abstractions. This is hard because C is inherently a
non-portable language, and this is only scratching the surface of the issues.
To really fix this, OpenCL would have to be subset substantially, like the EFI
C dialect.

> LLVM isn't actually a virtual machine. It's widely acknoledged that the >
> name "LLVM" is a historical artifact which doesn't reliably connote what >
> LLVM actually grew to be. LLVM IR is a compiler IR.

It sounds like you're picking a very specific definition of what a VM is. LLVM
certainly isn't a high level virtual machine like Java, but that's exactly the
feature that makes it a practical target for C-family languages. It isn't
LLVM's fault that people want LLVM to magically solve all of C's portability
problems.

-Chris

~~~
przemoc
Why copy-paste? (Well, I think I know why...) Should we do the same with all
messages from ML now? Just use normal ML UI, like gmane, I provided link to
earlier. If you need direct one to Chris' mail, it would be:

[http://thread.gmane.org/gmane.comp.compilers.llvm.devel/4376...](http://thread.gmane.org/gmane.comp.compilers.llvm.devel/43769/focus=43780)

Actually I'm not sure why this particular response was pointed out, as it is
not perfect rebuttal. It's almost always better to even only skim whole thread
than be picky about which mail to read carefully, as you'll always miss
something then.

~~~
wtallis
I'd guess Chris Lattner's response was singled out because Chris is the lead
developer for LLVM, so even if it isn't the best response, it's probably still
the most important.

------
przemoc
Whole thread in much more readable UI, w/o problems of terrible 90s ML web
stuff.

[http://thread.gmane.org/gmane.comp.compilers.llvm.devel/4376...](http://thread.gmane.org/gmane.comp.compilers.llvm.devel/43769)

------
nknight
I was already getting a bad feeling about LLVM recently, and this thread kind
of cements that.

Everybody seems to have a different (sometimes radically different) idea of
what LLVM is for. That can't be good for making progress, and it's definitely
not good for guys like me wondering if LLVM is a sane choice for a project.

~~~
eliben
Don't take it too far. For purposes it was designed to serve, LLVM is great,
best in its class. If your project is something that could benefit from LLVM,
then by all means use it. If it isn't, then don't. As simple as that.

Many corporations (most of all Apple) bet millions of $$$s in resources on
projects that depend on LLVM, so don't worry too much.

~~~
nknight
And what purpose was it designed to serve? That's exactly the problem I'm
alluding to -- the LLVM developers don't even agree amongst themselves, so how
am I supposed to know?

~~~
chc
It's a compiler-building framework and toolchain.

------
m0wfo
Just as a naive consumer of its facilities, I find LLVM produces binaries from
C and ObjC code which are smaller and faster than their GCC equivalents. I'm
attempting to write my own front-end as well, which is easier than I had
expected. I don't care what the black box does, c.f. "The user doesn't
care"(TM).

~~~
ori_b
GCC is still marginally smaller and faster:
[http://www.phoronix.com/scan.php?page=article&item=gcc_4...](http://www.phoronix.com/scan.php?page=article&item=gcc_46_llvm29&num=2)

~~~
m0wfo
IMHO, YMMV, etc. Oh man it's so easy to wind up HN kids with anecdotes.

~~~
oopsdude
That link has 14 benchmarks across 5 platforms. What's your anecdote/evidence
theshold?

~~~
nspragmatic
The parent was referring to their own anecdote, not the link's benchmarks.

