
Jitted LLVM IR on the JVM [pdf] - ianopolous
http://llvm.org/devmtg/2016-01/slides/Sulong.pdf
======
crudbug
"Graal is a dynamic compiler written in Java that integrates with the HotSpot
JVM. It has a focus on high performance and extensibility. In addition, it
provides optimized performance for Truffle based languages running on the
JVM."

"Truffle is a framework for implementing languages as simple interpreters.
Together with the Graal compiler, Truffle interpreters are automatically just-
in-time compiled and programs running on top of them can reach performance of
normal Java."

~~~
haddr
This is something I'm looking forward to in Java 9. There is already a big
effort in bringing R language to the JVM (google for project fastr on
bitbucket) and it might be just a beginning. It might finally bring the
interoperability between other, more specific languages and take advantage of
vast amounts of libraries available for Java and other JVM languages.

Edit: link to the fastr
[https://bitbucket.org/allr/fastr](https://bitbucket.org/allr/fastr) It's open
source, and they encourage people to participate.

~~~
azinman2
What's happening in 9?

~~~
the8472
So far Graal has been an experimental/research fork of the hotspot JVM. JDK 9
builds include JVMCI[1] as experimental feature which can be used to run graal
without JVM modifications.

[1] [http://openjdk.java.net/jeps/243](http://openjdk.java.net/jeps/243)

------
wcrichton
Does the memory created by the LLVM IR get managed by the JVM's garbage
collector, or is it manually managed à la C++? If the latter, I feel there's
an awesome opportunity here for creating a hybrid memory-managed/unmanaged
language like Terra ([http://terralang.org/](http://terralang.org/)) that
takes advantage of the massive Java ecosystem while letting people write low-
level code when they need performance and full control over memory.

~~~
ianopolous
My understanding is they can either use allocation in the JVM heap, or use the
Graal foreign function interface to call native malloc, which is useful for
inter-operating with native libraries outside the JVM.

------
pron
GitHub repo:
[https://github.com/graalvm/sulong](https://github.com/graalvm/sulong)

------
pcwalton
I assume the type speculation stuff is for dealing with bitcasts in a strongly
typed environment like the JVM? (And, if so, presumably the bailout
reconstructs the C heap and performs the bitcast manually?)

Also, interesting use of PICs for function pointers. What are the advantages
of that approach over just using Java interfaces and letting HotSpot write the
ICs?

~~~
mike_hearn
Graal/Truffle do not have to obey the vast bulk of the Java type system.

Together they are fairly mind bending projects so it's important to understand
precisely how they work. I've been studying them for about six months now so
hopefully this explanation isn't too garbled.

The typical code flow in a JVM looks like this:

1\. Bytecode loading and verification. This is where the type system is
enforced.

2\. Interpreting the byte code

3\. C1 compiler (very fast, low quality output) compiles bytecode to native
code, inserts into the code cache.

4\. Very hot code gets recompiled using C2 (slower, high quality output).

Graal inserts itself into and replaces step 4. Once bytecode verification is
done, the JVM itself imposes very little in the way of typing. You can
actually disable verification with a command line flag, and then bad bytecode
will segfault the JVM. The core VM sees code as being composed of methods
which are contained in classes, and it wants pointers to be distinct from
integers, but otherwise the JIT compilers can produce code that hardly match
the Java worldview at all.

Graal is a Java JIT compiler that is, itself, written in Java. Thus it is a
module which is passed some sort of input, which can be verified bytecode but
can also be something else (e.g. textual source code, or LLVM bytecode), and
the result of that is Graal inserting compiled code and generated metadata
into the code cache. The core HotSpot engine then takes care of swapping it
into the program when it's safe to do so.

So whilst Graal can compile Java bytecode (and do cutting edge optimisations
whilst it does), it isn't required to do so, and thus execution of LLVM
bytecode like in this example doesn't work by translation to Java bytecode, it
just bypasses the bytecode layers of the VM entirely.

And that's where Truffle comes in. Because Graal is written in Java it's quite
easy to expose a clean OO API, and it does so. So you can write programs that
manipulate or generate Graal's IR (which is a graph based IR) and thus compile
code to use the rest of the HotSpot runtime services. However graph IR is a
very low level way of thinking about a program. So there is this higher level,
Truffle, which allows you to write an AST interpreter in Java. It comes with
what they call the Truffle domain specific language, which isn't a language at
all but rather a set of annotations, and you can annotate your interpreter to
indicate how it should be optimised. And then Graal takes the interpreter and
the interpreter's input together, and does some incredibly aggressive
optimisations on them, exploiting the fact that the compiler knows what the
input to the program is. And like magic the interpreter gets loop unrolled and
the overhead is boiled away until you have a JIT compiled program.

Thus for the low low price of an AST interpreter, you get a full blown
optimising JITC, high end garbage collector, language interop framework etc
all 'for free'. It's really quite radical and the trivial effort of writing
AST interpreters means that they now have Graal/Truffle engines for Ruby,
Python 3, R, Javascript, C in both raw/unsafe and managed varieties, and of
course treating LLVM bytecode as a "language" is the next step.

So to answer your questions - no, type speculation is not required to work
around the Java type system. It's just that profile guided speculative
optimisations can often improve all kinds of programs and Graal is an
aggressively speculating compiler, speculation is just fundamental to its
design. And your last question doesn't really make sense because Graal
integrates so tightly that it isn't so much _using_ HotSpot, rather, it
actually _becomes_ HotSpot. And then of course it can write its PICs however
it likes.

~~~
pron
I would emphasize that in general Graal isn't a bytecode compiler at all, but
a general purpose compiler (somewhat like an LLVM backend) that compiles IR to
machine code, and that Graal has two frontends: Java bytecode and Truffle.

Also, within HotSpot, Graal isn't necessarily inserted at stage 4; it could
also be used in stage 3 (or even 2, instead of interpreting), it's just not
recommended because Graal is a slow compiler.

Finally, Graal doesn't need to run in HotSpot at all. Another option of
running it is in SubstrateVM, which takes your language's Truffle interpreter,
Graal itself and additional Java code (we'll get to that), and AOT-compiles it
all to a native binary (which then serves as a JITting VM for your language).
If your language requires a GC, that additional code will contain a GC written
in Java. I believe Substrate includes a simple GC, but you can write and use
your own.

------
edko
I wonder if this has the potential of making Swift another alternative for JVM
development?

~~~
mike_hearn
It does, yes.

At the rate new Truffle languages and frameworks are being developed, it'll
eventually be possible to run almost every language on the JVM, albeit with
varying levels of interop.

