
Ruby+OMR JIT Compiler: What’s Next? - magaudet
https://developer.ibm.com/open/2017/03/01/ruby-omr-jit-compiler-whats-next/
======
rurban
> In Evan’s keynote, he proposed a really interesting and ambitious solution
> to the problem he called “lifting the core.” It involved shipping Ruby with
> LLVM intermediate representation of the CRuby functions to allow the LLVM
> JIT technology to look inside the CRuby functions and dramatically increase
> the optimization horizon. As far as I know, this hasn’t been attempted yet —
> although, if it has, I really want to see it!

Actually unladen_swallow and cperl are doing this. Compile the whole runtime
to a lib<rt>.bc (trivial make rule), load this bitcode file (a single call),
and add the expensive optimizations, esp. the inliner and IPO in llvm. Just
the type-checks need to be done in the jit. This inliner does a lot of good
magic esp. for the small RT functions. >10x faster.

The compiler is expensive though (i.e. very slow), and LLVM changed it's jit
API 3 times already. <=3.4 jit, then mcjit and >= 3.6 ocrjit, all 3 of them
still having major quirks.

With mcjit you cannot selectively add jit code to a module. So every single
body needs to be a new module. So module != package/class/namespace. So the
jitcache is a bit complicated. With the latest ocrjit you got problems in
finding the symbols in the bc. The C API doesn't support the name resolver at
all, and it's a major quirks with C++. LLVM's C API is really behind, but at
least they don't change it that often as the C++ API.

unladen_swallow has a huge JIT overhead library, which nobody really needs.

For the cperl jit I just used 4 days so far and doesn't even link yet. I used
the C API, not the better C++ API. No C++ for cperl. But it's very simple.
[https://github.com/perl11/cperl/blob/feature/gh220-llvmjit/j...](https://github.com/perl11/cperl/blob/feature/gh220-llvmjit/jit.c)

~~~
chrisseaton
Wow I had no idea anyone was shipping bitcode of their runtime for dynamic
compilation. Are there any blog posts, papers, etc about it? Does it work
well?

~~~
wrmsr
impala does.

~~~
rurban
The impala bitcode reader/resolver is here:
[https://github.com/cloudera/Impala/blob/cdh5-trunk/be/src/co...](https://github.com/cloudera/Impala/blob/cdh5-trunk/be/src/codegen/llvm-
codegen.cc)

It also needs a special mcjit resolver:
[https://github.com/cloudera/Impala/blob/cdh5-trunk/be/src/co...](https://github.com/cloudera/Impala/blob/cdh5-trunk/be/src/codegen/mcjit-
mem-mgr.h)

------
holydude
Ruby is my favorit language and i am very excited about the performance
improvements by projects like this.

~~~
preordained
Seconded and seconded. Nice to see Ruby continuing to grow/improve, and not
always necessarily with a rails-centric motivation or focus.

------
tiffanyh
I really wish someone would step forward and do what Mike Pall did for Lua
with LuaJIT. Phenomenal performance gains he achieved with Lua. Essentially
runs equal to C performance.

And no, just because Ruby is highly dynamic - that's not an excuse for it
being slow.

To quote Mike:

" tl;dr: The reason why X is slow, is because X's implementation is slow,
unoptimized or untuned. Language design just influences how hard it is to make
up for it. There are no excuses. "

[1]
[https://www.reddit.com/r/programming/comments/19gv4c/why_pyt...](https://www.reddit.com/r/programming/comments/19gv4c/why_python_ruby_and_js_are_slow/c8nyejd)

~~~
chrisseaton
I believe that Lua is significantly simpler than Ruby. Not in the language,
but in the size and scope of core library. To make a Ruby application fast you
also have to make a large number of core library routines fast. A real Ruby
program is not much more than a chain of core library calls.

I'm working on techniques to optimise through the core library in Ruby, and
it's requiring novel research techniques that clearly weren't needed in
LuaJIT, as it was done without them, so there must be something extra in Ruby.

~~~
pjmlp
Does Ruby actually do more dynamic behaviours than what is possible in
Smalltalk, Lisp and Dylan?

Knowing Ruby only superficially I fail to see why the former three managed to
have AOT and JIT compilers, while Ruby still requires novel research
techniques as you say.

Thinking of Smalltalk, _become:_ is probably one of the best ways to kill
anything a JIT knows about an object.

~~~
chrisseaton
I think Ruby probably does use more dynamic behaviour in practice.

I use a Ruby library called psd.rb as an example. It's for handling Photoshop
files. It represents pixels as a hash (map) of r, g and b values, and it
parameterises the kind of filter in your image by making reflective method
calls with dynamically created strings instead of using a function object of
some kind.

So to compile that to tight machine code that looks like something a C
compiler would output, you are going to have to completely optimise away the
pixel hash and the reflective method call.

You could probably write that kind of code in the languages you mentioned, but
people usually don't. In Ruby, that's how a lot of the code is written,
starting with the standard library itself.

Maybe I could restate it by saying that optimising typical idiomatic Ruby code
is more difficult than optimising typical idiomatic Smalltalk, Lisp, Lua etc.

More here: [http://chrisseaton.com/rubytruffle/pushing-
pixels/](http://chrisseaton.com/rubytruffle/pushing-pixels/). Look half-way
down at 'Acid Test'. My implementation of Ruby can compile that whole program
to a constant.

~~~
pjmlp
Thanks for the overview, I was missing that.

I remembered you mentioned that example in some Graal talks I've watched.

------
edelsohn
Repurposing a statically-typed JIT is limiting. This is why PyPy, v8, LuaJIT
and HHVM all have written custom JITs specifically for the languages.

[https://pdfs.semanticscholar.org/d1fc/e50f5476088671adc3910d...](https://pdfs.semanticscholar.org/d1fc/e50f5476088671adc3910d333082df937920.pdf)

~~~
amaranth
I'm pretty sure that paper is talking about an earlier version of this same
work targeting Python instead of Ruby.

~~~
magaudet
While the 'Pitfalls' paper was implemented on top of the Testarossa compiler
technology (that underlies the OMR compiler technology), the OMR project has
spent a lot of engineering effort to make the technology more modular and
easier to interact with.

While it doesn't invalidate the reasoning behind the 'Pitfalls' paper, we
think that we're addressing the problem in a fairly different manner: Instead
of providing a fully featured JIT compiler that you need to wedge the language
into, we've instead chosen to create a more modular system that lets the JIT
compiler be more customized to the target language and VM.

------
crudbug
Eclipse OMR is interesting, it provides good platform abstraction with support
for threading, monitoring, etc.

Can you create new native languages using Eclipse OMR ?

~~~
magaudet
When you say "native languages" you mean a statically compiled language?

It's been done before with the technology underlying the OMR compiler
component, however there is some work that would need to be done to support
this.

In principle though, there's no reason it couldn't be done.

~~~
crudbug
Yes, I meant "statically typed" similar to swift.

I was thinking, OMR providing the native platform semantics that will enable
multiple language syntaxes.

------
lobo_tuerto
What does OMR stand for? Couldn't easily find a definition for it.

~~~
jwmittag
OMR is the Open Managed Runtime. Basically, IBM took their Java implementation
(the J9 JDK), or more precisely, J9's Java Virtual Machine implementation,
separated all the language-specific and VM-specific parts from the language-
agnostic and VM-agnostic parts, broke it up into independently re-usable
modules (memory manager, JIT compiler, garbage collector, profiler, debugger)
and released it as Open Source Software under the umbrella of the Eclipse
Foundation.

Ruby+OMR is the proof-of-concept project that aims to show that OMR is
_indeed_ language-agnostic and VM-agnostic and can be used as drop-in parts
not only for newly-designed VMs but also easily integrated into existing VMs.
The Java 8 version of IBM's J9 JDK is already built on top of OMR, but you
could consider that cheating, since that's the very VM OMR came out of;
Ruby+OMR shows that you can also do that with languages other than Java and
VMs that have a different design than J9.

I believe there is also a Python+OMR project which does similar things with
the CPython VM, but that project is not (yet) public, AFAIK.

------
thatmiddleway
Could this make eclim useful for ruby developers?

~~~
autoreleasepool
Eclim is already fairly useful for Ruby developers. I use it for the "stellar
omnifuncs" it provides YouCompleteMe as described in the official docs [0].

[0] [https://github.com/Valloric/YouCompleteMe#semantic-
completio...](https://github.com/Valloric/YouCompleteMe#semantic-completion-
for-other-languages)

------
iagooar
Ruby 3x3 is here already. It's called Elixir ;)

No, seriously, if your heart beats for Ruby, you will fall in love with
Elixir. Give it a try and you won't look back.

~~~
deedubaya
I really like Elixir... but to say it's ruby 3x3 isn't quite fair. While the
syntax is ruby-like, even the simplest ruby program isn't directly portable to
Elixir. It's definitely something different, so saying things like this is
misleading.

It'd be closer to say: > Ruby 3x3 is here already. It's called Crystal ;)

Definitely check out Elixir and Crystal though if you need more performance.

