
Opal: Ruby to JavaScript Compiler - napolux
http://opalrb.org/
======
munificent
I'm going to use this as a little example of why compiling to JS is hard. I
work on the Dart team, and people often ask what the big deal about JS
compilation is. Opal's compiler is a good example of why it can be hard.

I want to stress, though, that I'm not singling Opal out here. I think Opal is
a really cool project, and I hope it works well for lots of people. It's just
a good example language since it's on HN right now.

Let's compile this Ruby code to JS with Opal:

    
    
        a = 0
        10000000.times do
          a = a + 1
        end
        puts a
    

Opal gives us:

    
    
        /* Generated by Opal 0.6.2 */
        (function($opal) {
          var $a, $b, TMP_1, self = $opal.top, $scope = $opal, nil = $opal.nil, $breaker = $opal.breaker, $slice = $opal.slice, a = nil;
    
          $opal.add_stubs(['$times', '$+', '$puts']);
          a = 0;
          ($a = ($b = (10000000)).$times, $a._p = (TMP_1 = function(){var self = TMP_1._s || this;
    
          return a = a['$+'](1)}, TMP_1._s = self, TMP_1), $a).call($b);
          return self.$puts(a);
        })(Opal);
    

We'll compare it to some vanilla JS:

    
    
        var a = 0;
        for (var i = 0; i < 10000000; i++) {
            a = a + 1;
        }
    

I don't care at all that the generated code is a little funny looking. That's
fine. Opal's code is actually pretty readable to me. What is a problem for
some (many?) users is the _performance_.

You'll note that Opal did _not_ compile "a + 1" to "a + 1". Instead it
generated "a['$+'](1)". That's because Ruby's arithmetic semantics are
different from JavaScript's. To implement those semantics correctly, it needs
to use a method call instead of using the built-in arithmetic.

We can profile the two using this fiddle:
[http://jsfiddle.net/3UtNf/1/](http://jsfiddle.net/3UtNf/1/)

On my laptop, the Opal code is 264 _times_ slower than the raw JS code. In
other words, it runs at 0.3% of the speed of the JS code. Now imagine
sacrificing that much perf on a mobile device. That's enough to make the
language unsuitable for many real-world use cases.

This isn't intractable, though. You just need to compile math down to real JS
arithmetic operators _when JS 's semantics line up with your language's_
(which typically means, when you're suring you've got numbers and not some
other type with a user-defined operator).

Determining where you can do that is the hard part. It requires type analysis.
Doing that well in a language that doesn't have a sound static type system
requires whole-program analysis. It's extremely complex, monolithic, and leads
to very strange output code.

This is why, for example, Dart's dart2js compiler is so complex and
heavyweight. It _does_ do this kind of analysis. It compiles this Dart
program:

    
    
        main() {
          var a = 0;
          for (var i = 0; i < 10000000; i++) {
            a = a + 1;
          }
          print(a);
        }
    

to this JS:

    
    
        function() {
          var a, i, line;
          for (a = 0, i = 0; i < 10000000; ++i)
            ++a;
          line = "" + a;
          H.printString(line);
        }
    

That has the same performance as the JS code. The reason it does this is
because it knows "a" is a number. If we change the Dart code to:

    
    
        class Foo {
          operator +(other) => this;
        }
    
        main() {
          var a = new Foo();
          for (var i = 0; i < 10000000; i++) {
            a = a + 1;
          }
          print(a);
        }
    

Then generated JS changes completely:

    
    
        function() {
            var i, line;
            for (i = 0; i < 10000000; ++i)
              ;
            line = H.S(new Q.Foo());
            H.printString(line);
          }
    

Interestingly, here you can see the compiler understood that the "+" operator
on Foo always returns the same object and was able to inline the call to it
and then hoist it out of the loop completely.

This is the kind of stuff you need to do if you want to have a language that
compiles to JS and (unlike, say CoffeeScript and TypeScript) has semantics
that aren't very very similar to JS.

~~~
azakai
Not necessarily. You can compile a Ruby VM from C to JS, as people have done.
That should give you somewhere around 2 times slower performance, or better,
not 264 times slower (and with very little effort).

Things get more complicated if your VM has a JIT, but there are interesting
results even there, see pypy.js.

This approach lets you have arbitrary semantics, even ones that differ hugely
from JavaScript, with decent performance.

I've suggested this in the past on HN - I think that approach could work for
Dart as well. Would be happy to help investigate it.

~~~
munificent
> You can compile a Ruby VM from C to JS, as people have done.

That sounds to me like it would just kill your startup perf. Users would have
to download an entire Ruby VM every time they hit your site, wouldn't they?

Maybe I'm just a luddite, but spending network resources downloading a garbage
collector written in JS only so that I can run it in JS... which natively
supports GC just seems really gross to me.

Don't get me wrong, I think Emscripten is very very cool. It just feels like a
strange fit for applications written in a language whose semantics aren't
_that_ far from JS.

~~~
azakai
I see your point, but don't think it is quite as bad as that. For one thing,
it would be cached etc., so it is about as bad as every site on the web using
jQuery (that is, not great, but not horrible either).

Yes, it seems ironic to download a GC written in JS, when JS can do GC. But JS
can only do SOME types of GC. For example, it lacks destructor callbacks,
which things like Lua require. Some other language might need weakrefs which
JS also lacks. So it is not quite that unreasonable to download a GC, as you
can get the right semantics you want.

But I do agree it gets less clear when the semantics are very close to JS. I'm
not sure if Ruby is close enough, though (CoffeeScript certainly is).

~~~
rapind
While convoluted, I believe this results in some pretty fast js. mruby -> llvm
-> emscripten -> asm.js

See: [http://vimeo.com/70673036](http://vimeo.com/70673036)

------
Yoric
The name is a bit awkward, given the existence of Opalang, which is another
$SomeLanguage to JavaScript compiler.

~~~
Argorak
Given that Opal.rb was started in 2010 and Opalang (back then called opages)
in 2010 as well, I would file that under "unfortunate".

[https://github.com/opal/opal/commits/master?page=107](https://github.com/opal/opal/commits/master?page=107)
[https://github.com/MLstate/opalang/commits/master?page=126](https://github.com/MLstate/opalang/commits/master?page=126)

~~~
Yoric
I don't remember when Opalang was started, but I joined the project in 2008 :)
Fwiw, back then the name was Opa. Opages was the name of the CMS written in
Opa/Opalang.

Funnily, Opa and Ur/Web were two projects started around the same time and
sharing very similar ideas, and they also collided linguistically – in German,
Opa == grand-dad, while Ur == ancestor.

Anyway, have fun with Opal.rb :)

~~~
Argorak
Thanks a lot for clearing that up (also, my post misses a "?" after opapages)!

------
shocks
What is a good use case for this? Interesting project otherwise! :)

~~~
lukasm
Say you have a function that validates a file written in ruby. Now you want to
have client side validation in the browser for good UX. The spec for
validation may be complicated or you are not sure about the implementation (no
docs, person that written it just left). This is internal format with no good
parsing library. I'd try it out.

~~~
dragonwriter
Or you're completely sure about the implementation, but it may change in the
future (because requirements evolve) and you don't want to have maintain two
different versions.

------
seanewest
How hard is it to use an existing javascript library with Opal?

------
berdario
Since the Opal developers are reading:

I'm curious, do you plan to support Encodings and proper Ruby strings, or it
isn't worth the effort, and you'll keep using plain javascript strings?

(I tried to look for this detail on the website but came back empty-handed)

------
vinhboy
How do people know that one lang would compile down to another? Is there
something about the lang that make this possible?

Also, is it possible that something won't translate from ruby --> js and cause
a bug?

~~~
peter-fogg
Ruby and JS are both Turing-complete languages (as with just about any other
programming language you'll ever use), which means that one can simulate the
other. The simulation might be messy, complicated, or slow, but it will still
work. In this case, Ruby and JS are close enough that it can be done without
too much trouble (see the compiled JS).

~~~
seanewest
But does "anything written in language A can be simulated using language B"
imply "A can compile down to B"?

~~~
afarrell
yes. If you can write a body of code X in language B which performs the same
operations as any given body of code Y in language A, then all an A->B
compiler needs to do is find X given Y.

~~~
seanewest
Ok, but is finding X given Y tractable?

To me, it seems like compiling is translation of source (usually higher level
source to lower level source), whereas simulation is more like translation of
behavior.

Edit: and the ability to translate behavior does not seem to imply the ability
to translate source

~~~
Solarsail
Couldn't you translate source given the ability to translate behaviour? If
some behaviour in language X can be simulated by a state machine in Y, then
you could just emit said state machine as the translation of the behaviour,
give its output to the next state machine, etc. And gradually build up the
behaviour of a program in X, but using a bunch of generated code in Y instead
of just an interpreter. In other terms, if an interpreter for / simulation of
X may be expressed in Y, then for each bytecode / AST node in X you could just
inline the bytecode handling you would use in Y into the compiled output.

~~~
seanewest
I think thats a very clever algorithm.

But I think your first articulation amounts to converting a turing machine to
a huge generated state machine, which would be impossible. The behavior of the
X program given _one_ input could be modeled as a sequence of states if it
terminates. But the process of generation that you described would have to be
repeated for all possible inputs into the simulated X program, which would be
impossible.

Maybe a single "behavior" could be modeled by a state machine, but it would
also seem impossible to me to decompose a program into individual units of
behavior.

But as for bytecode -- one could always compile the X code into bytecode, and
then decompile that bytecode into Y code. If you can always find a common
bytecode language for two languages X and Y, then I think that would be a
generic algorithm for X->Y.

Edit: I guess that common bytecode could just be in a language that describes
a turing machine.

------
moger777
If performance is an issue, you can drop down back to JS using %x{} or ``.
Good idea to do this for mathematical operations.

------
adrianlmm
I pasted in the Chrome console opal generated js code, but I get that "Opal"
is not defined.

~~~
adrianlmm
Never mind, solved.

I see potential on this.

Thank you.

------
steven_yue
I'd rather learn js when I need to. Auto generation is waste time in this case

------
ankurpatel
How would Opal handle meta programming like method_missing and define_method?

~~~
DouweM
Opal supports `method_missing`:
[http://opalrb.org/docs/method_missing/](http://opalrb.org/docs/method_missing/)

`define_method` as well:
[https://github.com/opal/opal/blob/f958d6b2468e57acf7f324e10a...](https://github.com/opal/opal/blob/f958d6b2468e57acf7f324e10a06b8a8fe475f55/opal/corelib/module.rb#L256)

~~~
ankurpatel
This is not pure way of doing meta programming. It looks at the code to figure
out what methods are being called in future.

Opal.add_stubs(["first", "second", "to_sym"]);

But lets say I am dynamically generating method using a string passed from
user input or server then this would fail.

~~~
elia
I had the same fear at first, but actually if you generate dynamically a
method name you need to use `#__send__` (and friends) to call it.

Of course `#__send__` supports method_missing:
[https://github.com/opal/opal/blob/0-6-stable/opal/corelib/ba...](https://github.com/opal/opal/blob/0-6-stable/opal/corelib/basic_object.rb#L13-L31)

------
rikkus
Great, I love Ruby. But what do I do for debugging?

~~~
adambeynon
(Opal developer here).

When it comes to debugging, I just use the standard chrome dev tools with
source maps. Variables and properties on objects all compile using their ruby
names, so viewing local vars and ivars in the debugger is easy. I only ever
debug the generated javascript when I am debugging a bug with Opals
runtime/internals/compiled code.

------
sheerun
So compile a bad language to worse :) (Ruby developer here)

I think it makes sense to transcompile type-checked languages to JS, though.
Elm is pretty neat.

------
aikah
afaik,it's not really a ruby to javascript compiler(like coffeescript compiles
down to js).

You cant write xhr = XMLHttpRequest.new and expect it to compile in js.without
importing some libs.

~~~
dragonwriter
It is really a ruby to JS compiler.

Ruby, however, is not just a thin layer over JS the way CoffeeScript is. The
thing you are referring to is a sign of CoffeeScript being a thin layer over
JS, not CoffeeScript's compiler being a "real compiler".

~~~
aikah
my point was it doesnt transpile Ruby to Javascript.

~~~
dragonwriter
It does, though; it just doesn't expose things that are exposed in the
underlying JS environment to Ruby code without jumping through certain hoops.
Which makes sense, because unlike CS, which is tied to the underlying JS
environment, Ruby isn't, and exposing the underlying JS environment would make
it harder to port general-purpose Ruby code as you'd be more likely to run
into collisions with the JS environment that you wouldn't see on other Ruby
platforms.

~~~
aikah
My use case was an online IDE where instead of using javascript to code
webapps,users would use Ruby to do the same.But since opal doesnt expose
window object,it's pretty useless for me. Maybe you have a solution for
that,but I just didnt find one.

~~~
dragonwriter
Exposing the window object is a different issue than being a transpired and,
in any case, you can access the underlying KS environment in Opal, including
fairly simple access to things like the window
object.[http://opalrb.org/docs/interacting_with_javascript/](http://opalrb.org/docs/interacting_with_javascript/)

