While WASM makes for a convenient compilation target for languages like C/C++/Ru...

anthony_doan · on Sept 1, 2019

Wait what?

WASM is a vm and JS is a language.

Are you asking about the pro and con of a transpiler over a vm solution?

For one thing BEAM is the Erlang VM which Elixir and many other languages run on. Why would anybody want to implement it in Javascript? BEAM does a lot of crazy things, including a preemptive scheduler. Also BEAM GC isn't some generic GC, there's tons of crazy work going on with it to work with the Actor concurrency model.

pron · on Sept 1, 2019

> WASM is a vm and JS is a language.

I'm talking about compiling to WASM vs. compiling to JS. Calling something a VM or a language is mostly a matter of convention. x86 machine code can also be viewed as a language, and compilers certainly don't care. A compiler translates code in one language into code in another (there is no precise definition for "transpile", which is why compiler people hate that word; colloquially "transpile" just means that the compilation's target language is considered to be at a similar abstraction level to the source).

> Why would anybody want to implement it in Javascript?

Why would anybody want to implement it in WASM? If it's so that it could run in the browser, then compiling BEAM bytecode to JS would achieve the same goal, be easier to write and would likely result in a smaller and faster deliverable.

> Also BEAM GC isn't some generic GC, there's tons of crazy work going on with it to work with the Actor concurrency model.

There's nothing crazy going on in BEAM. As far as VMs go, it's near the bottom in terms of sophistication.

ohyes · on Sept 2, 2019

Exactly true! C, for example is a VM!

toast0 · on Sept 1, 2019

> Also BEAM GC isn't some generic GC, there's tons of crazy work going on with it to work with the Actor concurrency model.

BEAM GC isn't very crazy, it (like a lot of other parts of BEAM) is delightfully simple. Each process gets it's own stack and heap, and because if immutability, any references are always to older things, so BEAM GC is based on copy collectors: copy everything that's reachable from the current stack frame, then copy and adjust the references; when you're done throw away the old heap/stack. It's a smidge more complicated because BEAM uses two generations for the collection, and has reference counted binaries that are shared among all processes and require cleanup; but all of that is pretty straightforward too. Because of the language design, there's no need for advanced GC like in other languages; but if you transpiled to other languages, their GC would work fine.

pron · on Sept 1, 2019

Right, but can we stop saying "transpile"? :)

toast0 · on Sept 1, 2019

Sorry, language compression demands it.

If you want to push towards transcode, that's probably the most likely alternative. It's shorter too.

pron · on Sept 1, 2019

Or compile.

ufo · on Sept 1, 2019

The browser's JIT engine is tuned for idiomatic javascript. It doesn't work quite as well when JS is used as a target language.

The GC aspect is a real issue though.

pron · on Sept 1, 2019

Well, it would be interesting to see some benchmarks, but it would need to work pretty terribly to lose to AOT compilation of an untyped bytecode to WASM, unless it's a really good compiler with some excellent type inference.

ufo · on Sept 1, 2019

The longer explanation of what I was trying to say is that a JIT engine isn't a magic sauce that you can sprinkle on top of a language implementation to make it go very fast. As of today, making a good "cross language" JIT is still very much an open research question.

For example, suppose we want to make an implementation of elixir that runs on the browser. The most straightforward way to do this in a semantics-preserving way is to implement a BEAM virtual machine in Javascript, and use it to run untyped BEAM bytecode. The problem is that if we do this the Javascript JIT is only able to see see the control flow and the variables for the BEAM interpreter loop itself, and it isn't able to to peek into the "meta level" to reason about the control flow and the variables of the elixir program. As a result, we should expect that the Javascript JIT will be no faster than an AOT-compiled bytecode interpreter. In order to produce good code in this case we would need to use a JIT engine that understands and expects the code pattern of an interpreter loop, such as the one used in the PyPy JIT for Python.

The other approach you could try to take would be to compile compile the Elixir code more directly into Javascript, so that Elixir functions becomes Javascript functions, Elixir variables become Javascript variables, Elixir loops become Javascript loops, and Elixir objects become Javascript objects. This way, the Javascript JIT will have a much better chance of generating good code. But the problem you run into now is how to do this compilation while preserving the semantics of the original language. If the original language isn't similar to Javascript then preforming such a translation can be quite challenging, and the end result might be some weird-looking Javascript that won't necessarily run very fast either.

pron · on Sept 2, 2019

You don't need for it to be good, just to beat AOT compilation to WASM, which would likely not be very efficient. I don't see why JS is not similar enough to BEAM bytecode to achieve that.

ufo · on Sept 2, 2019

If the solution is based around BEAM bytecode then WASM should beat Javascript. A WASM version of the bytecode interpreter is going to be at least as fast as a JS version of the bytecode interpreter, and the AOT version will have faster startup than the JIT version too.

The caveat is that this reasoning is only thinking in terms of raw speed of executing the bytecodes. Both the WASM and JS versions will still run the elixir bytecodes in "dynamically typed speed", as the JS JIT is not able to optimize at the "elixir level". Things get a bit more complicated if you add the garbage collector to the mix though because the WASM version can't reuse the highly-efficient Javascript GC.

pron · on Sept 2, 2019

Right, but I'm not talking about an interpreter but about compiling BEAM bytecode to JS.

truth_seeker · on Sept 1, 2019

Any proof on this ? Are you talking about any specific JIT engines or all ?

hansihe · on Sept 1, 2019

In my mind this is a multifaceted issue.

One major factor is the special semantics of execution in Erlang VM compared to most other languages. Of note are:

* Guaranteed tail calls. There is no other way of doing a loop in Erlang other than recursing, so this point is really important. Although I can certainly imagine making this work with a JS target, we would have to perform certain code transformations that would most likely make the JS JIT a lot less effective.

* Preemption of Processes. Processes are the name of the lightweight actors used in the Erlang execution model. Each process is only allowed to run for a certain timeslice before execution has to be preempted. Although, again, I could imagine implementing this in a similar way as I would implement guaranteed tail calls, this would fully break up the control flow of a function, further disadvantaging the JS JIT.

* The semantics of the language term system. While this is probably one of the smallest issues for a JS target, it would still require us to implement and frequently call into relatively complex term operations that are implemented on top of the JS object model. I can't imagine this interacting well with the above two points in terms of performance.

* Concurrency/message passing model. One of our goals is exploring writing more fully featured applications in Erlang and then running it in the browser. As such, having the same concurrency model as the BEAM is critical. Using webworkers and utilizing shared memory through SharedMemoryBuffer (reenabled in chrome, let's hope firefox gets somewhere with this soon too), we believe makes it possible to implement things in a way that behaves very similar to the BEAM.

We want to invest heavily in the compiler in order to generate the best possible WASM code, which, given we play our cards right in codegen, should be further optimizable by the WASM engine. This involves inferring types and generating specialized code. Since most of Erlangs BIFs (built in functions) accept a much narrower array of types than what a lot of other dynamic languages do, we can draw relatively good type information from that.

This above is all an answer specific to the WASM/JS question, but doesn't even touch on the further goals of the project. While WASM is our primary focus at the moment, we also support native targets.

As for compilation to JS, ElixirScript (https://github.com/elixirscript/elixirscript) is prior art here. I am not involved in this project, so I can't really answer to what design decisions they made, but it is interesting to compare with regardless.

Again, time will obviously tell how well all of this will work out, but from what we have seen thus far, our approach is fairly solid.

pron · on Sept 1, 2019

With the possible exception of tail calls (which would require a bit of work), I don't see how any of those issues won't be better served by JS than by WASM, even if you did work against the JIT, and for less effort and with more features. It's just really hard to beat decent JITs with AOT compilation for an untyped language -- and V8's JIT is more than just decent -- even when the JIT is having a bad day, not without some very powerful whole-program analysis, and that costs a lot of time and money. But V8 is especially well optimized for a "broken", or "preempted" control flow, as that's just how JS is written, with or without async/await.

Native targets are another matter, of course.

Anyway, it would be interesting to see results. Beating BEAM's performance is not hard.

Thank you for your answer.

mwcampbell · on Sept 1, 2019

> Beating BEAM's performance is not hard.

Interesting observation. The hype I've heard around Elixir and Phoenix makes it sound like BEAM's performance is amazing. But maybe that's only in comparison to Ruby (specifically the C Ruby interpreter).

ohyes · on Sept 2, 2019

Beam’s performance is amazing if you are interested in context switching micro processes.

pron · on Sept 2, 2019

As long as you're not actually doing anything in those processes ;)

truth_seeker · on Sept 1, 2019

My feeling exactly !

The JIT enhancement and the new language features which shipped in various JS engines in last 3 years is commendable.