

Simple Language Implementation Techniques for the 21st Century - smarr
http://stefan-marr.de/papers/ieee-soft-marr-et-al-are-we-there-yet/

======
andrewchambers
Sometimes I wonder if using rpython is a good meta language for this tracing
JIT technique. I don't know if there would have been any benefit if the meta
langage was specifically designed for the purpose of implementing dynamic
interpreters with the meta jit magic.

Adding explicit static typing to the meta language (i.e. replace rpython)
could at the very least remove the long type analysis, and translation times,
while providing better error messages.

~~~
smarr
RPython is certainly not perfect, and the PyPy/RPython community is also very
much aware of RPython being something that developed over time. But, RPython
was from the start 'designed' as a language for implementing dynamic
languages. The problem is that it took multiple iterations to identify meta-
tracing as a foundation that works. And those iterations left some cruft here
and there.

Compared to the Java-based experiments I did, sometimes I miss the ability to
be explicit about types, and have to use RPython's assertions instead, which
is a clumsy way of telling RPython about types. But overall, RPython isn't the
worst toolchain I have ever worked with. And, typically, if the error messages
are really bad, the community is very helpful and eager to improve them.

------
acqq
My take from the data presented in the article is fully different: both of the
"acceptably fast" techniques use JIT to make the executed code fast. And the
author avoids to consider what the JIT does. JIT means that effectively, at
the end, the JIT engine produced the _native code._ The same code your real
compiler would produce. All the processing before that phase is just to allow
the clumsy "too dynamic" languages to be able to balance between the "do at
least something" and "do something fast enough" (the second happens after the
"tracing" code detects something is repated often enough. And the code that is
actually important has to be run as often as it can).

But if you write the code that is from the start less "dynamic," you can get
the quality code without all the engines in between and all the overheads that
are incurred.

The "asmjs" achieves the speed because it is a protocol which allows the JIT
to simply compile the whole big chunk of program to the native code at once,
without all the tracing etc. And to have such a code at the first place, you
need some less dynamic language, like C, as the source.

On another side, the programmers don't like to have to be "too specific" but
recently more languages that gain popularity demonstrate that the types can
often be inferred in the compilation phase and that such a code can be
convenient enough for a lot of developers but fast too.

So, no, I don't think that the future is in PyPy. It exists only because the
Python language "specification" (and the reference implementation) is too
clumsy to allow clearer type inference. In my opinion, the future is in the
type inference and the real compilation, not the clumsy-big-overhead virtual
machines. Obviously, the developments that actually move in the direction I
like are obviously: asmjs, Swift and Rust. Disclaimer: I know I'm biased in
preferring, whenever possible, the compiled code to the interpreted or the
JIT-ed. YMMW.

~~~
smarr
The one issue you seem to consider as not relevant at all is the effort it
takes to implement a 'fast enough' language. With RPython as well as with
Truffle+Graal, you can implement a language in less than 10k lines of code and
get performance within reach of state of the art VMs.

Sure, if you prefer more static languages, you can probably move the point of
optimization from runtime to compile time. However, I would like to see that
you can achieve the same degree of performance by using LLVM for a small
language also in the range of 5k-10k LOC. I am not aware of any similar
experiments in that field. Rust, Go, or Julia seem to be all larger and more
complex languages, so a direct comparison isn't really fair. Would be
interested if there is something out there I have missed so far.

~~~
ihnorton
This paper was an interesting read! And thank you for the exceptionally well-
selected references: I think I will enjoy reading several of those.

I'm not sure about the others, but as far as Julia goes, the main parts of the
language implementation are: parser and lowering in about 5000 lines of
Scheme; type inference is 3k lines of Julia; and codegen to LLVM is about 6000
SLOC of C++. (and then there are a few tens of thousands of LOC of runtime and
library code in C and Julia). I suspect it would be possible to implement a
nice, smallish, LLVM-backed DSL using the Ocaml bindings in well under 10k
lines, but I am likewise unaware of such an experiment. On the other hand,
implementing Julia in Truffle or RPython would be a neat project.

~~~
acqq
As far as I understand, Julia is the language that will certainly be a worthy
replacement of Python once it gets enough library functionality (I admit
actually didn't follow how much it gets, I'd appreciate if somebody writes the
current state). It's really nice that it was designed from the start to be
fast.

~~~
ihnorton
It really depends what libraries you need. You might be interested in:
[http://pkg.julialang.org/pulse.html](http://pkg.julialang.org/pulse.html)
(also a searchable package list).

------
jiyinyiyong
Inspiring even for novice like me.

