
Why Is No One Writing Language Runtimes? - nkurz
https://acorwin.com/2016/05/07/why-is-no-one-writing-language-runtimes/
======
pcwalton
A few points:

1\. There is no such thing as a universal runtime. All runtimes are coupled to
some language's semantics. The idea of a universal runtime has been tried
several times and has always failed, or in the best case been limited to
languages that are basically skins on top of the same underlying semantics
(e.g. C# and VB.NET).

2\. The IRs in the Go compiler are _especially_ strongly coupled to Go
semantics. Your language would have to replicate many idiosyncrasies of Go to
run on its runtime. For example, you would be unable to benefit from
performant lexically scoped cleanup the way e.g. Java (or basically any other
language) can, because the Go language "defer" construct has extremely dynamic
semantics and there is no support in the compiler for C++-style unwind tables
(at least, I don't see why there would be, given that nothing in Go needs
them).

3\. Go is not a good candidate for this universal runtime in 2016 due its lack
of a mature optimization pipeline. Maybe in a few years, but in the meantime
you'd be better off using HotSpot or .NET for this purpose.

~~~
voidlogic
>3\. Go is not a good candidate for this universal runtime in 2016 due its
lack of a mature optimization pipeline. Maybe in a few years, but in the
meantime you'd be better off using HotSpot or .NET for this purpose.

Go 1.7 on amd64 is looking to be pretty competitive (ranging to better than)
with HotSpot or .NET in micro-benchmarks. Most of this is due to the new SSA
compiler back-end. "Years" might be a pessimistic appraisal.

~~~
pcwalton
Given how long it took LLVM to become competitive with GCC (and LLVM has been
SSA _from the start_ ), even with a world-class team of optimization experts
behind it, I think history has shown that it takes a long time to compete with
mature optimizing compiler pipelines. There's no substitute for the long
engineering slog of tuning a generational GC and implementing algebraic
simplifications, alias analysis, devirtualization, SROA, SCCP, instruction
selection, instruction scheduling, autovectorization, LICM at multiple IR-
levels, etc. No silver bullet substitute for the hard engineering work exists
(other than using an off-the-shelf backend, of course).

Microbenchmarks are one thing; breadth and depth are another. It doesn't take
much to compete with GCC in a loop summing integers (unless GCC vectorizes
that loop or precomputes the value, of course!) but competing on SPECINT is
quite another.

------
tiles
The JVM is making huge strides on becoming a generic, static and dynamic
language runtime with the GraalVM prototype:
[http://www.oracle.com/technetwork/oracle-labs/program-
langua...](http://www.oracle.com/technetwork/oracle-labs/program-
languages/overview/index.html)

------
Russtopia
A related point which has always bugged me... the html script tag has a type
attribute. The only valid value, to this day, is "text/javascript".

Why oh why has no one written a runtime for the major browsers for others, say
"text/python", "txt/golang", ... Anything other than javascript?

Once an alternative engine were in the major browsers they would hopefully
take off as the world's hackers enhanced them.. and we could finally be freed
from the shackles of a language designed in a weekend and bandaid-ed ever
since.

~~~
petercooper
People have. _type= "text/vbscript"_ probably got the most use, but it was
only ever in IE. The W3C was gungho on TCL at one point and text/tcl was used
there - notably, the HTML 4.01 spec gives an example of setting text/tcl as
the _default_ with _< META http-equiv="Content-Script-Type"
content="text/tcl">_

The problem is getting multiple browsers to agree, and no real standard has
emerged there. Even if one did, you're either relying on browser creators to
include a runtime, or you have to patch one in, Flash-style. I think Google
was trying to do this with Dart but failed.

A modern workaround I've seen a few times is to write/compile an interpreter
in JavaScript that looks for script tags with different types and runs them,
but that leads to yet more JS bloat when loading pages.

~~~
derefr
> Even if one did, you're either relying on browser creators to include a
> runtime, or you have to patch one in, Flash-style.

This would actually be the perfect use-case for Google's Native Client, if
other browsers would bother to support it.

The arguments against webapp authors writing their app in NaCl are probably
valid. But those arguments don't apply to the case where the "apps" are run
times, there are only a common ones floating around, and people pull them in
by URL in order to script against them.

Interestingly, this would solve one additional pain-point: every script would
have the particular _ABI version_ of the runtime it was coded against "locked"
into it, in the same sense as a Ruby/Node dependency-lock file. So the
runtimes themselves could be in very heavy flux version-to-version, as long as
they kept old ABI-lines around and patched with security-updates.

------
thristian
> _Rust programs, however, won’t leak memory because the compiler makes sure
> you didn’t write code that could leak memory, and then writes down the parts
> about freeing memory into the final executable;_

A nitpick: Rust doesn't guarantee you won't leak memory, although it makes it
much less likely than in C. Leaking memory is actually officially Safe (in the
sense of Rust's safe/unsafe division) since it can't cause a program to access
not-yet-allocated memory or already-freed memory.

~~~
vvanders
Yup, there's nothing keeping you from having a few RCs around that create a
cycle.

That said the language does guide you away from things like that much more
than C/C++.

------
panic
It's worth noting that Python uses bytecode just like Java does. Even the
original Ruby implementation created an abstract syntax tree instead of
interpreting the text directly.

To answer the question in the title, maybe the reason nobody's writing
runtimes is that the JVM is just too good? What impact does all the "baggage"
have in practice?

~~~
gwright
Just a quick clarification regarding Ruby. All the major Ruby implementations
(MRI, JRuby, Rubinius) use virtual machines in their current incarnations.

~~~
panic
Thanks, I clearly haven't used Ruby for a while. I edited the post to be
clearer.

~~~
steveklabnik
This was one of the largest differences between 1.8 and 1.9, to give context
for when this change happened in MRI. During the split, the original
interpreter was "MRI" and the bytecode VM was "YARV", but once 1.9 shipped,
YARV became MRI, even though it was basically rewritten. Yay naming!

------
mangeletti
I think what we're really talking about is, "why aren't there more lower level
abstraction layers that are simple and 'open minded'?", and I like the idea of
such a thing.

The problem is that there isn't really any immediate gain for the organization
doing this. The benefit is for the community as a whole. So, we end up in a
Tragedy of the Commons-like scenario with disparate pieces that disregard each
other.

Going down a few turtles, my understanding of RISC is such that it does
something similar (providing a much simpler set of possibilities, leaving more
to the next level above), but at the machine level. What would 1 or 2 steps up
from RISC look like, if it were built with some of the same principles?

------
rurban
> There is a project called Parrot VM that was an attempt to do just this,
> provide a generic IR format for executing dynamic languages, but it never
> took off, probably because it was originally tied to the tragic development
> of Perl 6, and thus didn’t gain traction outside of that community.

The parrot story was a bit different. The tragic element was new parrot
maintainership who destroyed it, and not perl6. perl6 was always fine and
never tragic. And perl6 always kept the lead on board, in contrast to perl5
and parrot, where they left.

To the point: A lot of people are writing language runtimes. And some of them
even target multiple languages. But no one really cares about pluggable
bytecode, as parrot did. rpython the only comparable one, standing against the
jvm, clr and llvm. Most people care about an efficient compiler and run-time
for their language. And that's what is being done. Even more so in the last
decade. Esp. with llvm getting more and more usable for those.

Nowadays lua influences a lot of the smaller runtimes, btw. the spot which
used to be Scheme/your tiny lisp in the previous decades.

------
BWStearns
Can someone wiser than myself explain why there seems to be not a lot of
support for languages targeting multiple runtimes? I see Clojure has a CLR
version that seems to have died on the vine (maybe it didn't and I just
haven't investigated enough) but I can't recall seeing more examples. Given
the explanation offered int the article, it would seem especially worthwhile
for a language's portability to target the LLVM since it's fairly generic. The
most compelling reason I can see to not target multiple runtimes is to avoid
ecosystem fracture, but it's really not that much overhead to understand a
runtime as just another facet of the environment that has to be accounted for.

~~~
daxfohl
Actually there's a _lot_ of support for that. Just not official support. But
seems like if you need X to run on Y then somebody somewhere has a hack to
make it happen. Some obviously more substantial than others.

That, and almost _everything_ has support compiling to its original target
_and_ JavaScript these days.

Nonetheless I agree with the sentiment, perhaps because I'd love to see Scala
on .NET and F# on JVM in particular. I don't know why, of all the things that
_have_ happened, neither of these ever did. Maybe too much overlap to be worth
it. I'd also have to imagine that strongly-typed languages are a bit more
difficult since you have to match the underlying runtime.

Curious if there were any languages and targets in particular you had in mind.

FWIW ClojureCLR is still going strong; seems to track the JVM version within a
couple weeks: [https://github.com/clojure/clojure-
clr/commits/master](https://github.com/clojure/clojure-clr/commits/master) It
does have some warts, but I've played with it and it works fine.

~~~
daxfohl
Also FWIW the ClojureCLR blog has lots of interesting tidbits about some of
the intricacies that directly address your question:
[http://clojureclr.blogspot.com/](http://clojureclr.blogspot.com/)

------
stonemetal
Is this supposed to be like the joke "Nobody drives in New York the streets
are too crowded."? If no one is writing them how can you rattle a half dozen
runtimes that have achieved various levels of success. If anything you might
ask why there are so many rather than getting behind one opensource runtime
and pushing it farther rather than starting over all the time.

~~~
jsli
I think the author is saying a generic runtime instead of having each language
to create its own.

------
DonaldFisk
I have written and use my own Lisp, Emblem. It's syntactically similar to
Common Lisp, but like Scheme is a Lisp 1.

Functions are incrementally compiled into byte code instructions for a virtual
stack machine I designed specifically to run Lisp. Compiled functions are
added to the Lisp image, instead of being run directly in a Unix shell. The
virtual machine is written in C++. Using the read function as the parser and
Lisp-specific byte code as the target simplifies the compiler, which is only
1000 lines long. It also reduces the size of the object code. This does come
at the expense of speed, but it's fast enough for my requirements.

~~~
Nzen
It looks like you're talking about this.
[http://web.onetel.com/~hibou/Emblem.html](http://web.onetel.com/~hibou/Emblem.html)

And you started a visual dataflow language on the same runtime
[http://web.onetel.com/~hibou/Full%20Metal%20Jacket.html](http://web.onetel.com/~hibou/Full%20Metal%20Jacket.html)

~~~
DonaldFisk
Yes. That's an old page on Emblem, but the virtual machine is much the same as
it was back then, except for improved X11 event handling, and improved 3d
graphics.

Full Metal Jacket is implemented in Emblem.

------
icebraining
PyPy is an interesting alternative, since it's not a common language runtime,
but a toolkit for building runtimes. Besides Python, other languages for which
runtimes have been built using PyPy components (RPython, GC and JIT) are Ruby,
PHP, Pixie (a Clojure-like Lisp), Squeak, Prolog and BEAM bytecode.

------
rectang
The Apache Clownfish symbiotic object system is an alternative to the
"universal runtime" approach. It is a library rather than a platform.

Instead of trying to provide a _superset_ of semantics which encompasses all
languages, it provides a single runtime which supports a _subset_ of features,
which then lives inside the "host" language environment.

[https://github.com/apache/lucy-clownfish](https://github.com/apache/lucy-
clownfish)

~~~
iamcreasy
Can you explain it in simpler terms - what is 'symbiotic object system' and
how it's an alternate to an universal intermediate code?

------
dukoid
Well... what about WebAssembly?

------
justinlardinois
To answer the question posed in the title: because it's hard and the people
with those skills are already getting paid to do it. It's also the answer to
why there's no free software that's as good as Photoshop.

As others have said, the JVM has been a popular runtime target for the last
few years. But it's been in constant development for a quarter century, and it
powers countless enterprise systems, so the amount of time and resources that
have gone into it are enormous. It's hard to get that level of investment in
_any_ software, period.

------
carsongross
It's a great question. I suppose it's a fairly solved problem at this point
and very hard to do right, but when I look at what even "simple" bytecode for
the JVM looks like... yikes.

It would be so nice to have a universal VM that stripped out all the
unnecessary non-64 bit data types, cleaned out all the crazy method-
overloading related overhead, dropped the security model and added a nice
runtime reflection model and, dare I dream, a sane isolation mechanism between
modules...

------
Null-Set
You can call methods by name in c on Posix systems. dlopen can give you a
handle to your own main program from which you can look up symbols.

~~~
thristian
That depends on executables (such as your own main program) and shared
libraries (what dlopen() reads) having the same file format. That's a feature
of the ELF executable format commonly used on Linux systems, but isn't
necessarily true of other executable formats, like MachO (used on OS X).

------
bogomipz
The author states:

" The JVM is very powerful, but it has its own potential issues (type erasure
and garbage collection come to mind)."

Can someone explain why type erasures would be an issue? GC is a loaded
subject I know but don't type erasures allow Java to have generics? What is
the negative here?

~~~
iamcreasy
Java doesn't have true generics. It uses erasures to emulate this feature.
During compilation, template variables are removed and replaced with specific
type. I think that's why it's an issue.

~~~
bogomipz
How are true generics implemented then in other compiled languages?

~~~
zastrowm
In the CLR/C# for example, List<int> is a distinct type from List<bool>. At
runtime, when you create List<int>, the CLR will create a new type based on
List<T> that is specialized for int. For List<bool>, a new type based on
List<T> is specialized for bool. Then at runtime, you can test variable (if
(untypedObj is List<int>)) because there are distinct types for each closed
generic type.

In java, the compiler would instead (I'm simplifying here), replace List<int>
and List<bool> with List<object>, and at runtime, you wouldn't be able to tell
a List<int> from List<bool> because their types were "erased" and became
List<object>.

~~~
bogomipz
I see, thanks!

------
bitmapbrother
>The JVM is very powerful, but it has its own potential issues (type erasure
and garbage collection come to mind).

Isn't type erasure one of the very reasons dynamic languages are so plentiful
and abundant on the JVM?

~~~
steveklabnik
The addition of invokedynamic helped a lot too.

------
qaq
Because it requires very substantial investment and provides little benefit?

------
dschiptsov
Because runtime is hard to understand, leave alone sell.

Notably LuaJit, Golang (and before Plan9 and Inferno), Erlang, Haskell, Swift,
libc++, etc are writing runtimes. The problem is that it becomes a mess very
quickly, unless you are Mike Pall.

Also, the idea of a VM isolated from an OS is a marketing meme (how could a
mere OS process be isolated from an OS?) So, smallest, thin-layer-over-an-OS
runtime - approach pioneered by Inferno, is a much saner and efficient one.

Compilation to machine code + thin layer runtime + native ABI based FFI is
still the hardest and still the best way to make a runtime since times of
Common Lisp.

I am not writing runtimes because I am not Mike Pall, not a Google employee
and no one pays me to do so.))

BTW, marrying, say, Arc (or femtolisp, or a sane subset of scheme) to LuaJIT
runtime could be a nice project, to have a _really_ powerful scripting
language.

~~~
bogomipz
Could you elaborate on this sentence, specifically - the native ABI based FFI
part? When I think of the ABI I think of the runtime linker of the OS,
wouldn't this always be native? What would the non-native ABI be?

"Compilation to machine code + thin layer runtime + native ABI based FFI is
still the hardest and still the best way to make a runtime since times of
Common Lisp."

Also what is the thin layer, in this context? I don't know much about Lisp
runtimes, I'm sure its fascinating as its both compiled and interpreted.
Thanks.

~~~
dschiptsov
To make FFI lightweight, one follows underlying OS ABI, so there is no need to
do any conversions.

Compare Golang or LuaJIT FFI with what Java does. Java uses libffi, but it
does unnecessary copying and type and encoding conversions. Golang and LuaJIT
has zero-copy FFI due to reusing of OS types (for UNIX-like systems it is C) -
just dlload-and-call.

Thin layer (of abstraction) is how Golang's runtime is organized - delegating
to an OS instead of reimplementing inside a VM.

~~~
bogomipz
Thanks. Makes sense.

