
The Inevitable Death of VMs: A Progress Report [pdf] - ingve
https://www.cs.kent.ac.uk/people/staff/srk21/research/papers/kell18inevitable-preprint.pdf
======
paulddraper
Very interesting. Had to point the irony though

> A key hypothesis of the liballocs design is that existing VMs may be
> retrofitted onto it at fairly modest effort, rather than being thrown away
> or substantially rewritten. A previous partial retrofitting (of V8) exists,
> but is challenging to maintain; therefore, input is sought on alternative
> candidate VMs for use as retrofitting targets.

~~~
kevingadd
In all fairness to the author, casually maintaining any major changes to V8 or
Chrome for more than a month is borderline impossible. The whole codebase
undergoes massive churn on a release-to-release basis. There are projects like
LibCEF that have the rug pulled out from under them every release or two
because some major subsystem got completely overhauled.

So, the difficulty of maintaining the v8 changes is kind of meaningless. The
fact that it's possible at all is the interesting part!

~~~
stephenrkell
Thanks! Indeed I should have said "no more difficult to maintain than V8 in
general... i.e. well beyond a researcher's means".

~~~
paulddraper
Ah, that makes sense.

"V8 changes such that alterations in general are difficult to maintain."

------
Nzen
tl;dr this is a two page description of Stephen Kell's liballocs library [0].
Basically, it's wasteful to install jvm, python interpreter, and electron.
He's advocating that new languages use (u)nix infrastructure (processes,
files, etc) as much as possible. His library is supposed to assist with this
by providing process level gc. It provides a hierarchal view such that a
caller allocates objects, rather than flat memory.

I don't know enough to have an opinion on this. As an alterntative view,
consider listening to Cliff Click describe the challenge the jvm faced when it
had a more permissive ffi [1].

[0]
[https://github.com/stephenrkell/liballocs](https://github.com/stephenrkell/liballocs)

[1]
[https://www.youtube.com/watch?v=LoyBTqkSkZk](https://www.youtube.com/watch?v=LoyBTqkSkZk)

~~~
imtringued
I personally disagree. The differences between the language specific is high
enough that a fully general VM is impossible without also incurring all the
downsides of a language specific VM. Python and Java each have a distinct
standard library. The end result will be that you have to install 'supervm-
java-std' and 'supervm-python-std' instead of openjdk and cpython. JVMs tend
to be very memory heavy in exchange for higher performance for java code,
whereas the python interpreter is slow but more memory efficient because it
relies on C code to accelerate CPU and memory intensive applications.

~~~
stephenrkell
Not sure whether you're disagreeing with the paper or the commenter's summary
of it. As you could probably guess from the title of the paper, the goal is
_not_ a new "fully general" VM.

An analogy I sometimes use is pre-IP internetworking. If you wanted a new
cross-network application, then of course you could in principle build
application-layer gateways, but the economics simply didn't work. It took a
carefully engineered "hourglass waist" to fix the economics. The goal is to
create the equivalent for language implementations. And the whole point is
that "one super-VM" is not the recipe.

------
mappu
GitHub repository:
[https://github.com/stephenrkell/liballocs](https://github.com/stephenrkell/liballocs)

I read the paper imagining a kind of COM implementation
(apartments/marshalling/...)? But the Github readme makes it seem more like a
kind of tagged malloc wrapper.

~~~
stephenrkell
Ouch. :-) There's a lot more to it than that, though wrapping malloc is
certainly one part of it. (Reminds me I should finish my blog post on why
wrapping malloc reliably is way harder than it should be.)

But the key idea is to avoid introducing new abstractions -- anything that
liballocs formalises should be commonly "lurking" in there already. So types
and allocators are OK, but apartments would not be. It's not a new programming
model... the newness should be at the meta-level only, i.e. ways of
_describing_ what existing code already does.

------
bboreham
Interesting that the author makes no mention of CLR, Microsoft’s attempt at
the same thing which has been around for 20 years.

~~~
stephenrkell
True that the CLR is not mentioned by name, but it is covered. I invite you to
read the text again, and especially the following bit.

"Specifically, we should aspire to package language implementations in a way
that renounces ‘one true VM’, instead allowing first-class interoperability
with the host environment (perhaps at modest drop in performance), the same
interoperability with other VMs past and present, and tool support which ‘sees
across’ these boundaries."

The CLR simply doesn't do these things, as witnessed by the debacle of
"Managed C++" and the usual FFI wrapper tedium of "explicit P/Invoke". It is a
classic "one true VM", albeit more language-inclusive than a single-language
VMs.

~~~
bboreham
Surely it was an _attempt_ at all these things, it just didn’t work out. For
instance, I remember controversy that new Windows APIs would only be
accessible via .Net.

------
dcsommer
I'd love to see the topics of soft-realtime and lightweight threads addressed.

------
mroche
I expected this to be about system virtualization, not language VMs.
Interesting though, but not my area of expertise.

~~~
msla
> I expected this to be about system virtualization, not language VMs.
> Interesting though, but not my area of expertise.

The blatant reuse of abbreviations and terms is one of my pet peeves.

For example, VM can mean Virtual Machine, Virtual Machine, or Virtual Memory.
Yep, three completely distinct things, two of which are helpfully referred to
under the same expanded name as well. Do you know all three? mroche has
helpfully disambiguated between Virtual Machine and Virtual Machine already,
but in case you need help:

Virtual Machine means a computer split up into many different fake systems,
each running a guest OS. Xen is an example of a Virtual Machine hypervisor.

Virtual Machine means a completely fake CPU packaged with libraries and used
to run a high-level language, to facilitate things like garbage collection and
language-level security. Java runs on a Virtual Machine, quite imaginatively
called the Java Virtual Machine. Now, what would we call a Java execution
environment which could run as a Xen guest? (Don't strain yourself... )

Virtual Memory means lying to applications about how memory works, to present
a completely flat address space without such annoyances as caching or other
programs or the operating system disturbing application programmers, who are
quite disturbed enough.

Now, let's take all of those concepts, refer to all three of them with the
same abbreviation and two of them with the same name entirely. I suppose I
should feel lucky to live in this time: Soon, we'll be calling them all Bruce,
to cut down on confusion.

~~~
theamk
Your first two "virtual machines" are actually not that different.

Sure, xen simulates the same CPU type as host, and talks to host via block
devices and raw sockets, and each guest includes the full OS and filesystem;
while Java VM simulates a completely different CPU, and uses host's OS for
filesystem and TCP/IP access. They seem pretty distinct.

But you are just looking at the sides of the spectrum, and there are plenty of
things in the middle.

\- With Xen, you can boot directly into user app, without any OS, filesystems
or separate libraries. Is it still "fake system" if the thing you are booting
has no chance of running on a real hardware?

\- Qemu/kvm, which is normally used to emulate processors, supports "virt", "a
platform which doesn't correspond to any real hardware and is designed for use
in virtual machines." Does this start to sound like JVM for you?

\- There are an actual, physical chips which execute Java bytecode directly
(like picoJava). Does this put Java VM into "hardware emulation" category?

\- On, and there is UML (user mode linux) project -- it emulates a virtual
machine with its own fixed memory pool (like xen), and can use host's block
device (like xen), and raw networking (like xen); but it can also use host's
filesystem (like JVM), and it uses host's kernel for thread scheduling (like
JVM). Where does it go?

\- Oh, and there is a Smalltalk. The older versions had garbage-collecting VM
(like JVM), but it had its own device drivers (like xen) and filesystem (like
xen).

There is a reason we call of them "virtual machines" \-- they have lots of
things in common.

~~~
zamfi
I'd also add that I believe the usage in the title here is a little obscure --
I think almost any generalist programmer (i.e., not someone who exclusively
works in program language VMs or in system VMs) would assume "VM" refers to a
system-level virtual machine.

At one point I interacted with the JVM and its quirks on a daily basis, and I
never started referring to it as "the VM", it was always "the JVM".

I suspect in large part _because_ they're not that different from a systems
perspective, I assume "VM" to mean the more generic of the two!

~~~
atq2119
The usage has changed over time. Go back 15 years, and I'd expect people would
more commonly associated "VM" with the language VM. Of course, that historical
development fits well with the high-level claims made in the paper.

~~~
zamfi
Agreed. There was a time where the system-level virtualizer was "the
hypervisor", but I was still in school then. ;)

