
ZeroVM: Smaller, Lighter, Faster - bretpiatt
http://www.rackspace.com/blog/zerovm-smaller-lighter-faster/
======
psycr
I like this alternative to titling.

The original headline is preserved, and clarified by the editorial
clarification in square brackets. It would be great for HN to adopt this as a
solution to the modified headline problem, with the provisio that editorial
comment must only be used for the purposes of clarification.

~~~
benologist
It _was_ nice, until it was removed because fuck context.

ZeroVM: Smaller, Lighter, Faster [RackSpace acquires LiteStack]

------
VanL
Van Lindberg here from Rackspace. If you have any questions, I am around to
answer.

~~~
jmillikin
* The ZeroVM site makes a big deal about application execution being completely deterministic. How does this interact with applications that require random numbers, such as crypto?

* Is ZeroVM capable of running unmodified Linux binaries? If not, what compiler toolchain is required to get it working? The main advantage of other lightweight virtualization solutions (OpenVZ, LXC) is that it's very easy to take regular binaries (e.g. postgresql) and drop them in a sandbox with minimal fuss.

~~~
VanL
\- It is deterministic based on the inputs. You would need to pass in a seed
or read from an external source of randomness to get different values out of a
PNRG.

\- Binaries need to be recompiled. There are two toolchains, a GCC-based one
and an LLVM-based one. We can also compile within the ZeroVM container itself.

We expect that a lot of people will use existing language runtimes (Python,
Lua, JS) To avoid compilation.

Over the long term, though, a lot of the power comes from composability. Think
Unix pipes, in parallel, across the cloud.

~~~
codex
Does the hypervisor use multiple communicating CPUs? If so, how do the races
inherent in concurrency not destroy the determinism? Is this a single
CPU/thread/fiber hypervisor?

~~~
VanL
Each container has a single process. Each Individual part is deterministic, so
the entire system is composable deterministically.

~~~
codex
That's not true if the processes communicate, or contain communicating
threads. In one run, an input queue looks like {A, B}. In another, {B, A}. The
source of the non-determinism is just entropy bubbling up from the hardware.

EDIT: using completely synchronous I/O (mentioned below) is a very clever
solution, but it requires a process to know its inputs ahead of time. This may
also cause cluster scalability issues, as now each "round" of inputs is gated
by the slowest of the source processes.

~~~
pkconstantine
All reads from other sessions are blocking. There is no input queue, zerovm
processes read and write directly to each other. This way determinism can be
preserved even for clusters.

------
eaurouge
Perhaps a bit off-topic, but I would like to get to the point where I can make
intelligent comparisons between technologies like CoreOS and ZeroVM, and in
general better understanding of containerization, virtualization etc. Can
someone suggest a list of books that can get me started on that path?

~~~
krakensden
CoreOS and ZeroVM are so young that there's not really much literature about
them yet. That said, VMs and containers have been around for decades.

Lots of the recent activity is more about packaging and usability
improvements, rather than theoretical improvements.

A quick overview: CoreOS is a super-minimal Linux distribution designed to be
used as a base for applications. It's essentially equivalent to the JEOS
buzzword from five years ago. It would run inside of Xen or KVM or VMWare.

Xen, KVM, VMWare, Virtualbox, you probably already know about- they provide a
virtual machine, the operating system running inside of it (theoretically)
can't tell it's not on its own hardware. Xen uses a 'hypervisor', which is
essentially a very tiny, custom kernel. KVM uses the Linux kernel as the
hypervisor, which makes a lot of sense- you don't have to reimplement all the
years of hardware support and scheduling work they've done. VMWare and
Virtualbox run as applications on whatever OS you provide. You lose out on
some opportunities for clever performance hacks this way, but there are other
advantages. VMWare ESX[i] is more like Xen & KVM, but I don't really know that
much about it.

Containers (BSD jails, Solaris zones, LXC, and of course, HN's lovechild
Docker) let you provide VM-like isolation and resource management between
processes or groups of processes, but you only run one kernel. This means much
less duplicated effort and memory, and Docker's AUFS lets you deduplicate your
storage too. There are slightly more security concerns about this approach
than full VMs, the Linux kernel (and others, but let's be honest about the
target audience) has a long and ugly history of local privilege escalations.

ZeroVM is based on Google Chrome's NaCL, and that's about all I know about it.
I would expect VM-like security (it validates machine code), and an
environment that requires serious porting from POSIX. That said, if you use
Python, Ruby, Mono-compatible .NET, or Go, the heavy lifting has already been
done for you.

~~~
Patrick_Devine
ESX and ESXi are bare metal hypervisors, unlike KVM which is kind of like a
quasi-bare metal hypervisor which happens to sit in the Linux kernel. ESX has
been EOL'd, however it relied on something called the "Console OS" to
bootstrap itself until the vmkernel would take over and start scheduling
tasks. The Console OS was actually a modified Red Hat Advanced Server (later
Enterprise Server) instance which once the system was booted would act as a
kind of "privileged guest". You could log in to it and do sysadmin-y tasks
like add users, install RPMs etc.

ESXi, on the other hand, was written to do away with the Console OS entirely,
but it still has a fairly rudimentary shell. Many of the utilities are based
on busybox and the idea is it should be stripped down with only really minimal
functionality. It also sported something called the Direct Connect UI (DCUI)
which is a curses based interface for doing things like settings up an admin
password, reviewing logs and changing security settings.

~~~
krakensden
Thanks for the clarification.

------
anttiok
Based on "Why ZeroVM?"
([http://zerovm.org/wiki/Why_ZeroVM](http://zerovm.org/wiki/Why_ZeroVM)), a
large part of the motivation for ZeroVM is based on the premise that regular
VMs require a full OS and are therefore unacceptably fat. However, there are
multiple platforms for running unmodified applications directly on VMs without
requiring a traditional OS, e.g. the work I've been involved with:
[https://github.com/anttikantee/rumpuser-
xen/](https://github.com/anttikantee/rumpuser-xen/)

Determinism, OTOH, sounds interesting at least on paper. Is there any
experience from tests with real applications in real world scenarios?

------
saryant
Big congrats to Cam and the whole team! I work for one of the other companies
from their TechStars class, they were a blast to have in San Antonio.

------
zobzu
ZeroVM is LXC but with NaCl.

~~~
SvenDowideit
almost,

LXC starts as a general purpose Linux container with everything built in, and
adds more isolation as development continues

ZeroVM starts with no general purpose Linux, and will add support as
development continues

ie, LXC will work with what you have now, ZeroVM will eventually work with
what you have now, but shims will have to be developed for everything, either
in your code, or in ZeroVm's

IMO the future endpoint will have similar functionalty in both projects, but
LXC will see more testing and use /now/

------
yesimahuman
Huge congrats to the LiteStack team. I was in their TechStars class and those
guys are super smart.

------
felixgallo
I've read the architecture doc
([http://zerovm.org/wiki/Architecture](http://zerovm.org/wiki/Architecture))
and I loved it.

But, when you say tantalizing things like 'erlang-on-c', you raise the
question: what does the clustering control plane look like?

One of the great things about erlang is that the cluster's got supervisors
that receive execution-level messages (e.g. 'EXIT') and can then take whatever
action they feel like. Is that control plane level exposed to ordinary
containers?

And the other great thing about erlang is that the messaging model is either
synchronous if you care (with return receipts) or asynchronous if you don't
(fire and forget) -- and that richness turns out to have a bunch of good use
cases. What's the ZeroVM story there?

And the other great thing about erlang is being able to trace out messages,
especially when your synchronous architecture just took a dump on the sheets
and is staring at you belligerently. Does ZeroVM have introspection figured
out yet?

~~~
VanL
The control plane is on top of ZeroMQ, to allow for various arrangements of
components as well as the ability to observe the flow of inputs/outputs.

~~~
felixgallo
Thanks for the response. It'd be great if the wiki were fleshed out with an
overview of how that works and for the other questions to be addressed as
well, for those of us examining it from other backgrounds.

------
codex
How is does this solution compare to Google App Engine?

~~~
codex
Why the downvote? This solution requires an app recompile, no?

------
davidw
So... how do I use this with my Rails app, and to what end?

It looks like interesting technology, but I need a more concrete example.

~~~
vidarh
I think for stuff like your Rails app, you'd want to wait until there is
support lower level in your stack.

But imagine that you're writing a commenting system, and want to sanitise
stuff received from a user. Sanitising data is error prone, so you isolate the
code in a new zerovm. If someone finds a way to exploit anything in your
sanitising code, they might be able to write broken sanitised HTML out, but
they won't be able to e.g. send queries to your database, or write to your
disk, because the zerovm simply doesn't have permission.

And imagine the web server spawning a new zerovm for every request, that only
has permission to talk to the inbound network connection and pass messages
between that and a Rails zerovm for that request. If there's an exploit in the
HTTP parser, that vm could be exploited, but it'd die at the end of the
request, and would have no permissions to talk to the database server or write
to disk.

And imagine the Rails zerovm similarly being split into pieces: Request
handling might be done in one; authentication might be done in one.

The lower the startup costs, the more you can afford to chop the app into
pieces _and_ the more you can leverage that to benefit in terms of security
(by reducing the privileges of each individual component) and scalability (by
allowing distribution of the VMs across CPUs and across servers)

~~~
mjs
How would ZeroVM instances talk to any server with persistent storage (e.g. a
key/value store) in a deterministic way? get('top_stories') will change over
time.

------
epynonymous
why not just use docker.io? what are the differences between zerovm, docker,
and warden?

~~~
wmf
I found the architecture page helpful in understanding what this thing really
is: [http://zerovm.org/wiki/Architecture](http://zerovm.org/wiki/Architecture)

~~~
amalag
What does an instance look like. These are very lightweight instances of what?

~~~
pkconstantine
It looks like hardened *nix process. It has no access to anything it's not
permitted to access. And it has no notion of network, time or machine it's
running on, although it can communicate with other instances (even on remote
machines) via ipc. It can be suspended, resumed, relocated and so on without
it ever noticing.

~~~
amalag
Thank you. When you instantiate a zerovm instance you give it the associated
code as well? And which IPC method can it use? Is zerovm the library you use,
is there such a thing as a separate zerovm instance, or is it just the way we
are used to talking about virtualization?

~~~
krakensden
You give it an x86 (or ARM) binary to execute. NaCL is also working on an LLVM
version, that would get compiled to the specific machine at runtime.

IPC is super limited:
[https://github.com/zerovm/zerovm/blob/master/doc/api.txt](https://github.com/zerovm/zerovm/blob/master/doc/api.txt)

You get nothing but /dev/stdin, /dev/stdout, and /dev/stderr by default. You
can optionally make other resources (network, files) available, through a
similar api.

------
nine_k
Am I the only one here who finds this approach somehow similar to Plan 9?

~~~
4ad
Yes, because after many years of using Plan 9 daily I see no similarity.

------
passfree
I cannot see what this has to do with security. At the end of the day, it is
the data that attackers are after and the app needs to be able to access it
whether it is virtualised or not.

~~~
vidarh
Each part of your application needs access to some sub-part of your data, but
if you isolate your app at the OS level and run a whole app server inside a
VM, every part of your application can at least in theory access all your
data.

If you sub-divide your app in separate zerovms, whether per-request, or split
it up further into functional responsibilities, then you substantially reduce
the attack surface by ensuring that an exploit against any one part of your
application can only exploit the specific subsets of data it is allowed to
work on.

You can do this without zerovm too, but the more you reduce the cost and
difficulty of spawning a new vm or container, the more finely grained you can
subdivide your application, and hence the fewer privileges each subset of your
app will have.

~~~
passfree
This is wishful thinking at the moment. I understand perfectly well what that
means but data is data. An application typically has access to all data and
the fact that you run it through a VM doesn't change anything.

I can find this technology useful only in areas where you want untrusted 3rd-
party code to run without worrying about what it will do.

~~~
vidarh
It is only "wishful thinking" is as much as people are usually lazy because
the effort required to sandbox small pieces of code is prohibitive in most
current platforms. But larger systems are already often layered in ways that
layer access to data anyway (though often not intentionally for security).

~~~
passfree
Good luck with this approach. I am not saying you should throw away this -
awesome technology btw - but rather impractical in many ways. You will find
very little use-cases where you can apply this with direct benefit. In most
cases this wont change a thing - only perhaps on highly specialised software.

------
nwmcsween
why not a different libc, such as musl?

~~~
krakensden
I've never understood people's fascination with replacing glibc. There's
almost never anything to gain- and glibc has the advantage of being really,
really well tested.

~~~
krasin
"I've never understood people's fascination about SpaceX. There's almost
nothing to gain- and Russian Proton has the advantage of being really, really
well tested."

On a serious note: the primary disadvantage of glibc is that it's really,
really hard to change (and build times are slow). While it's already here,
sometimes you want to port it to a new platform or a new ABI, and the
adventure begins.

~~~
krakensden
As someone who tried to port code in the early days of Bionic, I am filled
with grumbles.

And glibc hasn't been slow for a year or two, since they defenestrated Ulrich
Drepper.

