
Cervus: A WebAssembly subsystem for Linux - dmmalam
https://github.com/cervus-v/cervus
======
kccqzy
The next step of evolution predicted by Gary Bernhardt.
[https://www.destroyallsoftware.com/talks/the-birth-and-
death...](https://www.destroyallsoftware.com/talks/the-birth-and-death-of-
javascript)

~~~
brian_herman
The end is neigh and it is the death of javascript.

~~~
emmelaich
Don't horse around. The word is `nigh`

------
chatmasta
From the readme:

> I'm busy with my College Entrance Examination until ~June 10, 2018

This dude is in high school?! When I was your age I thought I was smart for
writing a youtube scraper in PHP...

Awesome work, really creative solution. Good luck to you.

~~~
steveklabnik
I believe the other person working on a wasm kernel in Rust is also in high
school. They take two different approaches, I can’t wait to see how all of
this turns out!

~~~
zamber
A real High School drama, not that fake stuff on the telly ;).

------
erlend_sh
Highly related:
[https://github.com/nebulet/nebulet](https://github.com/nebulet/nebulet)

“(Going to be) A microkernel that implements a WebAssembly "usermode" that
runs in Ring 0.”

It’s inspired by the Microsoft experiment Singularity OS.

~~~
davidgrenier
I've been wondering for a while as to why nobody was trying to run an entire
VM at ring 0, the benefits would be significant. I was just not aware that's
what Singularity/Midori were doing.

I'm glad more people are picking up on it.

~~~
masklinn
Doesn't Ling/ErlangOnXen run in Ring0?

~~~
qop
Ling has the concept of hypercalls, as if you were running plain Linux on xen.
So, it's less an actual unikernel and more like a skin that molds itself to
xen, and looks like a unikernel.

So yes. But no.

------
jchw
Having web assembly be a native subsystem... That's brilliant. Why has this
not been attempted for Java or anything else for that matter? I guess you
could count Microsoft's .NET implementation.

In any case, you could extend this beyond just user mode. Currently the domain
of safe ring0 execution is eBPF as far as I know, but this would be way more
approachable and other operating systems could implement it.

I've got no idea what the future is for this project but I really hope it
doesn't stay in the realm of "fascinating but not practical."

~~~
masklinn
> Having web assembly be a native subsystem... That's brilliant. Why has this
> not been attempted for Java or anything else for that matter?

There have been CPUs which could execute Java bytecode directly…

And not just actual Java processors, ARM has/had an extension for that:
[https://en.wikipedia.org/wiki/Jazelle](https://en.wikipedia.org/wiki/Jazelle)

~~~
jchw
I was aware of the Java CPUs, but somehow that seems less interesting. Maybe
because at that point, it's not so different from any other CPU architecture.

------
zaarn
WA in the kernel could lead to drivers being established as WA code with
special interfaces. With safe interfaces to stuff like PCIe devices, the linux
kernel could transform into a more hybrid kernel, similar to NT. Many funs to
be had!

I imagine it could also be useful to run user programs without having to
switch rings all the time...

~~~
sime2009
> I imagine it could also be useful to run user programs without having to
> switch rings all the time

That is pretty much exactly what this project is aiming at:

"Cervus implements a WebAssembly "usermode" on top of the Linux kernel (which
tries to follows the CommonWA specification), enabling wasm applications to
run directly in ring 0, while still ensuring safety and security."

~~~
jacobush
Reminds me of how Forth was supposed to be the language for drivers and such,
write once, run everywhere drivers.

------
qznc
Another guy evaluating similar stuff:
[https://idea.popcount.org/2017-03-28-sandboxing-
landscape/](https://idea.popcount.org/2017-03-28-sandboxing-landscape/)

------
titzer
Do not run untrusted code at ring 0, regardless of software sandboxing
technology. It's just too risky!

Otherwise neat.

~~~
geofft
Why do you claim this?

1\. Hardware sandboxing (i.e., ring not-0) isn't much better, as seen by
Meltdown.

2\. Most production-ready UNIXish kernels have had support for running
untrusted code (namely BPF bytecode) in the kernel for decades.

3\. Do you _really_ trust all the code currently running in ring 0 on your
computer? In particular, do you trust the executable loader. which handles
complex untrusted input? What's the line between "code" and "not code"?

4\. On most desktop machines, there's a single user, and malware being unable
to get to ring 0 isn't particularly stymied; it can still exfiltrate your
files, stream your webcam, log into your bank, etc. Why is ring 0 more of a
concern than untrusted code elsewhere? (In fact, on most Linux desktops,
malware can wait until the user runs sudo, inject itself in, and then run
insmod and get to ring 0 directly....)

5\. These same machines make a practice of running untrusted, JITted
JavaScript and WebAssemy all the time inside the same sandbox you think is too
dangerous, and the sandbox works. Not perfectly, of course, but also certainly
much _better_ than, say, the Linux kernel protects itself from local privilege
escalations. Why is the same software sandbox too dangerous for use in
kernelspace?

~~~
titzer
> 1\. Hardware sandboxing (i.e., ring not-0) isn't much better, as seen by
> Meltdown.

Meltdown was a single Intel bug, it did not occur on other CPU architectures
or on AMD chips. It was a result of asynchronous permission checking and it is
a side-channel disclosure (a non-write bug). It is objectively not as bad as
the tens of thousands of buffer overruns and memory write vulnerabilities in
software.

> 2\. Most production-ready UNIXish kernels have had support for running
> untrusted code (namely BPF bytecode) in the kernel for decades.

Actually it is a security vulnerability as well. In fact, the Project Zero
proof of concept for Variant 1 of Spectre was an attack on the BPF
_interpreter_ , not even a _JIT_ , it's even worse with a JIT.

> 3\. Do you really trust all the code currently running in ring 0 on your
> computer? In particular, do you trust the executable loader. which handles
> complex untrusted input? What's the line between "code" and "not code"?

There are levels of trust, of course. I trust the Linux kernel a heck of a lot
more than, e.g. V8. And I work on V8. On the WebAssembly implementation. I
didn't want to mention it, but yeah, no, I would not put my own code into the
kernel.

> 4\. On most desktop machines, there's a single user

Again, levels of trust. I would not, e.g. run most userspace software in the
kernel, just because it's so broken it will probably bring down the system.
All the other things you mention are made easier, not harder by running in the
kernel.

> 5\. These same machines make a practice of running untrusted, JITted
> JavaScript and WebAssemy all the time inside the same sandbox you think is
> too dangerous, and the sandbox works.

I don't want to scare you, but please don't labor under the assumption that
web browsers are 100% secure. We have tons of bugs. I mentioned above that the
WebAssembly implementation in Chrome is a lot of my work. We've had security
vulnerabilities.

> Not perfectly, of course, but also certainly much better than, say, the
> Linux kernel protects itself from local privilege escalations. Why is the
> same software sandbox too dangerous for use in kernelspace?

Objectively, no, it isn't better than the Linux kernel. And yes, it is too
dangerous. This is based on the hundreds of security bugs that I've been
involved with while working on Chrome, and the hundreds more that I wasn't
involved with, and the probably hundreds more that are hiding in there. Yes,
we take security seriously, and we are very sober about this.

~~~
geofft
Thank you for working on V8 and the WebAssembly implementation. :)

I still think that this is a case of knowing how the sausage is made. I've
operated public-facing Linux systems for many years (but I am not either a
kernel nor V8 developer, so you know what you're talking about more than I do)
and the Linux kernel is ... not good. You take security seriously, and find
hundreds of bugs, and it scares you, which is great. The Linux kernel does
_not_ (remember the whole "security bugs are just normal bugs" thing, plus the
resistance to architectural improvements that kill bug classes, which as far
as I can tell you folks seem to be very excited about).

I would hope that you think that V8 + the associated Chrome sandbox (which, to
be fair, I think does not have an equivalent in this project) is secure
_enough_ to be exposed to random JavaScript / WebAssembly from malicious
parties on the internet running and updating 24/7, and keep things reasonably
safe, because a billion people do exactly that. I'm not saying it's perfect or
unbreakable - I'm just saying I definitely don't trust Linux to be secure
against random userspace from malicious parties on the internet running and
updating 24/7.

------
vitno
This is a cool idea I've seen increasingly talked about, but in practice? I
hope it never happens. Single Address space computing is a bad idea.

[https://www.cs.princeton.edu/~appel/papers/memerr.pdf](https://www.cs.princeton.edu/~appel/papers/memerr.pdf)

------
Asiasweatworker
So NaCL and PNaCL are not good enough to be comparable with WASM?

~~~
monocasa
NaCL relied on a lot of segment register tricks that are only available to
ring 3 AFAIK.

------
tree_of_item
Can someone explain what this is actually useful for, if you're not a kernel
developer? Would people in userland care about anything like this?

~~~
traverseda
Gary Bernhardt explains it better in this talk:
[https://www.destroyallsoftware.com/talks/the-birth-and-
death...](https://www.destroyallsoftware.com/talks/the-birth-and-death-of-
javascript)

But to summarize, jumps between kernal-space and user-space are expensive.
Instead of doing that, we can run a well-vetted interpretor in kernal-space,
and run "userspace" programs in kernal-space, in the interpretor.

This actually isn't slower (or so it is claimed), because a JITed interpretor
can be native speed on hot-code paths, and the inefficiencies for most
workloads are more than made up for by not having expensive syscalls.

So what you end up with is something that is about as fast as normal compiled
code for cpu-intensive workloads (maybe faster sometimes), much faster for
workloads involving a lot of syscalls, and interpreted languages like
python/javascript end up much faster as well, presuming they can take
advantage of the efficient JIT implementation.

Personally, what most exites me about this technology path, is that it should
reduce the cost of interprocess communication to near zero. Combined with a
shared object model, and a capabilities system, it could be pretty awesome.

