
Nebulet: A microkernel that runs WebAssembly in Ring 0 - lachlan-sneff
https://github.com/nebulet/nebulet
======
kibwen
The author has previously done AMAs on /r/rust:
[https://www.reddit.com/r/rust/comments/8j7y1f/i_am_lachlansn...](https://www.reddit.com/r/rust/comments/8j7y1f/i_am_lachlansneff_creator_of_nebulet_a_rust/)
and
[https://www.reddit.com/r/rust/comments/8bzex4/a_microkernel_...](https://www.reddit.com/r/rust/comments/8bzex4/a_microkernel_that_implements_a_webassembly/)

And yes, for those comparing this to The Birth And Death Of JavaScript: _"
After watching it for the first time, I realized that the joke idea that Gary
proposed could actually work, and work well! I'd say that that talk is
probably why I started writing Nebulet."_

~~~
ttflee
It should be renamed METAL sometime in the future and we should start
implementing an open source shim named Cacao, before the nuclear war break
out.

~~~
lachlan-sneff
An earlier version of nebulet was named "Metal OS".

------
mrybczyn
I love it. God speed.

The idea of a multi-user time-sharing virtual memory operating system is
seriously outdated.

Today, the server OS could be based on a one user, multi computer abstraction,
with multiple gigs of memory, solid state storage, and reliable gigE+ network
as assumed essentials.

Entire sub-systems of the current linux kernel can be omitted.

Most of the VFS layer, with so many optimizations for spinning rust. Right
out.

Paging, swapping, shared objects, address randomization and PIC. Right out.

User access controls, file access controls, network access controls /
firewalls. Right out.

Of the 50,000 device drivers in the linux kernel, probably 50 deserve to be
supported.

For a workstation with graphics / GPU and console support, and every pluggable
external device support, maybe some sort of legacy emulation layer would work.
Basically run a linux VM for backwards compatibility with those 50,000
devices. Would be less work than implementing those drivers...

Remember, Linus was once a 19 year old with a dream, and a small repo of
prototype kernel demo code, 25 years ago.

~~~
nerdponx
As part of the lightly educated masses, this sounds good to me. Can anyone
more knowledgeable than me comment on the feasibility of something like this?

~~~
FooHentai
To the extent the OP took it, as a universal way that 'servers' could go, not
very feasible. Partly because we already HAVE what's being described, in the
form of hypervisors.

We then layer within it all that other stuff that's chopped away to bring us
back to a system that has a bunch of stuff we want, but most importantly that
supports a wide range of applications we want to deploy.

The big problem with stripping back to the absolute bare essentials is that
you optimize towards a local maxim, and severely limit your flexibility. This
is certainly the way you want to go if you have deep pockets coupled with a
need for bare-bones speed that can't be sharded in an effective manner. But
that's not the majority of workloads.

~~~
detuur
Honestly the overhead of a modern OS, especially one like Linux which has been
finetuned for these sorts of workloads is negligible. More importantly, any
gains you make by stripping out parts of the OS you don't need you immediately
throw away by going for a virtualised, sandboxed ISA.

If you want to squeeze more performance per watt than you can get from a
modern server, the only way forward is to code your application in Verilog and
to run it on an FPGA or ASIC.

~~~
vthriller
It's not only about performance per something though, see e.g. this paper
about boot time optimizations:
[http://oirase.annexia.org/tmp/paper.pdf](http://oirase.annexia.org/tmp/paper.pdf)

------
mhandley
So in light of Spectre, the Chrome developers don't believe it's safe to have
any sensitive data in the same memory space as V8, but WebAssembly is safe in
ring 0? What am I missing here?

[https://chromium.googlesource.com/chromium/src/+/master/docs...](https://chromium.googlesource.com/chromium/src/+/master/docs/security/side-
channel-threat-model.md)

~~~
netsharc
"Normally, this would be super dangerous, but WebAssembly is designed to run
safely on remote computers, so it can be securely sandboxed without losing
performance."

I'm staring at this sentence hoping the author is being supremely sarcastic...

~~~
AgentME
I think they're saying that WebAssembly code doesn't lose any performance when
you sandbox it, _not_ that WebAssembly has equivalent performance to native
code.

------
xenadu02
He mentions Singularity as inspiration which is exactly what came to my mind.

This isn’t an unproven space. Singularity proved you can use a single global
address space (given 64-bit) and software to isolate processes - something MS
Research called Software Isolated Processes.

This requires a verifiable bytecode/VM system so the kernel can verify the
instruction stream at load time. In a way, WebAssembly is even easier to
verify than C#.

It’s obviously a research toy but that isn’t a bad thing.

~~~
kllrnohj
That was before spectre happened. Software isolated processes are now
basically impossible to do. All the CPU microcode updates & workarounds are
purely about fixing isolation between processes & rings that the CPU is aware
of.

~~~
kibwen
AFAIK, all the CPU microcode updates were for fixing Meltdown, not Spectre.
Unless and until CPU manufacturers find a way to fix speculative execution for
good, software-based mitigations would seem to be mandatory.

~~~
ori_b
The problem is that "fixing speculative execution" means "stopping to make
things faster in any measurable way". Given that the caches are there to make
things 2 orders of magnitude faster, this is rather crippling.

The hardware based protections are the only boundary where you can "fix"
speculative execution without slowing down your system by 100x.

The hardware boundaries effectively act as hints that say to the hardware
"slowing down things by 100x here won't kill performance too badly -- we
should turn off speculation for safety". Without those hints, everything needs
to get that much slower.

~~~
andreiw
Considering the hardware mitigations aren’t some magic bullet, but are a “big
hammer” that cripples CPU OoO behavior in crrtain ways _for everything_ ,
software mitigations don’t seem that unreasonable. Of course, “software
mitigations” use CPU facilities as well, and can allow for making critical
sections safe. In practice, it is well understood what code needs to be
written in a spectre-safe way. The difficult question isnhow to write code to
anticipate the next 10 varieties of speculation attacks...

------
willsinclair
> Normally, this would be super dangerous, but WebAssembly is designed to run
> safely on remote computers, so it can be securely sandboxed without losing
> performance.

This seems a like a pretty strong claim. I hope that it's true, but I'm not
going to be running WASM modules in ring 0 any time soon.

~~~
baybal2
WASM is a stack machine, which does not add to security. You sense that no
real VM specialist, nor one in field of security put hand to its design.

When I first read the specs, it screamed to me "VM design 101." It feels to be
someones master thesis, more than a piece of production software. Just as the
original Netscape Javascript 1.0 was.

It will have its fair share of "typeof null" style bugs to come.

~~~
bennofs
Why is a stack machine bad for security? The JVM also had sandboxed execution
as a goal and also uses a stack machine. But perhaps the stack machine was
choosen because it tends to produce smaller binaries (which is important for
things you send over the net) and not for security reasons?

~~~
baybal2
>But perhaps the stack machine was choosen because it tends to produce smaller
binaries (which is important for things you send over the net) and not for
security reasons?

Who knows what was in their heads, but stack level attacks are as easy as to
exploit unsafe type casting in anything that amount to a stack pointer.

My guess why they choose to do it that way is simply because there are more
literature available for mid-tier coders in style of "VMs for dummies" and
they wanted to always have an option to not to do extensive research on every
small mater, and just copy JVMs behaviour.

~~~
legulere
The stack in stack-based VM does not refer to the real stack that contains
return pointers that can be manipulated. You don’t have access to that from
web assembly.

The security problems of java are not related to it being a stack-based VM at
all. The problems are that the api lets applets do things they shouldn’t be
able to and arbitrary code execution during serialisation.

------
Sir_Cmpwn
Obligitory:

[https://www.destroyallsoftware.com/talks/the-birth-and-
death...](https://www.destroyallsoftware.com/talks/the-birth-and-death-of-
javascript)

~~~
carry_bit
If it's actually prophecy coming true, people might want to leave SF before it
becomes part of the exclusion zone.

~~~
microcolonel
It's already happened, it's just that the SF/MV overmind migrated to another
host and has resumed normal operation.

------
pcwalton
Linux already allows untrusted user code to be run in a ring 0 VM, via
BPF/eBPF. Honestly, given the choice between Web Assembly and eBPF to run in
kernel mode, I'd go with Web Assembly. The fewer secure VMs that have to be
audited, the better.

~~~
monocasa
To be fair, BPF/eBPF is far more designed for simple and obviously correct
verification. For eBPF, it's 10 register, two address, but load store, so that
on CISC and RISC platforms there's generally a 1 to 1 between eBPF asm and
JITed ASM. So the in kernel JIT doesn't even need a register allocator. The
code flow graph is constrained to be a DAG so you can solve the halting
problem on programs. etc.

You're looking at at least a couple of orders of magnitude more work to get to
the same level of correctness for a WebAssembly runtime.

~~~
titzer
There has been a WebAssembly interpreter implemented in Isabelle [1] and it's
been proved correct. Not fast, but still pretty cool.

[1] [https://www.isa-afp.org/entries/WebAssembly.html](https://www.isa-
afp.org/entries/WebAssembly.html)

~~~
monocasa
> Not fast

That's the kicker. BPF is simpler, _and_ in spitting distance of native code
perf. And if you're doing something crazy like injecting user code into
interrupts, you care about perf.

That being said, you're totally right that it's possible to get parity, just
orders of magnitudes more work.

------
kodablah
More context:

[http://lsneff.me/nebulet-booting-up/](http://lsneff.me/nebulet-booting-up/)

[http://lsneff.me/why-nebulet/](http://lsneff.me/why-nebulet/)

~~~
kowdermeister
"Nebulet is just a way to try new ideas in operating system design. It’s a
playground for interesting and exotic techniques that could improve
performance, usability, and security."

That's it. I hope he gets the support he needs.

~~~
empath75
> I hope he gets the support he needs.

That's what you say when someone checks into rehab, not releases a software
project :)

~~~
xena
Is there a difference?

------
IncRnd
> Normally, this would be super dangerous, but WebAssembly is designed to run
> safely on remote computers, so it can be securely sandboxed without losing
> performance.

This throws away the very important security property of defense in depth. A
system design should include interlocking levels of security, so even if there
is a vulnerability in one place, extra work may be required to exploit it.

~~~
pcwalton
It's no worse than what Linux already allows via eBPF.

~~~
IncRnd
So? Removing hw ring protections for all programs is not the answer.

You are actually making my point.

~~~
_wmd
Hardware protection rings are a holdover from a time when we didn't have
program formats that could be statically transformed and verified to do
exactly what the operator desires. They have a nonzero performance overhead
even when implemented in hardware, and in an ideal design should not be
required at all

Re: your followup about defensive in depth, this is a common and frankly
boring fallback argument. At some point computers had much less reliable
internals, and for example even the result of strlen () could vary across
runs. Should we also perpetually account for the presence of unreliable
registers or memory too?

~~~
kllrnohj
> when we didn't have program formats that could be statically transformed and
> verified to do exactly what the operator desires.

Which we still don't. Rowhammer, spectre/meltdown, etc... proved that even if
the code doesn't violate any sandbox constraints that doesn't mean it didn't
violate everything the sandbox was attempting to protect.

Hardware isolation is still very important and very necessary, now more than
ever.

~~~
_wmd
JFC. The reason the bugs you refer to are so important is because they
/spanned/ hardware protection domains.

~~~
kllrnohj
Yes, which means they also spanned software protection domains. Meaning your
software protection didn't work, regardless of how "verifiable" the bytecode
is.

And a few of those so far have no known software protection domain fix,
relying instead of hardware domains (eg spectre, which is why Chrome pushing
site isolation hard - because they can't fix the software protection and are
relying on hardware ones instead)

------
exabrial
Ok I don't understand the reasoning behind why this is safe... The explanation
is pretty sparse.

~~~
jchw
The language is memory safe, so it is theoretically limited to the surface
area that the non-memory-safe OS libraries allow. Of course, that is assuming
there are no exploits in the WebAssembly compiler, and no exploits in the OS
libraries.

Access to syscalls would not be unusual in any case. This is the normal attack
surface for an OS. The new attack surface is the WebAssembly compiler and
checker.

To be honest though, given the processor vulnerabilities that have come about,
I don't know if I really feel so bad about software protections like this
anymore. Nothing is a panacea, even magical processor protection rings.

~~~
gnode
I think in light of recent processor bugs, there's also an argument that
software protection is easier to patch than hardware protection.

------
bryanlarsen
discussion on the similar project Cervus from earlier today:
[https://news.ycombinator.com/item?id=17184410](https://news.ycombinator.com/item?id=17184410)

------
kartan
This is what the Java Virtual Machine could have been but never was. Most
applications are so lightweight for today's computer capabilities that running
them over a virtualization layer is not a problem at all. As a developer I
want my applications to run everywhere without having to take into account
different OSs, architectures, etc.

You still will have native applications for a lot of different situations
where performance is more critical and paying the higher cost of creating,
testing and distributing for different OSs. That is a skill that should never
die, like creating the hardware itself.

HTML5 is a good attempt to this, but it has problems with not-well-defined
behaviours and the fact that not all applications, e.g. games, fit the
hypertext approach.

~~~
umanwizard
What problem exactly do you have with Java?

~~~
kartan
Java virtual machine could have had better integration into OSs and browsers,
but it's licensing model made it impossible. I worked with Applets on the
browser and had a lot of potential, but probably it was too soon for them as
computers where slower and Java was not so will optimized back then.

WebAssembly doesn't has this history and it can be used in ways that Java has
been discarded for historical reasons.

------
2bitencryption
just when i was beginning to feel adequate, i discover this is all done by a
highschooler :(

congrats to that kid, he's got a damn good head on his shoulders.

~~~
pgeorgi
highschoolers have one wonderful resource in abundance (compared to the
working population) and that's time.

~~~
lachlan-sneff
Yeah, no, we don't. I just find time for the things I love.

~~~
danaliv
Trust me - compared to working adults, you do. :)

~~~
jzoch
As someone who was in high school just a few years ago and am now working I
would argue that is definitely not true. High school can quite easily take a
lot more time than work, especially if you are an honors student or do any
extra-curricular activities. I would wake up at 6 am, be at school at 6:45am,
and leave school at 5pm most days of the week due to sports and class. Then I
would have 2-5 hours of homework to do. Now, working for a large company is
40-50 hours a week, 9am-6pm with little work after hours.

~~~
JacobGoodson
Depends on the job, some of us do a lot more than 40.

Getting married and having children will make you reassess your statements.
Heck, just getting married will. Of course, I am assuming that you are single.

------
haglin
Java Applets were supposed to be safe too.

------
phkahler
What's safer, running untrusted code at ring 0, or running it at a lower
privilege level? The answer should be obvious to anyone that's not dreaming of
some ideal world.

Ideals are great to strive for BTW, just don't lie to yourself about already
having gotten there.

~~~
roblabla
The point isn't to be safe, the point is to be both safe and fast.

~~~
phkahler
Lower privilege levels don't really convey performance benefits. I suppose
syscall overhead may impact some applications, but one should not be running
untrusted code where the performance bottleneck is accessing the system.

------
matachuan
This is an interesting idea by putting some userspace stuff back into kernel.
But I wonder how much overhead you can save as the supporting argument is just
one sentence.

~~~
vectorEQ
context switches / syscalls cost a lot of cycles to save and reload cpu
context. you basically need to push all stuff to memory related to context,
registers, flags, stack pointers et.c etc., then load the ones for kernel
space, perform some call, and switch back. This means a lot of overhead is
saved, especially for small functions.

[https://wiki.osdev.org/Context_Switching](https://wiki.osdev.org/Context_Switching)

~~~
matachuan
lol I'm well informed on this topic but thanks anyway. Maybe I didn't make
myself clear in my previous post. I was questioning the motivation -- Is there
any existing profiling work that shows a significant amount of time has been
spent on context switches in this _particular_ workloads and how much you can
save by adopting this approach.

~~~
eptcyka
Well, do you not recall the massive amounts of moaning that took place because
now everyone's I/O bound stuff (most webservers/databases) works far slower
than it used to when spectre/meltdown patches had been applied? If we could
get rid of the cache flush that occurs when a userspace process syscalls,
there'd be a lot less heat produced at your local data center.

~~~
matachuan
I agree with what you said because everyone knows it's __massive __. All I
wonder is __how __massive, quantitatively.

------
ixtli
I love this because im closer to being right about my claim that eventually a
restricted javascript will replace arch specific assemblies >:D

------
vectorEQ
interesting project. it's not safe, but definatly i like this kind of
application of system software.

It might be interesting to try and run the webassembly in a VM ring 0 where
the ring -1 can do some security routines and checks on it, but that might be
beyond the scope and intent of this project.

------
alexandernst
Because what could possibly go wrong?

------
bigbluedots
This does not sound like a good thing for anyone to actually use in production
...

------
moonbug
Just reading that title makes me feel a little fit sick

------
childintime
Well, congratulations are in order, if this is the first step towards throwing
out most of the OS.

Face it. Linux had its time, its conception of the world reflect a developers
wet dream from the mainframe era, with many users, few resources, and the soon
to be extinct dictator, err, universally hated system administrator.

Who needs elaborate permissions when you're the only user on the system? Which
user still shares data without sending it, manipulating permissions? Who likes
managing installs and incompatible dependencies? Hell, just copy the data,
that's what we all do. The file-system dedups it anyways if necessary.

The list of unused and unwanted features goes on, but developers just keep
reincarnating this same old fantasy of an anachronistic OS. It seems to me
nobody but Linus can make them see again..

Linus, there's so much pain. What's the use case these days? Do you see Linux
going over its own horizon, how? I suspect your answer involves a server OS.

PS, this idea of course also reminds of Microsoft's Singularity OS.

~~~
umanwizard
Virtually all production servers in the world today are running GNU/Linux. The
substantial majority of smartphones run a Linux kernel. I wouldn't say Linux
has "had its time".

You may be right that the traditional permissions model is outdated, but it
represents only a very small part of what Linux does, and indeed it seems
totally unrelated to this project, so I'm not sure what point you're making.

To use the Android example again, it's proof that you can layer a very
different permissions model (single-human-user app sandboxing) on top of the
Linux kernel.

~~~
filleduchaos
Android uses the Unix permission model heavily - sandboxing is achieved by
creating a new user per app, and filesystem, network, etc permissions are
managed via user and group IDs just like with Unix.

So how is it "very different" (in implementation, not surface appearance)?

~~~
umanwizard
Surface appearance was my point -- it's possible to create a model that's very
different _from the user 's perspective_ without changing the underlying model
of the kernel.

