Hacker News new | comments | ask | show | jobs | submit login
Wasmjit: Kernel Mode WebAssembly Runtime for Linux (github.com)
214 points by varenc 4 months ago | hide | past | web | favorite | 133 comments



The actual problem with this, after you have satisfied yourself about the numerous non-problems, is that practically any real app in wasm has calls to library code implemented in C or C++ and compiled to real asm to implement the extended Javascript runtime, that you would not want running in ring 0. To those are added all the kernel-mode functions that must be used correctly. (If the app can be provoked to use them wrong, boom.)

So, the plausible use is an implicitly trusted app not exposed to untrusted input, that is system-call bound, and that you want to distribute cross-platform without recompiling to targets that have installed this module in a known kernel version. It seems easier just to distribute source for a kernel module; or kernel modules built for the (2? 3 tops) architectures required.

[edit: spelling error.]


I'm not sure why linking third party code is a concern, it all must live inside the sandbox provided by the runtime. This reduces the trust requirements from depending on a (software | hardware) solution where the hardware portion cannot be audited, and by 2018 are no longer to be considered trustworthy by default, to only depending on access and bounds checks implemented by a single body of code visible to everyone.

The 'system call interface' available to those buggy libraries need be no more capable than existed previously, that system calls are serviced without switching protection mode need not extend the potency of any possible attacks.

A future implementation of this style of design will have much better security and auditability properties than anything that has ever popularly existed in the past. It may have originated with the web, but I welcome wasm and all the market punch it promises to bear on ancient and long-unquestioned corners of our software stack like this. (Naturally I was also a Microsoft Singularity fan)


How does WebAssembly fix hardware issues? For example, take Rowhammer. I don't see how running your code in a VM fixes Rowhammer; surely there would still be some mapping of data in wasm to RAM, and I don't see why a sufficiently motivated individual couldn't determine and subsequently exploit that mapping.

Or what does WASM do about the concerns that major CPU vendors embed a second co-processor running untrusted, unauditable code with ring -1 access alongside the main CPU?¹

¹which may or may not be active, depending on CPU model and configuration. I've never really fully understood when it is, or isn't, but I think it's somewhat moot.

Further, this assumes the runtime (which is not just the WASM runtime: system calls and other functionality will inevitably need access to the actual, underlying hardware in order to do their job) from being buggy, and since the code is now in ring 0, you get all the consequences of that. (Full, unrestricted access to everything. At least previously, you'd need to find some root exploit to get that.)

(Also, as another poster hints below, since everything is running inside the same protection layer from the CPU's point of view, what prevents you from just running the same attack, in spirit, as Spectre? Only now, there is no protection — everything is ring 0 — so the CPU is "correct" to speculate a read. Sure, the VM will deny it to the code inside the VM, but that didn't matter in the case of Spectre?)


> since the code is now in ring 0, you get all the consequences of that

No you don't. Because WASM does not provide many of the facilities that real assembly provides. So, for instance, there is no way to stack-smash using WASM instructions, regardless of the any security problems in the code itself, or even outright malicious code. It just can't do it.

More generally for VMs, there are secure VMs which provide a mathematical proof that the code will, for instance, observe memory safety. Such a proof is much better than ring-level protection, because :

1) You can verify the proof. Good luck verifying the hardware implementation of ring-level security in processors

2) It doesn't take any resources at runtime

3) the proof can be verified at code load time, so insecure code (accidentally insecure or otherwise) just doesn't start executing, ever

4) It doesn't allow for manufacturers to hide "secure coprocessors" or any other bullshit like that


RowHammer most certainly works in a VM:

https://github.com/IAIK/rowhammerjs


> I'm not sure why linking third party code is a concern, it all must live inside the sandbox provided by the runtime.

Some runtime stuff implements operations not expressible in JS, and not permitted in the sandbox. You may not want that code in the kernel. Anyway, if random kernel functions can be called from the sandbox, it would be easy to misuse them. Untrusted input might trick code into misusing them from in the sandbox, via nominally valid operations.

Does webasm protect against integer overflow, or unaccounted unsigned wraparound? Not all exploitable bugs are pointer violations.


> "it all must live inside the sandbox provided by the runtime."

Processes with a strong user model are already one of the most effective isolation mechanisms in an operating system. Continually hardening your VM/runtime that was ported into the kernel, in my opinion, either results in either you building a microkernel or recreating the userspace/kernel separation that already exists?


Parent comment refers to running buggy code in ring 0, however in a design like wasmjit, that code would be compiled first to wasm, and the actual code running in-kernel is a derivation that includes bounds checking on at least memory operations. The original code never "runs" in ring 0, what runs is a derivation that includes a software equivalent of the hardware isolation we're used to and have relied on for the past 20 years.

While I haven't studied wasmjit, the most obvious implementation is to run system calls naturally, with a global "struct task" existing as before that defines the semantics of the current context, including details like the current UID and capability mask - in other words, without effort, read() and open() could be made to behave identically to before, it's just that the caller now lives in a software sandbox rather than a hardware sandbox, and most/all expensive hardware reconfiguration was avoided


I read through it and this is indeed how it works. Wasm as a language provides the restrictions that the MMU provides for machine code. You can't read from/write to arbitrary pointers in wasm.


WASM does not do bounds checking on pointer data.


It does


It does not, something like

    char buff[100];

    char func(int idx) {
      char *ptr = buff;
      return ptr[idx];
    }

    int main (void) {
      printf("%c", func(200));
      return 0;
    }
Compiles nicely to something like this

    (module
     (type $FUNCSIG$ii (func (param i32) (result i32)))
     (import "env" "putchar" (func $putchar (param i32) (result i32)))
     (table 0 anyfunc)
     (memory $0 1)
     (export "memory" (memory $0))
     (export "func" (func $func))
     (export "main" (func $main))
     (func $func (; 1 ;) (param $0 i32) (result i32)
      (i32.load8_s
       (i32.add
        (get_local $0)
        (i32.const 16)
       )
      )
     )
     (func $main (; 2 ;) (result i32)
      (drop
       (call $putchar
        (i32.load8_s offset=216
         (i32.const 0)
        )
       )
      )
      (i32.const 0)
     )
    )
Which will gladly blow up, or not, when i32.load8_s gets called.

https://webassembly.studio/


The load8_s instruction will check that the computed offset into the memory of size 64KB (1 wasm page) does not index past the bounds of 64KB. If it does, the program will trap.


Which will not work on the provided example, thus leading to memory corruption.

Where is the WebAssembly implementation that traps on my example?


WebAssembly protects the host from memory corruption by the user module. To do this it does a bounds check before executing the load. It does not protect a user module from itself. It does not change the semantics of C.

Relevant documentation http://webassembly.github.io/spec/core/syntax/instructions.h...


> It does not protect a user module from itself. It does not change the semantics of C.

Which is my whole point, WebAssembly does not protect memory corruption inside of the module code, which allows for security exploits anyway.

On my sample code if I expose func() to the host, and it gets called with 200 as parameter for a buffer size of 100 bytes, no trap will ocurr.

On a real use case that call might induce an internal memory corruption that will, for example, change the behavior of other functions exposed to the host.

If you wish I can provide an example how to do that, which you can try out in your favorite spec compliant Web Assembly implementation.


Current WASM spec is a MVP. There will be other ways to expose functionality to the host that should be safer. See future proposals. Your points are valid though.


How is this relevant to wasmjit? User space programs written in C can already corrupt themselves. As far as I can tell there is no new inherent risk to kernel stability by running wasm code in kernel space as long as the wasm spec is followed. Just like wasm programs aren't able to corrupt the browser sandbox in which they run.


The sandbox gives a false sense of security, because it opens a new attack vector.

Apparently you fail to understand how security exploits are taken.

For example, lets say I have an authentication module provided as WebAssembly, written in C.

The browser makes use of the said WebAssembly module to authenticate the user.

Now we make a cross site scripting attack that calls the WebAssembly functions in a sequence that triggers memory corruption inside of the module, thus influencing how the authentication functions work.

Afterwards the JavaScript functions that call those WebAssembly ones, might authenticate a bad user that would otherwise be denied access.

A contrived scenario that can be easily programmed in https://webassembly.studio/ .


I'm still not sure how this is relevant to wasmjit?

Your criticism is for another layer. If you don't like C, use Rust or Go. They also compile to wasm.


This is a moot point, the entire kernel is written in C. Since all code running on your computer involves kernel code running in ring 0, by your logic that would imply something is wrong, but it's not. System calls verify all input before executing.


And given Linux Kernel Self Preservation Project last report, there is still lots of room for improvement.


I have seen at least three projects to run WebAssembly in the kernel, but not a single strong attempt at a good userland binary format for WebAssembly: I don't want to run WebAssembly in my kernel, I just want an architecture agnostic binary format for Linux executables.


This is a shortcoming in general. There should be a binary format targeted for JIT-able intermediate representations. Or just a specification on how to put IR code inside ELF files.

Now we have (at least) three different IR formats that are very similar but have slight differences:

1. LLVM IR. Created for LLVM internal use and may only work with specific LLVM version(s) (correct me if I'm wrong here), is not portable across HW architectures.

2. WebAssembly. Made for web browsers.

3. SPIR-V. Made for OpenCL and Vulkan shaders/kernels.

All of the above have a specification, a binary representation, a text representation and associated tooling. Some (or most?) of the tools seem to operate by doing a source level translation from WASM or SPIR-V into LLVM, then doing optimizations or JIT and possibly converting the resulting LLVM IR back to WASM or SPIR-V. In my experience, the tooling with WASM and SPIR-V is inferior to LLVM IR tools or native code tools (binutils etc).

There's a lot of duplicate work going on here, but I understand that it's difficult to get the stakeholders of Web browsers, GPU APIs and compiler infrastructure to even discuss what could/should be done.

Compiler IR's suitable for AOT compilation (ie. the whole code is compiled before it's executed, although it might be done at runtime) are a game changer on how we deal with programming languages and runtimes and a huge improvement on Java-like bytecode+JIT which never quite delivered what it promised.


A game changer started in 1961.


Actually, you kinda already have it via QEMU, which is capable of doing syscall translation. Just compile your code for linux on any of the supported target archs and use QEMU User Mode Emulation to launch it.


> I just want an architecture agnostic binary format for Linux executables.

What would you use that for, if you had it?


I guess like IBM i, IBM z and Unisys ClearPath, have a computer system that you can change whatever you feel like at CPU level and have only the kernel take care of the underlying changes.

Universities should teach more about mainframe architectures.


So you don't have to make different binaries for different architectures? Pretty obvious I would have thought.


I don't even think ELF can be architecture agnostic without doing some hideous hacks (aka 'fat' binaries).


It can because you'd just add Wasm as a new machine type. That would be intrinsically architecture-independent because... it is.


Emscripten seems like a good start, the tooling already exists.


Increasingly I’m becoming convinced that something like this may become the backbone of future computing.

Imagine truly cross platform code that performs well, and can be written in whatever language you care for.

The web illustrated certain advantages to developing software that can be easily delivered and run on other systems. However, the learnings of the web were always coupled to the browser.

WebAssembly, through projects like this, may soon be able to empower cross platform code that delivers small, well performing programs on whatever platform a user desires to use.


Well, the JVM was supposed to be this, but turned out to be too closely tied to the Java language, as well as too heavyweight for a lot of webpages (getting replaced by HTML5 instead). It's interesting to see us coming full circle with WebAssembly, albeit with a much more language neutral, multi-vendor framework.


I was going to bring up the JVM because it definitely was an attempt at the same goal. I think the obstacles this time around will be very similar for WASM, but it stands a far better chance due to its language and vendor neutrality, as you mentioned.

For user facing applications there remains the question of what sort of cross platform UI framework could accompany WASM. Or should WASM just remain coupled to the Web? Java’s UI solutions were often received very poorly and led to Java’s bad reputation amongst consumer software.

Analyzing the desires and goals of the big players (Google, Microsoft, Apple) with regards to how Wasm might develop is interesting.


> Well, the JVM was supposed to be this, but turned out to be too closely tied to the Java language, as well as too heavyweight for a lot of webpages

Agreed, but want to add wrt tied to Java the language, that stdlib it carries around is a large burden too. WASM will need a stdlib one day or suffer portability, so here's to hoping whichever one is adopted is small and simple (no, posix is not good enough like emscripten and this lib work with).


As we've seen with JS, an anemic standard library is also a big problem. Hopefully npm-isms dont infect WebAssembly, but I don't hold out much hope.


Why does WebAssembly need its own standard library? No one is going to be writing code directly in it, any more than they would write Java bytecode by hand.


I can't even write a function accepting a string because there is no notion of string. Every piece of WASM written carries the runtime of the source language making sharing code much more difficult and bloating binaries with duplicate logic.

It doesn't need one stdlib as part of the WASM project, but sooner or later people are going to want to share, e.g. a socket. Even without a full blown stdlib, known data types need to be shared beyond just the structs that are coming with the GC proposal.


That's fair, even C has chars and char arrays. I don't know about strings, since every language has its own idea of what a string is. Even something as simple as an "array" has different behaviors between different languages... some languages only have maps that also act like arrays, some index from 1 rather than 0.

If Webassembly wants to remain as agnostic as possible, I don't think there's much more complexity they could add.


Being agnostic is a good goal, but being useful is more important.

Hopefully they get it right, because the JS ecosystem is a bloated mess that we've been stuck with because there's no other game in town.


I believe we would miss it dearly if wasm gains significant traction.

I wouldn't like running closed binaries with monolithic architectures through my browser. Maybe for a few selected high performance applications.

Not because the security aspect but I think it will be detrimental to an open web. Maybe not, we'll see.

edit: I don't really understand people that are not horrified by the thought of replacing general javascript code with one that is written in c/c++... Because string operations are awesome in these languages, right?!


The point of webassembly isn't to replace general javascript code but to be a better compile target for C/C++ etc than javascript, and to optimize performance critical parts of javascript when necessary.

Plenty of people want it to be a complete replacement for javascript, but I don't see that happening. I certainly don't think the doomsday scenarios that sometimes get mentioned whereby the entire web becomes nothing but closed, DRM ridden WASM binaries is likely. Large corporate owned sites and streaming services will almost certainly leverage it as much as possible, though.

"closed binaries in the browser" have existed since the days of Java applets and Flash. And no one is stopping anyone from providing or distributing the source for their webassembly modules, so I don't see webassembly as being any less free in that regard than any other language that compiles to a binary. Arguably, webassembly is more free because it isn't owned by a company or restricted to a single language.


> have existed since the days of Java applets and Flash.

That it what I was thinking about and it is something I wouldn't want to get back to. And having javascript as compile target would net you a more readable and therefore more open code. Painful process, sure.

But yeah, for specialized application webassembly could very well be worth it.

Javascript and C for example are like completely distinct circles of friends and you don't necessarily want them to meet each other.


The JVM as a kernel module would be interesting

I'm not sure where the hate for this comment is coming from though given the submission is also talking about adding a runtime into kernel space.


Sun did actually implement a Solaris kernel JVM.

"Writing Solaris Device Drivers in Java"

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.92....


Both Burroughs/Unisys and IBM had same idea about future of computing in the late 70's and then went to build AS/400 and Burroughs large systems, which are both based on kernel-space JIT compiled "bytecode" (with significant difference in granularity of the thing that gets compiled on first use and how the compilation results are cached, which for IBM i5/OS essentially means that it is on-demand AOT compiler), which is the reason that these systems are currently based on commodity hardware (ie. Power and Xeon).


And the AS/400 used to be the huge success.


We are going in circles here, this is how mainframes started in the early 60's, bytecode with microcoded CPUS.

Notable examples, Burroughs B5500, Xerox PARC workstations, ETHZ Oberon/Modula-2 workstations, UCSD Pascal, ..., iOS bitcode, Android DEX and UWP MSIL.


Operating systems don't work like that, you could program something high level but in the non-web world you need ensure data is written to storage and syscalls which are not portable even between architectures.


For WebAssembly programs to work syscalls and storage stuff would have to be wrapped so that each platform has its own implementation with the same interface.

This isn't that strange of a concept, it's how stuff like Java has worked, but for it to truly feel native the companies that control the platforms would have to coordinate on it and integrate it. Unfortunately I don't see that happening quickly, but perhaps eventually.


Would never work, how do you emulate real capabilities on say Linux or IOCP, etc.


Genuinely ignorant question, what's the difference with C from this perspective? I thought the whole point of C was to be a common language.


C has to be recompiled per platform.

The advantage to something like using WebAssembly is you could write a program in C, compile it to WebAssembly, and then it could be run on anything that supports WebAssembly.

This could enable a future where you write one program that can run in a web browser, as a phone app, as a desktop program. All with only tweaking the UI.

Additionally the divide between a “web app” and a desktop app from a user perspective increasingly comes down to: do I need this to run fast (as a native program), or do I want it to be convenient to use (as a web app)? WebAssembly may very well become fast enough that it is fast and convenient for all but the absolute most demanding of applications.


WebAssembly's closest analog would be LLVM's IR, which many compilers like clang (for C/C++) or rustc output so that they can use LLVM's existing codegen. In WebAsm's case, however, the backend would be the different browsers and their sandboxes.

The C ABI describes a common calling convention and a few other details to allow different compilers to use each other's binaries, but C is hardly a common language in the way WebAssembly is meant to be.


Binary compatibility across architectures, no undefined behaviours are the primary ones.


Yes of course, here is an example of a well defined, no ub, == overloads.

https://slikts.github.io/js-equality-game/

It is all bound to make JS kernel programing a Pilar of OS Stability.


You're confusing JS with wasm and undefined behavior with confusing behavior. JS and wasm are strongly defined and the specifications consider undefined behavior a bug.


None of that applies to web assembly, as I'm sure you know.


Does that apply to web assembly?


Nasal demons


I agree 120%.

It's weird nobody seems to really complain about platform lock in, which is the main problem created by software companies, that create so much pain for developers.

Even if one looks at the software market, I really doubts that lock in really works at all. Developers quickly figure out ways to do things run on multiplatform.

It's no surprise JavaScript grew to be so popular, because it was just 100% multi platform. It's quite sad to realize js has so many drawbacks.


Attempts at lock-in just seem to delay the inevitable and make software worse in the meantime.

For companies like Microsoft and Apple those delays have been enormously profitable, but long term there may be a hefty cost from their avoidance of truly great crossplatform software development tools.


I stumbled across WasmJIT about a year ago and actually used many of the concepts in the project for my own kernelspace implementation of the portable native client (pNaCl). Projects like these are extremely interesting to me because they offer the possibility of a completely architecture-neutral userspace that can be used across multiple devices without recompilation.


This goes a step further and could offer an architecture-neutral kernel space. If part of loading the module includes validation and external memory access is checked as in browser implementations this could even lead to safer/less exploitable kernel modules with more well-defined failure modes.


The problem I see with this is that there are still no 8bit or 16bit data types (or bitfields for that matter) in the wasm spec, which are kind of useful for interfacing with hardware and for protocols (Ethernet, TCP/IP, etc). Sure, you can pack and unpack them with masks and shifts (which I assume clang,LLVM,gcc,etc do), but it seems awfully inefficient when the resulting code ends up running on x86 or other architectures with native 8/16 bit instructions.


Unless I'm missing something, this should be pretty easy for the webassembly working group to fix. At a guess they probably just need some clear decisions around how to handle endian issues. But right now they're just focussing on the browser use case.

If kernel module developers become a vocal part of the wasm ecosystem, I'd expect 8- and 16- bit data types to get added before long.


> they probably just need some clear decisions around how to handle endian issues

They are handling endian issues by simply requiring little-endian.


I wonder if the pattern is common enough and simple enough that wasmjit could substitute the load-shift-and/shift-or-store sequences into direct 8,16,32-bit stores and loads.


Wasm has efficient 16/8 byte load/store instructions. 16/8 bit data types are irrelevant on x86 since it internally uses 32 bit registers anyway.


Huh? x86/x86_64 has segmented registers that can be used as 64 bit (rax), 32 bit (eax), 16 bit (ax), and the upper and lower 8 bits of the latter (ah,al). Not to mention I/O ports which use 16 bit addresses and allow for 8, 16, and 32 bit accesses, and various internal processor data structures that have 8 or 16 bit wide fields, among other uses.


Performing operations using the segmented registers is no faster than using the 32-bit analogues. Benchmark an add of eax,edx vs Al,dl in a loop.


Projects like these are extremely interesting to me because they offer the possibility of a completely architecture-neutral userspace that can be used across multiple devices without recompilation.

It makes one wonder what would have happened if the Micro/370 had gone forward and the architecture-neutral userspace had evolved in the PC space.

http://www.cpushack.com/2013/03/22/cpu-of-the-day-ibm-micro-...


My mind boggles at the meta-ness of using this software to (direct from the docs) speed up the operation of a webserver compiled to webassembly.

The paranoid security engineer in me screams "Ring0 is what I always wanted my browser to have access to".

Interesting project!


Do not worry, it will be secure. Signed, with verifiable checksums, and gdrp friendly.

Just, erm, load these binaries provided by FB, Google, Microsoft and Netflix. :P


My first thought was mental yelling about keeping kids out of my kernel space. It's been a couple minutes and that has settled down to a grumbling about people sacrificing security in the name of performance again.


Upon inspection of the code, it seems like this is something that could be used by browsers as well...


This neuters a bunch of the new spectre and meltdown mitigations which are required if you want ring 0 to be a security boundary.


How so? WebAssembly can't address memory outside of its sandbox


The whole point of Spectre is that you can indirectly access memory outside of your "sandbox" even if you don't have permissions to those pages, using various side channels.


Aren't those side channels dependent on addressing kernel memory that you don't have permissions for? You can't address kernel memory in WebAssembly, therefore you can't provoke the speculator into caching those pages.


I believe Spectre is driven by processor speculation into a bounds-checked array access, even for when the condition fails (that is, the index is out of bounds). One can get to arbitrary memory in this way using the right index on any array.


Why would you need traditional bounds checking with Wasm?

Just use MMU hardware to insert 2GB [0] no man's land below and above Wasm memory and only allow signed 32-bit indexing (-2^31 — 2^31-1).

This way the attacker can only read sandbox memory (and the useless 2 GB no man's land, mapped to pages full of zeroes or whatever).

When there's no speculation involved or the data is simply out of "speculative range", Spectre is toothless.

[0]: 2GB is just a basic example. More may be required if for example something like x86 SIB (Scale Index Base) is used for multiplying the index by 2, 4 or 8.


"Just" is never a good word in a technical discussion, especially around security vulnerabilities like Spectre.

That's a good idea, but there's also several reasons that may not be appropriate:

- WASM explicitly says that it may be extended to 64-bit indexing (more than 4GB of addressable memory is definitely useful for some things)

- Spending 4GB of (hopefully, virtual) memory on every WASM instance may be undesirable or impossible (e.g. 32-bit processor)

That said, it's very reasonable to impose restrictions on things running in ring-0, and wasmjit could well require a 64-bit machine with 32-bit WASM indices (which I imagine would be okay assumptions for things one would do with it anyway).


> - WASM explicitly says that it may be extended to 64-bit indexing (more than 4GB of addressable memory is definitely useful for some things)

In that case, just fall back to bitwise AND index clamping. A small performance penalty, but nothing major.

> - Spending 4GB of (hopefully, virtual) memory on every WASM instance may be undesirable or impossible (e.g. 32-bit processor)

Just page table entries. Wasting physical memory for that would be pointless. If the entries need to be mapped, on x86-64 it'd incur 4 kB, 2 MB or 1 GB total "wasted" memory, depending on which page size granularity you want to use. Of course, you could also simultaneously use this "wasted" memory for any non-sensitive data.

Well, mapping 2x 2GB memory using 4kB pages does take up hmm... 8 MB of RAM for the PTEs. So perhaps 2 MB pages would be optimal.


> In that case, just fall back to bitwise AND index clamping. A small performance penalty, but nothing major.

Masking the index will break code that is actually using the larger address space: running true 64-bit WASM code (as in, using >4GB of space) won't work, which is what I was referring to.

> page table entries

Indeed, hence the reference to virtual memory. In any case, because both x86-64 and ARM64 only have 48 bits of actually addressable space, that 4GB of overhead (plus, up to 4GB of actual addressable memory) only allows for 65536 (or half that) WASM instances. That's definitely a large number, but not one that is out of reach.


> Masking the index will break code that is actually using the larger address space: running true 64-bit WASM code (as in, using >4GB of space) won't work, which is what I was referring to.

You can also clamp for example at 33-37 bits, giving 8-128 GB array range.


You can do masking in the same way Linux does it. It prevents "branch code bypass" without using an explicit size:

    cmp %bound, %ptr
    jae bad_ptr
    sbb %mask, %mask
    and %mask, %ptr
Just two extra instructions. No need to memory map or hard code the size of bounds.

See `array_index_mask_nospec` in https://github.com/torvalds/linux/blob/master/arch/x86/inclu...


> Just two extra instructions. No need to memory map or hard code the size of bounds.

Pretty neat idea! [Although the (register) dependency chain looks a bit nasty. 'and' will need 'sbb' to commit and 'sbb' will need to wait for 'cmp' to commit (flags register). But I guess the few/rare cases where this latency is really an issue can be dealt one-by-one basis.]

> No need to memory map

Well, using MMU can have performance benefits. Less repetitive bounds checking code and better performance in most scenarios. Both solutions have their strengths and issues, there are no silver bullets.


Good point on the MMU performance advantage and trade offs involved. When everyone's heads were on fire, made sense to indiscriminately mask off user controlled pointers. Now that the dust has settled a bit I imagine we'll see more usage of memory mapping tricks in performance critical sections.


And now you're limited to 2048 WASM instances in a single address space, purely because of virtual memory overhead. To be clear, I think the idea is very neat, but, like most things, comes with a variety of trade-offs that should be reasoned about rather than papered over.


Oh my sweet summer child.

You don’t need to use one exploit if two that interact are common enough.

How about a nice pointer arithmetic bug in any reasonably common version of the webasm runtime?


Let's assume that the runtime is bugless. What security concerns would be still there, and why?


There is no bounds checking inside WebAssembly itself, as it lacks the concept of fat pointers.

So you can still trigger memory corruption if the WebAssembly was generated from languages like C and C++.


Again, not in any way that can cause instability to the kernel differently from running the same program in user space.


As community should strive for more secure software instead of hand waving security issues.


I never argued that it shouldn't, just that the issues you're raising aren't relevant to the security of running wasm in kernel space.


I don't know why you are being modded down.

It's indeed possible to avoid ever generating speculative fetches to sensitive data by simply ensuring the indexed access can't ever reach outside the sandbox in the first place, not even in any possible speculated case.

Just don't use conditionals and branches to do that, but something else. Like MMU to move data out of range or bitwise AND clamping


Related to this, the wasm generated by Go is currently tied to being executed in the browser.

There doesn't yet seem to be a implementable standard to define what a non-browser execution environment looks like. The beginnings of a common spec are here though, which some places have started working with:

https://github.com/CommonWA/cwa-spec

That'll likely need to make it's way into the official specs:

* https://github.com/WebAssembly/design

* https://github.com/WebAssembly/spec

At which point implementations will have something to focus on. :)

For the Go, this is the (recent) matching issue:

https://github.com/golang/go/issues/27766

Kind of guessing it'll turn into a tracking issue to get it done.



Except, WebAssembly isn't even close to Javascript. They're completely different languages. WebAssembly is closer to C than to Javascript.


On the surface level, sure. However, it's mostly just a lower abstraction way of accessing largely the same JIT. I'm pretty sure browsers supporting WebAssembly are doing so by reusing most of what they already have. And if you dig deeper, this was almost certainly inspired by tools like Emscripten and the Asm.js concept. After all, Asm.js accomplished a similar goal to WebAssembly, at the end of the day; Wasm is a cleaner, higher performance, less backwards compatible way of doing largely the same thing.

JS already unhinged from the browser pretty thoroughly. I think when it comes to Wasm it's almost as much about what it doesn't have as what it does have. Lack of DOM bindings and a GC make it much more suitable for hosting in more environments like the kernel.


WebAssembly is basically just a binary encoding of asmjs, which is the subset of javascript discussed in the talk.


While asm.js was basically just a textual encoding of C in JavaScript... round and round we go! :)


I'd say it more a textual encoding of LLVM IR. Which makes the s-expression text format of WebAssembly a text encoding of a binary encoding of a javascript encoding of a compiler intermediate representation of your program. Round and round indeed.


WebAssembly is a statically-typed language which passes values around using a stack. It’s very different from asm.js.


Yeah, but it doesn't have "Javascript" in the name, which makes it automatically better by way of bypassing everyone's irrational hatred of JS


Not really, it was what everyone was hoping would happen for ages. Remember PNaCl?


>>> results in a significant performance increase for syscall-bound programs like web servers

Reminds me of "Why we use the kernel's TCP stack"? And eternal debate over the performance benefits of "kernel bypass" and "zero-copy" technologies such as DPDK versus full userspace TCP implementations like OpenOnload.

https://blog.cloudflare.com/why-we-use-the-linux-kernels-tcp...

Conclusion is that the untapped fruit in kernel TCP gains comes from tuning performance of the network stack itself: optimizing overhead a packet incurs on its path through the stack. A very active area of development.

Netdev 0x13, THE Technical Conference on Linux Networking

https://netdevconf.org/0x13/


The recent work on AF_XDP seems like it could became established middleground sweet spot almost as efficient as kernel bypass and userspace networking.


"Lua support in the NetBSD kernel" (2013) https://news.ycombinator.com/item?id=6562611


Long ago someone stuffed a Scheme(-ish?) interpreter into Linux, but I don't think it went anywhere. I forget the details, but it may have used the SIOD implementation. I'd prefer it in a microkernel's userspace, though...


Another step towards Metal[0] becoming a reality.

[0] A (half) joke from Gary Bernhardt's excellent The Birth and Death of JavaScript (pronounced "yavascript"): https://www.destroyallsoftware.com/talks/the-birth-and-death...


nebulet is an implementation of a similar idea, but on a different scale.

https://github.com/nebulet/nebulet


How has the work on Nebulet's IPC layer progressed? In theory, a kernel-mode wasm interpreter could make IPC as fast as calling a function in a shared library through a vtable. Of course, this has to be mediated properly to avoid letting processes mess with each others' memory!


Slowly. I've been quite busy with school. What little time I have spent on nebulet has been on the fledgling network stack.


I see. I'm excited about nebulet and would be interested in participating in the design process for the IPC layer -- is there a place that discussion is taking place?


Yeah, totally! The gitter is where most conversations about this take place: https://gitter.im/nebulet/nebulet


Does anybody know if that kernel module can execute WASM generated from Go or Rust too?


Unless they’re doing something non-standard, wasm doesn’t care what language it came from.

That said, this claims it relies on emscripten, I think? You can do that with Rust, though it’s not the preferred way. I sure about Go.


From looking at high_level.h it can instantiate any wasm file, but it seems to need the host code specially integrated. There is a function to instantiate the emscripten runtime, probably code can be added to instantiate the go or rust runtimes.


Rust doesn’t have a runtime. Well, at least not in the sense of other languages; it’s comparable to C :)

Emscripten provides a libc; I wondered if it required it. I guess not.


> it’s comparable to C

I think you may want to reassess your definition of what a runtime is ;)

> "crt" stands for "C runtime", and the zero stands for "the very beginning".

[ https://en.wikipedia.org/wiki/Crt0 | https://en.wikipedia.org/wiki/Runtime_library ]

Yup. C definitely has a runtime!


That’s why I said that Rust’s is comparable to C. Both have very, very small ones. Which is what most people mean when they say “no runtime”, since every non-assembly language has some amount of runtime.


Fascinating concept! Wouldn’t this pause/block the kernel while the module runs? Can’t find anything in the readme. I always assumed kernel modules had to behave politely in terms of returning and not running indefinitely.


I’ll keep all my apps if possible in user space please. Call me paranoid.


Serious question: what's the advantage of something like this over eBPF? As I understand it, you can generate eBPF bytecode from C and Linux already has a JIT compiler for it.


eBPF is a limited, non-Turing complete language, in particular not having loops. Because of this, it's feasible to allow users to upload eBPF filters into the kernel, because execution is safe and and the time to run the filter can be upper-bounded.

Wasm, on the other hand, is a memory-safe interpreter, but unlike eBPR is Turing-complete, so it can run arbitrary programs.


> in particular not having loops

Well, almost.

> The main object of this patch series is to support verification of eBPF programs with bounded loops. Only bounds derived from JLT (unsigned <) with the taken branch in the loop are supported, but it should be clear how others could be added.

https://lwn.net/Articles/748032/


Many other modern languages such as Rust and Go are aiming to officially support WASM. I don't know much about eBPF but it looks like it only support C for now.


A company just announced some cool eBPF and Rust stuff today: https://blog.redsift.com/labs/ebpf-ingrained-in-rust/


eBPF is generated from LLVM IR, so any language that compiles to IR works, so long as the code doesn't translate to IR operations not permitted in eBPF.


Finally, full stack engineers live up to their title.




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: