I have been genuinely intrigued by WASM and have hacked on my own VM in spare time. One thing that disappoints me is the arbitrary invention of a new binary format for no apparent reason solving no apparent problems.
It would be better to use CBOR, or at least ELF. The former, especially, is not too far from WASM's format (both favor compact integer representation), and bears semblance to JSON which bears semblance to JS which bears semblance to WASM. It only makes too much sense. But WASM didn't take that direction, so we don't benefit from a highly extensible, simple and elegant format with widespread support.
I'm not sure ELF makes sense. ELF is all about the representation verbatim in memory, but at least for the executable segments, that's heavily abstracted for WASM. And describing alternate address spaces is always a hack in ELF, just see the craziness in AVR ELFs for an example.
How does this do anything to alleviate spectre and friends? Software isolation doesn't work (https://arxiv.org/abs/1902.05178), so the only protection you get is from process isolation, and I assume this doesn't change the OS mechanisms to enforce that.
It doesn't do anything directly about this. It makes it easier for you to choose other CPU's in the future. I guess Dagger as it is may _technically_ be invulnerable to spectre given it has no time reading support at all currently, but honestly part of the security here is done by the WebAssembly VM being slow enough the time spookiness doesn't happen as easily.
I think a lot of this is limited by existing OS mechanisms. I've been digging into seL4 to create a platform for WebAssembly code, but moving internationally eats up all your time. :(
> We might consider adjusting the precision of timers or removing them altogether as an attempt cripple the program’s ability to read timing side-channels.
> Unfortunately we now know that this mitigation is not comprehensive. Previous research [30] has shown three problems with timer mitigations: (1) certain techniques to reduce timer resolution are vulnerable to resolution recovery, (2) timers are more pervasive than previously thought and (3) a high resolution timer can be constructed from concurrent shared memory. The Amplification Lemma from Section 2.4 is the final nail in this coffin, as it shows (4) gadgets themselves can be amplified to increase the timing differences to arbitrary levels.
I have been looking at doing things like arbitrarily limiting the WebAssembly execution engine to only run an instruction per microsecond. This would then make full execution speed something programs have to be configured to do rather than something they get by default. I still don't know though, this stuff gets tricky.
I think the ultimate goal for my implementation of this stuff is to remove anything higher than seconds resolution of time unless the program actually demonstrates a need for it. I'd like to have javascript and the browser processes be in separate machine processes, with javascript artificially only allowed to use 10% of the CPU time at max. Honestly I think that letting everything run at full speed on the CPU is probably a mistake.
> remove anything higher than seconds resolution of time unless the program actually demonstrates a need for it
As someone working with Web Audio, I wonder if it's even possible to tell if a program "legitimately" needs milli/microseconds timing precision? Typically it'd be running on its own worker/audiolet(?) thread, but I imagine it could be exploited for some nefarious purpose.
Edit: I realized the talk is about WASM on the server, but, who knows, maybe in the future it could also involve audio that needs high-precision timers.
Yeah, my thought there is make resolution of timers a dependent capability. My ultimate goal is to let users be able to say "no, $APP doesn't need more than second resolution of timers", and if the user is wrong the app just has to deal with it.
That makes sense. In browsers there are already restrictions around audio/video autoplay, as an example, and an application needs logic around waiting for user permission. So I can imagine something similar, where the default timer could be coarse, and high-resolution timing would require elevated privileges.
Anyway, thank you for the notes/slides about WebAssembly on the server, fascinating stuff with a bright future!
The headline "Why" argument is hardware-independence, but aren't nearly all competitors on the backend just as hardware-independent?
Ruby, Python, JVM languages (Kotlin/Clojure/Java/Scala etc), Node platform languages (TypeScript, ClojureScript, ES5 etc)... Go comes to mind as the only exception, delivering native binaries, but it supports cross compilation really well.
They compile from source code elegantly, sure. But you don't always have the source code available. Removing implications of the underlying platform makes it easier to change out hardware in the future. It's so that once the next Intel bug comes out, you can just shed Intel CPU's in favor of ARM, RISC-V, or even big-endian POWER.
Go also doesn't cross compile with cgo that elegantly. Having the platform run the output of C compilers means that you can remove that from your runtime assumptions as well.
But most of the examples in 'fulafel's comment already don't require recompiling for a new architecture. Node, ruby, Python, JVM languages... You just need a runtime for the desired architecture, which is still true for WASM.
(As a sidenote, POWER has supported either endianness for a while, and ~everyone runs Linux in little-endian mode.)
Given that the point of cgo more or less is to enable calling into C code, don't you still need to do a lot to make that work in WASM? Like cross-compile all your C dependencies to WASM, then use cgo to interface between the go things and the c things within the WASM runtime?
Pretty interesting ideas and I'm interested to see how it plays out, but tbh I think I'm too old school for all this newfangled stuff and would just use qemu.
I think you're right. We can also add .NET (C#, etc.) to that list, as a popular hardware-independent language on the server (in enterprises at least). So, yes, most server languages are already hardware independent I believe.
Wasm does provide that property for C++ & Rust. I'm unsure how important that is in the server space.
It's one of the things I have on the back burner, I personally believe that WASI doesn't go far enough. WASI limits filesystem calls to the actual filesystem for one. I think it's better to have a homogeneous view of things so that the only difference between local and remote resources is what the platform does.
WASI is in very early stage right now. We already have a start of a proposal to unify reading/writing from file and network streams. Unifying filesystem paths with URLs is something that's certainly in scope beyond that.
I like to think of it more that Javascript/WebASM will finally accomplish what Java spent decades trying to do: be the completely ubiquitous hardware-independent code platform.
Javascript has truly become the "Write Once, Run Anywhere" language.
Spectre came along and ruined the awesome conclusion of that talk.
The idea was that the cost of using WASM would be entirely offset by the speedup of removing the cost of hardware memory protection. We could do that if everything ran in one big VM because the VM would enforce the memory protection.
Unfortunately, now we can't rely on a VM for memory protection anymore. We have to assume that any code running in the same address space can access any memory in the same address space.
So we need the hardware memory protection after all. You can say goodbye to your WASM-only future.
I assume that new chips will address this vulnerability, correct? Couldn't the VM detect whether the hardware is secure and decide whether to use hardware memory protection or not?
It doesn't seem likely. The chipmakers will fix the vulnerabilities that break isolation between processes and between user-kernel, but the within-process issues will probably stick around.
Well, Spectre is a largely-theoretical class of vulnerabilities, that doesn't even apply to chips that don't do speculation in hardware, and that is purely about information disclosure via side-channel mechanisms. It might be a bit of a concern for some users, but it's not the end of the world - for instance, the designers of the Mill architecture have a whole talk discussing how Spectre as such doesn't really apply given the architectural choices they make. And if running stuff in different address spaces is enough to mitigate it effectively, that still provides quite a bit of efficiency compared to an entirely conventional OS.
> doesn't even apply to chips that don't do speculation in hardware
This is an interesting way to put it. I would have said "applies to pretty much every CPU manufactured in the last decades." Your statement would make sense if speculation in hardware was some niche thing, but I think you would be hard-pressed to find an invulnerable CPU that is used in situations where people care about both performance and security.
That's great for the mill, but isn't relevant to the world outside of mill computing.
This is part of why I want to make a custom OS where each WebAssembly process can be in its own hardware protected memory space. I'm looking at building on top of seL4.
How do you figure, when certain features of javascript are supported on some browsers and not others? You've just swapped OS dependence for runtime dependence. JS's solution to this problem? Another layer of abstraction to make the JS cross-browser.
WASM already has this problem, what with 5 or 6 different incompatible runtimes already in existence.
It would be hard because traditionally most of our hardware relies on the register memory model. Implementing a hardware stack can be done but there are very few silicon CPUs like that thus a lot of the research which has been done on getting register machines up to speed in terms of caching and speculation could be difficult. Then there is the fact that webassembly doesn't use branch labels rather it uses blocks, which AFAIK have never actually been used on traditional CPU hardware.
Maybe, however remember that you then need to buy new hardware to use new WASM features.
Also WASM isn't really ideal for interpretation, this could make implementing the CPU harder (however I have no clue about implementing CPUs, so this is just a guess).
What would be the advantage? Performance? Probably not much after JIT compiling WASM to native machine instructions. If there is an actual problem there, I guess it would be better to just add new native instructions that support WASM semantics. The JIT can then use these instructions if available.
Right now WASM can't do much without a runtime, so I think a WASM-only CPU is probably infeasible for some time.
Practically this is a blessing and a curse. I really wish there was a better story for GC'd apps on WebAssembly because a GC is super expensive at the moment.
I vouched for this comment because I didn't go into detail much in the talk. The main idea is to remove a lot of the kernel and related overhead from application code, kind of like an exokernel. I am working on using seL4 as a base for an OS that only runs tasks in WebAssembly. Having things implemented such that the OS is not a fundamental linking requirement means that I can easily port applications to the browser. I don't have a working sample at the moment though, I somehow broke my last ones and haven't had time to fix them yet.
It would be better to use CBOR, or at least ELF. The former, especially, is not too far from WASM's format (both favor compact integer representation), and bears semblance to JSON which bears semblance to JS which bears semblance to WASM. It only makes too much sense. But WASM didn't take that direction, so we don't benefit from a highly extensible, simple and elegant format with widespread support.
I hope WASM fixes this soon.