"If a story has not had significant attention in the last year or so, a small number of reposts is ok. Otherwise we bury reposts as duplicates."
Designers of the WebAssembly spec explicitly chose to scope its goal of a 'secure sandbox' to only include securing the host's memory from the WebAssembly VM. They have succeeded in this goal (modulo implementation deficiencies). However, they left open the question of how a developer might ensure the correctness of their program running in the sandbox. This paper explores the consequences of that original design goal for such a developer, and concludes that all the regular binary exploit mitigations may still be useful for ensuring program correctness.
I made a comment at the time:
> ... Notably, they don't appear to even try to break the WA-host memory barrier, which I actually find to be a validation of the core design goal of WebAssembly: isolate the damage a vulnerable program can inflict to the memory space (and thus also output) of that program. Protect the host from the program, but not the program from itself. Also, maybe don't dump WA output you can't validate directly into DOM.
To which the paper's author replied:
> ... I do agree with you, however, that our findings do not invalidate the overall design of WebAssembly. ... I also think "host security" (which we are not really looking into) is solid, with two qualifications: ...
You are right, some of the issues highlighted in the paper could be solved by compilers targeting WebAssembly. One such mitigation that is (currently) missing are stack canaries. In contrast, stack canaries are typically employed by compilers when targeting native architectures. They also cost performance there (typically single digit percentages), but evidently compiler authors have decided that this cost is worth the added security benefit, since fixing "old C issues" in all legacy code in existence is not realistic.
Note however, that other security issues highlighted in the paper _are_ characteristics of the language, notably linear memory without page protection flags. One consequence of this design is that there is no way of having "truly constant" memory -- everything is always writable. I do think that this is surprising and certainly weaker than virtual memory in native programs, where you _cannot_ overwrite the value of string literals at runtime.
That said, we always knew a day would come where WebAssembly would get the ability to set more page protections. For non-web embeddings, this is easier, but as of yet, none of the engines I know of have either an API or are really prepared for the implications of this. I am still bullish on eventually getting that capability, though.
Thanks for your paper.
As you said, I just hope page protections can still be added later (somebody needs to specify it, embedders need to be able to implement them, toolchains need to pick it up, etc.).
Maybe memory vulnerabilities inside WebAssembly programs can also be mitigated in other ways that do not require such pervasive changes, e.g., by keeping modules small and compiling each "protection domain" (e.g., library) into its own module, or to have a separate memory. I am not sure about the performance cost of such an approach, though.
That is not true. As the paper goes into depth, linear memory, lack of ASLR, and the lack of read-only memory are problematic regardless. The latter is the most important imo.
I agree it makes Rust more attractive than C for WebAssembly.
But the original demos for WebAssembly were all in C, not Rust (games like Doom IIRC). The thing that made it attractive in the first place is that you could run existing code, e.g. image processing or machine learning in C/C++.
That's because the way you're using secure is likely misleading to most people; two WASM programs will be more strongly isolated from one another than two normal programs. The host has better protection from a WASM program than a C program. Those protections are akin to a VMs, but likely better.
The protections that are missing aren't irrelevant, but likely less relevant than in an OS context because WASM's I/O is so heavily constrained. Malicious data is still a possibility, but not quite as trivially triggerable as in any other networked program. Similarly, once hacked, those same limitations make the scope for mischief much, much more limited.
Sure, there are attacks conceivable against a WASM program that couldn't work when protected by various defenses in depth available to a "normal" OS program, but the reverse is true too - if you will; the scope of bugs is much, much smaller in WASM, but within that limited scope they'll be more easily triggerable. However, since the scope is so limited, I have a hard time believing the net effect is negative in practice.
This is wholly incorrect. Two WASM programs will be at most as isolated as two normal programs, but in practice substantially less so. Spectre being the obvious one that sort of "fully breaks" in-process sandboxing, but lack of hardware enforcement also means you're now more vulnerable to host bugs, too.
Regardless it's a question of what is the juicier attack surface. WASM only cares about protecting the host from the embedded code, but it does that essentially by sacrificing the embedded code's own security from the outside world. That is, if A is the WASM host of B, and B can be called by C, then WASM is about protecting A from B at the cost of making B more vulnerable to C. If B is the thing that actually has the juicy data that C wants in the first place, this is a really bad tradeoff. If A is the juicy target, though, then this is a fine tradeoff.
So in a browser context where the host is almost always the juicy target, WASM's tradeoffs work. If, however, you're talking about things like using WASM for web server sandboxing instead of docker or a VM, well the web server itself is likely the interesting attack surface rather than the "hypervisor." So that's probably not a good tradeoff to make there.
By contrast, normal OS programs have all kinds of I/O, including inter-process communication, shared locks, high-res timers, shared memory, files, networking, etc etc etc. Even if those programs are "fairly isolated" by being inside a VM, they still have access to all kinds of low level tools that WASM won;t have that might leak enough entropy to still be less well isolated - and that's kind of necessary and by design, because a VM or container is designed to run largely unmodified binaries, and thus cannot simply omit capabilities legacy programs have - but WASM can, and does. Then there's simply the sheer complexity difference; it's pretty daunting task to have all that be entirely safe.
In essence: there's a reason why it's much more common to see e.g. privilege escalation vulnerabilities than WASM sandbox escapes.
If indeed webassembly adds features that add attack surface, I sure hope (and fully expect) them to take things like spectre into consideration.
As to SELinux: the issue with securing existing programs is that those use all kinds of existing features; it's quite hard to take away functionality that programs depend on. SELinux simply has a much harder and trickier task than WASM, which never supported almost everything SELinux seeks to control in the first place. It may well be that it's possible to have SELinux enforce WASM-or-greater restrictions and that it would then be "safer" due to other features, but that's a pretty hypothetical scenario - what software would even run in those circumstances?
The comparison seems moot to me; apples to oranges. One is trying to KISS and be safe by simply not having complex features; the other is trying to deal with real, existing programs by restricting stuff they could access but don't need. Those are different use cases.
You seem to be continually focusing exclusively on the current existing extremely limited WASM-in-a-browser usage.
WASM's current limitations aren't intentional, it's because it's not finished & wasn't necessary for the MVP. In fact, it already has some of the things you are claiming it doesn't have - both threads & shared memory are available already in Chrome & Firefox's stable releases ( https://github.com/WebAssembly/threads/blob/master/proposals... & https://webassembly.org/roadmap/ ). Which means it also has accurate timers.
WASM very intentionally allows the host to expose whatever feature set it wants. That's a very core design element. You can't just ignore that and focus on the ~nothing that the "spec" itself delivers & pretend that's by-design. WASM isn't a standalone runtime and isn't trying to be.
You state WASM's limitations aren't intentional; they just weren't finished - but being able to use those securely is part of being finished, and they weren't enabled in browsers until they were.
Obviously other hosts may do whatever they want. That's a valid hypothetical, but just as valid a hypothetical is that WASM implements relevant defense in depth features before such poorly sandboxed hosts become commonplace; it's not like they're not aware of the issue (https://webassembly.org/docs/security/). Hypothetically it might be "safer" than SELinux; but... today none of this matters, and the missing features don't appear to be intrinsic. It's not like native programs have had ASLR since day 1 either, right? Or maybe WASM stays in the "highly isolated" niche which is fine too, and would still allow various usages outside the browser too (e.g. for most apps that don't interact with other apps outside of a few carefully controlled primitives).
If people make poor choices about future development, things may end up less secure than they are today: sure.
So, on the one hand I can't dispute what your saying - I just think it's not the best thing to be focused on, because a trivial summary would make it seem like there's a real risk in having WASM as it is today, and that's not the case. We should be very careful in the future, yes, but let's not throw out the baby with the bathwater; wasm's portability and sandboxing in practice today are likely quite useful and more secure (in a browser context) than a native app (outside of a browser context).
Innocent and naive as I am, I feel like this shouldn't be true in a post-spectre world? Surely intra-process isolation is recognized as, at minimum, a difficult problem to solve today? The attack surface of your host, given arbitrary compute powers (which wasm has), is quite significant.
It's not to say that intra-process isolation isn't a worthwhile goal, I'd rather see new systems designed with the idea if they can be. I'm just saying I don't know that "unsandboxed but has lots of mitigations" vs "sandboxed but trivially exploitable" is so clearcut.
But even supposing you pull all that off - spectre needs not just some entropy leakage, but also a way to put that entropy in context. Knowing a few random bits in some other processes memory space won't do you much good; you'll need to know something about how that process is working to interpret those bits and target the slow leakage, which is again pretty situational.
Finally, you need some way to check for indirect behavior changes, and typically that means timers. Those aren't directly available in WASM, and aren't as easy to replicate using other means as in a normal process; e.g. wasm doesn't even have fully capable shared-memory threads for exactly this reason (there's a proposal, and chrome feature flag, so you can play with it, but that's it); and other tricks to get reliable timers like via SharedArrayBuffer have been tweaked to avoid attack surface. But note: that's the case for WASM, but not for a classic OS program, even one running in a VM.
I'm not sure how much of a threat spectre is in practice, but clearly WASM is aware of it and avoiding features that would increase exposure, even very valuable stuff like threads.
If it's the first one, I dont think i really care.
But worth adding that Wasm by design lets you statically inspect all the things the Wasm can call. If the Wasm receives access to networking, or eval(), etc., then there is a real risk. However, often the Wasm imports and exports are extremely limited and easy to verify for safety.
Someone who controls the image data can perform an XSS, i.e. steal your credentials for the website, or your credit card info if that's stored server-side. That's not as valuable a target as controlling your computer, but it's not nothing, and can be chained.
This is completely realistic as Figma is full blown image editor written in C++ compiled to WebAssembly :)
If this is the case it mean there is already a hacker or virus running outside the sandbox which mean I already have bigger problem to worry about :).
That said, it does highlight that memory-safe source languages and other program exploit mitigations are still important even in a webassembly sandbox. Personally I wasn't confused about this fact, but it does clarify this point for any that might think that WA == no more security problems in any context.
By modifying memory you could potentially cause the existing code in a wasm binary to do unexpected and potentially malicious things inside its sandbox. Which can be bad. But no more than that, unless you can chain it with other types of vulnerabilities. And using safe languages for your wasm code helps. The paper also points out that uses of wasm outside the browser, e.g. in node, may not be securely sandboxed.
I hope that wasm people are looking at adding mitigations for some of these attacks. Stack overflow in particular seems like something that can be protected against.
However, I would draw a bit more attention to the consequences when memory vulnerabilities in a WebAssembly binary are exploited:
(1) Not every WebAssembly binary is running in a browser or in a sandboxed environment. The language is small, cool, and so more people are trying to use it outside of those "safe" environments. E.g., "serverless" cloud computing, smart contract systems (Ethereum 2.0, EOS.IO), Node.js, standalone VMs, and even Wasm Linux kernel modules. With different hosts and sandboxing models, memory vulnerabilities inside a WebAssembly binary can become dangerous.
(2) Even if an attacker can "only" call every function imported into the binary, it depends on those imported functions how powerful the attack can be. Currently, yes, most WebAssembly modules are relatively small and import little "dangerous" functionality from the host. I believe this will change, once people start using it, e.g, for frontend development -- then you have DOM access in WebAssembly, potentially causing XSS. Or access to network functions. Or when the binary is a multi-megabyte program, containing sensitive data inside its memory.
Sure, the warning is early, but I'd rather fix these issues before they become a common problem in the wild.
Needing to chain isn't as bad as a full breakout, but in practice, attackers can and do chain exploits. (pwn2own has some really impressive chains, but those are about breaking out of the sandbox)
Also remember that WASM doesn't actually improve peak performance over JS. The asm.js subset of JS has the same performance.
WASM does two things 1) it makes performance predictable by ensuring you don't fall off a cliff of the JIT 2) It improves parsing time because it's loading a binary format rather than parsing JS.
e.g. notice that the headline here is about load time, not execution time, which would be more important: https://www.figma.com/blog/webassembly-cut-figmas-load-time-...
5.1 is XSS due to using document.write with a user provided string. That's not taking over the browser and is very much possible from JS (it's basically the prototypical XSS example from well before there was wasm).
5.2 is more serious but in node, not the browser, and is barely a chain (two exploits?). It's a way to get a string to eval() if eval() is available in one of the indirect functions. I think the authors also overstate the benefits of ASLR in that case. If that same code was running natively in a server and exposed to direct user input like that, it's very possible an RCE could be made to work.
These examples are simple by design since they're just demonstrations that it's possible, so more sophisticated attacks are possible, but we also shouldn't over extrapolate from them.
tl;dr A bad case is if something like Figma stores your credit card info on the server. If they have a buffer overflow + document.write() of HTML derived from wasm code, the person who controls the image data could steal your credit card info.
This is basically example 5.1; Figma is a huge chunk of C++ compiled to wasm, which may process images from other users.
"oh but you can chain exploits" in this context is like if a php server takes POST data, passes it through an arbitrary but guaranteed to terminate "safe" function, and calls eval() on the result -- and then attributes potential vulnerabilities to the function. This seems silly, obviously the problem is the eval() at the end.
And as you say, the examples given are really poor; they're barely exploits, that's more like a really unwise feature - it's hard to imagine situations like this arising in practice by accident without being more easily exploitable without even using WASM in the first place - I mean, if you're processing data in a some way and your interface with that processing code includes stuff like eval, that's just a disaster waiting to happen - as web exploits over the decades have demonstrated again, and again.
If you don't feel like reading the whole paper, there is also a short (~10 min) video on the conference website , where I explain the high-level bits.
Whether that exploit is more or less concerning than the browser and Node.js examples, I think is hard to answer in general without additional qualifications. If the standalone VM uses fine-grained capabilities (e.g., libpreopen) or is sandboxed, then changing the file that is being written to might be possible inside WebAssembly memory but access could be blocked by the VM.
He didn't have it all. The 4th quadrant: unknown knowns
There is a fog to applying stuff we already know but have forgotten and it causes problems.
The paper is indeed correct - we did this 20 years ago - same architecture except with JVM instructions.
I'm entirely neutral on the merits of this tech supercycle - there are some great things being built, there are also old potholes getting uncovered that will end with some lessons re-learned.
I guess that's a concern, but im not sure its that big of one relative to normal xss.
So for many use cases Wasm does allow what you mention, running unsafe code quickly on normal hardware, and it is being used in production for that purpose.
There's very little that can be done about rowhammer with formal methods because it breaks the assumptions of any formal model of computation (that values stored to RAM are the same values that will be eventually read from RAM).
However, what we look at in the paper is "binary security", i.e., whether _vulnerable_ WebAssembly binaries can be exploited by malicious _inputs_ themselves. Our paper says: Yes, and in some cases those write primitives are more easily obtainable and more powerful than in native programs. (Example: stack-based buffer overflows can overwrite into the heap; string literals are not truly constant, but can be overwritten.)
You say that like the Java Virtual Machine designers didn't do the same thing.
That has a very familiar sound to it. ;-)
> However, what we look at in the paper is "binary security", i.e., whether _vulnerable_ WebAssembly binaries can be exploited by malicious _inputs_ themselves. Our paper says: Yes, and in some cases those write primitives are more easily obtainable and more powerful than in native programs. (Example: stack-based buffer overflows can overwrite into the heap; string literals are not truly constant, but can be overwritten.)
I think you've highlighted an important subtlety here for sure. Would you say that Java applets were similarly vulnerable?
This paper is only about messing up the WASM heap, which might cause the WASM application to crash, but this doesn't cause damage outside the sandbox.
The design of the Java applet security model did not include being able to escape the sandbox. That was something Microsoft did that deliberately broke the security model. The closest thing Java applets had to "escape the sandbox" was popping up a new window, and even that involved presenting the user with a warning that the window belonged to the applet.