
Everything old is new again: binary security of WebAssembly - hermanradtke
https://www.usenix.org/conference/usenixsecurity20/presentation/lehmann
======
infogulch
It seems they aimed to answer the question: What are the consequences of
limiting the WA spec to only try to sandbox the WA binary from outside memory,
and not try to prevent it from exploiting itself? The answer they got was that
yes, a WA binary may be capable of exploiting itself. I think this is an
interesting and valuable result, but I don't think the result is surprising,
or that it invalidates the design of WA. The outputs from WA are not
guaranteed to be what you expect. Or put another way: C is still C. I wonder
if such a vulnerability could be found in a program written in safe Rust.

Notably, they don't appear to even try to break the WA-host memory barrier,
which I actually find to be a validation of the core design goal of
WebAssembly: isolate the damage a vulnerable program can inflict to the memory
space (and thus also output) of that program. Protect the host from the
program, but not the program from itself. Also, maybe don't dump WA output you
can't validate directly into DOM.

~~~
angrygoat
WASM and the surrounding Javascript model offer a lot of opportunities to
minimise the damage if something gets exploited.

Say you are running OCR via some C library you have compiled to WASM: wrap
that functionality up with an interface, compile that to WASM, and only pass
it the image data you're happy for it to see. If someone exploits that binary
with a malicious image, there's nothing inside the container except the code
and the image itself, so hopefully the damage is contained.

If folks start compiling large amounts of their functionality into one single
chunk of WASM, which has a complex interface with the DOM/JS, then I can see
how these exploits could become an issue.

~~~
rini17
Even then. If the input images are from untrusted source, you need to
scrupulously treat the OCR output as untrusted too, and that it is actually
OCRing and not mining crypto (say, using image library exploit + OCR math
library functions), etc. etc.

~~~
angrygoat
True, but there's not much point mining for crypto if the WASM module's
gateway to the DOM/JS/... doesn't let you export the result.

------
munificent
The whole point of a sandbox is that it's only got toys in it so you don't
care when the cat shits in it. But, of course, if the sandbox only contains
toys, it's not very _useful_. You can't sit in it and order a pizza if all
you've got is a toy phone.

So you leave your wallet, phone, and other critical devices in there. But now
it has ceased to be a worry-free safe play space. It's no longer acceptable
for the cat to shit in it.

This is a fundamental tension and we'll probably keep rediscovering this
security problem any time a new container model is developed.

~~~
dane-pgp
Also known as the Sandboxing Cycle:

[https://xkcd.com/2044/](https://xkcd.com/2044/)

~~~
ineedasername
Yep, this about sums it up: "All I want is a secure system where it's easy to
do anything I want. Is that so much to ask?"

------
rubber_duck
Like they say at worst WASM can make a mess of it's own data.

The selling point of WASM outside of browser to me is native modules for other
languages where you can whitelist the exposed APIs (and provide sandboxed
versions) while having cross-platform binaries.

So for example in the future node can ship with WASM module support and JS can
load a C WASM module binary which I can deploy on Windows/Linux/Mac, and I can
review what that module has access to via some module manifest, and node WASM
exposes wrapped POSIX API based on manifest configuration.

------
daniellehmann
Hi everyone, Daniel here (one of the authors, I am the PhD student in the
video). Great to see the paper submitted and discussed :-)

Sorry for being a bit late to the discussion, I will try to answer the
questions in detail that were asked below.

I want to clarify one misunderstanding that seems to come up several times,
namely the distinction between "host security" and "security of the
WebAssembly program itself". WebAssembly _does_ have measures and a good
design for host security. E.g., in the browser, WebAssembly programs are run
in a sandbox (just as JavaScript is), and writes inside WebAssembly's linear
memory should never affect values outside of linear memory (e.g., VMs insert
bounds checks when reading/writing from/to WebAssembly pointers). Those
techniques protect against _malicious_ WebAssembly binaries, which is of
course important in the Web.

Here, we look at a different side of WebAssembly's security story: What if the
WebAssembly binary is _vulnerable_ and gets fed malicious input? In this
attacker model, we can at most do what the host environment allows us to do.
But especially for large WebAssembly programs with lots of imports, or
WebAssembly binaries for standalone VMs (outside of the browser, without a
tried-and-tested sandbox), this can still be a lot of attacker capability! And
when we look into the protections _inside_ WebAssembly's linear memory (not
between linear memory and host memory), we find that there are very little.
All linear memory is always writable, no stack canaries, no guard pages, no
ASLR, no safe unlinking in smaller allocators etc. This is worrying as more
code gets linked together into a single WebAssembly binary.

If you have any further questions, I am more than happy to answer, here and
also via email. Thanks again for the interest!

~~~
afiori
To my understanding one of the design constraints behind webassembly semantics
where given by wanting to keep implementation burden low for browsers.

In particular my understanding is that most of MVP WebAssembly almost a subset
of asm.js in term of expressivity.

Do some of the safety measures you consider translate well in this model?

~~~
daniellehmann
> Do some of the safety measures you consider translate well in this model?

Yes, I think several mitigations could be deployed without requiring a change
to the language, only by changing compilers and runtime libraries:

* Stack canaries on the unmanaged stack: For storing the reference canary value, you need some "safe" location that cannot be overwritten by regular memory writes. I believe in x86, thread local storage (TLS) / the fs segment register is used. In WebAssembly, one could use a non-mutable global scalar variable, which could only be overwritten by a matching global.set instruction (which cannot be inserted by an attacker). Then, before returning from a function (or really at any point you want to check the integrity of linear memory), you could compare the canary value against this global, as you would in native architectures.

* Allocator hardening, e.g., safe unlinking doesn't require any language support, the allocator just needs to do it (and live with the slightly increased code size cost).

Other mitigations would require language extensions. E.g., for finer grained
control-flow integrity (taking source types into account, not just WebAssembly
primitive types), one would require multiple table (part of the reference
types proposal). Then, only functions with compatible types should be stored
in the same indirect call table.

ASLR, unmapped pages, and constants in linear memory are harder, I believe.
For ASLR, WebAssembly's 32-bit pointers will most likely provide too little
entropy. Unmapped pages and "true constants" would require some host API to
make portions of the memory non-writable/readable, but I don't know of any
proposal to do so.

> wanting to keep implementation burden low for browsers

> WebAssembly almost a subset of asm.js

Yes, that is also my understanding as to why we are at the current situation.
(But I did not take part in WebAssembly's design, so I cannot claim any
authority.)

------
justinclift
This doesn't really seem to offer anything useful, or even contain new
information. eg:

* no corruption of, nor access to host memory.

* no corruption of, nor access to memory or other processes.

It looks like what they did is create a wasm file that generates a JS "Alert
!!!" string, then blindly runs that string in the browser without any kind of
validation?

It's hard to tell from looking at their code though:

[https://github.com/sola-st/wasm-binary-
security/blob/master/...](https://github.com/sola-st/wasm-binary-
security/blob/master/end-to-end-exploits/browser-libpng-xss/02-compile-
pnm2png-wasm/out/main.html)

The html page there seems to be missing the "main.js" to see what's happening.
:(

\---

Hmmm, it might be this main.js, which looks (without checking) like it came
from Emscripten:

[https://github.com/sola-st/wasm-binary-
security/blob/master/...](https://github.com/sola-st/wasm-binary-
security/blob/master/attack-primitives/stack-buffer-overflow/out/main.js)

~~~
daniellehmann
> This doesn't really seem to offer anything useful, or even contain new
> information.

You are right in that we do not try to attack a concrete host implementation
or aim to break out of the sandbox.

I disagree, however, that security inside a WebAssembly binary is irrelevant.
If an attacker can arbitrarily read/write in linear memory, it depends on the
imported host functions what can be done. The larger WebAssembly applications
are (and we believe they will become larger in the future, and incorporate DOM
functions, as in our PoC application), the higher the risk that host APIs can
be abused. This is troubling especially where there is no sandbox (on some
standalone VMs), but even in browsers (where it opens the door for a new type
of cross-site scripting).

For example, even if you only call JS eval() with a constant string in your C
code, because WebAssembly has no truly constant memory, this can give XSS to
an attacker with a memory write primitive inside linear memory.

> The html page there seems to be missing the "main.js" to see what's
> happening. :(

Oops, good point. They are now pushed, sorry about that.

> main.js, which looks (without checking) like it came from Emscripten

That's correct, you can also check the build script: [https://github.com/sola-
st/wasm-binary-security/blob/master/...](https://github.com/sola-st/wasm-
binary-security/blob/master/end-to-end-exploits/browser-libpng-xss/02-compile-
pnm2png-wasm/build.sh)

~~~
justinclift
Hmmm, I'm kind of in two minds about this.

On one hand, the problem doesn't seem to be any different than happens with
standard JS. The exploit was possible only because the wasm (literally) did no
input validation. And "validate all input" is the first thing web programmers
learn. (or very close to first ;>)

On the other hand, it _is_ a vector. And we've seen plenty of cases where
vectors are chained together in novel ways to enable unexpected attacks. So
there's that. ;)

> Oops, good point. They are now pushed, sorry about that.

No worries, thanks for getting that done. :)

------
Nokinside
If you want good security, you can't download new untested code. Sandboxing is
important, but the most important part is to not allow arbitrary code to run.

WASM seems like very good solution if you want to load random code and still
retain as much safety and flexibility as possible. The burden is now moved to
code that interacts with potentially misbehaving code. Check and verify all
outputs, limit resource usage. Using WASM system It's not so different from
calling remote server in the cloud. Anything can happen during executing but
only the interacting with the system can cause harm outside the system.

~~~
justinclift
Wonder what the potential for side channel attacks (eg timing analysis, etc)
will turn out to be with wasm?

~~~
Nokinside
Many side channel attacks available for native binaries are impossible in
wasm, because code can't be handcrafted. It always goes trough compiler.

When they exist, potential WASM side channel attacks are system/compiler
specific and can be fixed with software updates. Compilers can be designed to
remove the possibility it becomes a real issue.

~~~
justinclift
> ... are impossible in wasm, because code can't be handcrafted. It always
> goes trough compiler.

That's _extremely_ wrong. People have been hand writing WASM files for ages.

------
lstamour
I’ve only watched the video, but to me this as much highlights how WebAssembly
was designed to run in restricted browser sandboxes with CSP and other
mitigations readily available, and that the uses of WebAssembly outside of
such relatively hardened environments is, as shown, more risky. Outside of the
confusion that a WebAssembly binary might be tricked to misbehave though, the
rest feels like standard security measures are enough. I’m not yet convinced
we need more mitigations at the WebAssembly layer vs better security linting
to try and catch the vulnerabilities, and perhaps not trusting any web
assembly outputs without validating them first? I might be missing something
here. I’ll probably need to read the paper next :)

~~~
foota
I think the lack of mitigations are things like memory protection, stack
canaries, and such that make attacks harder in practice for binary targets,
even if there is an exploitable bug in the code.

------
xfer
Yes, and people are trying to run this outside browser too(like a jvm
alternative). Not sure what's the craze about wasm outside browsers
considering all the risks.

~~~
rvz
Exactly. There's so much "WASM without the browser" hype for some time and
this analysis puts the risks into perspective.

It has been aggresively marketed as a JVM alternative (not equivalent), but
with the same old native bugs and even makes XSS attacks possible. Stunning.

Maybe that's why all the big media content companies went behind it. Out with
Flash and Java and in with 'compile and run WASM programs in every browser' to
support this 'open web™'.

What they really mean is: 'We can't wait for WASM DRM!'

~~~
infogulch
The core WA principle of "protect the host from the program, but not the
program from itself" still holds, based on the video this paper does not even
attempt to break the WASM host memory barrier. That seems more like a success
than anything else.

Guess what: developers are still responsible for avoiding security bugs in
their code. But at least when they mess up, it still won't damage more than
what their dumb web app could have done already.

