Hacker News new | past | comments | ask | show | jobs | submit login
Securing Firefox with WebAssembly (hacks.mozilla.org)
244 points by edmorley on Feb 25, 2020 | hide | past | favorite | 75 comments



It's really great to see this work. Cutting down the amount of memory-unsafe code in the browser is an important ongoing goal.

I'd love to do the same for our contenteditable implementation. The hard part here is to implement a DOM abstraction, because that code currently pokes at DOM objects directly and you can't safely do that across a wasm barrier. Once we've done that, though, Firefox's contenteditable implementation would become a cross-browser library, which opens up some interesting opportunities. We could use it in Servo, and authors could ship it on their Web pages if they want to ensure a consistent contenteditable experience across browsers.


How would this work? Isn't the content editable attribute magic so far as web code is concerned. You can't currently implement contenteditable in userspace, can you?

I guess maybe you could make an attribute which just enables a flashing cursor, and then everything else could be done with DOM apis...


We might have to add some magic hooks into currently-internal parts of the browser, so the fidelity may not be 100% if contenteditable is shipped as part of content. It'd be an interesting experiment and might help to highlight gaps in the Web platform.


Well Google Docs editing is fully custom, including the selection and caret. So it must be possible.


I believe Google Docs implements it's own text layout entirely from scratch. Sure, you can do that, but that's a lot of work, and it doesn't perform particularly well.


I've implemented text editing on top of contenteditable, and directly via key events etc. The latter is far less work. contenteditable gets you 60% of the way there but is so buggy and inconsistent that you have to override all its behavior anyway.


As a user I bet I’d prefer the contenteditable approach. I find that absolutely consistently the cleverer a web rich text editing widget tries to be, the worse the experience is, and the more it breaks my model of how such things work on the native platform. Dropbox Paper’s is atrocious (I’ve reported several super-annoying bugs, and never had even a response). Slack’s new widget is abysmal (I haven’t even bothered filing bugs with them, because everyone else has already complained about the same things—most of all, its caret handling is quite insane). Google Docs I haven’t edited with for many years, so I can’t fairly remark on it. I can think vaguely of a couple of tech demos of not using contenteditable at all from a few years back, and at that time they were awful on desktop and entirely unusable on mobile. I don’t know if the situation has improved at all from that, but I’m sceptical.

Do you happen to have either or both implementations on the public internet? If so, I could see just how many seconds it takes me to be infuriated by them, especially by the non-contenteditable text editor. If I fail, I will gladly eat my metaphorical hat, but presently I would assign odds of well under 0.1% of that happening.

Sure, contenteditable has many bugs and inconsistencies, and Chrome’s implementation especially is a toy as regards functionality (e.g. poor table and image handling, difficulty with distinguishing a caret inside the end of one node from one after the end of that node), but much of the 60% that it gets you is stuff you can’t get at all any other way, especially insofar as it varies by platform, mostly deliberately.

pcwalton’s proposal interests me because it would actually concretely examine what can and can’t be done, and definitely identify the shortcomings. It would have a chance of actually being good.


> I find that absolutely consistently the cleverer a web rich text editing widget tries to be, the worse the experience is

I don't know about this. Google Docs isn't great (I haven't used Paper), but CodeMirror (https://codemirror.net/) is excellent. Performance is flawless. All the platform shortcuts + extra niceties like multiple cursors.


I should have clarified that I’m specifically speaking of rich text editors. CodeMirror is a plain text editor, which removes most of the difficult-to-implement-without-contenteditable stuff. It uses a textarea, not any contenteditable or contenteditable-like thing.

CodeMirror is rather good as such things go. Last time I tried it it had serious issues with some keyboards on Android (regardless of browser, I believe), but that looks to be working well now. Well, either that or the new Firefox for Android has worked around the input problems; could be either, I don’t have a URL that I know was broken and hasn’t been updated recently.

But even so, it still has problems, some niggling and some major. Navigation keys don’t behave natively (e.g. on Windows, select a range and then press Up or Down and it should go up or down one line from where the caret is, but instead Up takes you to the start of the selection and Down to the end). I can’t get a caret thumb on Firefox on Windows, and it seems very hit-and-miss on Android, and the Samsung keyboard on Android is going crazy with its suggestions as you move the caret around the document—and that’ll be basically because of my next point.

Perhaps most seriously, it’s completely unusable from the perspective of accessibility tech. And most damningly, guess what the solution to that is? They’re working on CodeMirror 6, and concerning accessibility https://codemirror.net/6/ says:

> This version leaves more to the browser, instead of “faking” the editing process in JavaScript. This makes it more transparent to screen readers and other accessibility tools.

You know what “leaving more to the browser” means?

contenteditable.

Yep.

It still interferes with various native user agent functionality (e.g. navigation keys are still wrong and it’s still interfering with caret thumb and long press on at least Windows), but it’s distinctly better because it reduces the amount of the important type of cleverness.


And that's why having contenteditable split out from the browser engine would be great. Instead of reimplementing parts to work around issues, you could just fix the issues directly.


> contenteditable gets you 60% of the way there but is so buggy and inconsistent that you have to override all its behavior anyway.

Believe me, I know. I did my own once upon a time (back when IE6 was still a thing, and supporting IE-style ranges was required!). How do you deal with things like the caret if you're rolling your own. Just fake it with a div?


I think there was a text input element that you actually typed into, which showed the cursor and handled IME. After composing a character, that character would be rendered into a div and the input element moved to the next position. I've also written versions with a blinking div, or a canvas element, but I think the input element is the better approach to leverage browser IME.


> The latter is far less work

Only if you ignore inconvenient but important things like internationalization and accessibility.


And it makes mistakes, even with pure Latin script!


Indeed, that’s an area of code that has had a history of strange per-browser-version differences, though better today than in the past.


> but we’re performing the wasm to native code translation ahead of time, when Firefox itself is built.

This might be an advantage for startup performance, but it's also a disadvantage performance wise as then you can't use all available features of the CPU. That's one of the advantages that wasm gives you.

> we were often asked, “why do you even need this step? You could distribute the wasm code and compile it on-the-fly on the user’s machine when Firefox starts.” We could have done that, but that method requires the wasm code to be freshly compiled for every sandbox instance. Per-sandbox compiled code is unnecessary duplication in a world where every origin resides in a separate process.

Android keeps an on-disk cache, performing native compilation at installation time, or at first startup after an OS update. One could have a similar cache for Firefox as well.

Anyways, this stuff can be improved in the future as well. This doesn't change the fact a bit that it's an amazing and really great idea to improve safety of the browser. I love it!


> Android keeps an on-disk cache, performing native compilation at installation time, or at first startup after an OS update.

If you’re describing what I think you’re describing, that was a Dalvik thing (Android 4.4 and earlier), which ART (Android 5 onwards) doesn’t do. I also found it a pain, because from time to time even when there hadn’t been an OS update or any other such change my phone would take three or four minutes longer to boot up as it decided it had to optimise all its packages. No idea whether that was a typical experience.


The compile target of ART's dex2oat tool is actually native ELF code while Dalvik's dex2opt only created odex files, so the on-disk cache still exists. I've found sources that this is still done on app installation. On whether it's still done on OS updates, I couldn't find sources, but I guess your experiences mean that now OS updates don't trigger recompilations any more. Thanks for the pointer!


It still happens, except now the recompilation is done before the reboot for the upgrade, so that it doesn't create a massive downtime for upgrades. But if you manually upgrade, you can still see one (very long) step called "Optimizing Apps".


Here is the complete workflow post Android 7.

https://source.android.com/devices/tech/dalvik/jit-compiler


Ditto on the love!

This is such a good and almost a "oh, duh" moment. They can significantly decrease the attack surface simply by aggressively sandboxing everything.

The fact that it may even be faster due to the possibility to JIT in platform specific optimizations is really gravy.


That is the same approach used by the language environments on mainframes, Windows Store (MSIL cloud compiler), watchOS and a couple of other platforms, I would guess.


The Firefox bytecode cache is pretty similar. I think it's regenerated after updates.


if parts of the web browser start being shipped as wasm code, we will eventually reach the point where the web browser shipped to the user is only a wasm vm, and all the rest will be shipped as optional libraries or even downloaded on the fly. Even stuff like the html engine, the css, and the javascript. In that world, using the messy web standards evolved over time would be optional. The web browser would then become the universal virtual machine that the world seems to want it to be, instead of a browser. The web would be the app distribution system. One could for example, decide to write their site using tcl/tk.

Implementing the wasm vm and its basic apis would be simpler in a new operating system. Because the way it is now, the web browser itself is more complex that writing a simple operating system. That hinders innovation in the Operating System space.


There's a HUGE way to go to get there and I don't believe we really ever will. OSes support every language, 100s of Input Method Editors, all the issues of left to right and more complex text rendering than English. Asking every webpage to provide all of that and to keep all of that up-to-date would be a huge loss for the web and app dev in general.


> Consequently, there are now around 40 high-level programming languages that support WebAssembly, including C and C++, Python, Go, Rust, Java, and PHP. Wasm is not a new language, but a portable, pre-compiled, cross-platform binary instruction set for a virtual machine that runs in the browser.

https://blog.stackpath.com/webassembly/

It doesn't seem terribly unlikely. I don't think it's the goal, but it seems like a fairly likely outcome.


You should check this talk about JS and ASM :) https://www.destroyallsoftware.com/talks/the-birth-and-death...


All I can see is people wanting to use web-technologies to develop applications. So I would place my bet on the DOM/CSS evolving further and swallowing everything. JavaScript might get some contenders.


50 years later the mainframes have won.

Language environments and containers.


They have been "winning" since they started losing. Our PCs are much more similar to 80's mainframes than to 80's PCs.


But ... in a lot of ways worse.


Good point. That should probably be the case for the majority of generation who entered the programming world in the last 20 years, which is quite a lot of people. But yet, old farts like me may disagree.


> One could for example, decide to write their site using tcl/tk.

Please, please don't do that. Tk is completely inaccessible to blind people via screen readers, and probably people with some other disabilities as well, on all platforms. Most toolkits written by people who decide to throw out those messy web standards would probably have the same problem.


Exactly my thoughts when I first read of WASI: conceptually you could now have a WASI runtime ("OS") and an HTML.wasm potentially independently developed and swappable. And since WASI is a lot smaller surface, it is easier to audit, test, reimplement, etc.


Looks like eventually Wasm will become JVM and Firefox will become HotJava[0] :-P

[0] https://en.wikipedia.org/wiki/HotJava


Every single day I am thinking how the JVM could actually be a platform that is worth using in many situations. That day may come but certainly not within the next decade. No matter how well you optimize your code the JVM will ruin it and make it slower and consume more memory than necessary. Of course none of this matters in a world where your internal dashboard with single digit user counts is set to use 4GB RAM just "to be sure".


> No matter how well you optimize your code the JVM will ruin it and make it slower and consume more memory than necessary.

Maybe you are really good at this.

But for a good chunk of software engineers AFAIK compiling Java to bytecode and running it on the JVM easily outperforms their optimized code performance wise in most cases.

JVM and JDK writers (and the same people on the Dotnet side) aren't dimwits and their efforts over the last two decades are being applied every time we compile and run software on their platforms.


> No matter how well you optimize your code the JVM will ruin it and make it slower and consume more memory than necessary.

On the contrary, I for one enjoy the benefits that the JVM brings to my poorly-optimized bytecode produced by javac.


However given the progress rate of the proposals, I guess we will have to wait about 5 more years at very least.


This is kind of neat but it's not clear how to think about what WebAssembly is doing, since the output is native code. I guess it's essentially a safe compiler for C++ code?

For a language whose compiler already outputs safe binaries, this step would be redundant.

It's not that easy to write a safe compiler though. Would a compile toolchain that uses WebAssembly for all compilation be useful?


It's not a safe compiler for C/C++, the compiled wasm code can still be compromised, it just cannot touch the rest of the process except indirectly via returned values.


And, to be clear, people could still do bad stuff with compromised WASM.

The big difference is that the WASM sandbox significantly reduces the surface area of what bad stuff can be done.

Today, a compromise in the browser means the attacker can do whatever the browser can do (which is usually a LOT). With the sandbox, a compromise can only really affect what the sandbox has available to it. That means, if your sandbox only exposes a single method which takes in a string and returns a string, the worst thing an attacker can do is return a malformed string.

Of course, if you mishandle that returned string then bad stuff will happen but it's a far cry from the input string being able to potentially cause arbitrary code execution which installs a virus on your machine.

To really do something evil you have to not only compromise the code running in WASM, you have to find a way to break out of WASM. That's a lot harder to do.


WebAssembly still suffers from possible internal memory corruption , so it is only safe to the extent that the C++ standard library with bounds checking enabled gets used.

Alternatively WASM could eventually support memory tagging like SPARC and ARMv8.


No, the goal is to constrain "internal memory corruption" to the WASM sandbox itself and values returned from it.


In the occurrence of internal memory corruption the values returned by public functions exposed in a WASM module can be anything.

For example, a WASM based security module can just start logging-in everyone as admin, because the user metadata got corrupted.


The claimed advantage is not that wasm-compiled modules are more resilient to bugs or exploits, but that those exploit are easier to contain.

Nobody is surprised that an exploit in the authentication module can be used to log in as admin. It is different if an exploit in the font rendering module lets you log in as admin.

They are just two (equally important but) orthogonal facets of security


Maybe the question I should have asked is what sort of protection does it provide beyond a normal compile and how useful is it?

It seems like there is some kind of cross-module memory protection? Maybe Erlang would be an interesting comparison?


It prevents a bug in your text rendering module from logging you in as admin, which is usually not the case inside a single process.


Is there any reason to continue using process-level sandboxing if this approach is available? It sounds like WASI adds an extra layer of security even beyond process-level sandboxing by being able to limit access to system resources. Or is access to system resources really not the concern here / can you do that just as easily with process-level sandboxing?

And you can pass callback functions to sandboxes, which IDK if you can do with separate processes. Does process-level sandboxing provide any advantages WASI doesn't?


The docs list WASI as currently experimental[1] and has missing features[2]. I understand how the sandboxing approach can add security, but if using experimental technology to enable this, doesn't it potentially open up more new holes than it closes? I'd love to hear more detail about what exact parts of WASI are used here, and what happens if you compile C code that targets a "rough edge" or "missing feature" of WASI by accident?

[1] https://github.com/bytecodealliance/wasmtime/blob/master/doc...

[2] https://github.com/bytecodealliance/wasmtime/blob/master/doc...


External experimental tools are different than internal experimental tools. It would not be a good idea to do the same using a tool they have little control over, but in this case they have full control over cranelift (up to even feeling secure in using a different version)


The sandbox architecture model makes me think of cell membranes.

https://plsyssec.github.io/rlbox_sandboxing_api/sphinx/


It's very impressive that they can do it with existing libraries properly.

Personally, if I had to design a solution for this problem, I would use the io_uring model - which would allow every part to reside in its own process and memory space.


This kind of technique could also be applied to the parts of Firefox built with Rust, correct?


It could but Rust already solves the issues this is aiming to solve but with fewer steps. I don't know what Mozilla's policy is on adding Rust vs C++ for new features.


I somewhat disagree.

Safe rust solves the memory safety problem. However, there is still the possibility of a logic bug causing rust to touch something it shouldn't.

The sandbox approach is about adding a second level for malicious code to bypass. You now not only need to find a way to get past the code, you also need to find a way out of the sandbox.

It's a little like running your apps on a server with highly restricted permissions. You do that so the app compromise limits what is exposed.


Rust is memory safe. It does not need this treatment.


It would still be useful in that case as defense in depth. Specifically it could guard against compiler bugs, use of "unsafe", etc.


And conversely, if a C/C++ component is put into a wasm module, work that's needed to clearly specify the API can be re-used for any possible Rust rewrite in the future.


The wasm compiler can have its own bugs.


That is why you should write it in Rust, obviously. ;)


It's a smaller surface area that has a high amount of scrutiny. Particularly because it is already being exposed to arbitrary code execution.


I wonder how this would compare to compiling the sandboxed code with ASan and UBSan on


Afaik those make for a 2x performance hit. And they don't necessarily protect against buffer overflows; if you overflow out of data addressable from one pointer, but into data that's legally addressable from another pointer, that's still an issue, but it won't be caught.


But WASM has the same buffer overflow problem, and at least that bad of a perf hit.


WASM has less of a perf hit.


Address sanitizer and undefined behavior sanitizer are not intended to be security mitigations; it's possible to get around them and now you have bad things happening in your process again.


I'm still waiting for better toolchains.

WASM is really the greatest thing, but it's lacking proper support from compilers. Binaryen seems like a beast to use. I don't really understand why WASM is not just supported natively directly by clang or gcc.


Wasm is already supported natively by clang.

Binaryen is an optional optimizer that you can run on the emitted wasm, to make it smaller (either manually, or a toolchain may integrate it for you, like emscripten or wasm-pack).


You could try Rust. It seems pretty simple to compile to wasm: https://developer.mozilla.org/en-US/docs/WebAssembly/Rust_to...


Isn't it already supported backend in LLVM? https://github.com/llvm/llvm-project/tree/master/llvm/lib/Ta...

I mean it is still under development but I think it is usable. Here's a quick intro I found some time ago: https://dassur.ma/things/c-to-webassembly/


The WASI tutorial for compiling C/Rust to wasm and running it has some useful information on this front: https://github.com/bytecodealliance/wasmtime/blob/master/doc...

We wouldn't have been able to put wasm in Firefox like this if there wasn't decent compiler/runtime support via the clang ecosystem.


That C code has a bug where the inner write loop writes the same portion of the buffer after a short write.

The Rust code, by contrast, slurps up the entire file contents into a buffer before writing it out. If someone is going to write code that way why even bother with a low-level language? (This "memory is infinite" mentality is something I've noticed in other Rust projects, even major ones like mio.)


Good catch on that bug! I'll fix that.

Beyond that, that file is just a simple example for showing how to work with the toolchain and the sandbox.


TinyGo looks pretty interesting

https://tinygo.org/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: