Hacker News new | past | comments | ask | show | jobs | submit login
The Bytecode Alliance: Building a secure, composable future for WebAssembly (hacks.mozilla.org)
407 points by markdog12 on Nov 12, 2019 | hide | past | favorite | 255 comments

I'm one of the folks working with the Alliance, and I'm incredibly excited about WebAssembly outside the browser. Happy to answer questions.

Imagine extensions for applications or databases, written in any language you want, with no ability to exfiltrate data. Imagine supporting a safe plugin API that isn't just for C and languages that FFI to C, but works natively with safe datatypes.

Today, if you want to be extensible, you typically expose a C API for people to link to, or you embed a specific language like Lua or Python. Imagine embedding a WebAssembly runtime like wasmtime, and telling people they can write their extensions in any language they want.

The post mentions:

> tiny IoT device

But, I've looked at this in the past and concluded wasm was a pretty poor fit for small devices: 1) it only addresses memory in 64KB pages (perhaps more SRAM than you might have) and 2) requires both single and double float support.

(1) might be possible to work-around by backing the memory space with smaller page allocations -- at the cost of indirection for every memory access. (2) seems pretty insurmountable -- it'll just need a chunk of extra C library.

Are you aware of any projects working on this for smallish microcontrollers?

You don't have to support floats; you can, for instance, support loading and validating WebAssembly modules only if they don't use any floating-point operations.

Much like C, which also requires floating point support. WebAssembly implementations for tiny IoT environments can emulate floating point in reliance on people not using it heavily, just as you'd expect them not to run Node.js. But at least the developer has the option, which can have significant value, such as during exploration and experimentation phases.

Similarly, compilers like GCC will synthesize 64-bit arithmetic inline for i386 and other targets lacking 64-bit support. These days few people think twice about using 64-bit data types even though 32-bit processors are still in heavy use. (Though, AFAIU emulation has largely moved from compiler to the instruction decoder or microcode.)

> it only addresses memory in 64KB pages

is this by the standard or implementations?

This isn't entirely accurate. Exploitations in whatever code is converted to wasm instructions still work inside the wasm memory space.

As in, if you write your program in c, python, whatever, and put it in wasm, then have a wasm extension system of some sort - if that c or python is exploitable, then whatever external access that wasm instance has is now useable. And whatever memory is in the memory space can be modified.

This is an important distinction. Wasm should be thought of as a tool to containerise execution and memory if the implimentation is proven to do so.

The point is that you can more easily and clearly sandbox the wasm plugin. If the python implementation tries to send http requests but the wasm sandbox has no network access you are safer in wasm than in python.

Unless you choose to sandbox your python. Obviously the language itself doesn't support this, but most OSes have functionality for this that can be applied to any application or code, not just wasm.

Does WASM let you run memory as code?

No. Marking memory as executable is a native machine level capability (as in x86_64 assembly machine instructions). Wasm is a stack machine interpreter that can read only Wasm binaries. The binary format is essentially just a compressed/minimized form of the bytecode format, which used to essentially be an AST of the original source code, but now I believe they modified it a bit to be more analogous to hypothetical machine code, but still abstracted. The Wasm interpreter does not understand x86 or ARM or whatever machine code it's fed it and rightly so. If that was possible, anyone could just send over a buffer to the browser to execute whatever they want.

That's what I thought. Then that's one whole category of exploits that no longer applies, no?

Yes, assuming there's no flaws in the interpreter. But they're all moving full steam ahead in the browsers right now, so imagine it's deemed as safe as javascript. The current push is to divvy up the permissions of certain resources such as sockets and filesystem (browser code don't have filesystem access) and make an API to ensure that modules don't overstep their granted permissions. It'll probably look a lot like node and npm, where you could add a module to your source and know that it can't access things it shouldn't. Pretty exciting stuff. It's a true merger of all languages together under one runtime.

That's true, but that's a class of exploits that is mostly impossible anyway (except in an embedded environment). Most systems that have an MMU ensure that no page is both writeable and executable, meaning you can't inject code. This forces attackers to use ROP.

> Imagine embedding a WebAssembly runtime like wasmtime, and telling people they can write their extensions in any language they want.

Is that desirable? Won't it lead to a bloated mess with every extension dragging in a different language runtime with it?

Web developers don't seem to have any concern about using gigantic quantities of memory. Don't hire a web developer to develop for games. :)

It's almost as if spending time optimizing memory usage isn't worth it because server RAM is cheap!

What does server RAM have to do with modern web development, where the vast majority of the work is done on the frontend?

With several programs using wasm, the runtimes can be shared, the way shared libraries work in modern OSes, or shared assemblies .net.

Yea but sometimes we are willing to run an entire nano linux in a WASM module if it helps launch a few days earlier!

    no ability to exfiltrate data
As we've learned with things like rowhammer and Spectre, this is a very high bar. How does this initiative plan to deal with side-channel attacks?

This is discussed in the blog post; see the discussion of "time protection".

If you're just reading hn comments:

"There is unfortunately one less straightforward way that attackers can access another module’s memory—side-channel attacks like Spectre. OS processes attempt to provide something called time protection, and this helps protect against these attacks. Unfortunately, CPUs and OSes don’t offer finer-than-process granularity time protection features."

"[snip] Making a shift to nanoprocesses here would take careful analysis."

"But there are lots of situations where timing protection isn’t needed, or where people aren’t using processes at all right now. These are good candidates for nanoprocesses."

"Plus, as CPUs evolve to provide cheaper time protection features, WebAssembly nanoprocesses will be in a good position to quickly take advantage of these features, at which point you won’t need to use OS processes for this."

I'm particularly interested in a world where wasm binaries replace (or supplement) containers (or even VMs?). I'm specifically imagining Kubernetes or AWS Lambda but with direct wasm execution (instead of wasm-in-container or wasm-in-VM). I'd be curious if anyone has put more thought into this and what possibilities they envision.

> and I'm incredibly excited about WebAssembly outside the browser.

What's your take on the browser, instead? Curious to hear your point of view.

p.s. Thanks for working with the Alliance, I think it's great news overall.

I am very interested in the inter "process" comm (or, inter "module" comm to be precised). From the surface, this runtime and modular system look pretty similar to the based OSGi runtime plus OSGi security. Not that I wanted to scare folks here as OSGi has a bad reputation of being bloated :-)

So, like, Java?

> Imagine extensions for applications or databases, written in any language you want, with no ability to exfiltrate data

How does Java preclude plugins exfiltrating data?

Downvoters: I’m not trying to be sacrilegious; I genuinely don’t know the answer.

You can use a security manager and define permission on what the loaded java code can do.

Unfortunately, it's fraught with danger because of the confused lieutenant issue ( you'd need to give parts of the app permission but not others - doing so isn't trivial).

Just to help others searching for "confused lieutenant," I believe it's usually known as the "confused deputy problem," perhaps as a reference to Barney Fife. I do like the image of a confused lieutenant though.

Fantastic! Since I made quite a few Silverlight RIAs, WASM is like Silverlgitht on steroids. I am surprised that Uno Platform is not a founding member of the Alliance. AFAIK, they are the most devoted group for WASM (hope not offending you). I use Uno Platform to make WASM apps with ease and pleasure. The current major issue is the long startup time. Hope there will be a significant improvement in this area soon.

Is there a plan to bring together a bunch of these extensions under one roof with a package/dependency manager similar to cargo or npm, or is it intended to be glued together on the developer side? To me this looks massive, like to be able to pull down modules knowing what permissions they need, and to be able to have some kind of assurances that the code can't misbehave.

Will this work on platforms that do not allow JIT compilation?

Lucet supports ahead-of-time compilation, and WAMR provides an interpreter.

And Wasmtime will also support both of those. Support for environments in which JITting is not an option is of course really important to this!

I was talking with a colleague today who mentioned that he had looked at WASM for a particular use-case (file-verification IIRC) and had concluded that for now the overhead of copying memory made it run worse than well-written JavaScript. It is also my experience that the overhead of memory copying can really put a damper on performance improvements.

Now, I get that sharing memory is a huge safety issue - it kind of inherently breaks the sandbox, but when I see the "nanoprocesses" bit in the article I worry about death by a thousand paper cuts (lots of tiny WASM module spending more time copying data than processing it). Are there ways/plans to minimize memory copies that don't conflict with the safety concerns?

There will be ways to memory-map files, share memory between modules, and otherwise avoid copying memory unnecessarily.

Shared memory doesn't break the sandbox. Sharing all memory by default would, but controlled sharing of specific memory doesn't. Think of it like inter-process shared memory, rather than threads sharing an entire address space.

Reading through this, this sounds like mostly a capability based architecture, with process isolation replaced with a statically typed bytecode in which you can verify that each module can only use capabilities passed to it.

I was wondering why the focus on copying memory between the processes. It should be possible to make capabilities which represent a pointer and a length (and maybe an access mode), which could be used to give a process direct access to shared memory safely. I don't know if that could be done with low enough overhead for small objects, it's possible that there would need to be a threshold below which you would just copy values to be efficient, but it seems like exploring fat pointer capabilities would be worthwhile.

Memory mapping files or inter-process shared memory is the very coarse-grained version of this, using hardware protection. I feel like it should be possible to do something more efficient and finer grained with pointer capabilities and the static type checking that is done to verify WASM during compilation, but it may be a significant research project of its own.

What you're describing will indeed be introduced with the WebAssembly GC proposal: https://github.com/WebAssembly/gc

For languages that can express unforgeable pointers as first-class concept, that is indeed a very attractive, fine-grained approach. Unfortunately bringing that to languages like C/C++/Rust is a different matter altogether.

Since we want to support those languages as first-class citizens, we can't require GC support as a base concept, so we have to treat a nanoprocess as the unit of isolation from the outside.

Once we have GC support, nothing will prevent languages that can use it from expressing finer-grained capabilities even within a nanoprocess, and that seems highly desirable indeed.

(full disclosure: I'm a Mozilla employee and one of the people who set up the Bytecode Alliance.)

That future possibiilty reminds me of https://en.wikipedia.org/wiki/Singularity_(operating_system) - where process/address-space isolation was replaced with fine-grained static verification of high-level code (presumably not the first experiment in this area).

Indeed: that and many other things are prior art in this space. And there is a lot of prior art for what we're working on—this is not meant as an academic research project! :)

Yes, one of the answers I want to give any time someone asks "why will WASM succeed when the JVM didn't" is that there is 25 years more experience and research to draw upon.

And yet bounds checking access validation was left out of the design, something that most of previous research projects took care to taint as unsafe packages when present.

> For languages that can express unforgeable pointers as first-class concept, that is indeed a very attractive, fine-grained approach. Unfortunately bringing that to languages like C/C++/Rust is a different matter altogether.

The semantics of these languages aren’t incompatible with unforgeable references, though: it generally works in practice, but it’s technically undefined to create pointers out of thin air. Why can’t we take advantage of the standard here to disallow illegally created references? (Which, as I understand it, many other vendors are already beginning to do with e.g. pointer authentication and memory tagging.)

It's slow. You should read up on the challenges of implementing memcpy in C emulated on Java. Basically you have to manually implement paging.

What would allow other languages to represent unforgeable pointers as a first class concept and not C/C++/Rust?

Forging a pointer is UB in all of these languages as far as I know.

It seems like you should be able to have opaque types that represent these unforgeable pointers which you can't do arithmetic on or cast to raw pointers, but can access values in type safe ways, or provide a view to a byte slice which does bounds check on access.

Is there a good place for discussion of this design? I seem to be having this conversation with you and Josh both here and on Reddit, and it seems like a lot of the discussion is spread out in a lot of places.

In unsafe rust you can arbitrarily increase the length of a vector/string by modifying the stored length. You do not need to forge the pointer itself to break the pointer's invariant.

You would need to do either static or dynamic bounds checking when accessing memory via these capabilities. You obviously can't just give arbitrary code a pointer and let it read however far it wants past the end of it.

Given that most code in Rust is safe code and includes bounds checks before access, you should be able to have the verifier rely on those when they exist, and add in bounds checks in cases in which the access is not protected by a bounds check.

Maybe that would be intractable, or to inefficient to be worth it with all of the extra bounds checks. I'm not sure. I'm asking because it's something that I feel should be possible, but I haven't been involved in the research or development, so I'm wondering if those who have been more involved have references to discussion about the topic.

>Since we want to support those languages as first-class citizens, we can't require GC support as a base concept

I feel like you're overthinking it. Can't you just have a table that holds GCable objects and only hand out indexes to C and co?

That is how we support references in the Rust toolchain right now, via wasm-bindgen, and it's an important part of making unforgable references work for languages that rely on linear memory.

It doesn't help with making capabilities more fine-grained, though: we have to treat all code that has access to that table as having the same level of trust.

Right, I was being sloppy - I guess was really thinking "memory addresses", not "memory".

Anyway, that's great to hear! Any suggestions for discussions and such that one could follow to see how it develops?

Yes. We know that WebAssembly in its present form doesn't yet have all the functionality we'll want to make all this efficient, so we're active in many areas in the WebAssembly standards process helping move it forward.

WASM has some pretty zany ideas about computing that make the “ASM” part feel a little strange.

Web binary lisp sans tail call optimization is a little more fitting IMO.

In my opinion this is a fascinating approach, and it may end up transforming our industry.

But the main question I've had is how big the overhead is, specifically since modules don't share their wasm Memory. That means data will be constantly copied between them. Compared to regular static or even dynamic linking, that may be a noticeable slowdown.

> specifically since modules don't share their wasm Memory. That means data will be constantly copied between them.

We're working on that. We want to make it possible for modules to share memory in a controlled way, without giving access to their entire address space.

Shared memory can certainly do this. Even separate processes can use shared memory to communicate lock free using atomics.

Other than that, I wonder if nested page tables that are normally used for virtualization can be used to separate address spaces within a process.

Interesting! Is that written up somewhere public?

Some of it; see https://hacks.mozilla.org/2019/08/webassembly-interface-type... for the starting point that this will be based on.

Oh, where? I don't think I see anything about controlled memory sharing there, but I guess I missed it. Which section is it in?

It isn't yet, that work is still in progress. WIT is just the starting point that this would be based on.

(Sorry for the lack of more concrete information, documentation is still in progress.)

I see, thanks. Curious to learn more when it is public.

>specifically since modules don't share their wasm Memory. That means data will be constantly copied between them.

It is possible today to have modules import memory from the host, which means the host can provide the same imported memory to multiple modules. Then modules can pass pointers to each other without needing any memcpys.

It does require the modules to not tread on each other though, such as by requiring them to have distinct static offsets (poor man's linking), or by requiring them to be PIC / compiled with relocation info so that they can be relinked by the host loader.

It doesn't seem to me from reading TFA that this is going away, merely that there are planning to add alternatives to make it more fine-grained.

It is possible to share memory in wasm today, yes (we use that to optimize wasm/JS interaction, and wasm/wasm dynamic linking) - and that won't go away, it's a core feature of wasm, you're right. But the specific approach in this article disallows that.

The article does mention a possible future extension of multiple wasm modules in a single nanoprocess (the section with "allowing native-style dynamic linking"), and JoshTriplett mentions in another comment some future ideas of sharing parts of memory but not all. Those things will compromise the strict initial requirement of no shared memory, and improve performance.

Those are the interesting questions for me - is not sharing memory too much overhead, and if it is, is there a way to relax that which preserves enough security with enough performance.

To the question of "how would dynamic linking work in the future", as already stated in the article, a nanoprocess doesn't need to correspond 1:1 to a wasm module/instance. Rather, N wasm instances can share a single wasm memory to collectively represent a single nanoprocess. From an external point of view, whether a nanoprocess is implemented as 1 or N instances wouldn't be visible -- it's an impl detail -- and the memory is (still) encapsulated by the nanoprocess. Thus, there is a two-level hierarchy of: a graph of shared-nothing-linked nanoprocesses each containing a graph of shared-everything wasm instances.

There's still a lot more details to figure out, of course, but that's true for wasm dynamic (shared-everything) linking in general.

I’d rather start with something provably secure and slower, and then optimize its performance and features over future iterations and versions.

Starting with speed and trying patch in security doesn’t seem to make things actually secure.

Last I checked WebAssembly still didn't have a way of actually freeing memory it has allocated either.

You give it a heap. It doesn't allocate memory. It may allocate within its own heap, using whatever allocator/gc strategy it inherits from it's host language.

Which means it can't free. But that's also not accurate, anyway, there's a brk() method to grow the heap. Which is such a bad starting point for a new runtime.

Would it be impossible to implement mmap and sbrk in the runtime?

mmap is arguably the only thing it should have had in the first place. brk/sbrk is fairly dead.

  $ strace /bin/ls 2>&1 | grep brk
  brk(NULL)                               = 0x55fb032b3000
  brk(NULL)                               = 0x55fb032b3000
  brk(0x55fb032d4000)                     = 0x55fb032d4000

  $ ktrace /bin/ls >/dev/null
  $ kdump | grep brk | wc -l     
  $ kdump | grep mmap | wc -l
glibc, musl, and jemalloc got the memo, they just couldn't be bothered to stop using brk by default. They're all capable of using mmap exclusively, however, because whether or not brk is actually dead, there's long been consensus that mmap is the better abstraction on balance, particularly given the benefit of ASLR. So there's little reason to support brk instead of mmap for compatibility. Presumably it was chosen for the convenience of WebAssembly implementations--contiguous, flat memory is a great simplifier of VM, JIT, and even traditional AoT architectures.

Right, and while you can grow the heap you can't give any of it back, so you can't really dynamically contract your memory without restarting the process.

I thought that was the host's work. You import a function that does it. If we're talking about shared objects. Your own module's memory is managed by you (or your language's runtime)

This is great. I've been waiting for a module-level permissions system for a while now; it definitely seems like the best approach to mitigate supply chain attacks.

Hopefully once WASM has demonstrated the principle other languages will follow. This seems like it'd be especially useful in the JS ecosystem, where many modules are already small enough that they'd likely be able to run with no permissions at all.

JVM was supposed to be this. Gosling said publicly that JVM is more important than Java. .

Many things went wrong. Microsoft was actively sabotaging JVM. They implemented very fast JVM for Explorer and their operating system that intentionally broke the JVM 1.1 standard. See Sun vs Microsoft 1997. Microsoft lost and paid damages. .NET was created to do more damage.

I like the JVM and have done lots of code for it, but the java download&install / setting up PATH / non native gui-look / no exe-files / should i get se/ee/jre/sdk etc etc must be a big contributing factor to its lack of being more popular (client side). A lot of the experience has been clumsy, ugly and unintuitive from the start

WebAssembly developer experience is also not the most friendly one, with its mix of toolchains, specially if one is on Windows.

Oh, and debugging is still at printf level style and reading raw bytecodes.

> debugging is still at printf level style and reading raw bytecodes

Wasmtime has had support for source-level debugging via lldb or gdb for a few months https://hacks.mozilla.org/2019/09/debugging-webassembly-outs...

That doesn't look like production ready, nor integrated into browser developer tools.

Which, is also why that method for delivering applications has been deprecated for a couple of years.

Now, the prefered way to ship a Java application is to bundle the runtime and the application into a single installer similar to, for example, Electron apps.

JVM bytecode semantics were all about Java - it forces the Java object model on you, for example. It was never designed to allow a diverse ecosystem of languages.

Yet, we still a lot more new languages on the Java platform. Then say .NET which was designed to be polyglot.

If you look at those languages closely, they are usually high-level enough that they can be mapped reasonably well to the JVM object model, or else the language itself is intentionally designed around its limitations (e.g. Scala). But something like C++ doesn't really compile to Java bytecode well.

.NET CIL is much better in that regard precisely because it was designed to be low-level enough to compile C to it - it has stuff like raw pointers and pointer arithmetic, stack-allocated arrays, unions etc. The reason why not many languages bothered is longstanding lack of cross-platform support - Mono was around for a while, sure, but it was not the official implementation, and nobody knew whether it'd still be there next year. Today that's not really an issue anymore, but by now wasm is a better choice.

You can blaze a trail that others literally follow, or you can blaze a trail that others figuratively follow.

The JVM is definitely more important than Java. But it might not be the most important VM, let alone the most important thing in software.

And other languages/platforms emulating that plan doesn't diminish the JVM, except numerically.

Yeah, Microsoft actively sabotaged JVM by developing CLR. Those monsters!

There should not be a 'browser wasm' and a 'non browser' wasm.

The wasm committee made many mistakes by treating it as an idea on paper, and not writing the actual implimenting software.

This has resulted in significant fragmenting of implimentation, each less trustworthy than the last.

If any software is going to advertise safety, it must prove it. That's done through the feedback cycle and careful development. The only ones that have this are in browsers.

Yet browsers have their own dubious implimentation, integrated with their JavaScript environment.

Unless wasm decides on a set of standard runtimes that become trustworthy there will not be a wasm outside the browser.

For example, when I search for python, I get python. There's still other pythons, but there's the python. This is true of all other software used. Wasm is failing to do that.

Try searching for C, C++, or even SQL or Javascript. You will quickly see that you don’t have “the official implementation”, because that’s not how things work in practice when considering well standardized and widely used software.

You're mistaking written standard for implimentation standard. You only need a standard in practice.

All of those things you've listed have only a few big players, with enough use and feedback cycle that there can be multiple different implimentations.

Those different implimentations are very similar. There is not a different C to compilers, they have their own quirks, benefits, and target different systems.

JavaScript is the least selective of any language, you will only find runtimes for it's modern versions in browsers Typically V8 or safari. V8 has become the effective standard.

SQL has multiple big implimentations that deviate completely. This has a lot to do with marketing, and the fact data storage/retrieval is a significantly diverse section of computer science.

In each of these known technologies there is a well known, well backed, named software project that people use it's name synonymous with it's supposed standard.

Wasm doesn't have this, and that's because the implimentations took place in browser, which afaik did not impliment a unique engine to it. It just integrates wasm instructions to their js instructions.

Which then led to there being dozens of different self called 'safe' implimentations. None of these have the users, time, and open source community necessary to know this. Which will inevitably lead to exploits.

Given that situation, the only way wasm is going to live up to its goal of safety is backing a well made runtime.

The problem of safety is what makes this very different from other software. Your own code with bugs is an annoyance, and maybe you can't do what you wanted. But code that is meant to be safe that instead allows for damage to you, your company - that deserves a much higher level of scrutiny.

Super cynical, I realize, but whenever I see the description “secure by default” for something computer related, and they don’t mean, “we unplugged it,” I assume what they really mean is, “we made the code so complex we can’t find the problems.”

Instead, they mean something sensible: that permission has been linked with designation as per capability security. Adding feature extensions are thus linked with the authorities required to implement those features, and the modules implementing those features can't do anything behind the scenes.

So you would no longer be able to amplify a string value into a file descriptor as you can in systems with ambient authority (which is nearly every system in widespread use today).

So "secure by default" here means, "programs conform to least privilege".

I suggest reading the WebAssembly paper:

Bringing the web up to speed with WebAssembly


Well, you should really read it as "eliminating certain class(es) of vulnerabilities".

Don't take it bad, but it looks like the java launch a (long) while ago. What makes wasm better than java ?

The toolchains for building WebAssembly from numerous languages, for one thing. The existence of an LLVM WebAssembly backend helps. (While eventually there were other languages that targeted the JVM, for a long time if you wanted the JVM sandbox you had to write Java.)

WebAssembly also provides a fine-grained API surface area; you can run a WebAssembly sandbox with no external functions provided, or just a few.

WebAssembly's sandboxing isn't tied to the web; we're keeping all the same security properties when running code locally, and we're protecting modules from each other too.

Also, the WebAssembly bytecode format is designed from the beginning to support many different kinds of languages, including languages that directly store types in memory, rather than keeping everything as garbage-collected or reference-counted objects on the heap.

>WebAssembly also provides a fine-grained API surface area

Are there concerns about compatibility when browsers are inevitably forced to reduce that surface area because of a security flaw?

And how are we going to keep all browsers on the same page with regard to what functionality they provide? It will suck if wasm turns into a cross browser compatibility nightmare.

WASM modules by themselves can do pretty much nothing besides allocating memory. If you want to use some API, you have to explicitly expose it, as a user.

Just like MSIL, Xerox PARC microcoded bytecode, IBM mainframes language environments and plenty of other examples.

Yes the concept, as well as implementations, existed for decades. I think WASM is hype because it comes from the Web/JS community and they have good communication, but technically there is nothing new under the sun.

I would say that being a success instead of a failure is something new. The JVM was successful but not for plugins, nor for embedded safety.

Also the fact that many vendors have reached a consensus on both a MVP and an update process to future features looks like a win. If Sun, Microsoft, Apple, and Google had all been on board with the (supposedly free software) JVM the story would have been very different.

Ironically WebAssembly advocacy keeps being defensive against JVM, while cleverly forgetting the dozens of other formats that also offered similar capabilities, including executing C derived languages.

> The existence of an LLVM WebAssembly backend helps.

So why aren't people just shipping straight up LLVM intermediate language VMs and instead go through wasm?

Incidentally, would you consider wasm as a destination language for virtual machines for obfuscation? I.e. is it reasonable enough to implement it all in about a week? Plus I fear like the decompilation/disassembly tooling might be there too soon for it to be a real viable option, but maybe nobody's been working on that yet.

LLVM IR is an unstable format, changes a lot and only has one implementation.So it's not well suited for a standard. Google actually did use it for PNaCl but fortunately Mozilla prevailed when wasm was created. "Do what LLVM does" is not a valid strategy. On a related note, I'm glad that Firefox doesn't support the "Do what sqlite does" WebSQL api either, but Chrome (sadly) does.

Plus it’s not really portable.

It is intentionally not portable.

> So why aren't people just shipping straight up LLVM intermediate language VMs and instead go through wasm?

This is answered in the WebAssembly FAQ: https://webassembly.org/docs/faq/#why-not-just-use-llvm-bitc...

Thanks. That one's on me doing incomplete research.

Still leaves open the question as VM for obfuscation (RE tooling, ease of implementation from scratch) though.

Do you have a typo in your second line? I can’t figure out what your question is.

Java failed to deliver safe sandbox. Browsers finally booted it because of numerous vulnerabilities. For me wasm looks exactly like another JVM attempt, but it's a good thing, because the idea is good, we just need better implementation.

Now if I don't need sandbox, it's going to be a tougher sell. But who knows, may be it'll outperform JVM on bare metal some day.

Technically I agree, but nowadays with 5G+Edge computing, where low latency application requirements are essential, there will be other performance constraints not tackled by Java that needs new solutions. Previous JVM attempts mostly focused on "write once, run anywhere". But now, "anywhere" means "anywhere and quickly" :)

With proper implementation JVM could be kept running and loading applet would be extremely fast.

WebAssembly also has its security issues.

Lack of bounds checking for multiple data accesses mapped to the same linear memory block.

Right, but this isn’t a sandboxing issue.

How would you enforce bounds checking while supporting unsafe languages?

That is exactly the point, don't advertise WebAssembly as safe bytecode, if unsafe languages are part of the picture without any kind of control.

Secondly, nor ISO C or ISO C++ forbid implementations that do bounds checking by default. In fact that is what most modern compilers do by default in debug mode.

Finally look at memory tagging in Solaris SPARC ADI, Apple iOS or the upcoming ARM extensions support on Android for how bounds checking is enforced at hardware level while supporting unsafe languages.

The point of the byte code being safe is that no operations performed by WASM code could cause a memory unsafety bug from the perspective outside the sandbox. If your code violates its own memory rules then it will only mess up the logical state of its VM memory chunk, and probably produce an incorrect result. The same thing can happen in any safe language if you access the wrong indexes in an array because of a logical error in your code

That incorrect results might lead to security exploits, the same way that browsers now get exploited by taking advantage how their VMs work.

So hand waving such security issues is rather strange, when it should be the top concern when selling an infrastructure to run code from unknown sources.

> the same way that browsers now get exploited by taking advantage how their VMs work

The difference is if the exploit is run by the browser or the VM. If the VM has a logic error, gives back the wrong result, the browser decides to trust the result and be exploited then it is a browser bug; not a sandbox issue.

The other situation is when a browser spins up a VM to add 2 and 2 but then the VM starts downloading malicious files from the internet.

No kind of safe language can avoid the first class, wasm avoids the second.

> In fact that is what most modern compilers do by default in debug mode.

Assuming that “debug mode” is -g or equivalent, then I have not seen a modern compiler that does this.

Then get to use Visual C++, which does bounds checking on collection types (array, string, vector,...), memory dumps leaks at exit.

Or Solaris SPARC and iOS compilers that make use of hardware memory tagging.

Clang and GCC do need extra flags to enable FORTIFY mode though.

However on Android FORTIFY is now a requirement and future versions will make use of memory tagging on ARM hardware.

So ironically something like Android does have a better sandboxing model as WebAssembly.

> iOS compilers that make use of hardware memory tagging

No iOS devices ship with ARMv8.5, so while I think compilers are implementing this today I am not sure if Xcode ships with this or if it's functional.

> So ironically something like Android does have a better sandboxing model as WebAssembly.

…you're not understanding what sandboxing means.

Apple has their own CPUs,


I do understand what sandboxing means, and how WebAssembly advocates keep overselling its security capabilities, by ignoring issues that other bytecode formats have taken a more serious approach, already in the mid-60's.

I believe that wasm is not particularly focused on functional security as much as embeddable security.

Essentially the promise here is that you can download a random wasm from anywhere, run it with little-to-none privileges and be sure nothing bad can happen.

There was an article here many months ago detailing how wasm on the server makes it harder to mitigate attacks due to lack of wasm-inspecting tooling compared to system utilities for processes/native binaries.

But in part that is because the attack model of wasm is "literally executing malicious code".

Pointer authentication≠memory tagging. The former requires ARMv8.3 and is in the A12 processor, and the latter is not in any hardware that Apple is currently shipping.

Yep, fighting windmills here.

He’s probably talking about instrumentation (valgrind, ASAN etc.)

That’s part of why running instrumented code feels like running python.

No, just plain bounds checking on C++ standard library and integration with hardware memory tagging when available.

Can you link to an example of this exploit? Does javascript suffer from the same problem?

Any kind of typical C memory corruption that you can think of that fails to validate the buffer sizes that you give as parameters.

WebAssembly memory access bytecodes only do bounds checking of the linear memory block that gets allocated to the module.

You then do your own memory management taking subsets from that memory block and assigning it to the respective internal heap allocations or global memory blocks.

So you just need to have a couple of structs or C string/arrays, living alongside each other and overwrite one of them by writing too much data due to miscalculations of the memory segment holding the data.

Wouldn't that just corrupt your own program? If there is a security flaw you could demonstrate it.

Except that you load code from multiple sources and one could eventually have a piece of JavaScript code that makes use of such behaviour to have access to some feature that by default is not accessible.

I rather let WebAssembly turn out to be the next Flash, when black hats start looking at it with the same care they had before, no need to waste cycles myself, as it is a lost battle against WebAssembly advocacy.

> Except that you load code from multiple sources and one could eventually have a piece of JavaScript code that makes use of such behaviour to have access to some feature that by default is not accessible.

With all due respect, that doesn't make WebAssembly unsafe in any way.

By the same logic, any program that takes any kind of user input is unsafe because the program could trust data it should not trust and then execute incorrectly.

If a program does not validate (untrusted) input then it is the program's fault. Not the input's fault or the input-producing-method's fault.

I agree with where you're coming from though. People are going to make mistakes and if the average developer has to interpret blobs of bits as meaningful data structures just to get things done, then we are going to see a lot of these types of problems. However, there are already projects in the works that are automating the $LANGUAGE to Wasm glue code which should completely mitigate this issue.

I don't understand what you mean by "a piece of JavaScript code that makes use of such behaviour". Webasm is accessed by javascript, not the other way around.

This really sounds like you have an axe to grind with webasm for some reason, the things you saying seems like grasping at straws. It already works in browsers, so if there is something to exploit you could demonstrate it with a few files on github.

Yep, overselling security.

I guess everyone knows how to corrupt memory with C code, no need for github files.

Are you saying webasm will crash or that it is insecure? These are two different things and you seem to be conflating them. In your posts you have said that it is insecure but when you talk about specifics it just seems to be about crashing. Then when pressed you avoid any examples.

I keep telling the examples, memory corruption, typical C exploits.

Apparently the force is strong with WebAssembly advocacy.

I meant working examples. Saying "it's insecure!" and then calling your own words evidence doesn't count.

If there are actual exploits or security flaws, then demonstrate them with a working implementation. You seem to be trying to turn a technical discussion into an emotional one.

If you want code examples, just go to the CVE database and compile any memory corruption related to C into a WebAssembly module.

Why should I need to provide new examples when we have plenty to chose from?

internal memory corruption is different from external memory corruption.

It still gets a CVE award in the end.

But they are addressed differently; in particular internal memory corruption is hard to address from a framework perspective, while external memory corruption can be almost entirely eliminated.

Java was wildly successful. It was kicked out of browsers because of its huge attack surface. WASM implementations seem to be going with a sandbox approach out of the gate.

A sandbox mostly used by cryptominers already.

This is just snark. Confining malware to the sandbox is a win. Malware authors would love to load something more damaging and more valuable, and they fall back to cryptomining as a last resort to get value out of your machine.

A non-technical advantage might be that it's not owned by one company.

Also, things don't get adopted because of their technical merits but because of random chance[1], or we wouldn't have Javascript today. So having "just another" go at trying to establish a sensible standard is good enough for me.

[1] From a technical standpoint, I consider politics, hype, parasitic corporate behavior, etc, "random chance."

On the web? If was follow its roadmap, it will become a 1st class web object, interacting with everything else on the same level as JS, HTML, and CSS. That means it will be part of your page, instead of funny images and blurry text with bad UI inside a little box that takes a while to load.

Being built into the browser. Look how many years and versions of Java made then failed at that promise.

And a key point: "Our founding members are Mozilla, Fastly, Intel, and Red Hat"

Where are Google, Microsoft and the other big names you would expect?

We launched the Alliance to formalize collaboration on pre-existing projects we were all already collaborating on. We'd love to collaborate with others as well, and we're having a lot of additional conversations at the moment.

> Where are Google, Microsoft and the other big names you would expect?

Those folks are the incumbents. What do they have to gain from joining? Google especially has developed a colossal amount of tooling in-house that gives it an edge over others; new tech may make the lives of competition easier. Ditto to an extent for MS, though there, I suspect it's more bureaucracy and not seeing anything to gain.

After all MSIL already supported multiple languages, including C++, back in 2001.

Google and Microsoft probably don't have much use for WebAssembly outside the browser, whereas companies like Fastly and Cloudflare run customers' untrusted WebAssembly on their edge server networks.

Other than on the committee that designed WebAssembly, you mean?

I'm sure they'll join sooner or later. Don't forget Apple.

First of all, congrats on forming the alliance!

> Imagine extensions for applications or databases, written in any language you want, with no ability to exfiltrate data

That's what we are working towards on Wasmer, the server side WebAssembly runtime - https://github.com/wasmerio/wasmer

In fact, we already have a lot of different language integrations (maintained by us and the community) and our software is the pioneer on the space.

Is there any reason on why you think is a good idea to do a side-alliance instead of collaborating with us and the community so users and developers can be the ultimate beneficiaries? (it's good, it just seems is not an alliance made for the users, which it's a bit weird from my perspective)

Disclaimer: I'm from the Enarx project (https://enarx.io) which is associated with the BytecodeAlliance.

I think a good analogy is: wasmer is to qemu as BytecodeAlliance is to rust-vmm.

Wasmer is a general purpose WASM runtime. But BytecodeAlliance is a place to build tooling for the construction of runtimes (WASM or otherwise), including special purpose runtimes. I think there is a lot of space for both in the growing non-browser WASM market.

Agreed! (regarding the non-browser WASM market)

Not sure if I got the analogy, but all I can say is that I'm very excited about the Enarx initiative :)

How does Enarx differ from MesaTEE?

> First of all, congrats on forming the alliance!

No Google on board, it means no Chrome support

What does Chrome have to do with a plugin for my image editor or a command line tool on my server?

Obviously it has with it controls the biggest OTHER use case for WebAssembly (the v8 runtime used in both the browser and Node) and thus can very much influence WebAssembly towards what they like (or slow it down adoption wise etc), even in areas not related with the web directly...

Does this mean we'll also have a mechanism like pledge(2) to assert that the root nanoprocess or any privileged brokers only needs access to certain APIs, and permanently close them?

¹ - https://man.openbsd.org/pledge.2

The WASI and Bytecode Alliance approach is based on capability-based security, granting only the access needed for a module to do its job.

To expand on this, capabilities allow us to go further than pledge(2): it enables selective forwarding of capabilities to other nanoprocesses, such as only forwarding a handle to a single file out of a directory, or a read-only handle from a read-write one, etc...

Capabilities are also more complicated.

I fear that at the end of the day, capabilities will have the same fate as other sandboxing mechanisms: nobody will use them. And, just so that their application works and avoid support burden, developers will tell people to use a setup that enables access to everything.

pledge(2) and unveil(2) learned from the past and are way simpler. I really wish WebAssembly had adopted similar mechanisms.

Agreed, and there are a lot of UX questions to sort out. Many security concepts took many attempts to figure out in full (or to the extent that they have been figured out :))

One important aspect here is that this doesn't just target whole apps. It also targets developers using dependencies: while it's desirable to restrict an application's capabilities, there's a lot of value in developers only giving packages they depend on very limited sets of capabilities. And that seems much more tractable, given that kitchen-sink packages aren't what most people want to use anyway.

This is getting complicated. OS process management, threads and lightweight pocesses, green field process control, virtual machines, containers, sandboxing in n browsers with m different technologies, now this WASM stuff..and orchestrating this all across the cloud and the global internet, ending in homes and corporate machine rooms.

Enverywhere you have to think: who can load/run a module/process and from where, how to authenticate and authorize, which API to give to it, etc...

A historical note:

Bell Labs Plan 9 had a universal OS level solution, that Linux has somewhat adopted, but could not make general enough, partly due to the higher lever ecosystem being stuck to old ways:

- per process name spaces with mountable/inheritable/stackable union directories and optionally sharebale memory (Linux light-weight process, LWP, comes close, it was also historically copied from Plan 9)

- Almosty all APIs (even "system calls") as synthetic file systems (Where do you think /proc came from?)

- which you could mount and access (efficiently) locally or through a secure unified network protocol (9P)

On Plan 9 you could just run different parts of the browser (JavaScript engine, WASM or anything) in a tailored limited LWP with limited mounts as synthetic file system APIs...

Note that Docker kind of retro-fits Plan 9 ideas in Linux kernel to embrace and extend the original ideas of Plan 9...

> Where do you think /proc came from?

UNIX 8th Edition (http://lucasvr.gobolinux.org/etc/Killian84-Procfs-USENIX.pdf) and SRV4 (https://www.usenix.org/sites/default/files/usenix_winter91_f...)

A better analogy would be /dev, but that was already part of Unix from the beginning. Plan 9 is really about per process, user mountable namespaces implemented by 9P-speaking user processes; basically what you said, sans the origin of synthetic file systems and file-like objects.

The primary use case for WebAssembly is malware.[1] We're probably going to regret letting WebAssembly into the browser. Because vendors won't let it be locked down so much that it can't be used for ads and tracking. Which means it has to allow malware.

[1] https://www.tu-braunschweig.de/Medien-DB/ias/pubs/2019-dimva...

Thanks for the reference.

Just so it's clear, the definition of "malware" used here is in-browser crypto mining and obfuscation. Only 1 in 600 of the top 1 million sites use WebAssembly. WebAssembly doesn't actually provide a new vector for malware.

WebAssembly is still very immature. It unsurprising that it has such a low adoption right now.

I think the real problem with webassembly would be if it becomes too popular and starts to become a JS competitor rather than a complement to JS.

WebAssembly is the revenge of Flash/Java/ActiveX, but this time everything will turn out perfect, as per WebAssembly advocacy.

Honestly, given the lack of tooling they have now, it is pretty much perfect from a user's perspective. You are only going to use WASM if you absolutely need the performance. It's just too painful otherwise.

Well, better take care which sites you visit.


Your link basically says that less than 0.01 _percent_ of the top one million websites have webasm cryptocurrency mining. There is no mention of any security flaws. A webasm miner would just eat up a single core while the page is open.

This doesn't seem like much of a red flag to me. If one out of every ten thousand unique sites I visit uses one hypercore while it is opened that isn't going to keep anyone up at night.

On the other hand full video editors, image editors, CAD, 3D content creation programs, silky smooth 3D games, custom video codecs and more have already been made possible due to webasm. Not bad huh?

Not bad at all, for something that has been possible in Java, Flash, ActiveX, PNaCL before.

Thanks to service workers, the miner won't go away when you close the browser, as by default settings (which normal users don't even know they exist) service workers run on their own processes.

You say possible, but where were they? Actionscript never ran at native speeds or even close to them. These things were never seen in Java applets either.

Webasm + webgl is a potent combination. C and C++ libraries can be used directly instead of trying to squeeze fast matrix and vector math out of a Java JIT that needs special wizardry just to avoid heap allocating everything.

And let's not forget that you implied that someone had to be careful of what sites they visited when it is actually a case of 1 in 10,000 sites carrying mild consequences and not security exploits.

I get that you love Java and don't know modern C++, but denying reality doesn't change reality.

Unreal for Flash demo done in 2011.



I am quite up to date with modern C++, in fact more than many regular HNers, thank you very much.

Yes, and it was still unsafe as well.

As security exercise that everyone keeps asking me about, just compile Hearbleed to WebAssembly.

What does heartbleed have to do with webasm security? Webasm can't communicate with without going through javascript.

Hey don't forget Shockwave. That was the thing that totally took over the web that everybody hated before Flash.

One of my next projects is to create a Tcl package for webassembly that will let other extension authors compile their packages targeting WebAssembly and be able to use those compiled binaries on any platform.

That sounds great! I'd love to hear more about that. Are you looking to support extension of Tcl with WebAssembly, or using Tcl inside WebAssembly?

I think wasmtime would likely be a good fit for your use case; you could either use the wasmtime C API, or use Rust to bind to Tcl and to the wasmtime-api Rust crate.

I've found it quite easy to embed wasmtime and run a simple WebAssembly module.

See https://github.com/bytecodealliance/wasmtime-demos for some samples of how to do so.

This is going to be a Tcl extension named "webassembly" that then allows you to load other Tcl extensions that are compiled to WebAssembly.

Right now Tcl loads binary extensions using the "load" command, this will load a shared object (.so, .dylib, .shlib, .dll, .a, whatever your OS supports for runtime loading) and call <ExtensionName>_Init.

My plan is to create a new command named "webassembly::load" that will open a WebAssembly module and provide ... some way ... to call <ExtensionName>_Init as well as some way for the extension to make calls back the ~200 Tcl API functions and some selected other APIs per platform (Solaris, AIX, FreeBSD, Linux, Windows, macOS, HP-UX, etc).

Additionally, a mechanism (probably in the form of an SDK) for Tcl Extension maintainers to compile their extension targeting being loaded by "webassembly::load" with the APIs mentioned above being available. Most Tcl extensions right now are written in C or C++.

Makes perfect sense. I look forward to seeing your work when it's ready! Please feel free to ask if there's anything we can do to help, or if you run into any areas where the APIs could be easier to use.

You may find the witx project (https://github.com/WebAssembly/WASI/tree/master/tools/witx) useful when writing the bindings from WebAssembly to the Tcl APIs.

Real worried when I see phrases like "secure by default" that this will involve some sort of security certificate or formal verification process which a government or other malicious actor can use as a weapon against its enemies.

Is the security limited to sandboxing of the code itself or is there some sort of verification process involved?

The security here is based on sandboxing code and providing limited capabilities. If you're embedding wasm, you choose what capabilities to give the sandbox. For instance, if a game wants to support mods via wasm, it could give the mods APIs to the game world but not to the network or filesystem. A database plugin might have access to interpret a database object handed to it but not exfiltrate data over the network.

We're providing mechanisms here, not identity-based policies.

> For instance, if a game wants to support mods via wasm, it could give the mods APIs to the game world but not to the network or filesystem.

This is, IMO, pretty huge. I'm building a game right now that supports clientside NodeJS mods, and figuring out sandboxing has been a huge pain. Similarly, I've been trying to figure out how to sandbox some of our dependencies at work and in personal projects.

I want to be able to let someone mod my games in any language, and distribute them however they want, while still providing guarantees to my users that the worst a mod can possibly do is maybe freeze your computer or something.

So much of the problems the OP describes ring true to me; it's a very exciting project.

What about fixing the lack of bounds checking when multiple data elements are mapped into the same linear memory block?

This leaves the door open for trying to influence behaviour of C and C++ generated WebAssembly modules, by corrupting their internal state via invalid data.

If you give a sandbox a capability and then there’s a bug in it, there’s always a chance that it will maliciously access those privileged resources. The only way I can see of protecting against logic bugs like these is better tooling.

Yeah, but then one should acknowledge those issues, and not advocate WebAssembly as if there weren't hundreds of other attempts since the late 50's.

They don't mention them because their focus is on other aspects of safety.

Either one is actually serious about security across the whole stack, or not.

It's not like what you propose hasn't been tried before. The main practical issue that I don't see this post address, is the combinatorial explosion that stems from fine-grained sandboxing of any complex application. There are bound to be executable paths through the state space that both pass initial muster and can be used by an attacker to craft a sandbox bypass.

In other words, finegrained sandboxing does not solve the problem. It may be an improvement on the current -dismal- state of affairs as far as ecosystems like pypi or NPM are concerned, but I don't see how it addresses the main issues in any sort of practical, real-world environment.

Something that definitely works is that which security-conscious orgs/teams/persons currently do: ownership and curation.

Ownership implies minimization of 3rd party dependencies.

Curation implies strict quality (incl security) reviews and relentless culling of code that fails them.

The distributed engineering model that you advocate for where code is being pulled-in from hundreds of disparate sources outside of one's control is _fundamentally broken_.

You seem to be arguing that sandboxing is not a security benefit. On the contrary, sandboxing is maybe the security success story of the past decade.

You missed my point, which is not sandboxing.

Webassembly through fine-grained sandboxing promotes software decoherence by amplifying the number of dependencies (since the major downside to working in this fashion is now advertised to be reined in).

When the number of dependencies goes up, combinatorial explosion ensures that the state-space is full of possible attacks. Fine-grained sandboxing does not solve this anti-pattern but can in fact make it a lot worse. You can examine each and every dependency and make sure that its sandbox is kosher but that does not guarantee anything about the interactions and transitive relationships between dependencies. The metasystem is now an amplified (by sheer number of dependencies) state-space that attackers can seek to manipulate.

Since security is a systemic rathen than an isolated affair, the model that the OP advocates for is broken.

You might have to give specific examples with wasm in mind instead of talking about ' combinatorial exoplosions in the state space of the amplified meta system '

System sandboxing (virtualization) yes. "OS" sandboxing (containers) yes. Process sandboxing yes. None of those need or benefit from webasm.

In-process sandboxing, where wasm competes, is, if anything, the security failure of the past decade. JS in browsers has been a constant, never ending battle. And it just hard-failed thanks to spectre.

The idea of everyone rolling their own, hardened syscall interfaces is a straight up terrible idea if security is your goal.

It's not specifically referring to certification or process, it's more about software architecture. It's referring to how the core WebAssembly instruction set has no I/O instructions, so it can't do anything other than what the APIs given to it allow. And with WASI APIs, the goal is for the APIs to have a similar property, where access to external resources are represented by handles, and WASI-level API functions won't let you do anything without being passed a handle.

>So how can you protect your users against these threats in today’s software ecosystem?

>You could subscribe to a monitoring service that alerts you when a vulnerability is found one of your dependencies. But this only works for those that have been found. And even once a vulnerability has been found, there’s a good chance the maintainer won’t be able to fix it quickly. For example, Snyk found in the npm ecosystem that for the top 6 packages, the median time-to-fix (measured starting at the vulnerability’s inclusion) was 2.5 years.

Sometimes it looks like writers think that the average reader is complete idiot. How is that supposed to be example? First they say that it takes long time fix once the bug is found and as illustration they give period starting with introduction of vulnerability?

We can't even get software developers to use sane metrics. Good luck getting Masters of Fine Arts people to use them property to describe software. Especially when we've provided such a stellar example.

I saw a comment about this on HN before, forgot who it was by. But it was interesting, someone mentioned the spec for WebAssembly is generic enough to apply outside of the web. I'm suspecting we'll see languages converting on node.js a la web assembly for back-end logic in your preferred language, but in any runtime this includes NodeJS but also excludes it as we see future runtimes. What's your view on this? Also, Aside from WebAssembly being in every modern browser what do you think will be the next killer feature for WebAssembly?

It would also be interesting to have an embedded WebAssembly plugin runtime, much like Lua is used all over now that you mention all those examples.

> someone mentioned the spec for WebAssembly is generic enough to apply outside of the web

Absolutely. The spec provides a set of instructions and their semantics. Browsers provide a set of common runtime APIs. Non-browser environments can provide the sandbox with any API surface area they want.

> in any runtime this includes NodeJS but also excludes it as we see future runtimes

node.js is working on WASI support, and I'd also expect to see versions of JavaScript that run inside the WebAssembly sandbox. When we say "any language", that includes people who want to run JavaScript.

> Aside from WebAssembly being in every modern browser what do you think will be the next killer feature for WebAssembly?

Shared-nothing linking; libraries that don't have to trust each other with their entire address space. I see WebAssembly as the future plugin interface for any software that wants to be extensible.

> It would also be interesting to have an embedded WebAssembly plugin runtime, much like Lua is used all over now that you mention all those examples.

wasmtime is easy to embed; it only takes a handful of lines to load a WebAssembly file, hand it a few functions of your choice, and run it.

See https://github.com/bytecodealliance/wasmtime-demos for various demos of how to embed wasmtime.

Following up from your comment, if you want to start running WASI modules on Node.js (or in the Browser) today, you can use this npm package! (same API as future Node WASI integration)


> It would also be interesting to have an embedded WebAssembly plugin runtime, much like Lua is used all over now that you mention all those examples.

Completely! Here are some examples on how to embed Wasm in different languages:

* Python - https://github.com/wasmerio/python-ext-wasm

* PHP - https://github.com/wasmerio/php-ext-wasm

* Go - https://github.com/wasmerio/go-ext-wasm

* .Net - https://github.com/migueldeicaza/WasmerSharp

...and many more! (just check the Wasmer repo)

Darn, they are missing java :(

This whole thing is a shot across the bow of Java. Everyone is finally piling on to kill it once and for all.

Wasmtime is being developed to be usable as just such an embedded runtime. We have some demos of this here: https://github.com/bytecodealliance/wasmtime-demos

How can I learn more about how nanoprocesses and the wasmtime sandbox work under the hood? Searching the repo on github for common keywords doesn't turn up much. Are nanoprocesses like Windows picoprocesses somehow, or are multiple "processes" running in the same address space? If so, you can probably exfiltrate data between nanoprocesses with spectre. Additionally, if you get RCE in the wasm JIT (this happens all the time in javascript JITs), there's nothing to stop you from ropping to gadgets to open your own sockets without going through any in-process checks.

> or are multiple "processes" running in the same address space


> If so, you can probably exfiltrate data between nanoprocesses with spectre.

Right, this is mentioned in the article. (TL;DR if this is a concern for you don’t use the sandbox, at least not until someone’s figured out how to implement timing protection.

Been working with web tech since 2000... something about WASM rubs me the wrong way, at least for use in the web. It's probably the lack of human readable source code (I'm not a big fan of minified code for the same reason).

If it wasn't so impossible to work with W3C, I think it would probably make more sense for the web to work towards something like more strict, compilable typescript. Then sites could download the source, compile, and cache.

Stupid question from a non specialist. I see Intel in the list of parties. Does that mean hardware acceleration for wasm?

Among other things, yes; we're contributing to the SIMD support, for instance.

Anything beyond just the 128-bit SIMD yet? Last thing I seen on that was "Long SIMD" from here: https://webassembly.org/docs/future-features/

And that was years ago.

Seems like an awful waste to have many millions of cpus with avx for instance, wasted.

It is meant to be compiled to native instructions, I don't think there is significant room for hardware acceleration.

I am talking from a position of ignorance, but intuitively I am sure the runtime does or will do lots of things, between the sandboxing, the linking, potential garbage collection, etc.

It isn't run through a run time, it is meant to be just in time compiled to native instructions. There isn't any garbage collection.

This is historically the kind of thing that people think is very cool and important, but has turned out to not matter at all. "Languages matter" must be the biggest enduring fallacy of computer programming. It has an obvious corollary in believing processor architectures are important.

The cross-language interoperability of WebAssembly seems practically like a happy side-effect of its design; I don't think it would look much differently at present if it had originally been built to target only C (in a sandboxable way). Most of the article still applies even if you ignore other language support. The first priority for WebAssembly seems to be making things strongly sandboxed.

Actually it fails to properly sandbox C derived languages versus what hardware memory tagging like SPARC ADI and ARM are capable of, because it doesn't do bounds checks inside linear memory blocks.

WebAssembly strongly sandboxes the module from affecting the world outside of it, not from affecting itself. Isn't that the usual use of the word sandbox? The sandbox imposes a boundary between the inside and outside, but it doesn't directly change how things work on the inside.

It might be nice to have features for enforcing memory bounds within a module, but I wouldn't call those sandboxing, or call the lack of those features a deficit of the sandbox.

It is a security deficiency, hence why MSIL that contains certain C++ features is tainted as unsafe by the security verifier.

Some other bytecodes like Unisys ClearPath mainframes follow the same approach when using unchecked bounds access.

WebAssembly folks just hand wave it as not an issue.

To do anything useful across the boundaries, parsing will be involved. Parsing without bounds checking is insecure and will be exploited, period.

Hardware memory tagging is not a requirement for sandboxing. Actually, hardware memory tagging doesn’t even fully restrict out-of-bounds accesses because it’s granularity is usually larger than a byte.

No one is saying it is a requirement, rather that WebAssembly design oversells security by not doing those checks.

Even if they would be at word level, it is already an improvement over the actual design.

It matters a whole lot. There's lots of code written in various languages and it's often prohibitive to use it in your own project because making the languages speak with each other (especially if you have performance concerns) is a lot of work. I don't expect WASM to make this trivial, but to say that it doesn't matter is patently wrong.

I think the point is that your laundry list of features in your language of choice does not really matter when it comes to adoption.

For instance, memory safety does not really matter so much that people would stop using unsafe languages.

Why high level languages should be able to easily speak with each other? Wouldn't this defeat the purpose of having more than one language, since they would be effectively identical?

Wait... I thought the CLR was supposed to do this? Or was it the JVM?

What's old is new again!

Or UNCOL, or Xerox Pilot and Dorado microcoded CPUs, or IBM z/OS and OS/400 language environments, or VMS multi-language backends, or PNaCL, or ....

You forgot the UCSD P-System.

It is included on the .... :)

...That reminds me of a cartoon I saw in an old Apple ][ magazine:

(One kid talking to another, with dad in the background hunched over an Apple ][.)

"Daddy's playing UCSD Pascal. That's the game where you try and see how many dots you can get before it beeps and you say nasty words."


UCSD p-System Program Development (p. 2-4)

While the compiler is running, it displays a report of its progress on the screen in this manner:

    Pascal compiler - release level VERSION
    <   0> ...................
    <  19> .......................................
    <  61> .......................................
    < 111> .......
    < 119> ......................................

    237 lines compiled
    MYPROG ..
During the first pass, the compiler displays the name of each routine. In this example, INITIALIZE, AROUTINE, and MYPROG are the routines. The numbers enclosed within angle brackets, < >, are the current line numbers and each dot on the screen represents one source line compiled.

Can you use wasm instead of regular js/html? I'm thinking for rich GUIs it would be great, even just for company internal sites. I can't find any good examples though.

I have a recent note that says (from HN id=21495338), try makepad.github.io/makepad . My recollection is that this is such an example.

Nice! though confusingly its an editor that run JavaScript pages. https://makepad.github.io/makepad.html

Yes, of course, you can draw anything on a regular HTML canvas and all you need is a thin layer of JS to do some interfacing.

However, I'd discourage you build complex GUI-s in WA. If you do that it means that you have to reimplement everything how the browser works and your end users are accustomed to.

Check out Microsoft Blazor.

Actually this looks great.

Rust has libraries for this.

One pretty annoying thing with WASM is that it is pretty hard to generate due to requiring a structured control flow so anyone generating it must implement relooper och stackifier. Is there any work on solving this issue?


Radare2[1] supports WebAssembly disassembling if anyone is curious about compiled code analysis.

[1] https://github.com/radareorg/radare2

Are those drawings supposed to be clickable? When I do so, it just opens and immediately closes a new tab or window. Firefox 70 on Windows if it makes a difference.

Works for me. Opens a new tab with an image, e.g.:


FF 70 on Arch Linux.

That's weird. In fact it does the same with your link, though I can open it in the same tab. If I try to open it in a new tab or window, it just closes immediately.

Bad adblock filter, maybe?

Yup, that was it. Disabling the ad blocker fixed this.

WebAssembly keeps trying to be the next UNCOL, it seems.

What's the main difference between this and the Java Virtual Machine?

Not in a mood of diving deep into technical details but forgive me for being skeptical. This "secure by design" execution mantra makes me somewhat sick after hearing it so many times over the decades.


... this project prevents others from executing arbitrary malicious code on your computer.

If you want to fiddle with your own computer, you can write your own code to do exactly that

When you use WebAssembly, you're in control of the sandbox. You can pass things into the sandbox and let the module use it. There's no signing/certificate/appstore system tied to WebAssembly.

How is this different from current web browsers for running JavaScript apps? Don't you think there is lockdown now?

>How is this different from current web browsers for running JavaScript apps?

WebAssembly is a thing that web page Javascript can choose to use, and the Javascript is in control of what it exposes to WebAssembly. WebAssembly doesn't grant or somehow take away any privileges to the web page. (Ignoring performance, it would be possible to implement WebAssembly inside of a Javascript library as an interpreter.)

>Don't you think there is lockdown now?

No? Anyone can author and host a website, anyone can view html+js saved on their own machine, etc. Open source web browsers exist, are fully-featured, and are popular.

Spot on. Websites will turn from bloated to extremely bloated. Shipping 5 MB of bytecode will help with:

* Obfuscating dark patterns e.g. aggressive fingerprinting based on hardware, OS, cache contents, timing

* Breaking ad blockers

* Breaking tracking blockers like Privacy Badger

* Preventing uBlock Origin from removing annoying elements

* Preventing scraping

* Offering different prices for goods and services based on the hardware and software (as it happened before)

More downvotes?

Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact