Hacker News new | past | comments | ask | show | jobs | submit login
Extism makes WebAssembly easy (dylibso.com)
136 points by wikiwong on Oct 4, 2023 | hide | past | favorite | 98 comments



The thing I want to achieve with WebAssembly is still proving a lot harder than I had anticipated.

I want to be able to take strings of untrusted code provided by users and execute them in a safe sandbox.

I have all sorts of things I want this for - think custom templates for a web application, custom workflow automation scripts (Zapier-style), running transformations against JSON data.

When you're dealing with untrusted code you need a really robust sandbox. WebAssembly really should be that sandbox.

I'd like to support Python, JavaScript and maybe other languages too. I want to take a user-provided string of code in one of those languages and execute that in a sandbox with a strict limit on both memory usage and time taken (so I can't be crashed by a "while True" loop). If memory or time limit are exceeded, I want to get an exception which I can catch and return an error message to the user.

I've been exploring options for this for quite a while now. The furthest I've got was running Python in wasmtime: https://til.simonwillison.net/webassembly/python-in-a-wasm-s... and running Pyodide inside of Deno: https://til.simonwillison.net/deno/pyodide-sandbox

Surprisingly I've not found a good pattern for running a JavaScript interpreter in a WASM sandbox yet. https://github.com/justjake/quickjs-emscripten looks promising but I've not found the right recipe to call it from server-side Python or Deno yet.

Can Extism help with this? I'm confident I'm not the only person who's looking for a solution here!


The problem you want solved, perfect sandboxing for untrusted code, is only just THE single most important problem in operating system security. If you can solve that then you have the basis of a perfectly secure, unhackable operating system. Anybody claiming to solve that problem at speed in any other software domain can trivially use those same techniques to create a perfectly secure operating system runtime.

So, you have to wonder to yourself, if they can do that why do they not just go and write a unhackable operating system. It is only like one of the single greatest problems of all the commonly used commercial operating systems in what is viewed as one of the most hardcore of software disciplines where solving it would instantly establish you as a supreme software guru. Basically, if you can solve that problem you should make and advertise a unhackable operating system; anything else is selling gold bricks as ballast.

To channel Theo de Raadt of OpenBSD: You are absolutely deluded, if not stupid, if you think that a worldwide collection of software engineers who can't write operating systems or applications without security holes, and then turn around and suddenly write browser sandboxes (originally virtualization layers) without security holes.


The browser has offered this kind of sandboxing for JavaScript for decades at this point.

The reason I'm so excited about WebAssembly for this is that it's not even new technology: it's been supported by widely deployed browsers since 2017.


Browser sandbox escapes from untrusted JavaScript are discovered and exploited regularly. JavaScript is much more constrained than the full force of a low level language like WebAssembly, and they can not even get the JavaScript sandbox safe to run truly untrusted or malicious code. Why would something harder to do work when they can not even do the easier thing?

Unless you are just talking about something meant to handle accidentally, not intentionally malicious code. Then sure, it is probably be okay for that. But if you are actually worried about malicious code then, no, browsers (and commercial operating systems) do not provide that. And anybody suggesting they can do that is almost certainly lying unless they also claim to have developed a unhackable operating system/virtual machine as well.


I know that it's hard, but I'm not ready to agree that this isn't worth seeking answers to.

AWS run untrusted code on Lambda all the time.

Browsers seem to be handling this pretty well in the face of the most untrustworthy computing environment our species has yet developed. Zero days in browsers are big news, and don't happen very often.


If you can sandbox arbitrary malicious code, then you can make a unhackable operating system/runtime. Such a feat is frequently viewed as literally impossible in many software circles and would constitute a extraordinary claim that demands impeccable, extraordinary evidence to support it such as, minimally, mathematical proofs of the entire code base. Nothing less should overcome the sheer ideological inertia behind the common-sense view that everything is easily hacked as has been continuously demonstrated on basically everybody all the time.

So, unless you want to claim Amazon has invented a unhackable operating system to run AWS, has the mathematical proofs of correctness to support such a extraordinary claim, and has just not bothered to tell anyone, claiming AWS can actually securely run untrusted code is pure unsupported bluster. In fact, I bet exactly zero people at Amazon would back up such a claim if pressed, and if even the people doing it think it is impossible then there is no way they are actually doing it. The same goes for browsers.

As to zero days in browsers being big news, they are really not. Zerodium only pays 500 K$ for a Chrome RCE+LPE [1]. That is pocket change. Ransomware attacks ask for millions of dollars per attack these days. They can literally afford to burn multiple Chrome RCEs per attack (if needed) and still come out profitable. The cost of sandbox escape needs to be somewhere around 20-100x higher for it to be viewed as "secure" against the common threats seen every day.

[1] https://zerodium.com/program.html


> AWS run untrusted code on Lambda all the time.

AWS uses virtualization (Firecracker) to provide isolation for Lambda.

WebAssembly vs browser/javascript isolation is a little like virtualization vs operating system level isolation. WebAssembly and virtualization offer far smaller attack surfaces which mean they are far more likely to remain secure in the long term.

Browsers and operating systems are highly complex abstractions and they only remain secure (if you keep them patched) through the large ongoing investment in them.


Webassembly is far more constrained in the browser than Javascript. Exploits are flaws in the implementation that can be fixed, but what is being asked for is an environment that has fewer privileges by design.


Fuchsia solves that.


Yeah, you are going to need to support a claim of solving the biggest problem in operating system security with more than a random assertion.

How about you start by finding a quote by any member of the development team who is willing to support the claim that Fuchsia establishes unhackable separation/sandboxes? If nobody making the software is willing to say that then we can be sure that they did not achieve it.

You can then follow it up by pointing to a mathematical proof that the code enforces separation kernel security properties where the proof has been verified by a competent third party. I will not demand you present or link the proof, all you need to show is that one exists and a credible party assents.

Only then do you have the bare minimum of evidence needed to support a assertion like that.


To run a JavaScript interpreter (spidermonkey, in this case) in Wasm, as well as running that same wasm in a JS engine, you want to look at `jco` https://github.com/bytecodealliance/jco

The component model tooling is getting very close to maturity and will solve many of these problems.


That's really useful. This page in particular: https://github.com/bytecodealliance/jco/blob/main/EXAMPLE.md

Being able to run "jco wit cowsay.wasm" to see what interfaces that .wasm file provides solves a problem I've run into a bunch of times in the past.


you should also check out modsurfer[0]

"modsurfer generate -p cowsay.wasm -o mod.yaml"

Especially for non-component core modules that wont have wit definitions

[0]: https://github.com/dylibso/modsurfer


Hey Simon, I've done some similar experimentation using Extism to sandbox LLM generated code. This uses our JavaScript PDK (which uses quickjs) and can be embedded in any host language we support https://extism.org/blog/sandboxing-llm-generated-code

I'm also working on a Python-PDK right now and hope to have a beta ready in the next month. That would allow us to do something similar with python.


Sandboxing code generated by LLMs is another of my use-cases for this, that's a fantastic link, thanks!


I think I referenced one of your blog post in that article too. So thank you!


As a robust sandbox, have you considered using a micro-VM? Firecracker [1] comes to mind, the VM behind AWS Lambda. It's designed to be lightweight to launch, suitable for running ephemeral code.

While I agree that it'd be nice to be able to use WASM for this purpose, it seems like a microVM might provide a more convenient interface: you can "just" run any existing programming language inside it (without needing any specific support for e.g. WASM). Indeed, you could run multiple processes built with different programming languages together and allow them to communicate in standard ways.

Additionally, VMs offer a number of advantage from a security perspective. Hypervisor VMs take advantage of hardware support, and their surface area is arguably well-hardened and smaller than alternatives (hence why VMs are used for cloud computing).

> I've not found a good pattern for running a JavaScript interpreter in a WASM sandbox yet

Is there a good reason to do this? I thought WASM typically used the V8 JavaScript interpreter as its sandbox and to execute code. If you could launch WASM, couldn't you equivalently launch an instance of V8 with the JavaScript code running inside directly? I do think this is a good question, and it raises further questions like: what if I want to run JavaScript and WASM side-by-side, so that they can communicate with each other and/or with native code.

[1] https://firecracker-microvm.github.io/


Firecracker is a fine technology, but serverless companies have started taking advantage Wasm's faster start-up and invocation times for use cases of running Wasm on the server (https://www.youtube.com/watch?v=yqgCxhPAao0). The deny by default security policy makes Wasm a popular choice to run code in isolation, particularly for maximizing hardware resources in the multi-tenant environments these serverless companies operate.

> Is there a good reason to do this?

One use case to run JS inside a Wasm VM is Shopify Functions. Shopify allows their customers to customize things like checkout flow by writing code compiled to Wasm which gets executed during the checkout process. They want their customers to be able to write JS as well as other languages. https://github.com/Shopify/function-runner

> I thought WASM typically used the V8 JavaScript interpreter as its sandbox and to execute code.

V8 is popular for running Wasm on the web and for some serverless companies, but there are a bunch of serverless, blockchain, and iot projects that use other Wasm runtimes (Wasmtime, WAMR, WasmEdge, and Wasmer to name a few) - https://github.com/appcypher/awesome-wasm-runtimes


I'm working on this problem as well and would be happy to sling you some thoughts and notes. Check my website https://runno.dev and send an email to the address on that website!


I'm building something that solves this exact problem, and I'll have an alpha version ready soon. Drop me a mail at marc@ my username .net, if you're interested. I'd also really like to have a chat about your use cases if you're up for it.

Cheers, Marc


I created https://github.com/dicej/component-sandbox-demo when you asked about this on the Bytecode Alliance Zulip. Curious if you have any feedback on it.


Oh wow, I hadn't checked in on this. Looks like a much more complete method than the one I explored in https://til.simonwillison.net/webassembly/python-in-a-wasm-s...

Thanks! I'll give this a shot.


>When you're dealing with untrusted code you need a really robust sandbox. WebAssembly really should be that sandbox.

We shouldn't have to trust any code. WASM is a stepping stone out of the insanity that is ambient authority.


As another reply says, there's no such thing as perfect sandboxing unfortunately, especially not with the string of CPU side channels that have been uncovered, and continue to be revealed.


> I want to be able to take strings of untrusted code provided by users and execute them in a safe sandbox.

Isn't this the entire pitch of the browser as an app platform rather than a document viewer? Why target it if it's not sandboxed?


The idea of Wasm as a universal plugin system is very promising. But string passing is maybe not the best example to highlight, considering that Wasm is introducing stringref to enable zero-copy string sharing between the Wasm runtime and host language.

https://github.com/WebAssembly/stringref/blob/main/proposals...


I'm working on a WASM project[0] with Andy, the stringref champion, and unfortunately the future for stringref is uncertain. There is now a new proposed alternative called "JS string builtins"[1]. The way we intend to ship support for strings currently is to just use JS string builtins, because it is just a set of imported functions, not new instructions. Those imports can either be provided by efficient, native compiled functions (currently behind a V8 feature flag, not sure if other JS engines support it) or from a JS polyfill. It's not an ideal situation but this will allow for shipping strings that work in all browsers with default settings.

[0] https://gitlab.com/spritely/guile-hoot

[1] https://github.com/WebAssembly/js-string-builtins/blob/main/...


While it'd be a nice addition, I wouldn't expect it any time soon.

It's currently still a stage 1 proposal, while we've been waiting for years for other proposals to be merged. The last time a proposal was actually finished was over 2 years ago.

https://github.com/WebAssembly/proposals

https://github.com/WebAssembly/proposals/blob/main/finished-...


Indeed, webassembly is moving extremely slowly. I started a project years ago expecting https://github.com/WebAssembly/memory-control/blob/main/prop... and https://github.com/WebAssembly/memory64 to be fixed at some point. Neither are yet, and the project still suffers from it to this day.

I think wasm is still great without these fixes, but I have lost confidence in the idea that wasm will reach its full potential any time soon.


https://github.com/WebAssembly/memory64/blob/main/proposals/...

For your second one, it looks like it is already implemented in Chromium and Firefox but not Safari. Sadly, it's not new for Apple to be dragging their feet on moving web standards forward.

It took what seems like a decade to get proper WebRTC support in Safari.


Eventually people will realise its potential is being yet another bytecode format, with VC trying to capitalise products on top of it.


Assembly is not bytecode! V8 already has bytecode for the web. WASM is meant to run faster and more efficient computations.


Dude first learn what you are talking about.

> WebAssembly (abbreviated Wasm) is a binary instruction format for a stack-based virtual machine. Wasm is designed as a portable compilation target for programming languages, enabling deployment on the web for client and server applications.

https://webassembly.org/


I was going to say that everything around interoperability in Wasm has been stalled for years, but hey, looks like garbage collection has reached phase 4! That's a pretty big one!

The component model (aka interface types aka snowman types aka...) is still stuck is phase 1, though. After almost four years, that's not encouraging.


Thanks for sharing!

This is the first time I saw a mention of WTF-8 [1] and WTF-16. From the spec description this seems strange to use this interoperability "hack" (using the word from the spec) as the foundation of the string proposal. I wonder if they could use UTF-8 instead and keep WTF-16 for interoperability with JavaScript.

> WTF-8 [...] is a superset of UTF-8 that encodes surrogate code points if they are not in a pair. It represents, in a way compatible with UTF-8, text from systems such as JavaScript and Windows that use UTF-16 internally but don’t enforce the well-formedness invariant that surrogates must be paired. WTF-8 is a hack intended to be used internally in self-contained systems with components that need to support potentially ill-formed UTF-16 for legacy reasons.

[1] https://simonsapin.github.io/wtf-8/



This and its relatives feel like such a weird positioning choice for a company. If you have resolved to use some client library to do your heavy lifting, you want one focused and hardened on the specific language you are using, not one library of a dozen libraries spuriously maintained by a startup which will most likely eventually deprecate the library if its not used with their Edge hosting service or whatever.

The huge list of languages does not inspire confidence. Like how going to a restaurant where the menu says they have amazing pizza and pho and ice cream makes one think that all three of those items are pretty bad.


this makes no sense at all


A lot of people ask why one would use Extism and how exactly does it help, so I wrote a blog post to explain it detail. I hope it is informative!


If you want to hear more about extism we did an interview with them https://www.devtools.fm/episode/58


thanks for having us on!


If the 5x reduction in LOC to execute a wasm function isn't enough, then hopefully the logo just carries us to wasm nirvana :)


This blog article is great, I had a hell of a time integrating Wasm into Ruby, so much so to the point that I gave up.

I was absolutely going to call a function, pass it some HTML to parse (as a string), and return a number, but what I wound up doing is pass a directory with some HTML file in it, return a number as a string, and parse it. Because after struggling with wasmtime and wasmer until I could see it was not going to perform acceptably with a Ruby interpreter packed into the Wasm module, I found that further, it was seemingly impossible to call a function from in Ruby with any meaningful parameters without going through the system interface to the Wasm.

https://youtu.be/EsAuJmHYWgI?list=PLbzoR-pLrL6prBc8UnTQ9wI3B...

and

https://youtu.be/EsAuJmHYWgI?list=PLbzoR-pLrL6prBc8UnTQ9wI3B...

The article does a much better job of explaining these issues concretely, but if you're totally lost, you can hear my fever-dream version of the same ideas here. (I am a Wasm beginner.)

But if you don't have time to watch, tl;dr: I gave up, my Wasm in the end was called as a WASI, which I passed a filesystem context into rather than attempt to call any function at all, and I parsed the output from the system interface. Not too different than what it looks like the Extism example is doing. I will definitely be going through these docs as it seems likely to help me understand better what I've missed, and moreover that it has Ruby support right on the front page chef's kiss


I've been working on a new Ruby SDK for Extism 1.0 if you want to check it out: https://github.com/extism/ruby-sdk

I added host function support and cleaned up the API a bit.


Great talk, thanks for sharing! Would love to hear if Extism allows you to accomplish this, and if not, what prevented you. We're always down to chat on Discord, but an issue on the Ruby SDK repo will do just fine as well - https://github.com/extism/ruby-sdk/


This is great so far btw! Going to watch this whole talk today.


There's a lightning talk version from GitOpsCon (at GitOpsCon/CDCon Vancouver - co hosted with OSS Summit later in the week)

But they are pretty much the same talk, except at CDCon, I hadn't written the Kubernetes operator so that it actually ran the Wasm module yet. At the end of the talk at OSS Summit, I show it running and I'm so glad you enjoy it! I will definitely check out your new ruby SDK :tada:


Come join our Discord if you get the chance and find me there in the ruby-sdk channel. I've been doing some experiments with Wasm and ruby (particularly around rails and higher level application abstractions) and would love to have another rubyist to bounce ideas off of. Also I had trouble tracking down your email, but if you'd prefer you can reach me at ben at dylibso dot com


really enjoyed the talk btw!


Thanks so much!


Another open Source project locking their community discussions away on Discord where it can't be easily discovered or searched.

At least consider mirroring to the web - or at this stage inform people you will one day make posts available on the open web. Acquiring that consent afterwards is painful.


I prefer to read documentation via Minecraft chat. That way I can craft a diamond pickaxe to smash my head in.


You can run examples in redstone


I've been wanting a bot that would scrape discord conversations into a database and then render them as HTML to be served statically, bash.org-style.


https://www.answeroverflow.com/ roughly does this and can be self-hosted


I really miss forums tbh, what happened to them?

It feels like Discord has replaced the good old forum but it's worse in so many ways.


> I really miss forums tbh, what happened to them?

Spammers and similar bad actors. :(


Most forums were awful phpbb nonsense where it was impossible to find the information you wanted. Search always required logging in and the only interface for very long threads was paging through them 10 badly laid out posts at a time.

No thank you.

Of course modern forums are better. Disqus is ok, and D's forum software is arguably the best thing to come out of the D project.

In any case Discord is not a replacement for forums; it's a replacement for IRC.


Unfortunately, it tries to also be a replacement for forums, with the worst of both worlds (the lack of real time, AND the lack of discoverability).


None of the phpBB-like forums that I am aware of (phpBB, SMF, vbulletin) require login to search.


I recall running into the odd one or two like that - I have vague memories that it was because searching was _slow_ and it didn't take more than a few people using it to bring the site to its knees. Restricting it to logged in users helped, and I think I saw some sites restrict searching to only post bodies for a period too, for similar reasons.

I haven't run into it recently, but I have at least vague memories of it happening in the past.


What makes the D-language forums so good? Never used them. (https://forum.dlang.org/)


Just have a browse. It's so fast! Also very clean, not stupidly sparse layout. Proper markdown support. I believe it also acts as a mailing list/newsgroup somehow but I'm not sure on the details of that.

Mainly it's just so fast and clean.


> Search always required logging in

I always used a search engine to search. Site search is usually worse than Google (is/was)


hear ya loud and clear -- and we use multiple forms of communication.. GitHub issues and discussions are actively suggested when we find that a conversation is turning into something that would benefit others and move from Discord.

Discord is really advantageous for us to have real-time conversations with people though.


Note to other maintainers: GitHub does have a nice Q+A feature. Use that, not Discord!

https://github.com/features/discussions


Not really a real-time chat replacement thing, so it's not a suitable replacement.


That's a feature. Promote discussions, not chit chat.


we also like Discord _for_ the “chit chat” though. getting to know the community that’s growing around your project is one of the most enjoyable aspects of open source IMO!


My point is that people WILL use your Discord as a forum and it WILL accumulate valuable knowledge.

This knowledge should be mirrored to the web so it's discoverable and archivable. And to do that you need consent - you should notify users now even if you don't implement it now.


Is there some service available for mirroring discord chats the web?



Looks excellent, thanks!


Zulip has a read only view available for the web too.


Oh cool, thanks!


I used https://www.answeroverflow.com/ which can be self-hosted.


Very useful, thank you!


Iis there any problem with that? Discord is nice to use, free and most people already have it.


1. Discord search is very poor

2. Discord will go away one day and so will all the accumulated knowledge

3. People who aren't directly looking for your project won't find it. It's will have no search footprint. You have to actively seek out the Discord to discover that that's activity there.

"Most people" don't already have it. Only a fairly specific handful of demographics.


> "Most people" don't already have it. Only a fairly specific handful of demographics.

This just shows how in-a-bubble people can be. I'm trying to think of a single person outside of my hardcore gamer friends that I know who would have Discord - drawing a blank.


I was talking about "most people" in the IT bubble as well


The problem is that information in Discord is locked away and is completely inaccessible.

As an example. An Elixir library I'm using, Ash, had its support forum on Discord. Just recently they moved it to a proper forum. Here's the immediate result [1]:

--- start quote ---

Moving to ElixirForum just benefited me. I googled elixir ash registry to understand what it is and if I still need it (I ended up with it from an example somewhere). The top result was this excellent question and answer

--- end quote ---

A forum is always nicer, and is accessible to more people.

[1] https://elixirforum.com/t/ash-community-updates/58515/9


There are several problems with Discord; see https://mastodon.derg.nz/@anthropy/110922638711307077


can't view without a phone number at times


Discord doesn't seem to need a phone number.

Are you thinking of Telegram, which does?


it depends. if your account gets flagged by some ML algo then they can ask for a number for verfication.

it's similar to meta where some flagged accounts have to add an id


This is an ad. Just use WASI.


"Just" use WASI is not useful advice. I've been trying to do that for my own purposes for over a year. The learning curve on that (as a Python programmer who wants to use WebAssembly for sandboxing) is practically a vertical wall.


I should have said "Just use any WASI compliant runtime". There's a ton of them with these features, and none of the commercial angle.


They have all proven extremely difficult to use for my sandbox case - see other messages in this thread.


Can you link to a similar example using WASI?


Take your pick of runtimes. Wasmtime is the most popular right now.

https://github.com/bytecodealliance/wasmtime/blob/main/docs/...


Thanks. I'm not sure how this maps to the original example, though. This is about compiling a standalone program which translates POSIX apis, it seems? How about calling it from Rust with string args etc?


Exactly. That's what Extism is trying to solve. That WASI post doesn't show at all how to use the WASM code from C, or vice versa, because it just compiles the entire C program to WASM which uses the POSIX-based WASI API. If you want an alternative to Extism, you need something like wasmer.io, not just wasmtime (Extism actually uses wasmtime as mentioned in the post).


Right. The point being that you can take your pick of WASI compliant runtimes which have those features. Extism isn't something special compared to any other open source project in this domain, yet it smells of a commercial venture. Count me out.


>Extism isn't something special compared to any other open source project in this domain

Well, ok, can you give me an example of some other project doing this sort of interop without manual memory mapping?


WASI is great and obviously Extism supports it as a superset of functionality - but you don’t always want to give your guest code access to system resources even in a limited environment.

Extism also offers a bunch of other features that you don’t get with WASI. But use what’s best suited for your needs!


Claiming to make a thing that is already easy easy is a whole business model.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: