This article is very correct: Wasm has a code size problem. This is a problem in browsers because all that code has to be downloaded to start the site. It's also a problem for serverless architectures, where code is often loaded from cold storage to a specific server on-demand while a client waits.
Tree-shaking might help, but I feel like it's only an incremental optimization. Fundamentally the reason Wasm programs are so bloated is because they have to bring their whole language runtimes and standard libraries with them. In contrast, with JavaScript, the implementation and basic libraries are provided by the browser. But obviously the browser can be expected to have language runtimes for every language pre-loaded...
... or... could it?
I think we need to consider another approach as well: Shared libraries and dynamic linking.
WebAssembly supports dynamic linking. Multiple Wasm modules can be loaded at the same time and call each other. However, many Wasm toolchains do not attempt to support it. Instead, they are often design to statically link an entire program (plus language runtime) into a single gargantuan module.
Pyodide (CPython on Wasm) is a counter-example. It is designed for dynamic linking today. This is precisely why Cloudflare Workers (a serverless platform) was recently able to add first-class support for Python[0]. (I'm the tech lead for the overall Workers platform.) A single compiled copy of the Pyodide runtime is shared by all Workers running on the same machine, so it doesn't have to be separately loaded for each one.
If dynamic linking were more widely supported, then we could start thinking about an architecture where browsers have various popular language runtimes (and perhaps even popular libraries) preloaded, so that all web pages requiring that runtime can share the same (read-only) copy of that code. These runtimes would still run inside the sandbox, so there's no need for the browser to trust them, just make them available. This way we can actually have browsers that have "built-in" support for languages beyond JavaScript -- without the browser maintainers having to fully vet or think about those language implementations.
> browsers have various popular language runtimes (and perhaps even popular libraries) preloaded, so that all web pages requiring that runtime can share the same (read-only) copy of that code.
That sounds a lot like the idea from some years past that commonly used JavaScript frameworks would be served from a few common CDNs and would be widely enough used to be almost always in cache in the browser, and therefore won't need to actually be downloaded for most pages (hence, the size of the js frameworks shouldn't matter so much)
I'm no expert but from what I understand, that didn't really work out very well. A combination of too many different versions of these libraries (so each individual version is actually not that widely used), and later privacy concerns that moved browsers toward partitioning cache by site or origin. Maybe other reasons too.
Of course, you didn't mention caching and perhaps that's not what you had in mind, but I think it's a tricky problem (a social problem more than a technical one): do you add baseline browser support for increasing numbers of language runtimes? That raises the bar for new browsers even further and anyway you'll never support all the libraries and runtimes people want. Do you let people bring their own and rely on caching? Then how do you avoid the problems previously encountered with caching JS libs?
These are good questions and I think there's more than one answer that's worth exploring.
I think that the privacy problems caused by shared caches could be solved, without simply prohibiting them altogether. Like, what if you only use the shared cache after N different web sites have requested the same module?
But if we really can't get around that problem, then I think another approach worth exploring is for there to be some sort of curated repository somewhere of Wasm modules that are popular enough that browsers should pre-download them. Then the existence of the module in a user's browser doesn't say anything about what sites they have been to.
Versioning is a problem, yes. If every incremental minor release of a language runtime is considered a separate version then it may be rare for any two web sites to share the same version. The way the browser solves this for JavaScript is to run all sites on the latest version of the JS runtime, and fully commit to backwards compatibility. If particular language runtimes could also commit to backwards compatibility at the ABI level, then you only need to pre-download one runtime per language. I realize this may be a big cultural change for some of them. It may be more palatable to say that a language is allowed to do occasional major releases with breaking changes, but is expected to keep minor releases backwards-compatible, so that there are only a couple different runtime version needed. And once a version gets too old, it falls out of the preload set -- websites which can't be bothered to stay up to date get slower, but that's on them.
This is definitely the kind of thing where there's no answer that is technically ideal and people are going to argue a lot about it. But I think if we want to have the web platform really support more than just JavaScript, we need to figure this out.
I think a better model would be for the site itself to provide the modules, but the browser will hash and cache them for the next site that may want to use the same module.
This way, there's no central authority that determines what is common enough.
This model does not allow for versioning. For this model, it would be risky to allow it (one website could provide a malicious model that infects the next site you visit).
> Like, what if you only use the shared cache after N different web sites have requested the same module?
That would still let websites perform timing attacks to deanonymise people. There's no way to verify that "N different websites" isn't just the same website with N different names.
Though, we could promote certain domains as CDNs, exempt from the no-shared-cache rules: so long as we added artificial delay when it "would have" been downloaded, that'd be just as safe. We're already doing this with domains (HSTS preload list), so why not CDNs?
Web browser developers seem to labour under the assumption that anyone will use the HTML5 features they've so lovingly hand-crafted. Who wants something as complicated as:
<details>
<summary>Eat me</summary>
<p>Lorem ipsum and so on and so forth…</p>
</details>
<Accordion>
<AccordionSummary id="panel-header" aria-controls="panel-content">
Eat me
</AccordionSummary>
<AccordionDetails>
Lorem ipsum and so on and so forth…
</AccordionDetails>
</Accordion>
Maybe the problem isn't the libraries. Maybe the problem is us.
The problem is the libraries. Browsers are still mostly incapable of delivering usable workable building blocks especially in the realm of UI. https://open-ui.org/ is a good start, but it will be a while before we see major pay offs.
Another reason is that the DOM is horrendously bad at building anything UI-related. Laying out static text and images? Sure, barely. Providing actual building blocks for a UI? Emphatically no.
And that's the reason why devs keep reinventing controls. Because while details/summary is good, it's extremely limited, does not provide all the needed features, and is impossible to properly extend.
It seems like limited dynamic linking support could go a long way.
For example, there could be a Go shared library that includes the runtime and core parts of the standard library that many programs use. It would decrease the size of all Go programs, without needing to have dynamic library support within an app. The language runtime might not need heavy optimization for space. It’s already loaded, and as long as any program uses a function, it’s not wasted space.
It changes the cost model for optimizing programs in that language for space. Since included standard library functions are free (if you’re using the language at all), you might as well use them.
Though, the problem reoccurs with commonly used libraries and frameworks. You’d also want Cloudflare’s standard library for Go to be shared when running on Cloudflare.
One problem with this model is that languages don’t evolve in lockstep with the runtime. Either there would be limited support for different versions of a language, or the shared libraries available would pile up over time, resulting in limited sharing between apps. JavaScript has the “you don’t get a choice” versioning model, which requires strong backward compatibility and sometimes polyfills. It might not be as suitable for other languages.
When a runtime really wants to cut down on space, it can be done by limiting plugin diversity. Though there are complaints, “you must use JavaScript” worked out pretty well for browsers.
Maybe we don’t need a lot of different WebAssembly-based languages? It’s a tower of babel situation. Diversity has costs.
Could it be possible to do "profile guided tree-shaking" to build a small module with all the code that's necessary for the application and pull-in less used functionality on-demand using dynamic linking?
If tree-shaking was done based on production information it may be possible to prune a lot of dead/almost-dead code without having to implement sophisticated static analysis algorithms.
A lazy chunked delivery strategy like used in the k8s stargz-snapshotter[0] project could be effective here, where it only pulls chunks as needed, but it would probably require wasm platform changes.
There is a substantial risk there unless you can hit all the edge cases and error conditions when profiling. Even a good fuzzer can miss a very rare state. Then when you hit it in real use there's no code to handle it!
Profile-based optimization and JITting is plausible because the corner cases are still there, just not optimized.
I completely agree, that's why in that case you could download the missing code from the server and load it using dynamic linking.
The server would then mark it as reachable so it's delivered as part of the main bundle next time.
I would expect the bundle to converge quickly to the set of functions that are actually reachable.
Aditionally, it's very likely that the sets of reachable code of two versions of the same app have significant overlap, so the information collected for version N could be used as a starting point for N+1, and so on.
> we could start thinking about an architecture where browsers have various popular language runtimes (and perhaps even popular libraries) preloaded
that could potentially lead to hundreds of versions of runtimes downloaded in the browser, filling up the cache with binaries that might be used by 1 site each
I think I agree overall, just want to point out that with Wasm, you still end up using a fair bit of the built-into-browser js to accomplish things not purely computational. Especially in this context with Hoot [1], where things like appendChild are external functions you call inside the scheme. One could theoretically do this for much of the js standard library in any kind of wasm context.
Indeed, I/O APIs (anything that talks to the outside world) are another sore point for WebAssembly, as browsers do not currently expose any particular APIs directly to Wasm, only to JavaScript. So Wasm has to make calls to a JavaScript middleman layer to use those APIs.
But browsers are understandably hesitant to create a whole parallel API surface designed specifically for Wasm callers. That's a lot of work.
I am not totally convinced that this is a real problem, vs. just something that makes people feel bad. Like, if you are coding Rust, the idea that all your "system calls" are calling into a layer of JavaScript feels disgusting. But is it a real problem? Most of these calls are probably not so performance sensitive that this FFI layer matters that much.
If it is a real problem, I'd guess the answer is for browsers to come up with a more efficient way to expose WebIDL-defined APIs to Wasm, but without reinventing any individual APIs. Being derived from WebIDL, they are still going to have JS idioms in their design, but maybe we can at least skip invoking actual JavaScript.
Lot I don't know about how browsers are shipped, but it seems to me like browsers could easily get away with packing in a few languages and their STLs as part of their default installs. Python is what, 25MB? Would another couple hundred megs of disk space be such a big deal?
Possibly – if you can find a single version of Python that everybody will be happy with, forever.
Being able to cache runtimes and libraries like that across sites would be nice, though (but probably enables fingerprinting, so one Python runtime per origin it is).
The way Emscripten does it, IIRC, doesn't require any special browser support. The toolchain generates glue code in JavaScript to support calls between dynamically linked Wasm modules.
Do you happen to know where can I check out the cutoff version for each browser? https://caniuse.com/?search=wasm doesn't have it (or other things like WasmGC for that matter)
I believe dynamic linking has been a core feature of WebAssembly from the beginning. You have always been able to load multiple Wasm modules in the same isolate and make them call each other.
(But, language toolchains have to actually be designed to use this feature. Most aren't.)
Tree-shaking might help, but I feel like it's only an incremental optimization. Fundamentally the reason Wasm programs are so bloated is because they have to bring their whole language runtimes and standard libraries with them. In contrast, with JavaScript, the implementation and basic libraries are provided by the browser. But obviously the browser can be expected to have language runtimes for every language pre-loaded...
... or... could it?
I think we need to consider another approach as well: Shared libraries and dynamic linking.
WebAssembly supports dynamic linking. Multiple Wasm modules can be loaded at the same time and call each other. However, many Wasm toolchains do not attempt to support it. Instead, they are often design to statically link an entire program (plus language runtime) into a single gargantuan module.
Pyodide (CPython on Wasm) is a counter-example. It is designed for dynamic linking today. This is precisely why Cloudflare Workers (a serverless platform) was recently able to add first-class support for Python[0]. (I'm the tech lead for the overall Workers platform.) A single compiled copy of the Pyodide runtime is shared by all Workers running on the same machine, so it doesn't have to be separately loaded for each one.
If dynamic linking were more widely supported, then we could start thinking about an architecture where browsers have various popular language runtimes (and perhaps even popular libraries) preloaded, so that all web pages requiring that runtime can share the same (read-only) copy of that code. These runtimes would still run inside the sandbox, so there's no need for the browser to trust them, just make them available. This way we can actually have browsers that have "built-in" support for languages beyond JavaScript -- without the browser maintainers having to fully vet or think about those language implementations.
[0] https://blog.cloudflare.com/python-workers