Hacker News new | past | comments | ask | show | jobs | submit login
Firefox's low-latency WebAssembly compiler (wingolog.org)
374 points by robin_reala 10 days ago | hide | past | web | favorite | 99 comments

What's the latest with WebAssembly? I must admit, I'm not fully sure I understand what its goals are and what we should expect its usecases to be.

For a time there seemed to be a lot of hype around writing SPAs in whatever language you like, not just JavaScript, but that seems to have cooled off somewhat - at least that's my impression.

Last time I checked in it seemed to be actually quite the opposite - good for low level stuff, data processing, graphics, things in other threads essentially, but if you want to touch the DOM at all just use JS seemed to still be the standard advice. I haven't looked again since. Have things progressed on that front, and if not was that ever really a goal of WebAssembly anyway?

One cool application that cropped up in the Rust ecosystem is to use WASM for procedural macros. That way not only can they be distributed as precompiled (arch-agnostic) blobs from crates.io, but you also get all the benefits of the sandbox environment to guarantee freedom from malicious side effects. The W in "WASM" doesn't have to limit possible applications to just the web browser.


And once there is a GC, it will be a good target for script languages on the OS, as wasm compilers are optimized for fast starting.

> but if you want to touch the DOM at all just use JS seemed to still be the standard advice

This is still the current state of WebAssembly. There's several proposals out there that, if implemented, could change the situation. The primary one that I'm aware of is the interface types proposal [1].

[1] https://github.com/WebAssembly/interface-types/blob/master/p...

And Garbage Collection. Without it, you can't touch DOM IIRC.

The goal: Allow a uniform execution context (compile target) in the browser via a standard byte code. That means executing an application written in any language, provided a corresponding compiler, as an island in a web page.

In practical terms the language of the browser is JavaScript, but what if JavaScript doesn't do what you need, or it isn't fast enough, or you simply don't like JavaScript. Instead you want to write some application in Rust (just using that as an example). The language of the web is still JavaScript and not Rust so your new application will not run directly in a web page. That's unfortunate, because the web is a fantastic distribution platform because things that run in web pages do not require a separate installation step or a separate download step, since web page assets are downloaded automatically as called by the web page. If you compile your Rust application to WASM it can run in a web browser.

To be clear though WASM isn't replacing JavaScript or even in any way competing with JavaScript, because that is certainly what it sounds like. WASM is a separate and isolated execution instance in the browser, even from the page that contains it. JavaScript, on the other hand, executes directly in the context of the page that requests it.

You might hear some discussion of wanting WASM to replace JavaScript, but that is clearly stated by the WASM project as not a design goal. You might also hear that if only the containing page's DOM were available to WASM it would replace JavaScript, but that isn't a design goal either. That part about the DOM is the confluence of a misunderstanding about the DOM and about WASM's security goals of a separate execution context. There is work, however, on providing some Web API capabilities to WASM so that a WASM instance may have some awareness of its execution context or to allow some manner of external interaction of the WASM instance from the surrounding page. I don't know the status of that work though.

* WASM project site - https://webassembly.org/

* Web APIs, a list of de facto standards for interacting with web browsers from JavaScript - https://developer.mozilla.org/en-US/docs/Web/API

* DOM specification, the DOM is a language agnostic dynamic tree model that represents markup and containing content - https://dom.spec.whatwg.org/

> You might hear some discussion of wanting WASM to replace JavaScript, but that is clearly stated by the WASM project as not a design goal.

Unfortunately, that appears to be a huge miss. There are a lot of programmers in other (non-JS) languages that would very much like to create front end pieces for their web applications.

Hopefully when the interface types proposal (or whatever) happens, direct access to the DOM does become a thing and non-JS programmers really can do their front end work without JS.

Nobody from the WASM working group is working on that though, so the probability of direct access to the page’s DOM is close to zero. Programmers have been hoping for this for years but WASM has been very clear that isn’t in the project’s design path. The are intentional design objectives limiting that work officially but likely the largest hurdle is overcoming various security and privacy concerns already in place due to the isolation of WASM instances from their containing page.

Far more likely, and this already exists in various unofficial examples, is that WASM instances ship with a DOM for accessing markup provided by or requested into the WASM instance.

> ... WASM instances ship with a DOM for accessing markup provided by or requested into the WASM instance.

Sounds like it would achieve the same end goal? eg direct access to the page elements, without needing to go through JS

Or is that my not really understanding the problem space? :)

No, a DOM that is provided by a WASM instance will provide access to markup contents available to that WASM instance, such as markup requested or shipping with the WASM code, but not outside the WASM instance such as the containing page.

Interesting. What's the intended use case for that?

Trying to think of something, but nothing is coming to mind.

If your WASM application makes use of any kind of XML or HTML content you need to parse it or it’s just a string.

No worries, thanks. :)

Check Qt, Blazor or Uno.

Thanks, but Qt isn't really suitable for web apps. :(

eg PgAdmin, which went from a good desktop application (v3) to unusable in non-trivial cases on the desktop when they switched to Qt (web). :(

Blazor / Uno seem to be C#? Not (personally) going to touch that with a long stick. ;)

While replacing JavaScript might not be a design goal, the experience on Android and iOS shows what happens when there is a native layer for other eco-systems to plug into the platform.

For me, WebAssembly will be the revenge of browser plugins.

We already have Flash, Java applets and Silverlight being made available on WebAssembly, even if not at same feature level as their former selfs.

Without the security holes, hopefully.

Well, given that WebAssembly has decided that doing bounds checking inside a linear memory segment, or memory tagging wasn't worth the effort...

Meanwhile real hardware is adopting memory tagging (with exception of x86/x64).

I’m well aware of hardware support for enforcing memory safety, but that’s not quite what I’m talking about here: browser plugins have historically been poorly sandboxed, while WebAssembly should be an improvement in this area.

Which I doubt, given that it depends on what is being compiled into WebAssembly anyway.

It is really nothing new here, just the politics to have a bytecode execution platform that everyone agrees on.

For example, if I can replicate a Flash security flaw with CheerpX, should I blame Flash, CheerpX for being a good implementation, or WebAssembly for allowing Leaning Technologies to faithfully re-implement Flash?

Memory tagging or inner bound checks have no effect on sandboxing. The wasm guarantee is that a module cannot escape it's sandbox, not that it is logically correct.

the security model of the web site based on executing arbitrary code, so you would need to assume that the wasm you receive tagged it's memory in the most malicious way possible anyway.

the way I see it real hardware needs this kind of checks more because they have worse sandboxing

Also, Flash vulnerabilities were about escaping the execution sandbox of the web, so you should blame bad the implementation/specification of wasm in your case.

AFAIK WebAssembly can't do more harm than JS since it lives in the same sandbox, so that's not a concern? You're basically running Flash in a small VM.

I think the ultimate goal was getting GIMP to run in the browser.

Javascripters aren't going anywhere anytime soon

It would be better if GIMP ran nowhere tbkh

May I ask why

AFAIU main point of WebAssembly is to provide a performant way to use existing C/C++ libraries for some functionality that's won't work fast enough with JavaScript. Current JavaScript engines are very tricky when it comes to optimize JavaScript while WebAssembly is expected to execute in a more obvious way in regards to performance.

So if you want to implement cryptography or codec, WebAssembly is a good tool. If you just want to write some browser code, JavaScript is enough.

Another use of WebAssembly probably is to serve a better compilation target. While you can compile anything to JavaScript, it makes more sense to compile to WebAssembly if you don't need GC. So with enough time there will be more languages available for web application development. That said, without GC support in WebAssembly, for most interesting languages it's still not usable.

Even with GC it won't be a great target for pre-existing languages. Whatever WebAssembly's future GC semantics, I can guarantee you they won't be the same GC semantics as for Go, Java, Lua, Perl, Python, or Ruby because they all have different semantics--semantics which interleave with other design choices in those languages. Bridging those semantics will either require similar workarounds as currently required, or modifying the behavior of the language.

WebAssembly with a deep GC model will at best be a slightly better environment for writing JavaScript extensions and at worst a way to write really obtuse JavaScript code.

What do you mean by GC semantics? Like when finalizers are run, what's a value type vs a reference type, or? I would imagine any code which relies on the semantics of a GC implemnentation is pretty broken.

In the Ruby community we discourage anyone from doing crazy stuff like using Objectspace.id2ref

The most obvious semantic is persistence or lack thereof. Java and JavaScript can use a copying collector because addresses aren't exposed. JNI/JNA is the exception, though the JVM has a particular model for pinning memory that is in design chummy with JVM's GC model. All the others, AFAIK, use persistent memory, usually for the FFI semantics. Presumably WebAssembly would end up with a persistent model, as well, if the GC model went deep enough. But right off the bat you're losing a unique option available to some very high performance VMs--compaction and defragmentation. (Notwithstanding the Go paper from a few years ago that shows fragmentation isn't a serious issue these days, or the fact that WASM's singular, linear heap and 32-bit addressing is far more limiting--which on second thought actually might make compaction even more desirable.)

Less obvious (though as you pointed out) are things like finalizers. In Lua finalizers are guaranteed to run in the reverse order of allocation, and that's an important guarantee when writing C modules. Finalizers in Lua can also resurrect dead objects, which isn't always the case in other languages, and indeed some languages don't even support finalizers at all, which simplifies their GC model. Lua also has an optimization/simplification constraint that finalizers must be attached at object construction (specifically, metatable assignment). If WASM copied something like this, it might disadvantage other languages; if it didn't have such a constraint then the GC implementation might be unnecessarily complex relative to Lua's needs.

Another example would be support for ephemerons, which are more powerful than weak references. Only a few languages support this, such as Lua, Smalltalk, and more recently OCaml and JavaScript. AFAIU ephemeron support bleeds into many aspects of the GC. Supporting that in WASM might be tricky and, I'd bet, not something likely to happen--you can't just hoist v8's or SpiderMonkey's GC engine into WASM wholesale, even though I suspect the prospect of doing that motivates some of the interest in adding deep GC support to WASM.

If and how WebAssembly's GC model supports these constructs matters greatly to whether other languages can run as first-class citizens in the WASM VM. As an example for the headaches caused when support is lacking, think about exceptions. the standard C API setjmp/longjmp is ridiculously slow in WASM, yet it's how Perl and Lua (and possibly other languages) implement exceptions. Exceptions are on the roadmap for WASM, but they look to be more tailored to Java/C++ exceptions, which iteratively unwinds the stack; Perl and Lua don't need to unwind the stack because they carefully maintain and preserve internal state consistency and can jump across deeply nested function invocations. That's hardly a blocker, but in practice it's incongruent and I would expect Perl and Lua to either suffer for doing things differently, end up having to rearchitect their VMs to suite WASM, or more likely motivate the emergence of a WASM-specific VM (e.g. Iron Python on .NET), which all but guarantees second-class status, either for the new or old project.

None of these are deal breakers. You can synthesize behaviors and hack your way around all of this. My point is merely that the prospect of a glorious execution model that unifies the diverse world of languages is naive. Languages are designed and implemented around a core set of primitives and concepts; and many of these primitives and concepts either won't exist in WASM, or will behave so differently as to be effectively useless, especially when cross-compiling existing implementations.

> Presumably WebAssembly would end up with a persistent model, as well, if the GC model went deep enough.

That's not going to happen. The primary point of WebAssembly GC support is for integration with the host, with GC for the guest program secondary. And the biggest WebAssembly host is the browser, which as you note uses a copying collector. A design for WebAssembly GC that doesn't allow that is a non-starter.

Because of this prioritization, extensions like finalizers or weak references aren't even under consideration for the initial version: https://github.com/WebAssembly/gc/blob/master/proposals/gc/O...

> My point is merely that the prospect of a glorious execution model that unifies the diverse world of languages is naive.

Hardware already provides this; the whole idea behind WebAssembly and the reason it can support C so well compared to most other VMs is that it exposes a more hardware-like execution model. Implementing your own GC if WebAssembly's isn't suitable should not be considered a hack, just the usual bundling of the runtime that happens on hardware.

CLR supports C and C++ quite well.

The CLR is not "most other VMs." You might say it's the exception that proves the rule. ;)

Further, as I've pointed out to you before, the CLR does not really support unmodified C and C++ as well as WebAssembly does. It supports C++/CLI instead, which is not a superset of standard C or C++. And worse, the main (only??) compiler for C++/CLI has no plans for keeping newer features of C++ working on the CLR.

In the context of this discussion about integrating GC into a VM that supports C and C++, the CLR is an odd choice of example anyway. The CLR doesn't aim to support existing languages unmodified, every port I've seen has been similar to C++/CLI in diverging from its original definition, specifically because of its memory and object models. And those are precisely what WebAssembly does away with.

WebAssembly also does not support unmodified C and C++.

What do you call emscripten and the language extensions it makes use of, then?

One of the major features for .NET Core 3.0 was C++/CLI support. It is a major .NET language in Windows desktop applications, as it is more convenient than writing P/Invoke annotations.

Also CLR is just the example more people nowadays can relate to, I can give other C and C++ bytecode examples.

You don't have to use emscripten or its language extensions to build C or C++ for WebAssembly. And in the other direction, emscripten's raison d'etre is unmodified C and C++, by providing the platform APIs those unmodified programs use.

.NET Core 3.0's C++/CLI support, again, is not standard C++ and never will be! It's just the same C++/CLI on a more up-to-date CLR.

So then help me here, I am kind of lost, in which ISO C page are EMSCRIPTEN_BINDINGS, EXPORTED_FUNCTIONS described, or WebIDL for that matter?

As I originally pointed out, the difference between C++ and C++/CLI is that C++/CLI is not just a superset of standard C++. Things are changed or missing, not just added. You can't just take a pre-existing standard C++ program and drop it in a C++/CLI compiler because you'll hit all those differences.

(Things may have been better in the past, but the C++/CLI spec has not kept up with the C++ standard at all and so the main implementation, when asked for "C++17 and C++/CLI," just kind of shrugs and makes a mess.)

This is not true of emscripten. It does add some Javascript binding functionality, but that's no different from e.g. Linux or Windows offering their own platform APIs beyond the C++ standard library. You can just take a lot of pre-existing standard C++ programs, often including those that use platform-specific APIs, and drop it in emscripten!

CLR and wasm share many design goals

Yeah, except that usually WebAssembly is placed as being innovative in supporting C and C++.

Bytecode for C and C++, is even older than the CLR, and before the browsers, there were a couple of J2ME competing stacks that used C and C++ instead.

Again, the innovative part is the embeddability of the bytecode/VM. The CLR is not designed to run malicious code and still be safe (It had very interesting work on that with Midori, still).

The innovative thing with wasm is half political (that is it is socially free to implement and use) and half the lightweight reliable sandboxing.

It is perfectly possible that there is a subset of the CLR that would have been a better Web-Assembly than this WebAssembly, but the CLR chose different trade-offs so it isn't.

One of WebAssembly main distinguishing features is that it can compete with eBPF in term of safe embeddability.

WebAssembly is indeed innovative, especially in the field of security. Another great concept is normalized control flow graph of WebAssembly byte code - it significantly simplifies the job for interpreters and JIT compilers at run time.

But WebAssembly (wasm32) code density is not the world's best. Comparing to good old x64 it goes at 1.42/1.0 ratio which is still tolerable. I hope someday we'll get a more dense instruction set like ARM's Thumb.

So what does it offer versus CLR, IBM i and z/OS security models?

> My point is merely that the prospect of a glorious execution model that unifies the diverse world of languages is naive.

Well, the native machine does that…

> So if you want to implement cryptography or codec, WebAssembly is a good tool.

For cryptography specifically, the security considerations are the same as for JS: all of your security promises die if your threat model assumes that you cannot trust the security of TLS.

WASM can be MITM’ed just as well as JS.

I've been messing a lot of with Blazor (an implementation of wasm support for .NET) and it was possible to write decent CRUD stuff with it, but tooling support isn't great yet

It's tricky to develop "standard" frontend when support for hot reload isn't "by default","dll's" size is far from being small and having to perform workarounds in order to manipulate DOM.

I think it needs to mature - both wasm and blazor because all of those issues were said to be addressed

Here is the simple answer: as W3C stated, it is a new "language" for the web [1]. So you get alternative language options other than JS. That's it.

Admittedly, The current state of WASM is somewhat limited. For example, it cannot touch DOM directly without JS now. One of the goals of WASM is that making it more browser-native module (just like ES6 modules) to remove such JS dependency. This goal will take years I guess.

Of course, things make people confused as many guys are experimenting with WASM outside of the browser. But this is the initial goal in the web browser context.

[1]: https://www.w3.org/2019/12/pressrelease-wasm-rec.html.en

One potential future use could be as a better llvm bitcode. Bitcode isn’t portable but wasm is, and wasm has plans to be a much better platform for a variety of languages (eg with gc support and delimited continuations) whereas making gc compatible llvm code is basically impossible because you need to write the gc yourself but can’t control what’s in the registers or on the stack. Perhaps an optimised wasm compiler/runtime could be used to get a good compiler for a new language without doing as much work.

Or perhaps a general vm will never work as well as a custom one and the trade off won’t be worth it.

>> For a time there seemed to be a lot of hype around writing SPAs in whatever language you like, not just JavaScript, but that seems to have cooled off somewhat - at least that's my impression.

I'm building exactly that. A SPA using Go compiled to WASM. Some issues seem to be the startup time and the payload size.

Similar kind of thing for myself.


Standard Go:

The resulting .wasm sizes aren't practical. 2MB+ for "hello world".

That being said, it was only ~2.5MB for an inital mock 3d on 2d canvas test thing: https://justinclift.github.io/wasmGraph1/ (~550k when gzip compressed, as is commonly done when sending over a network)



Much better sized .wasm files. ~500 bytes for "hello world".

~65k for a mock 3d on 2d canvas test: https://justinclift.github.io/tinygo_canvas2 (~26k compressed)



~110k for a port of the same mock 3d on 2d canvas test: https://justinclift.github.io/rust_canvas_2d/ (~44k compressed)

Note that this is done with lousy Rust code, as I'm still in the early stages of getting the hang of it.


Rust (so far) seems to be the best trade off, at least for now. Although TinyGo's binary sizes are much better than it and standard Go, TinyGo breaks far too often, and reported blocker bugs aren't addressed for months. :(

The resulting .wasm size of my app is about 70MB so at this time I believe you can only develop enterprise apps or very simple stuff.

I would love an expert's thoughts on whether something like MathJax/KaTeX can take advantage of WebAssembly for faster rendering. Seems like a perfect use case to me.

MathJax will probably not benefit much from WebAssembly. It doesn't actually render math itself. It just creates HTML + CSS (or MathML), and the browser renders it using the normal, highly-optimized rendering paths.

But it does parse and compile latex expressions right?

It's more of a macro system stapled to a layout engine, which emits a DOM tree.

WebAssembly sucks at DOM. It sucks huge. WebAssembly can't hold a reference to a DOM object and can't even create strings. You have to do that all in JavaScript, and then import that JavaScript code into your WebAssembly module. You'll end up doing something weird like sticking your DOM objects in an array and then passing indexes into the array to your WASM code. Strings have to be encoded/decoded to get them in/out of WASM.

MathJax is written in TypeScript. If you want WASM, that means rewriting large parts of MathJAX, keeping other parts in JS, bridging them, and dealing with all of the resource dependencies (i.e. dealing with WebPack).

And anyway, JavaScript is decent at string processing.

Good points! It doesn't seem like it'd help unless there's a really compute expensive portion. I doubt any parsing would be a limiting factor.

Most people don’t think parsing is the limiting factor—which is why we’re always surprised when we measure and it ends up being the slowest part of the process (not common, but it does happen)

It would be interesting to know what the slowest parts of MathJax are.

A better solution would be to improve MathML support in browsers.

There’s good news on that front: https://mathml-refresh.github.io/mathml-core/

> I must admit, I'm not fully sure I understand what its goals are and what we should expect its usecases to be.

I think the project itself lost its track. It started as an admirable goal of being a common low-level-ish compilation target for the browser. They managed to ship an MVP targeting C semantics.

But now it looks that now people are more interested in writing a yet another compiler to WebAssembly (which isn't even in a beta phase) than developing WebAssembly further. And pushing WebAssembly (which, once again, isn't even in beta) into all sorts of weird stuff like server-side apps. And this is occurring even in the orgs responsible for developing WebAssembly, like Mozilla.

WebAssembly was announced in 2015. In 2017 they had a Minimum Viable Product that was supported in all major browsers. We are a third way in 2020, and there's hardly anything beyond MVP.

Interestingly, I've been downvoted, and yet there's no answer in the comments as to what's the latest with WebAssembly or what is the state of the project.

AFAIK I know Figma uses a fair bit of webassembly for perf. Their editing core is C++ compiled to webassembly which is then bound to a canvas. I am pretty impressed at how silky smooth their UI is. (I could be wrong)

If anyone from Figma is reading this, would love if you could shed more light on how you use webassembly and your learnings.

I'm still wondering why webassembly is not used more. I did not discover helloworlds tutorials, how to call js from WASM and vice versa... I sense WASM is still a little fastidious?

I would love to have python released as a python WASM blob, there's pyiodine but it's not just python.

I thought I would see more libraries compiled to WASM but it doesn't seem to be the case. I want to guess the c/c++ toolchains are lacking support and usage, or maybe C/C++ devs don't care about WASM? I know I do.

WASM lets non-web-developers run their code in a browser, but web developers don't really care since they prefer writing JS. Maybe that's why WASM is not taking off. That's a shame.

> I'm still wondering why webassembly is not used more

Largely because the majority of the problems that I have as a web developer are not solved by manipulating 32/64-bit ints/floats at native speeds.

While I'm sure there are many exceptional uses for wasm that are related to computational pipelines, machine learning, and canvas rendering none of that really helps me add a text node to a DOM element that let's my user know they entered an invalid username/password combination.

>> I'm still wondering why webassembly is not used more.

Because it doesn't support direct DOM access(thus is slower than JS and more mem hungry) and lacks GC so many languages (i.e php, ruby, python etc) don't run (at all or just not efficiently) in the browser.

There's not a lot of good reasons to use WebAssembly for most apps. You need code that is performance critical with light IO/DOM access, and can be written in the handful of languages that can be practically compiled to WASM, i.e. not just porting the runtime to WASM. Fact of the matter is that JS has evolved to be the best language for most front end development tasks. It's scriptable, somewhat performant due to JITs, quite good for single threaded async reactive programming and actually can be written in a nice pseudo-functional manner. And the libraries are quite good, even if NPM gets some hate.

WASM doesn’t have access to the DOM which really limits use cases.

I believe that its actually possible to run python in WASM, see https://wapm.io/package/python#shell

> In SpiderMonkey there is no mechanism currently to tier down; if you need to debug WebAssembly code, you need to refresh the page, causing the wasm code to be compiled in debugging mode.

Is there ever a reason to "tier down" WebAssembly besides debugging? In JavaScript functions will often get deoptimized back to the baseline interpreter if they side exit too often, but is this a thing that needs to be done in WebAssembly?

As far as I know: Not really the initial "downgraded" code is normally worse in more or less any aspect. It's just emitted fastly.

The only reason I could come up with is to downgrade to them upgrade again to a differently optimized code. But then WebAssembly doesn't need to do speculative optimizations like JS needs. So I guess it's not really relevant.

In theory, you could generate better code after having observed the actual arguments to various functions and recompile while treating them as constants. Basically profile-guided (re)optimization. But the observation would slow down the initial code, it would be a large implementation effort, and the gains would be dubious imho. Especially given how much information was already lost via the initial AOT (ahead of time) optimizing compile; you might need to keep the original source around or something.

Unlikely to happen.


I considered this scenario as llvm does somewhat support profile-guided optimizations but it would be a fundamental change where we go from "unoptimized" to "optimized-with-tracing" instead of optimized, so that step could still have optimization upgrade hooks, so we still wouldn't need a downgrade to "base-line". I guess.

I agree that it's anyway unlikely to happen, at least in the browser.

I could imagine that some WebAssembly based cloud provider (or similar) would be potentially interested in generating profiles and then use that for better optimizations in other instances (so still no down-/upgrade ).

> you could generate better code after having observed the actual arguments to various functions and recompile while treating them as constants

This transformation is valid for pure functions; therefore you can mark it constexpr before compiling to WASM. There's the `constexpr-everything` project (among others) that uses libclang to apply constexpr to all legal functions/expressions in a codebase.

That's for C++ - Rust is working on `const fn`.

> This transformation is valid for pure functions; therefore you can mark it constexpr before compiling to WASM.

It's also valid for non-pure functions at runtime.

JS functions go back and forth because they're trying to provide optimized representation of functions based on guesses they make about the code. WASM plays an entirely different game. WASM bytecode is generally already extremely optimized (compiled with compiler -O flags, etc), the optimizations that a browsers apply are more about the machine the WASM running on (dealing with register allocation, simd instructions, etc). There is no guessing or opportunistic optimization like there is in JS.

There's still profile-guided optimization.

Isn't it because the compiled code makes some assumptions about the types of the arguments? So if the types of the arguments change, it would need to downgrade to interpreted code?

Webassembly is statically typed.

I'm trying to remember why, but the JVM backtracks for a couple of different scenarios.

wasm can still lazily load code, right? "Static" in that situation is a little fuzzy. You can't make wholesale changes to the types already loaded, no, but the behavior out at the leaf-nodes can be altered by a new leaf showing up.

Some assumption or inlined conditional may be wrong or simply ill-advised once new code arrives.

IIRC the JVM optimizes very aggressively based on runtime profile, and just adds traps to the optimized-out paths to collect new runtime profile.

Imagine code like this, where at runtime foo is always null

  if (foo != null) {
if foo is always null the compiler will just remove all of the statements inside and places a trap there. If foo happens to be non-null (say, due to external environment changes) then the it hits the trap and the whole method is deoptimized and goes through the profiler again.

The optimizations in the JVM are quite fascinating - https://advancedweb.hu/profile-based-optimization-techniques... I imagine javascript engines employ similar techniques

Another example: if MyClass is the only class that implements MyInterface, HotSpot can generate code that takes advantage of that assumption, removing vtable indirection in method-calls. If, later in the program's execution, the Java classloader loads another class which implements MyInterface, HotSpot will detect that the assumption it made earlier no longer holds. Having determined that the code it generated earlier is no longer safe, it will discard that code.

I recall reading that in earlier versions of HotSpot, it was sometimes possible to improve performance by making your class 'final', enabling the JVM to remove indirection for method-calls. This is not true of the later versions of HotSpot, where it can use the technique described above to safely make optimisations based on tentative assumptions, giving you the same enhanced performance either way.

Even older than that, hotspot would in-line the common case for method dispatch in situations where there is one implementation or where one implementation dominates. (Kind of wonder if Spring would perform at all without it).

The runtime optimisations are remarkable but it does look like runtime performance can change unpredictably. Has this been your experience?

A fair question. In principle this can happen, and HotSpot can respond to changes in the program's execution patterns, but to my knowledge it's not a problem in practice with real-world Java servers. Other than a 'warming up' phase, real-world performance doesn't tend to vary wildly as time progresses.

(Disclaimer: I've not done a great deal of real-world Java work.)

Yes, the JVM tends to speed up over time as it applies these optimizations.

Real question.

Most of the wasm instructions are very low level (eg, the test and branch are different, similar to assembly).

But then when you get to the control flow constructs they went in the this weird direction. There is no goto, some of the instructions (br) are context dependent, and if the wasm JIT tries to inline anything it has to muck with the br operand (not even the worst of it). The looping constructs are just odd too.

When I last checked you couldn't even do basic blocks with a control flow graph like most other compilers (so that also means no jump threading optimizations, and without goto you would have a hard time optimizing at the wasm generation level).

It just seems like this part of the instruction set was developed by somebody who wanted some weird higher-level asm, but others wanted an asm close to the instruction set. Just seems like a really bad decitions.

Is there a reason for this or has this been cleaned up at all?


"There is one thing that may have been a major factor to the decision not to adopt arbitrary CFGs for WebAssembly control flow. I believe that V8 is an exception to most compilers in that it doesn’t represent code as a CFG at all - it maintains roughly JS-compatible control flow all the way from front-end to codegen. In order to support this V8 would have to be converted to use CFGs, they’d have to implement something like Relooper internally, or they’d have to write a new WebAssembly runtime from scratch. Google was and is a major roadblock to getting this implemented as they have multiple people on the committee in charge of WebAssembly who have veto power."

That was done entirely on purpose, to make validation simpler. It does make things harder for compilers that want to support irreducible control flow, but the workaround is a single "relooper" algorithm that doesn't really fundamentally change how those compilers work.

It is possible that in the future this will be relaxed (without giving up on the benefits for validation) via something like this: https://github.com/WebAssembly/funclets

It's not just validation. A lot of things compilers have to do becomes dramatically simpler with structured control flow and simpler means faster. When you operate on an unstructured CFG you first have to analyze the entire graph to find the dominators, loops etc and then you have to propagate the dataflow information in that order. All these complications go away.

There's been a few other works that realized this, among them the CORTL IR, suggested by Carl McConnell in his Thesis proposal "Tree Based Optimizations" and the IR used in the Oberon compiler by Michael Franz.

1. You can have both. Jumps (conditional or unconditional allow for more complex control flow and different control flow than was immediately thought of.

2 only having structured control flow makes it a difficult target for HLL, which is what was is supposed to be.

3. A few years ago I was working on an array language and was trying to use tree based optimization and code generation, but I put it on hold until i understood more - can you recommend and links or papers?

4. Wasm is an IR, both target and source. Why not add a goto and structured primitive. There are clearly some things that cannot be compiled anywhere remotely performant.

No "goto" means no JUMP instruction, right? I mean I guess that's still Turing-complete, but it seems like that would make things a lot more difficult than necessary. I would love to know the answer to your question

also no conditional jumps (equivalent to jnz, jeq, etc...) instead of uses this weird br_if that can only go to places on the control flow stack.

what's the use case for WebAssembly? Why are people working on it and why should I use it?

edit: why am i being downvoted? Is this a bad question?

As I understand it: (1) calling an optimized library written in another language for performance or size -- say you have a physics engine component in your game, or a graph traversal algorithm, or whatever; (2) calling an existing library that is available in another language with no JS equivalent; and (3) running a full existing application of some sort that is written in another language and giving it a Web interface. Oh, and (4) sandboxing untrusted code, and coming up or already present (I'm not keeping close track) (5) portability, where you can either run it on the Web or on a truly cross-platform runtime (WasmTime I think?).

(And the downvotes are probably because your question could be interpreted as implying that Wasm is useless and people are wasting their time. Even though you're asking a totally fair question, it's easy to read the tone as negative and dismissive whether you meant it that way or not.)

I have only played around with it, but to me the possibility of using a language other than JS to write web apps is really exciting. I don't hate JavaScript by any means, I've used it professionally off and on for years, but it still frustrates me that for webapps you're basically forced to use it.

I brought it into openEtG to speed up the engine's RNG: https://github.com/serprex/openEtG/blob/master/src/rng.wat

RAD tools like Delphi for WebAssembly or Microsoft .NET WinForms is what springs to mind.

Basically plugins for the browser that everyone can agree on having, instead of complaining that they are proprietary blobs.

its been more than 5 years, and nothing useful so far

you can do "cool" stuff like running DOOM in your browser... but why?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact