
WebAssembly Is Not a Stack Machine (2019) - arto
http://troubles.md/wasm-is-not-a-stack-machine/
======
dang
Discussed at the time (with comments from one of the designers of WASM):
[https://news.ycombinator.com/item?id=19069587](https://news.ycombinator.com/item?id=19069587)

~~~
saagarjha
And at least one other WASM implementer as well. A lot of good comments there.

~~~
gridlockd
I think that's actually a rather poor discussion, revolving around what is or
isn't "ad-hominem", or which unimportant details of the history of WASM are
inaccurate, or what the name of things should be.

All of the technical concerns raised by the post are either left unaddressed,
or deferred to future extensions of WASM. In other words, the author must be
correct in his analysis and the standard has made questionable trade-offs from
the outset.

I can think of at least one reason why you might want a register machine: Not
every platform is going to use or allow JIT, and a register machine would
perform better here, i.e. like Dalvik in the early days of Android.

~~~
saagarjha
Read past the first thread ;)

------
pizlonator
It’s misleading to say that register machines carry no liveness or that they
defeat liveness. That’s a bit much. Computing liveness on a register machine
with dense integer numbering of locals is not that expensive. It’s certainly
cheaper than running a good backend. And a good backend will certainly modify
the code in a way that requires liveness to be recomputed.

It’s also misleading to say that register machines defeat SSA. It’s not hard
to convert from a register machine to SSA - the algorithm is almost linear.
Powerful backends (like WK’s B3) recompute SSA after some transformations
anyway.

I think that wasm combines elements of a stack machine and a register machine
in a way that leads to a compact format and multiple reasonable paths to
converting to the sort of IR you’d want for converting a platform agnostic
form like wasm into any contemporary instruction set. I’m no wasm fanboy but
as far as binary IRs for transporting code into optimizing compilers go, this
one is pretty slick.

------
IshKebab
Looks like the proposal to fix this was merged into the standard in April.

[https://github.com/WebAssembly/multi-
value](https://github.com/WebAssembly/multi-value)

------
lebuffon
The changes noted for future: loop counters as arguments, returning multiple
arguments sound a bit like re-inventing the Forth VM, which does things this
way. Might not hurt to review some papers in that sphere that may have walked
this ground before. (?)

[https://www.researchgate.net/publication/2414672_A_Prelimina...](https://www.researchgate.net/publication/2414672_A_Preliminary_Exploration_of_Optimized_Stack_Code_Generation)

------
kevingadd
I'm not sure the timeline described ("only at the last minute did it switch to
stack-based encoding for the operators") is accurate, but it is the case that
for a while we were working towards more of a register-oriented encoding
instead of the stack oriented one that shipped. The representation of trees
and operands was also different. I think what ultimately shipped was probably
right, but the semantics described by the article for blocks are incredibly
gross and if I had known about them I would've blocked them. The author's
conclusion that this is due to wasm's asm.js-derived heritage is accurate
(also, arguably the 'lots of locals' model was unavoidable since everyone was
compiling wasm using JS runtimes anyway.)

Incidentally this claim is false: "No streaming compiler had yet been built,
hell, no compiler had yet been built." Early in development we had at least
two different compilers used to generate test cases - one compiler for a home-
grown imperative language written by Nick Bray, and another compiler for a
subset of C# that I wrote [1]. Having those two compilers generating code
early on was useful given that neither emscripten or LLVM were capable of
compiling real apps so we were flying blind without them. Development of LLVM
integration also started _very_ early, the problem is just that it took a long
time until it was usable.

As for whether the lessons from those compilers were actually paid attention
to or acted upon, well...

P.S. I still don't understand the reasoning behind "blocks have return
values". Does any popular programming language out there do this except maybe
some of the ML-derived ones? I've never run into it in production software.
It's certainly not something a typical compiler would generate unless the
source language had it as a primitive.

1:
[https://github.com/kg/ilwasm/blob/master/third_party/tests/R...](https://github.com/kg/ilwasm/blob/master/third_party/tests/Raytracer.cs)

~~~
comex
> P.S. I still don't understand the reasoning behind "blocks have return
> values". Does any popular programming language out there do this except
> maybe some of the ML-derived ones?

Most languages have something like `cond ? expr1 : expr2` which would
naturally compile to an if-else block with a return value.

~~~
kevingadd
'if' is a separate instruction from 'block' in wasm, though - the conditional
having a return value makes sense given that there are two paths it can take
and a value might want to flow out. the logic behind applying that to all
basic blocks is confusing to me.

------
trumpeta
It seems to me that this is only a problem when you try to write wasm
directly. If you compile from rust then this analysis is already done for you
on a higher level. or what am i missing?

~~~
francasso
The issue is that all the analysis the rust compiler did is not
present/inferable from the wasm artifact. Since the compiler that has to
translate wasm to machine code cannot make more assumptions than those that
are in the wasm specification, it has to redo the analysis for certain things.
That is time consuming for an optimizing compiler (how much really I don't
know), and I think impossible for a streaming compiler.

------
transfire
The block argument and multiple return value proposal is a very good proposal.
[https://github.com/WebAssembly/multi-
value/blob/master/propo...](https://github.com/WebAssembly/multi-
value/blob/master/proposals/multi-value/Overview.md)

Any idea how likely is this to make it in to the spec?

------
gridlockd
Always write an implementation first - and make it a good one. Then derive a
standard from it.

"Oh, but the implementation details will leak through!"

So what? This is an ivory-tower concern. When you are designing a standard,
you _must_ have your mind on possible implementations, which is far more
difficult without having created an actual implementation. You can't design in
a total vacuum, otherwise your standard can't be implemented properly at all.

~~~
jacquesm
That's how we got the open version of MS Word. For small values of 'open',
because 'do it like Word '95 did it' is not a very good way of describing a
standard.

~~~
comex
I just checked and was somewhat surprised to learn that AutoSpaceLikeWord95’s
behavior is actually pretty well specified:

[https://docs.microsoft.com/en-
us/dotnet/api/documentformat.o...](https://docs.microsoft.com/en-
us/dotnet/api/documentformat.openxml.wordprocessing.autospacelikeword95?view=openxml-2.8.1)

I’m sure there are still gaps in the specification overall; I don’t actually
know much about it, but I believe competing implementations have trouble
reproducing the exact layout of Word documents, which should be possible with
a good specification, and is mostly possible with HTML.

But I don’t see anything wrong with that particular attribute. Backwards
compatibility is important.

