
Why does WebAssembly need the relooper algorithm, instead of gotos? (2019) - tbodt
http://troubles.md/posts/why-do-we-need-the-relooper-algorithm-again/
======
int_19h
"There is one thing that may have been a major factor to the decision not to
adopt arbitrary CFGs for WebAssembly control flow. I believe that V8 is an
exception to most compilers in that it doesn’t represent code as a CFG at all
- it maintains roughly JS-compatible control flow all the way from front-end
to codegen. In order to support this V8 would have to be converted to use
CFGs, they’d have to implement something like Relooper internally, or they’d
have to write a new WebAssembly runtime from scratch. Google was and is a
major roadblock to getting this implemented as they have multiple people on
the committee in charge of WebAssembly who have veto power."

~~~
est31
> they have multiple people on the committee in charge of WebAssembly who have
> veto power

Note that even if Google did not have people on that committee, they'd just
not implement the feature in their browser, and the feature would be dead in
the water. Outside of niche cases where browser choice is controlled, no web
dev would adopt features that > 70% of users can't use. So the committee only
reflects power dynamics that exist in the greater browser sphere. In fact,
non-Google browsers probably have more seats than they have influence by raw
market share.

~~~
leoc
> So the committee only reflects power dynamics that exist in the greater
> browser sphere.

It doesn't _only_ reflect them, but also amplifies them. Quietly nobbling
things in committee is surely a lot easier than having to field questions
about why you aren't implementing Web standards (insert quotation marks as
desired). There's also the danger of being embarrassed when someone writes a
library which runs slowly on your browser and fast on others.

------
saagarjha
WebAssembly makes a number of strange choices that seemingly fly in the face
of convention; this is one of them. It would be really nice if it someone took
some time it make a FAQ on the website to answer questions like “why doesn’t
WebAssembly do x like everything else”, because otherwise it just seems that
they’re just doing things differently because they can…

~~~
cwzwarich
A lot of those bad WebAssembly decisions are carried over from
Emscripten/Asm.js and don't have any real rationale behind them beyond that.

Obviously, since JS doesn't support arbitrary control flow, Emscripten/Asm.js
would need to convert to some simplified form of control flow (although not
necessarily fully structured). It seems that this solution was carried over to
WebAssembly without much deep thought given.

There does seem to be some backwards association with Java's bytecode
verification complexity issues (which are the result of a bad design, not
unstructured control flow). In the big GitHub Issue
([https://github.com/WebAssembly/design/issues/796](https://github.com/WebAssembly/design/issues/796))
on the topic, one of the core V8 developers even claims that the Java
experience justifies the WebAssembly decision. However, the funclets proposal
for WebAssembly
([https://github.com/WebAssembly/funclets/blob/master/proposal...](https://github.com/WebAssembly/funclets/blob/master/proposals/funclets/Overview.md))
shows that you can support arbitrary control flow with linear time
verification if you choose a different design than Java.

~~~
brabel
WASM is a typed bytecode format where the stack state must be validatable at
compile-time, guaranteeing that no validated program can mess up the stack at
runtime. Having unstructured control flow would make it pretty much impossible
to do that, wouldn't it?

~~~
cwzwarich
To get linear time validation, you need conditions at each control flow
transfer that can be checked locally but together imply that stack usage in
the function is well-formed.

The funclet approach (in more ordinary terminology) is to ensure that the
stack difference from function invocation upon entry to a basic block is the
same across entries to that basic block, and that those additional stack
entries all have the same (statically specified) type in each invocation. This
is very natural when generating code from an SSA optimizer.

Java bytecode traditionally didn't contain any of this branch target or type
information, and required the bytecode verifier to do an iterative combined
control/data-flow analysis. Even now that they added some of that type
information with stack maps, it's still more complicated than it needs to be
due to the historical legacy of this approach.

------
emmanueloga_
META: I don't understand why some authors go out of the way to obscure
important pieces of information:

* Who wrote this article?

* When was this written?

Was lucky to find this pieces of information in Github [1].

1:
[https://github.com/Vurich/troubles.md/blob/master/content/po...](https://github.com/Vurich/troubles.md/blob/master/content/posts/why-
do-we-need-the-relooper-algorithm-again.md)

~~~
rkangel
It's a common problem with news articles too - you get linked something and
you don't know whether it relates to a current event or something interesting
3 years ago.

------
RodgerTheGreat
Something this article doesn't appear to spell out is that the structured
format WASM is constrained to means that program CFGs will never contain
irreducible loops. As noted, this approach can make it harder for the AoT
compilers (they have to untangle the knots to generate valid WASM bytecode),
but it makes it much easier for the JIT compilers to perform analysis and
transformations at runtime. In that context, it seems like a sensible
engineering decision.

------
yuri91
The article says about the Stackifier algorithm: "no link for the latter, it’s
only mentioned by this name in LLVM internal discussions".

Since the release of this article, I wrote a blog post explaining how
Stackifier works, for those who are interested:

[https://medium.com/leaningtech/solving-the-structured-
contro...](https://medium.com/leaningtech/solving-the-structured-control-flow-
problem-once-and-for-all-5123117b1ee2)

~~~
jeltz
Thanks! I used your article to successfully implement Stackifier.

~~~
yuri91
I am glad it helped somebody :)

------
titzer
_sigh_

I am partly responsible for why WebAssembly is this way. You can thank/blame
me for the if-else bytecodes. They are indeed, a form of compression, as they
don't add expressive power. I measured carefully and they make a big
difference in code size. That's why they're there!

The structured control flow requirement is to benefit _all consumers_. It is
only a burden on producers that come from CFGs, not from ASTs and other tree-
like IRs. If you have a dominator tree, then you can generate structured
control flow in linear time. LLVM does this a particular way, but there are
straightforward algorithms.

No, this wasn't Google throwing around its veto power or something like that.
There is a good reason why control flow is structured, as hinted in comments
here.

1\. Structured control flow rules out irreducible loops. Irreducible loops
cause problems for _all_ JIT compilers in all browser engines, not just V8,
and even in JVMs. Things get really complicated, particularly in register
allocation. [1]

2\. Structured control flow guarantees a stack discipline for the use of
labels, mirroring the stack discipline for values. This is not only a nice
symmetry, it means that a consumer that needs to allocate space per label can
reuse that space as soon as a control construct is closed. That is essentially
optimal for use of consumer space resources.

[1] No kidding. If you have an irreducible loop in Java bytecode, which is
possible, you will never be JITed and will get stuck running 100x slower in
the interpreter. We thought this through very carefully in V8. If you allow
irreducible loops in Wasm, you force all engines to either stick to their
lowest execution tier and run 2-100x slower, do relooping themselves, or
handle the general case of irreducible loops spending multiple person-years
complicating their optimizing tiers' backends for a case that is _incredibly
rare_ (and probably introducing lots of bugs). In V8 we would have probably
gone for the relooper option because the other two options are bad. So that's
a lose, because now the engine is more complicated, doing a rewrite of the
code that could as well be done better and more efficiently offline by a
producer. And there is no benefit because the engine's code would be no better
than what the producer would have come up with. So we'd choose the lesser of
the complexity options, but get no performance benefit, in order to avoid the
absurdly bad performance hit of not being able to use the optimizing tier. Bad
tradeoff now matter how you slice it, IMHO.

I am fully convinced we made the right choice here.

We should have communicated better and the relooper algorithm and tools should
textbook, off-the-shelf stuff.

~~~
kevingadd
It's bizarre to me that there are complaints about the if-else bytecodes. You
can call them "weird" but they're very easy to reason about, make it easy to
write toy examples, are easier to generate from a compiler, and are easier to
read and understand in disassembly. At the point where the author started
talking about customized compression to recover the size gains they should
have realized why the bytecodes exist! Early on in the spec process it was
very, very useful to have them.

Anyone questioning whether the lack of goto was the result of an invocation of
veto power can look at the design repo and see that lots of control flow
consideration and discussion happened in the open before the group finally
agreed upon a solution. Many of the players involved were not Google employees
at the time of the decision (I don't know if they are now):

[https://github.com/WebAssembly/design/issues/33](https://github.com/WebAssembly/design/issues/33)

[https://github.com/WebAssembly/design/issues/44](https://github.com/WebAssembly/design/issues/44)

I definitely recall that people had disagreements about how control flow
should work and had different goals or priorities but it was a pretty detailed
and drawn-out decision-making process. I don't think it was really possible
for everyone to walk away from the table happy.

~~~
sunfish
As context, at the time of those issues, the control-flow restrictions were a
compromise to help get the "MVP" off the ground, with the understanding that
"more expressive control flow" was expected to be added later:

[https://github.com/WebAssembly/design/blob/master/FutureFeat...](https://github.com/WebAssembly/design/blob/master/FutureFeatures.md#more-
expressive-control-flow)

------
albertzeyer
Interestingly I came up with a similar algorithm like Relooper when I
translated C to Python code. First I translate all C code to mostly equivalent
Python code but including some gotos, and then I get rid of those gotos by
introducing some more loops. The current solution is simple but the resulting
function could be slow.

[https://github.com/albertz/PyCParser/blob/080a7b0472444fd366...](https://github.com/albertz/PyCParser/blob/080a7b0472444fd366006c44fc15e388122989bd/goto.py#L102)

------
6510
Could someone reduce the GOTO complaints to the arguments why it is harmful.
(Preferably without the considerations?)

------
heretoo
The article cites Dijkstra's "considered harmful" essay, which asserts that
"goto" should be removed from "higher level" languages, not "machine
language".

Without reading the remainder of the article, this would seem to undermine any
further assertions.

UPDATE: after reading the article, it seems the opposite is being shown. my
bad.

~~~
saagarjha
It’s brought up to prove the opposite. I would suggest reading it through,
it’s quite good.

~~~
heretoo
Yeah, I figured that might be happening, once I started reading further. Blame
the reader.

------
The_rationalist
Unpopular opinion: GraalVM is what should have been webassembly.

Reasons: * True polyglotism * existing, feature complete implementation *
compatibility with the existing JVM platform, and through polyglotism
compatibility with almost any platform/library.

~~~
c-cube
how is Graal more polyglot than wasm?! Does it support low level languages as
well? Besides, it's only one implementation (controlled by Oracle) and will
probably remain so forever — who would want to risk a lawsuit by
reimplementing it, I wonder.

wasm is already in all major browsers. It's succeeding where no JVM-based
approach ever could.

~~~
wbl
Graal is licensed under the GPLv2. It's free software!

~~~
toast0
GPLv2 isn't free enough to be used in a commercial browser.

~~~
Ace17
Nitpick: I assume you meant "GPLv2 isn't free enough to be used in a
_proprietary_ browser".

