Responding to the OP, since there is no comment section on the site.
First off, this rant gets the history of Wasm wrong and the facts of Wasm wrong. I wouldn't unload on a random person on the internet generally, but I would like to point a sentence like:
> Not only that, but for the most part the WebAssembly specification team were flying blind.
It's an ad hominem. This really just impugns people and invites an argument. It might be cathartic, but generally it doesn't advance the conversation to cast aspersion like this.
And it's not true. I can tell you from first hand experience that a baseline compiler was absolutely on our minds, and Mozilla already had a baseline compiler in development throughout design. The Liftoff design that V8 shipped didn't look too different from the picture in our collective heads at the time. And all of us had considerable experience with JIT designs of all kinds.
As for the history. The history is wrong. The first iteration of Wasm was in fact a pre-order encoded AST. No stack. The second iteration was a post-order encoded AST, which we found through microbenchmarks, actually decoded considerably faster. The rub was how to support multi-value returns of function calls, since multi-value local constructs can be flattened by a producer. We considered a number of alternatives that preserved the AST-like structure before settling on that a structured stack machine is actually the best design solution, since it allowed the straightforward extension to multi-values that is there now (and will ship by default when we reach the two-engine implementation status).
As for the present. Wasm blocks and loops absolutely can take parameters; it's part of the multi-value extension which V8 implemented already a year ago. Block and loop parameters subsume SSA form and make locals wholly unnecessary (if that's your thing). Locals make no difference to an optimizing compiler like TurboFan or IonMonkey. And SSA form as an intermediate representation is not as compact as the stack machine with block and loop parameters which is the current design, as those extra moves take space and add an additional verification burden.
A final point. Calling Wasm "not a stack machine" is just a misunderstanding. All operators that work on values operate on the implicit operand stack. This is the very the definition of a stack machine. The fact that there is additional mutable local storage doesn't make it not a stack machine. Similarly, the JVM has mutable typed locals and yet is a stack machine as well. The JVM (prior to 6) allowed completely unstructured control flow and use of the stack, leading to a number of problems, including a potentially cubic verification time. We fixed that.
All that said, there might be a design mistake in Wasm bytecode. Personally, I think we should have implicitly loaded arguments to a function onto the operand stack, which would have made inlining even more like syntactic substitution and further shortened the bodies of very tiny functions. But this is a small thing and we didn't think about it at the time.
[edit: Perhaps "ad hominem" is a bit strong. It feels different to be on the receiving of a comment like "flying blind"--it doesn't mean the same thing to the sender and receiver--especially when this was really not the case, as I state here.]
Ignoring any factual incorrectness, I can not see how the author could have made his point in a more respectful way. He clearly has great enthusiasm for WASM and respect for it's authors, I am struggling to see how anyone could have interpreted it as cathartic...
The paragraph in which your excerpt originated makes this pretty clear:
> The developers of the WebAssembly spec aren’t dumb. For the most part it’s an extremely well-designed specification [...] I considered WebAssembly’s design to be utterly rock-solid, and in fact I still strongly believe that most of the decisions made were the right ones. Although it has problems, it’s incredible how much the WebAssembly working group got right considering it was such relatively unknown territory at the time of the specification’s writing.
: I understand that the comment author has other concerns besides the phrasing, but I'm only focusing on the phrasing right now.
Maybe in the future don't accuse engineers of 'flying blind' if you aren't inviting return fire.
From context there was a lot of conjecture going on, but the big challenge with building something new is what order to build the bits in to give you the most useful information fastest. As the number of people goes above 2 the odds that everyone agrees or that 'everyone' is right drop rapidly toward zero. You do the best you can, and hope it's good enough that you still have time to react to the worst of the decisions you made earlier. But it's not 'flying blind'.
Maybe you would have a point if it were a Linus-style "only a fucking idiot would" rant. But responding to a sincere attempt to defend a design decision as if it were an insult is some prima donna behavior.
It is not an ad hominem in any sense! For one thing, this part isn't even an attack - here the author is trying to explain and essentially forgive why the (allegedly) sub-optimal design was chosen: that there wasn't enough information at the time to make a fully informed decision! He's saying "it wasn't their fault they designed it like so, at the time it probably seemed like the best decision".
Second, even if it were some form of attack - it wouldn't be an ad hominem because it is not an _irrelavant_ personal attack. It is directly relevant whether some group made decisions based on sufficient existing information etc. The author might be totally wrong about the facts, but at least he believes and offers evidence regarding the situation at the outset of development.
It hurts to have your work criticized, and I can't comment on the factual accuracy of the timeline and other claims, but the piece does not come off at all badly-intentioned, personal or otherwise unreasonable: it comes off mostly as purely technical criticism.
An example of an ad hominem: saying that someone must have cheated on their taxes, because they have some depraved sexual kink. Whether or not the assertion (“X is a fetishist”) is true, it is obviously irrelevant to ascertaining the truth of whether X cheats on their taxes.
An example of a not-ad-hominem: saying that someone is more likely to have cheated on their taxes, because they are an old rich white man. This might be stereotyping (i.e. inductive reasoning), it might be a “personal attack”, and it might be disallowed in a debate for any number of other reasons, but it’s not an ad hominem: being in the relevant class really does have some correlation (however small) with cheating on one’s taxes (mostly because all groups other than the relevant group consist of people with less access to the resources that would allow them to get away with cheating on their taxes.) Therefore, the truth-value of the assertion is not entirely irrelevant to the syllogism—so it’s not an ad hominem.
To be clear that particular phrase wasn't direct technical criticism - but it was embedded in an article that was largely technical and it was in direct support of the technical arguments (essentially "it ended up like that because no compilers, streaming or not, existed yet").
I think you should look carefully at the definition of ad hominem. You say that the author couldn't have known what he was asserting. Maybe, sure! That doesn't relate to it being an ad hominem though (it would make it simply false). You say the worse are deprecating. I don't really agree, but even if they were that doesn't by itself make it an ad-hominem.
Ad hominem needs all three factors: _personal_, _irrelevant_ and _an attack_.
I think you can make very good arguments that it was not an attack and certainly that it was not irrelevant. I would even argue it's not personal, since it is not an about a personal characteristic of any person, but simply an observation about what point in time an event took place. Like if I said "you were FLYING BLIND because you had to decide whether to take you umbrella before knowing the weather at your destination", it is not even a personal thing: just observing that you had to decide before you had all the information.
>It's an ad hominem.
I didn't read that as a criticism at all. He was just saying that the Wasm team didn't have all the information that they would ideally have wanted to have. No idea if that's true or not, but I think you're misreading it if you take it as some kind of ad hominem attack.
It's generally used to indicate operating without information that's really required, but historically it's used when that information is missing because someone else should have provided it and didn't do so, leaving those doing the work without information they need. Without that context it sounds like someone is choosing to make a poor choice and work without knowledge they should have. The responsibility for the problem in those interpretations falls on different people, which can make the phrase tricky to use without ruffling some feathers, as it seems to have done here.
If that's an extension, and it's only implemented in V8 but not in some of the other main WebAssembly platforms, then I guess it's fair to call it out as not being part of WebAssembly.
Phase 4 requires a second implementation, and then we will ship it.
The "flying blind" idiom just means that you were operating on intuitions developed from experience, but without much guidance from directly relevant factors (post mentions a working compiler with which you could run experiments to see what would and would not work).
I don't see how it could be interpreted as a dig in this context.
> It's an ad hominem.
Er, no, it's not. It's not a an attack on the designers as bad people, nor does it serve the role in an argument it would need to serve to be part of an ad hominem argument even if it was.
It may be inaccurate, misleading, , deceptive, uninformed, or a million other kinds of wrong, but it's not ad hominem.
I'm sorry that you felt attacked by that line, I really tried my hardest to phrase it in a way that didn't assign any blame. I wasn't trying to imply that the team wasn't thinking about these issues, just that real-world implementations of this kind of VM didn't exist yet and so many of the practical issues were difficult to see in advance.
Many of your other issues I directly address in the article itself, for example that optimising compilers can recalculate the information lost when using locals (tl;dr: why recalculate this information when you could include it in the format) and that Wasm started as an AST machine. The JVM works similarly to Wasm, true, but it is generally considered a hybrid stack/register machine. I'd define Wasm as a similar hybrid.
As for the multi-value extension, although that improves codegen for streaming compilers it doesn't reduce complexity unless locals are also deprecated. Something that I don't talk about in the article but that seems like it may be a problem going forward is that Wasm seems to have no mechanism in the format for major version bumps/breaking changes. Unnecessary things like locals and structured control flow (see the second article in the series) cannot be removed even when they are subsumed by more-general features.
Agree with all you wrote. With my own implementation, I'm hesitant to include these extensions until they are "standardized" (I have a one-man toy project, bleeding edge is unreasonable). I understand and have watched as going through the phases seems like quite a slow process (not that it's a bad thing). I think, to avoid fragmentation, it is reasonable for someone targeting WASM at this point to assume the multi value extension is not available. I know there are proposals about runtime capabilities and the like, but with so few of these extensions reaching full spec adoption yet, I don't think they're viable features to leverage even if implemented in the most popular runtimes.
I.e. just change the one word “WebAssembly” to “asm.js” in the sentence you quoted. Consider it a typo. Does the history now read correctly?
To me, it broke down as:
(1) SSA form would allow for substantially simpler compilers. Adding block/loop parameters while maintaining support for locals as an extension doesn't address that, as compilers would then still have to implement full support for both modes of operation.
(2) The magnitude of the performance impact of this design decision.
I have some limited experience in compiler design but not enough to really have an understanding of the implications.
Inlining is likely important for performance. Any chance this will be corrected?
I think that this article overstates the impact of all of this.
It might, however, make sense to have another standard "SSA WebAssembly" program representation. There could then be standard tooling to compile vanilla WebAssembly to the SSA form, frontends could choose which variant they want to emit, and backends preferring SSA as input could still be made happy.
I'm obsessed with the quality of streaming compiler-emitted code for a few reasons. Firstly, I'm working on an optimising streaming compiler. Secondly, I work for a blockchain company and we can realistically only allow linear-time compilation, this doesn't necessarily mean streaming compilation but we might as well make it both (I explain why we need linear-time compilation in a different article http://troubles.md/posts/why-wasm/). Thirdly, anything that gives streaming compilers more information also means that non-streaming compilers have to reconstruct less information, and lastly in this particular case there is no reason (except for backwards compatibility constraints) why we can't preserve more of the information from the front-end and have streaming compilers emit better code.
I've seen a single baseline compiler go through a metamorphosis from essentially streaming (HotSpot client V1) to full SSA-based with register allocation (HotSpot client today).
In other words, prepare for change. A single tier is probably not going to be your final design.
Anyway. I could tell you a lot more about how to design compilers but I have to take my kid to school.
So, I don’t see the point of sending SSA over the wire.
I'm not advocating for an SSA register machine like LLVM, I'm just advocating for a format that makes it trivial to reconstruct SSA form on-the-fly. A pure stack machine with a statically-determinable stack depth and type at any given place in the program would give you the same information as SSA form in a more-compact way.
That said, the key reason for my push back is the suitability of SSA for fast backends. If you can afford to run some coalescing then SSA is at least perf-neutral. If you can’t then compiling from SSA will result in crappy code. As in, probably worse than block local RA.
So your best bet is to somehow avoid having to coalesce. But that probably means using SSA only for extracting liveness and then running the world’s dumbest linear scan. Even that may not be as good as block local RA.
Overall, I'm pretty excited for WASM and the implications of it, but it does feel like the web has regressed in the ability to deliver games.
Undelimited "asymmetric" coroutines, like Lua's, could be an interesting addition. That still seems to me to be too high level a feature for a "portable assembly language" specification though.
WASM is recapitulating the same cycle. Which is understandable because time is limited and you can't make the perfect the enemy of the good, but you still have to recognize it for what it is--a vicious cycle of short sightedness. If WASM doesn't provide multiple stacks as a primitive resource, then things like stackful coroutines, fibers, etc, will have to be emulated (at incredible cost, given WASM's other constraints regarding control flow). And if they have to be emulated they'll be slow, which means languages will continue avoiding them.
However, it doesn't seem to be _entirely_ an implementation detail. Some developers just don't seem to like the semantics of called functions being able to cooperatively yield without the caller explicitly opting into it with a keyword like `await`. I disagree with them, but it's a legitimate complaint I've a few times.
It reminds me of arguments in the Lisp community about delimited continuations and undelimited, i.e. Common Lisp and Scheme. A lot of the arguments there are really about semantics and not implementation details, and come to the same point: should cooperative scheduling require explicit notation at each level of the call stack?
My view on this is that systems languages like C and Rust should require explicit notation for it whereas application languages should not. This seems to be a point in favour of Go and Erlang over Java and C#.
However WASM, similar to C or Rust, seems to target a level in the tech stack at which it should concern itself only with abstractions that have relatively direct translations to the instruction set of the underlying hardware. Support for multiple stacks doesn't fit into this from what I can see. (A similar argument can be made for WASM not supporting garbage collection too, although it looks like that'll be added at some point to make interoperability with JS smoother.)
With the JVM supposedly adding fibres soon, it poses a question for WASM: is it trying to be a portable assembly language, a portable high-level language runtime, or something in between?
>"This means that you have overhead associated with compilation - knowing the liveness of variables is extremely important for generating efficient assembly, but instead of the liveness being calculated when creating the IR and stored as a part of it you have to recalculate this data every time."
Can someone say what is involved in calculating "liveliness"? What is the procedure for doing so?
As an aside, liveness is also useful for some other things. For example, a variable that is live at the start of a function is one that may be used without being initialized, and the compiler can emit a warning for it.
Edit: BTW, it really is "liveness", not "liveliness".
In Part 2, when discussing `goto`, he says, "WebAssembly is a stack machine."
Part 2 contains no explanation about the contradiction with Part 1.
Yes, the author just needs more work to be done. But it's perfectly doable, although it has a complexity. Like everything in the world of compilers. There are no free lunches.
"In March 2017, the design of the minimum viable product was declared to be finished and the preview phase ended."
It was only announced in 2015.
I'm not taking sides here, but either it's well designed, or getting weighed down by legacy in 2 (or 4) years.
WebAssembly itself is relatively new, but it wasn't a completely blank sheet of paper that they were starting with when they designed it.
"and only at the last minute did it switch to stack-based encoding for the operators"
Which kind of counts against it being well designed.
To me, "weighed down" by legacy suggests some deep problem that shouldn't be manifesting in something so young. You could argue that 2 years is a long time in tech, I wouldn't say it's a long time in language development though.
Maybe I'm just arguing semantics here? Is a library for a particular language weighed down by legacy because it's designed to run on one particular language?