Hacker News new | past | comments | ask | show | jobs | submit login
WebAssembly from Scratch: From FizzBuzz to DooM (github.com/diekmann)
527 points by popitter 65 days ago | hide | past | favorite | 89 comments



I found this video https://www.youtube.com/watch?v=r-A78RgMhZU "A Talk Near the Future of Python (a.k.a., Dave live-codes a WebAssembly Interpreter)" to be a brilliant introduction to WASM as well as writing interpreters in general. I'm a relative novice in the subject and it was pitched right at my level.


Dave Beazley is a truly amazing presenter. Another favorite of his of mine is his epic tale of how he ended up demolishing an opposing side's case in a civil lawsuit... by sheer luck of having a Python interpreter avaialable. For for you viewing pleasure:

   https://www.youtube.com/watch?v=RZ4Sn-Y7AP8


I learned about this author from SWIG. I have always liked that project. https://en.wikipedia.org/wiki/SWIG


Very nice. I like these tutorials showing the nuts and bolts of wasm and C without just throwing it at emscripten toolchain.

I'm curious if there's perf differences between canvas and webgl canvas. This project uses just canvas, but iirc passing frames to be rendered by webgl is faster. Perhaps I'm wrong in this context.

I also don't see threading in here. Makes sense for a demo, but if this were to be used performantly you'd have to throw it all in a webworker so it doesn't block the main thread. This is one point of contention with wasm because it's not straightforward to render to a canvas/webgl on the main thread from a worker thread. OffscreenCanvas is one workaround but not supported by FF or safari.


Seems to me the original rendering pipeline wasn't GPU based. The author is just dumping whatever the game renders in a canvas element, there is no need for webgl for that.

Also Canvas is getting some GPU acceleration in some browsers:

https://developers.google.com/web/updates/2012/07/Taking-adv...


Usually canvas hardware acceleration is not taken advantage of unless you do CSS tricks like setting an element with another depth order.

https://www.programmersought.com/article/58016098343/

In any case, given how browsers black list drivers and GPU models, even WebGL isn't guaranteed to be accelerated.

With these constrains the Web will never be as fast as native for graphics programming.

They took us Flash away, but forgot to ensure the same capabilities were kept.


GP was asking whether it would be more performant to do via webgl though. (It's already known that canvas is currently used)

My bet is that webgl probably would be, but I'd also be curious to hear from someone with more info.

Seems to basically come down to whether canvas of webgl does a more efficient version of some kind of memory copy onto the GPU side.


There's also the problem of getting keyboard input in and out of the web worker in a performant manner.

I tried this a few years ago with a Gameboy emulator I had ported from Go to webassembly and used web workers to run the emulator in.

Getting the keyboard input in, in a performant way was a real struggle using postMessage, although I'll admit I'm not the best at web programming so someone more skilled might have been able to do it better


Do you mean that keyboard input into a web worker is too slow, or it uses too much memory or cpu or power or ???

How is keyboard input different from any other sort of data round trip to/from a web worker?


Why this arch over having the UI run in the main thread and sending events into the wasm worker?


I'd ported the emulator to WASM from one I'd half written a few years previously, and engineered myself into a corner (+ like I said, not a web programmer)

The emulator ran in the worker, and the inputs were handled on the main thread.

The output (i.e. display) was pushed from the worker -> main thread

I think if I was to do it from scratch I'd do something similar to the DooM approach here and use renderAnimationFrame etc

If you're interested I wrote about it a few years ago but haven't touch it since https://djharper.dev/post/2018/09/21/i-ported-my-gameboy-col...


The web worker where wasm is running in would be the one that handles the whole game, and the game has to be told the inputs. Inputs are only grab able on the main thread, and rendering is only done on the main thread. If you run the game in the main thread too you can end up with your browser tab becoming unresponsive, or have delays in the input and rendering that affect the games performance.

It's the same as when you make a python UI all in one thread, you can't receive user input and also do some long task at the same time.


You could pass it via a SharedArrayBuffer


From what I found on MDN "a side effect to the block in one agent will eventually become visible in the other agent", what does the word eventually mean there, what's going on under the hood?


Should just be nanoseconds. I think they're mostly making considerations for unfortunate thread scheduling where one gets stuck for a while.


Pity that WebAssembly Studio development seems to have stalled.

https://webassembly.studio/

It is the easiest way to get into WASM.

Threading requires sending custom headers by the way.


Any thoughts on how you could revive it?


By having someone to pay me a proper salary. I don't do charity.


Given that

> Doom has a global variable screens[0] which is a byte array of SCREENWIDTH*SCREENHEIGHT, i.e. 320x200 with the current screen contents.

It would seem to me that the right approach would be to hoist out Doom's main loop so you just have a renderFrame() function, then put something on the main browser thread to "blit" the image into the canvas itself.


As you wrote "the right approach would be", I'll add: That's exactly what the code does.


Just a point for the first chapters: you are not required to run your own local server (even if things push in that direction)

You can include the wasm as an ArrayBuffer or as a base64 encoded string and hardcode it in the javascript. Now it will run even in a static html.


It's incredible how far the web has come. I remember the first time I saw a browser GameBoy emulator and I was amazed. Maybe I should port my GB emulator to WASM...


> It's incredible how far the web has come.

I agree and disagree. It seems like no one is questioning why we need to use legacy web browsers in between all the code we're executing locally.

It's like a new iteration of old tech like lisp machines, which started out as specific purpose only to grow into complete environments (afaik).

In this regard, we haven't come far, it's just the syntax that has changed.


For one thing, I’m unlikely to download a native copy of Doom to run on my own machine from a strange website. The ability to run cross-platform code that uses my GPU in a secure sandbox is pretty neat to me.


prboom-plus is open source, FFS.

Also, you have FreeDoom.


In the case of WASM, even the syntax hasn't changed, it has just come full-circle back to S-expressions.


This is exactly the kind of tutorial I've been waiting for for years. The way blocks and breaks work is especially non-intuitive if you are used to either assembly or regular languages, and you START with it. Good work, really loving this tutorial!


If you'd like to try (multiplayer) Doom in WASM there's https://silentspacemarine.com/


If anybody wants to play some Deathmatch: https://silentspacemarine.com/dbee695faaa29aefe9a14d5798689e...


I tried this, how do you open doors?


I can't tell if the lack of strings and DOM API interop in web assembly is on purpose or not.

If it is on purpose, what an absolutely diabolical way to ensure javascript language dominance in the browser: give people a way to port their language to the browser, but make it incredibly difficult to do anything.


Afaik for the WebAssembly MVP, the goal was to have a simple, efficient compile target - therefore only integers and floats. To make wasm more useful & easier to integrate, the plan calls for interface types[0], which allow both accessing complex (JS) objects and calling browser APIs.

[0] https://github.com/WebAssembly/interface-types/blob/master/p...


interesting, thank you


thank you


What type of strings though? Exposing Javascript string objects in WASM doesn't make much sense if the code is expecting C strings for instance. Same for other languages, those all have their own incompatible internal representations for strings. The only somewhat interop-friendly string type is a zero-terminated bag of bytes, usually UTF-8 encoded (aka C strings), but that's a different string representation than Javascript uses.

The Emscripten SDK offers helper functions to marshal high level data types like Javascript strings to UTF-8 encoded C strings on the WASM heap and back to JS, so it's not that bad.

DOM access can be achieved with helper libraries which call out into JS. And since any sort of DOM manipulation is extremely slow anyway there's not much of a performance difference even with the overhead of calling out from WASM into JS (which actually is quite fast nowadays).


>What type of strings though?

The good one!

UTf-8, NOT null terminated, pascal like.

ie: what rust have:

https://doc.rust-lang.org/std/string/struct.String.html

REPEAT the mistakes of C (and considering the security angle! in a browser!) must be a big no.


This would still require conversion from and to Javascript strings, and doesn't help with any language compiled to WASM that isn't Rust. And it probably wouldn't even help Rust because such a native WASM string type would presumably live outside the WASM heap (because if the string data would be on the WASM heap, there's no need for a native string type).


Any complex structure past int/floats requiere conversion. Heck, even floats and ints (for example: oCalm and anybody with more/less bits than JS).

So, given this is a fact, the best course of action is chosen the most safe alternative.

And for everyone else? Well an array of bits ant let the host/callers that are the only that know their own stuff deal with it.

INCLUDING Js.


I don't know what you're arguing about. The interface types proposal defines a string like this:

    string ≡ (list char)
char is defined to be a Unicode scalar value (i.e., a non-surrogate code point).

Basically, this is "the most safe alternative"??!


I was arguing against the idea of "string are implemented differently by different languages, so WHICH one choose?".

A safe one. Your sample is that - except I think is better if is a utf-8 string, but this one works for me too-. What will be worrisome is if is made to be like in C.


> UTf-8, NOT null terminated, pascal like. > > ie: what rust have: > > https://doc.rust-lang.org/std/string/struct.String.html

Rust (or C++) strings are not pascal strings. In pascal strings, the "string buffer" also contains the length information, and historically it was all bytes with a length byte at the start, which was why your strings started at index 1 and limited to 254 bytes.

It's possible to modernise this style of strings to be less crummy (that is essentially what sds does), but C++/Rust string are a third take where the length (and capacity) are stored separately from the string buffer, and that buffer is always on the other side of a pointer (ignoring SSO, which Rust sadly doesn't have due to the original interface definition).


Unfortunely that is not what WASM designers decided when they went without memory tagging for linear memory segments.

So you get all the fun to corrupt linear memory C style.


I think if you're afraid of memory corruption inside the WASM heap, it's better to use Rust instead of C or C++. WASM's job is to prevent code inside the sandbox from escaping the sandbox, not to prevent memory corruption inside the sandbox.


> I think if you're afraid of memory corruption inside the WASM heap, it's better to use Rust instead of C or C++.

Agreed, but as consumer from WASM modules that isn't your option to make.

> WASM's job is to prevent code inside the sandbox from escaping the sandbox, not to prevent memory corruption inside the sandbox.

That is not better than a typical OS process, just it happens to be randomly downloaded into my computer.


It is a lot better: there is a sandbox with minimal surface area compared to no sandbox at all (except for memory isolation)


When they are finished with the WebAssembly roadmap there will be the same sandbox as a typical OS process, no different of running a ART executable on Android, bitcode on watchOS, MSIL on Windows, or TIMI on IBM i.


Of course, that is the plan, but even then it will still be possible to run WebAssebly modules with no permissions or limited permissions, as the sandbox was always there.

On the other hand I need to admit that I would have not forseen some of the more recent use cases for WebAssembly

https://bytecodealliance.org/articles/making-javascript-run-...

which can be reminescent of the "docker daemon running as root" issue.


DOM manipulation is slow is pretty much busted misconception today:

https://svelte.dev/blog/virtual-dom-is-pure-overhead


It is "slow" relative to the overhead of calling from WASM into JS to manipulate the DOM from JS.

To be fair, I don't know how many clock cycles creating, destroying or modifying a DOM node costs on average, but most likely "a lot" compared to the overhead of a WASM to JS call because a lot more machinery is involved.


You are correct, today. But the overhead from WASM to JS is disappering with host binding, also known as WebAssembly Interface Types proposal.

https://www.chromestatus.com/feature/6219189974990848


The difficulty is inherent; C, C++ and so on live in a very different world to JavaScript. Whether or not WebAssembly had direct interaction with JavaScript objects at launch or not, writing bridging code would still be tedious.

But there's no reason you must write this yourself. Others have done the hard work for you and written libraries.


WebAssembly's purpose was never to replace JavaScript but only to speed up certain parts of a website/app.


That was the original message used to sell WebAssembly, however when the real goal is to replace ActiveX, Flash, Silverlight and PNaCL it was obvious that it would grew beyond that.


really? i thought it's purpose was to take us back to the good old days of sellable proprietary binary blobs instead of the more open HTML/JS/CSS stack.


that's not how I remember it:

https://brendaneich.com/2015/06/from-asm-js-to-webassembly/

edit: from the linked article, in case it isn't clear, an HN comment by eich:

"Sure, in userland many languages compile to assembly. Hmm, where have I heard that word lately?"[1]

[1] - https://news.ycombinator.com/item?id=9554914


The lack of strings makes sense, as many different languages and standard libraries have their own implementations of it, that can behave slightly differently. It now puts the implementation of the string to the compilers/linkers, as is generally the case for assembly as well.

The lack of a DOM API is something I sorely miss as well. It's currently possible (and not that hard, you can just interact with JS), but comes with such performance overhead that you lose the entire benefit of WASM.


WASM is supposed to be 'assembly' level a little bit like java bytecodes. So it's lower level than 'strings'.

But as you have pointed out, the missing layer on top i.e. the 'thing we can practically use' is a big gaping hole and it's a little bit diabolical.

The fact that JS has gotten so much faster and the lack of both higher-level abstractions and notably a really good 'bridge' to JS means it's lagged in terms of material applicability.


It's also still missing proper garbage collection, meaning languages like C# have to include basically the entire runtime if you compile to WebAssembly. This is a major part of why Blazor apps in .NET 5 are ~2MB for a simple "Hello World" (closer to 8MB if you use the AOT compilation options in the .NET 6 preview).


Why would WASM have garbage collection? It's an assembly target, not a runtime. What if languages would want different memory management strategies?

I know it's an existing proposal for WASM, but it feels so massively out of scope. If the issue is having to include runtimes in the WASM binary it might be more useful to think about how we could serve runtimes in a more efficient way.


I feel the same way. I find it very odd that GC is something that WASM ever intends to think about. If shipping your entire runtime sucks, find a smaller runtime?


Think of it more of "integration with a host environment's runtime" than a "adding a runtime to wasm directly."

(At least, that's what it used to be; I haven't been involved in WebAssembly for a long time.)


The problem is that each runtime has different GC requirements, so at best it will mean WASM GC semantics will be the underlying JS GC semantics, probably not what you want for a D or .NET GC, for example.


No one has ever articulated the details of what they mean by these runtimes having different GC requirements. JavaScript garbage collection has no "semantics"--it is entirely invisible to applications. Even WeakMap and WeakSet do not expose garbage collection details because they are not iterable.

The memory profile of JavaScript applications tends to look a lot like the memory profile of typical Java applications. It tends to be a law of large numbers.

Now if you want to talk about details of how we implement runtimes that do have observable GC details, like weak callbacks, Java's zoo of reference types, etc, then let's do that, because Wasm GC will eventually need to have low-level mechanisms to support those.

But if we're talking about a Wasm engine GC's ability to allocate, trace, move (or not!) little blocks of memory around, then I don't see any fundamental stumbling blocks to making that mechanism efficient and universal.


For example, the existing JS GC doesn't need to expose ability to stop the GC, execute on demand, support value types, pinning memory, interior pointers, control GC regions, marshaling to native code to the developer, whereas a .NET or D GC does.

So if a future WASM GC doesn't offer APIs for such capabilities, it is useless from those runtimes point of view.


Thanks for getting down to brass tacks. Of the things that you mentioned, I think that interior pointers are the only thing that is relevant.

Java has an API for executing the GC on demand, and VM engineers I have talked to over the years think it's a knob that apps shouldn't have.

Wasm already supports multiple return values, so you don't need to box value types on any boundary--they can be flattened whereever they occur.

Pinning memory has to do with interfacing native code that could potentially do unsafe things. That doesn't fit into wasm's model, and would only be necessary for interacting with platform APIs, which are being designed not to need that. Same for "marshalling to native code".

I don't understand what you mean by GC regions. Realtime Java had GC regions and a complex system for trying to allow threads to run without touching the heap. It really didn't go well. I think if regions are useful for a GC, the engine should do inference of them, because adding regions to the type system infects everything.


I would assume that there’s Microsoft folks involved to make sure it works out satisfactory given their investment in Blazor, but yes, it’s always a possibility that an API is bad. I don’t know what their level of interest is in embedding wasm inside C# is.


That was just an example, there are a plethora from GC algorithms to chose from, which of them needs to be fine tuned for the specific runtime it is to be applied, if performance is of any concern to the language implementers.


I have worked on a number of runtimes and it is not generally the case that a GC needs to be "tuned" for a runtime, rather that a GC co-evolves with a runtime and features or misfeatures of the runtime determine the path of least resistance for developing more advanced GC algorithms. The interplay tends to involve a lot of technical debt if the separation is poor from the outset. But regardless, it's rare that a runtime develops more than a couple GC algorithms unless it has a very long lifetime or is explicitly designed to allow swappable GCs, like Jikes RVM with Mmtk.

GC performance depends more on the program than the language.

But regardless, the hardest parts of getting to advanced GCs, such as concurrent and parallel algorithms are usually very deep assumptions of single-threadedness and uninterruptibility that are debt in the runtime. It usually doesn't help that most runtimes are written in C/C++ and suffer that environment's complete uncooperativeness[1] in finding and manipulating roots.

[1] To the point of seeming hostility. It's been how many years and LLVM still fights against supporting stack maps?


Yeah, could be interesting. I guess it works for the JVM? I feel like people will still want to use their own runtimes but idk


My Intel CPU also doesn't have a GC, so that is how it is.


I love seeing this kind of tutorial, that isn't just a step-by-step guide, but also an exploration of the thought process and trial-and-error that goes on in crafting each step, so thanks for sharing.

Looks like a lot of the work on the Doom port (https://github.com/diekmann/wasm-fizzbuzz/tree/main/doom) is about getting common functions from the C standard library to work in WASM. Surely this seems like a good opportunity for a new Free Software initiative - something optimized, properly licensed/credited and easy for everybody to use?


There's already wasi-libc: https://github.com/WebAssembly/wasi-libc


Interesting that Firefox by default did not render the fizzbuzz demo correctly by default. I had to click the canvas icon next to address bar and allow the canvas usage. And it did not show a prompt either. It just looked broken by default.

Screenshot from Firefox vs Chrome https://i.imgur.com/Af8nTim.png


...Good! The "prompt" permission model is fundamentally broken, because all it does is train you to click through the prompt.

The "click the blocking button and turn it off" model is much better. It still trains you to turn off blocking when something is broken. However, crucially, that's only when it's broken. When it's not broken, you just use the site, instead of habitually clicking through the permission prompt that's just harvesting data, not actually needed to function.

And yes, malicious sites can of course display themselves as falsely broken until you grant the permissions. But this makes them more annoying to use, granting a UX edge to the honest sites which don't request unnecessary permissions. In other words, the incentives of sites and users are more aligned.


   >>> all it does is train you to click through the prompt.
no, if I want extra functionality, I click it, otherwise I ignore it. (random page wants my location? nah)

modular blocking prompts were broken, optional prompts are fine


You probably have resist fingerprinting turned on


Most likely. I don't remember all the settings that I have turned on at some point. :)


FF 89 worked fine here (mac), either your FF version is old or it's some sort of FF/linux issue ?


"Then, I threw out everything which is either not needed or looks complicated. We only need the string formatting functions anyway, let's remove everything else. Th result is a crossover of musl 1.2.2 and arch from emscripten for musl 1.1.15. YOLO!"

Lmao


Could I buy the author some headphones, I'd really like to see how they would port audio :)

note on controls: 'ctrl' is a bad choice because ctrl + up/down on mac map to window management shortcuts, making the game unplayable.


My first question I thought of before reading this was how to actually display characters out of it.

Quite a mess, IMHO. (Not that I'm blaming the author).


This is common for lower-level API though. Displaying a single cube in modern OpenGL or Vulkan is also surprisingly a mess.


Jumping from FizzBuzz to Doom is quite the leap!

It reminded me of the meme "howbro draw an owl": 1. Draw 2 circles 2. Draw the rest of the f**ing owl


An old port for PNaCL is available here, https://doom.pdox.net.


sorry if the answer is to read the whole series (i only read part 4), but is there a comparison of this hand-optimized route vs what emscripten outputs (in terms of binary size an browser perf)?

i assume a proper emscripten comparison would also need to strip networking & audio output.


While I think this is nice, the tutorial is almost entirely unreadable on mobile.


Nice read! I ported DOOM to TempleOS about a week ago.

https://git.checksum.fail/alec/chocolate-doom


Doing the gods work, or the devil's work. I'm a little confused here.


another tutorial added to my bookmarks I will struggle to get around to :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: