
Compiling C to WebAssembly Without Emscripten - ingve
https://dassur.ma/things/c-to-webassembly/
======
_nhynes
> C without the standard library (called “libc”) is pretty rough

If you don't want to pull in musl or a traditional libc, there's a more Wasm-y
solution known as the Web Assembly System Interface (WASI) [0] that delegates
the libc functionality to the runtime.

A WASI Wasm module can be compiled using clang, as in the article. The only
difference is to use the WASI sysroot [1].

> optimization

LTO and -O3 are great! I've also found the Twiggy [2] tool useful for more
"manual" optimization.

[0] [https://wasi.dev](https://wasi.dev) [1]
[https://github.com/CraneStation/wasi-
sdk/releases](https://github.com/CraneStation/wasi-sdk/releases) [2]
[https://rustwasm.github.io/twiggy/](https://rustwasm.github.io/twiggy/)

~~~
dassurma
Author here :) I am very excited about WASI, but as I mentioned in a comment
below, wanted to keep it to the fundamentals so you can appreciate what WASI
does for you.

And definitely agree with the shout-out to Twiggy (which I mention in the
previous post in the series)!

~~~
ngcc_hk
Totally agreed.

Just may be the next step is like rust to have core and std so some minimum
subset would be used. But without this option still excellent.

------
azakai
I've been thinking of writing a blogpost comparing the advantages and
disadvantages of emscripten, wasi, and plain llvm (which is what is discussed
here), and also how those interact with web vs server. This space has
definitely gotten more interesting recently!

~~~
pspeter3
I would love to read that post. I hope you can post it soon!

~~~
tomcam
I'm all over that idea too.

------
justinclift
One of the pain points to using Wasm in the real world, is the lack of decent
debugging.

eg no ability to run code in a debugger, set breakpoints, etc.

That being said, it's an area being worked on.

Wasm generated by LLVM can already have debugging info stored in it using Wasm
"custom sections" (they're a thing) in DWARF format. eg .debug_info,
.debug_str, (etc)

So, debuggers are at least possible.

Unfortunately, the way Wasm does variables doesn't map to the way DWARF
currently does them. So they can't be encoded correctly. A Major problem. :(

Yury Delendik is working through a spec for fixing that (officially):

[https://yurydelendik.github.io/webassembly-
dwarf/](https://yurydelendik.github.io/webassembly-dwarf/)

His initial implementation, with patches (on an older) LLVM so it generates
correct debug info according to the in-development spec, is here:

[https://github.com/yurydelendik/llvm-project/tree/frame-
poin...](https://github.com/yurydelendik/llvm-project/tree/frame-pointer)

Testing and feedback by a wider audience would be useful. :)

A more recent fork of LLVM (based on 8.0.1 dev ~3 days ago), with Yury's
patches applied is here:

[https://github.com/justinclift/llvm/commits/release_80-wasm_...](https://github.com/justinclift/llvm/commits/release_80-wasm_v2)

Personally, I'm still trying to get my head around generating DWARF debugging
info. Hopefully work it out in a few days. :)

~~~
saagarjha
Does WebAssembly have a debug trap?

~~~
justinclift
Not specifically:

[https://webassembly.github.io/spec/core/syntax/instructions....](https://webassembly.github.io/spec/core/syntax/instructions.html)

There is an "unreachable" which may be possible to use with some creativity.

Haven't really thought that through, as I'm taking a different approach.

Since the Wasm VM executes instructions virtually, it should be feasible to
pass the VM a list of addresses to break on.

eg have the VM listen on a socket, and pass it break point info (etc) out of
band.

The main Go debugger - Delve - does this with non-Wasm targets.

As a Wasm VM executes, it just needs to check if the current instruction
matches the current break point list or any other trigger conditions.

Thus, trying to figure out debug info decoding. At least the location in
memory of variables, for displaying them when a breakpoint is hit.

There's not much use in a debugger that can't show the value of variables. ;)

~~~
saagarjha
> There's not much use in a debugger that can't show the value of variables.
> ;)

Well, that depends :)

~~~
justinclift
Heh Heh Heh. While writing that, figured someone might have a valid use case.
;)

What sprang to mind for you?

~~~
saagarjha
Nothing specific, just pointing out that debug symbols are not necessary to
find value from a debugger. A disassembly and registers (or stack, in this
case?) view, along with a place to run debugger commands, is very useful in
and of itself.

------
DonHopkins
Anybody used or have opinions about AssemblyScript?

[https://github.com/AssemblyScript/assemblyscript](https://github.com/AssemblyScript/assemblyscript)

>AssemblyScript compiles strictly typed TypeScript (basically JavaScript with
types) to WebAssembly using Binaryen. It generates lean and mean WebAssembly
modules while being just an npm install away.

[https://dev.to/jtenner/an-assemblyscript-primer-for-
typescri...](https://dev.to/jtenner/an-assemblyscript-primer-for-typescript-
developers-lf1)

Here's a great example of a project that uses it:

[https://github.com/torch2424/wasmboy](https://github.com/torch2424/wasmboy)

>️Gameboy Emulator Library written in Web Assembly using AssemblyScript,
Debugger/Shell in Preact ️

Here's an excellent talk about wasmboy by the author, Aaron Turner -- he's
done some really outstanding work:

[https://www.youtube.com/watch?v=ZlL1nduatZQ](https://www.youtube.com/watch?v=ZlL1nduatZQ)

Since you can compile AssemblyScript into JavaScript with the TypeScript
compiler as well as into WebAssembly, you can compare the speed of JavaScript
-vs- WebAssembly on the same source code. Aaron did some interesting
benchmarks using wasmboy, in the great tradition of using GameBoy emulators to
benchmark JavaScript engines:

[https://medium.com/@torch2424/webassembly-is-fast-a-real-
wor...](https://medium.com/@torch2424/webassembly-is-fast-a-real-world-
benchmark-of-webassembly-vs-es6-d85a23f8e193)

~~~
torch2424
Hello!

Aaron Turner here, thank you for all the kind words! :) Yes I did all those
things, and stoked to see people excited about it!

Definitely feel free to reach out anytime about AssemblyScript or WasmBoy.
Would love to chat with you / anyone interested.

We also have a slack channel you can reach out and get invited to (see the
wiki sidebar):
[https://github.com/AssemblyScript/assemblyscript/wiki](https://github.com/AssemblyScript/assemblyscript/wiki)

Thanks again!

~~~
DonHopkins
Hi Aaron! I got to the assemblyscript slack signin page here --
[https://assemblyscript.slack.com/](https://assemblyscript.slack.com/) \-- but
there's no obvious way to get an invitation. Where should I click or send a
request to? Or if you could please send one to don@donhopkins.com, I'd
appreciate that. Thank you!

------
lxe
This is a great article that takes you pretty far with very little. I think
it's much easier to tinker and experiment when the boilerplate and tooling is
reduced to a minimum -- you get a much deeper understanding in what's actually
going on behind the scenes.

~~~
brighteyes
Yes, much easier to tinker with a more barebones system. The downside of
powerful toolchains is often their complexity. So there's a difference between
one being better for shipping code and one better for learning.

------
phickey
Nice comprehensive writeup. The next step, beyond the basic allocator
provided, would be to use wasi-sdk ([https://github.com/cranestation/wasi-
sdk](https://github.com/cranestation/wasi-sdk)) which provides a full musl-
based libc, targeting the WASI interfaces. With this, you can invoke a C
program with arguments, environment variables, and filesystem access.

~~~
dassurma
Agreed! WASI is the logical next step. It’s the “universal glue code” I wanted
to have for the longest time.

------
blackhole
The inNative WebAssembly Runtime ran into this problem as well, and includes
wasm_malloc.c, which can be linked against your application to provide a
simple malloc() implementation without having to write one yourself or depend
on WASI.

[https://github.com/innative-sdk/innative/wiki/Compile-C---
wi...](https://github.com/innative-sdk/innative/wiki/Compile-C---with-clang)

Of course, it'll be a lot easier to simply depend on WASI instead and re-
implement standard libraries on top of it.

------
jedisct1
It doesn't have to be so complicated: [https://00f.net/2019/04/07/compiling-
to-webassembly-with-llv...](https://00f.net/2019/04/07/compiling-to-
webassembly-with-llvm-and-clang/)

~~~
dassurma
Author here :D WASI is great and has me all kinds of excited, but I wanted to
cover the fundamentals so people can appreciate what WASI really gives you.

------
wa1987
Off-topic: the visual aesthetic of this blog really has its own unique charm.
It's truly something else and nonetheless seems to do the trick pretty well.

~~~
dassurma
I consider myself aesthetically handicapped, so this comment made my day.
Thank you very much.

------
cryptonector
What a nice blog post! Lots of detail, and just the right level of detail.

------
shakna
Though I'm certain the security concerns were integral to the design of WASM,
the ability to pass a pointer from JavaScript downstream into WASM terrifies
the hell out of me.

I know that theoretically every WASM module is supposed to have a fully
isolated memory block, but I can't help but wonder about the day where a bug
allows WASM to deliver malware payloads to read other web browser tabs. Let's
hope that WASM doesn't become everyday in advert networks.

~~~
maxmcd
It is a wasm pointer, not an OS pointer. You wouldn't actually get an address
of OS memory, just to the addressable memory within the wasm VM.

Getting isolation between tabs does seem to be an ongoing concern, but that's
happening regardless of the presence of wasm:
[https://v8.dev/blog/spectre#site-isolation](https://v8.dev/blog/spectre#site-
isolation)

