
Oxidizing Source Maps with Rust and WebAssembly - mnemonik
https://hacks.mozilla.org/2018/01/oxidizing-source-maps-with-rust-and-webassembly/
======
MrBingley
> Tom Tromey and I have replaced the most performance-sensitive portions of
> the source-map JavaScript Library’s source map parser with Rust code that is
> compiled to WebAssembly. The WebAssembly is up to 5.89 times faster than the
> JavaScript implementation on realistic benchmarks operating on real world
> source maps.

Wow. Some people have been wondering if Rust has a "killer app", but this
could be it. Right now WebAssembly only supports non-GC languages (so C, C++,
and Rust), and of those three Rust is the easiest to get started with. It
looks really appealing.

~~~
jjtheblunt
Why do you say Rust is easier to start with than C or C++?

[Disclaimer : i'm old enough to dream in K&R C and later ANSI C, and then C++,
but am more than enthused about Rust, so a genuine question]

~~~
seba_dos1
I would guess that a language in where the compiler holds your hand so you
don't shoot yourself in the foot is probably by definition a bit easier than
one that hides traps for you behind corner every now and then :)

~~~
carlmr
While I generally agree with that sentiment, Rust's compiler is so hard to
please that a lot of C/C++ developers find it hard to do it. I think for a
beginner it's even harder.

C is so small a language that you can learn it much quicker. The problems
surface later on.

~~~
neikos
What do you mean by 'hard to please'? The joke about the compiler being some
beast you need to sacrifice a goat to seems to put the blame on the wrong end
of the computer imo.

Unless what you are doing is not fit for what guarantees Rust gives you, the
compiler should just be a crutch in case you missed a step.

~~~
carlmr
I thought the same until I actually tried Rust. The compiler will complain in
a some places which are ok if you know what's happening. It was probably
easier to implement the borrow checker that way. At the same time a lot of the
error messages are very cryptic if you haven't seen them before. It is in that
sense hard to please. A lot of this might become better with coming iterations
though.

~~~
steveklabnik
Now, this is true in many senses, as it's inherently how static analysis
works, but I've also had many experiences where someone joins one of our IRC
channels, shows some code and says "hey the borrow checker won't let me do
this thing that's totally safe" and then I or someone else replies "well, what
about this?" to which the answer is "...... oh. yeah." This is virtually
almost always from C++ programmers.

It's hard to escape the mindset of languages we're used to!

> At the same time a lot of the error messages are very cryptic

You should file bugs! We care deeply about the legibility of error messages,
and the whole --explain system is there to try and go above and beyond.

~~~
carlmr
>You should file bugs! We care deeply about the legibility of error messages,
and the whole --explain system is there to try and go above and beyond.

Next time I do some Rust I will. I also need to check out that --explain
system. I didn't see that when learning Rust.

~~~
steveklabnik
Take a program that doesn't work, like this one: [https://play.rust-
lang.org/?gist=2c7b672c6a211e3ab2f3995e37d...](https://play.rust-
lang.org/?gist=2c7b672c6a211e3ab2f3995e37d5a52a&version=stable)

> error[E0384]: cannot assign twice to immutable variable `x`

That E0384 is a link in the browser. Click it and you go to [https://doc.rust-
lang.org/error-index.html#E0384](https://doc.rust-lang.org/error-
index.html#E0384) which has a longer version of this error.

On the command line, you can run `rustc --explain E0384` and it'll print out
the same text to the terminal.

~~~
carlmr
Nice, thank you! I always Googled it until now.

------
kbenson
Wow, I had assumed a lot more of the Rust runtime/stdlib would need to come
along with anything, but the wasm-unknown-unknown target and aggressive
culling of unused code looks to make this really competitive. That's awesome,
and I'm really happy to have my predictions and assumptions proven wrong. :)

Edit: Some of the numbers:

Original JS: just under 30,000 bytes.

Closure compiler minified JS: 8,365 bytes.

New combined JS and WASM: 20,996 bytes

Roughly half of the WASM output is JS and half of which is WASM, since not all
components in the original JS were replaced, just a specific subset. There is
some duplication of functions that both the remaining JS still uses and the
new WASM code does as well. Rust diagnostic and error messages appear to still
be present inthe data section although unusable, so could be cleared out with
better tooling.

~~~
steveklabnik
The smallest file we've gotten so far is 116 bytes. I went over it here
[https://www.reddit.com/r/programming/comments/7fn87w/rust_wa...](https://www.reddit.com/r/programming/comments/7fn87w/rust_wasm_target_landed_in_nightly_without_extra/dqeoz5n/?context=4)

------
ridiculous_fish
VLQ decoding is sort of subtle, and the one shown in the article under "Base
64 Variable Length Quantities" looks broken.

1\. `shift` can grow large, and an overlong left shift will panic in Rust. I
expect an input like "gggggggC" to crash.

2\. `accum >>= 1` is wrong because it rounds down. It needs to be a round-
towards-zero: `accum /= 2`.

3\. `accum` is a 32 bit int which is too small. It needs to be larger so it
can represent -2^31 in sign-magnitude form.

~~~
mnemonik
Thanks for the report!

We've already fixed some of these issues[0] but I didn't update the code
snippets in the article -- woops!

[0]
[https://github.com/tromey/vlq/commit/3b41a2b6c778ce476eaaa28...](https://github.com/tromey/vlq/commit/3b41a2b6c778ce476eaaa2851a5d32e4df63a81c)

------
js2
FYI, the Sentry folks have also written a Rust-based sourcemap decoder:

[https://github.com/getsentry/rust-
sourcemap](https://github.com/getsentry/rust-sourcemap)

~~~
steveklabnik
The authors of it and the authors of this were talking on Twitter; the one in
the article is pretty specialized, so that's why they didn't use this. They
recommend Sentry's library for every non-wasm use case, basically.

------
couchand
This is brilliant. Very excited to see more applications like this making
great use of the possibilities of Rust+WASM.

One thing that stuck out to me was the exploded-struct callback mechanism for
reporting Mappings back to JS. I've also been struggling to handle the low-
bandwidth interface between JS and WASM. That wasn't a strategy I'd
considered, but it's pretty neat.

It's simple enough and will work in this case, but unfortunately doesn't
generalize very well. I've been exploring using well-defined binary encoding
for this purpose (specifically Cap'n'Proto, but Protobuf or another binary
encoding would work, too).

See an example I put together: [https://github.com/couchand/rust-wasm-
capnproto-example](https://github.com/couchand/rust-wasm-capnproto-example).
I'm definitely going to go back and clean that up with some of the FFI
patterns from this article.

------
eridius
parse_mappings is incorrect. It has the following line:

    
    
      Vec::<usize>::from_raw_parts(capacity_ptr, size, capacity);
    

But size is wrong. size here is the number of bytes that JavaScript instructed
Rust to allocate. It's not the size of the Vec::<usize>.

This should be harmless as the resulting Vec is immediately deallocated, so
the only thing that size is used for is deinitializing values, and since the
values are usize there's nothing to deinitialize. But it's still technically
incorrect.

~~~
mnemonik
Great catch -- thanks for reporting this!

Fix is over in [https://github.com/fitzgen/source-map-
mappings/pull/15](https://github.com/fitzgen/source-map-mappings/pull/15)

------
eridius
The get_last_error() snippet shows LAST_ERROR as being a static mut (which
IIRC requires unsafe access as it's not thread-safe). But the parse_mappings()
snippet shows safe access to LAST_ERROR using an API that clearly indicates
it's not just an Option. I assume that at this point it's actually a
std::thread::LocalKey.

~~~
mnemonik
Thanks for the heads up. It used to be a `thread_local!`, but switching to a
`static mut` resulted in smaller code size. I just forgot to update the code
snippet in the article.

> which IIRC requires unsafe access as it's not thread-safe

Yes, and also has no guarantees against mutably aliasing and re-entry.

Note that wasm currently has no shared memory threading (just message passing
via FFI through JS and workers), so thread-safety isn't an issue to be wary of
here, just re-entry.

