
Asm.js: a strict subset of js for compilers – working draft - shardling
http://asmjs.org/spec/latest/
======
tachyonbeam
Sitting about 10 feet away from Luke Wagner right now. He has told me that
asm.js is Mozilla's response to NaCl. You compile code with Clang (or another
compiler) into the asm.js subset of JavaScript, which they know how to
optimize well, and their JIT compiler will offer you performance very close to
that of native C (they claim a factor of at most 2x). They use special tricks,
like (a+b)|0, to force results to be in the int32 range, and avoid overflow
checks. The heap views uses multiple typed arrays pointing to the same data to
give asm.js code a typed heap of aligned values, avoiding the JS garbage
collector (you can manage this heap as you wish).

They already have sample apps, including some using libraries like Qt (I
believe OpenGL as well), compiling to asm.js. I believe it has potential, so
long as they have good support for standard and common C libraries (i.e.:
porting to asm.js is almost effortless).

~~~
duaneb
> They use special tricks, like (a+b)|0, to force results to be in the int32
> range, and avoid overflow checks.

Why is this not seen as the worst idea ever? Why this great resistance to
moving away from javascript?

~~~
mccr8
> Why this great resistance to moving away from javascript?

To what? What scenario do you envision where Apple, Google, Microsoft and
Mozilla are all willing to implement a single new language for the web?

~~~
duaneb
> To what? What scenario do you envision where Apple, Google, Microsoft and
> Mozilla are all willing to implement a single new language for the web?

Well hopefully we wouldn't make the same mistake again by going with a
language and we would just implement a VM on which JS is implemented. It would
be much easier to validate the correctness of a VM anyway, achieve better
speed than was possible before, and not tie people down to a (painful)
language. On a personal note I would actually start to view the web browser as
a platform instead of a bunch of tangled strings, cans, and tape.

EDIT: I didn't answer your question. But, as a developer, I will only move to
the web for my default programming language when I have bytecode or assembly
to look at, not some convoluted scripting language. Hopefully, Asm.js is a
minor, minor bump on the way towards the browser as a usable platform for
arbitrary development.

~~~
TheZenPsycho
Oh. Right. A VM. On the web. Because that worked reallllly well when Java did
it.

~~~
wwweston
Don't forget Flash, which did actually work relatively well, and pretty much
powered the development of the web as a video and gaming platform... but which
we've eventually ditched for HTML/CSS/JavaScript + native web APIs anyway.

~~~
TheZenPsycho
That is fair enough. Though that wasn't really Flash's original ambition. And
to be further fair, the fact that it wasn't their explicit ambition is
possibly what saved it, while Java suffered a doomy fate from Microsoft's
embrace and extend strategy.

Flash was fine as a format for vector images. (like SVG). Then it was fine for
animations. Then simple interactive graphics. Then it gets a scripting
language. And as it gets more and more sophisticated, more used as an
application platform-- something flash was not originally designed to do, it
becomes more and more like Java in its shortcomings. Binary blobs. Long load
times. Serious security holes. Slow as molasses.

But flash always had one good thing going for it: Anti-aliasing.

------
cromwellian
This seems very much targeted at emscripten and not to cross-compilers that
start with GC'ed languages like GWT, Dart, ClojureScript, et al. If you are
cross-compiling Java or C# to asm.js, you don't really want to manage memory
manually. I work on GWT, and asm.js as an output target is very interesting to
me (I've worked on a number of performance Web games using it, GwtQuake,
AngryBirds, etc), but the starting assumption is GC, so I want to leverage
non-boxed numerics, and all of the other nice stuff, but don't want to stuff
everything into a TypedArray.

It's also unclear to me how this solves the problem of startup time on mobile.
A giant glob of Javascript takes a non-trivial amount of time to load even on
today's fastest smartphones. The spec seems to argue that Browser VMs could
recognize asm.js and if I read between the lines, employ snapshoting the app
and caching it for later quick startup?

In all likelihood, the majority of asm.js outputs would be actually be non-
human readable output of optimizing cross-compilers, so there isn't much
benefit from having a readable syntax that humans could read, so what's the
real justification for using JS as an intermediate representation over say, a
syntax specifically designed for minimum network overhead and maximum startup
speed? Seems like it might be worthwhile for Mozilla to also work on efficient
representations of asm.js that minimize this overhead. The usual response is
minify + gzip, but it's not a panacea.

~~~
dherman
We have plans!

First of all, we do care very much about supporting compilers for managed
languages like Java and C#, but we're starting with this first version that
only supports zero GC and atomic primitive types. We have plans to grow
outwards from there to support structured binary data, based on the ES6 binary
data API, and controlled interaction with the GC. Luke has ideas about how to
do this without losing the predictable performance for lower-level compilers
like Emscripten.

We do have plans for startup time. I hope to pitch a very small, simple API
for a kind of opaque compiled Function. Internally we've been calling it
FunctionBlob (we'll bikeshed the name later). The idea is that `new
FunctionBlob(src)` is almost identical to `new Function(src)` except the
object is (a) an opaque wrapper that can be converted to a function via
`fblob.toFunction()` and (b) entirely stateless and therefore compatible for
transfer between Workers as well as offline storage. This would essentially
make it possible to do things like background compilation of asm.js on a
Worker thread, and caching of the results of compilation in offline storage.
That way next time you startup you don't have to download or optimize the
source. (This could work especially well with the app experience where you
could perform these download/optimize steps as part of the installation
process.)

As for the use of JS, this is purely pragmatic. The code works today in
browsers, so people can start using it and it works -- and even quite fast;
Emscripten is already shockingly performant in Firefox and Chrome -- but over
time it'll see better and better optimization.

~~~
cromwellian
Presumably, if it was a blob, it could also be stored in local
storage/indexdb/filesystem API? Is the internal format supposed to be
architecture neutral, or dependent? I can see arguments for either, but if it
were neutral, than the blob conversion could be done offline/statically on the
server, and downloaded by the client dynamically (e.g. XHR to fetch function
blobs). If it were architecture dependent, then I could see advantages as
well, letting the browser vendor choose the optimal form of the blob. This
would potentially yield better performance, but you wouldn't be able to host
blobs on the server.

Anyway, cool idea.

~~~
dherman
I was only thinking that it would be internal. Your server point is good, but
I think way, way harder, and kind of starts the whole project back at the
beginning: how to design a standard, efficient, optimized bytecode format. So
I think it's probably not really feasible.

Not the same, but an additional optimization you can do is incremental
compilation. Because you have JavaScript's eval, you can download the code a
bit at a time and optimize (and cache via FunctionBlob) each piece.

~~~
ianb
Storing it and transferring it on the server is one thing; serializing it
locally in the browser itself might be a more reasonable goal? That is, it
wouldn't be expected to be portable to anything but that very same browser,
but it would allow you to cache the compiled result. (I would expect the
serialized string to be signed by the browser itself, to prove that it was
created by the browser – and for the deserialization to fail on some browser
upgrades).

~~~
dherman
Right, that's the idea of FunctionBlob. It wraps a browser-internal
representation of the optimized compiled code. The web code can instruct the
browser to store that offline, without exposing its implementation-specific
details to the web code. The web code can then, in a later session, retrieve
that optimized code from storage as another FunctionBlob, which it can then
convert into a Function. This is no different from just storing the asm.js
source code in offline storage, except it avoids redoing the work of compiling
and optimizing the source. (It'd still have to be stored in position-
independent format and there might be some back-patching necessary when
reloading it.)

------
evilpie
<https://bugzilla.mozilla.org/show_bug.cgi?id=840282> has some measurements on
how fast the implementation of asm.js (called OdinMonkey) already is. "sm" is
SpiderMonkey, that is the normal JS engine. v8 is Chrome's JavaScript engine.

Current results on large Emscripten codebases are as follows, reported as
factor slowdown compared to gcc -O2 (so 1.0 would mean "same speed")

    
    
                   odin (now)  odin (next)  sm      v8  
      skinning     2.80        2.46         12.90   59.35  
      zlib         2.02        1.61         5.15    5.95  
      bullet       2.16        1.79         12.31   9.30

~~~
shardling
I kinda hope that someone at Mozilla keeps a count of just how many monkeys
have gone into the code base. Off the top of my head, there's SpiderMonkey,
TraceMonkey, JagerMonkey, IonMonkey, ScreamingMonkey, IronMonkey, and perhaps
you should count Tamarin. It would be neat to see mascot like versions of them
all... :)

~~~
nnethercote
I've never heard of ScreamingMonkey or IronMonkey, and I'm a member of
Mozilla's JavaScript team...

~~~
shardling
IronMonkey was apparently for "mapping IronPython and IronRuby (and maybe
IronPHP) to Tamarin". I just vaguely recalled seeing the name pop up in a
Mozilla related context -- I certainly couldn't have told you what it was
without looking it up. :)

<https://wiki.mozilla.org/Tamarin:IronMonkey>

------
willvarfar
The cool thing is that those of us who have small performance-critical
javascript routines e.g. game engines have a whole new cheatsheet of 'optimal'
javascript. I can't wait for box2d, octrees and matrix libraries to adopt it;
a whole new generation of hand-optimised assembler!

~~~
shardling
Last I read up on it, the version of box2d compiled with emscripten performed
a lot better than any of the "hand tuned" js versions! :)

------
shanselman
Madness! ;)
[http://www.hanselman.com/blog/JavaScriptIsAssemblyLanguageFo...](http://www.hanselman.com/blog/JavaScriptIsAssemblyLanguageForTheWebSematicMarkupIsDeadCleanVsMachinecodedHTML.aspx)

~~~
dherman
Don't think I haven't explicitly referenced you in my talks already! :D

<http://vimeo.com/43380479>

------
mhd
Reminds me a bit of efforts like C--, which sought to seek a simpler pseudo-
assembly used as some kind of intermediate language for compilers. But those
efforts never gained much traction, whereas I think that some transpilers
actually exploited a few features _beyond_ even normal, full-fleged C --
namely GCC extensions -- to make some features easier and/or faster (it's been
a while, but it was probably some trampolining optimization).

Let's see how it turns out.

As for something completely different, I've always wondered how it would be to
program a webapp in a rather different language than JS -- most transpiled
languages aren't that fundamentally different from JS. And Emscripten seems
mostly used to port some code that "runs in a box". Wonder how far one could
come doing some stereotypically Web 2.0 things in e.g. Pascal.

~~~
aaronblohowiak
Elm and Roy are quite different in that they abandon normal js semantics.. as
far as going to a more blub language, please do report your experimental
results!

~~~
swannodette
ClojureScript is also quite different from JavaScript -
<http://himera.herokuapp.com/synonym.html>

------
adamnemecek
I've become increasingly convinced that a standardized VM in the browser that
other languages could target would be the best idea. And we could forget that
the whole JS thing ever happened.

~~~
jws
LLVM byte code in an OS secured jail/sandbox? The JIT is already there and BSD
licensed so all the players can use it. You'd really have to trust your
sandbox though.

The API for what you what you can do out of your sandbox would be the hard
part. Every capability you add to the API is also a lurking attack vector in
each implementation.

~~~
adamnemecek
Microkernels in the browser to the rescue :-). But yeah, I had something like
that in mind. I do realize that getting security right would be tricky, but
I'm not sure if it would be that much trickier than say the security of any
given JS engine. Since the semantics of said VM would be probably simpler, I
would make the argument that getting the security right would be easier to do
than the security of said JS engine.

------
onassar
Any thoughts on where this could be useful? The context and purpose of it goes
a little over my head.

~~~
dherman
Compilers like Emscripten and Mandreel, which already generate code similar to
asm.js, can be modified (we already have it implemented for Emscripten, it's
not a big change) to generate valid asm.js. Then engines that recognize it can
optimize the code even further than existing optimizations. Some of the
technical benefits:

* ahead-of-time compilation and optimization instead of heuristic JIT compilation

* heap access can be made extremely efficient, close to native pointer accesses on most (all? still experimenting) major platforms

* integers and doubles are represented completely unboxed -- no dynamic type guards

* absolutely no GC interaction or pauses

* no JIT guards, bailouts, deoptimizations, etc. necessary

But the bottom line here is: asm.js can be implemented massively faster than
anything existing JS engines can do, and it's closing the gap to native more
than ever. Not only that, it's _significantly_ easier to implement in an
existing JS engine than from-scratch technologies like NaCl/PNaCl. Luke Wagner
has implemented our optimizing asm.js engine entirely by himself in the matter
of a few months.

As the site says, the spec is a work in progress but it's nearly done. Our
prototype implementation in Firefox is almost done and will hopefully land in
the coming few months. Alon Zakai is presenting some of the ideas at
<http://mloc-js.com> tomorrow, including an overview of the ideas and some
preliminary benchmarks. We'll post his slides afterwards.

~~~
espadrine
Are there any thoughts on debugging this kind of code? How do we know that the
code was correctly treated as asm.js, and if not, can we know why?

~~~
dherman
The code has to be explicitly marked as asm.js, using a pragma similar to ES5
strict mode. This way if the code fails to validate, the errors can be
reported to the browser's developer tools.

As for debugging, the story is the same as Emscripten. I believe there's
plenty of work to do to make it better, but it's no different than the
existing state of affairs. All we're doing is formalizing existing practice so
that engines can capitalize on it and optimize better.

------
JoshTriplett
I like the idea of this, but it bugs me that it still uses doubles and only
simulates integers via optimizations in the JavaScript compiler. Why has no
JavaScript extension arisen to supply real integers?

One notable side effect of this: asm.js only seems to support 32-bit integers,
not 64-bit or larger integers.

~~~
kevingadd
JS has 'real integers', they're not being simulated. If you put the
appropriate hints in your JS the JITted output will never use a float
anywhere. Your complaint is more that all the operators (with the exception of
the bitwise ones) operate on floats, and yes, that is kind of a problem.

64 bit integer support is being worked on for JS elsewhere; asm.js probably
doesn't offer it yet since you can't use 64 bit ints in any browser yet.

~~~
dherman
That's true, but JoshTriplett has a reasonable point. In point of fact, we
_are_ discussing custom value types like int32 and uint32, as well as compound
value types like immutable records, for the future of ECMAScript:

[http://wiki.ecmascript.org/doku.php?id=strawman:value_object...](http://wiki.ecmascript.org/doku.php?id=strawman:value_objects)

But standardization takes time, and we wanted to get asm.js working now.

In the future if ECMAScript gains these other features we'll happily
incorporate whichever ones make sense. For example, if having more
straightforward syntax can help decrease code size that's a clear win. (Though
gzipping source tends to mask a multitude of sins.)

------
msvan
This seems more realistic than Google's grand plans of displacing JavaScript
with Dart. Let's hope it gains traction!

~~~
spankalee
This would probably help the Dart effort a lot by providing a sane compilation
target.

------
edtechdev
It's still a bit surprising that types were never (and still haven't been)
added to javascript, as proposed for javascript 2.0 back in 1999:
[http://web.archive.org/web/20000817085058/http://www.mozilla...](http://web.archive.org/web/20000817085058/http://www.mozilla.org/js/language/js20.html)

Now we have Google's Dart & Closure compilers, Microsoft's TypeScript, and
Mozilla's asm, all of which essentially add types back to javascript, not to
mention about two dozen other statically typed javascript compilers:
[https://github.com/jashkenas/coffee-script/wiki/List-of-
lang...](https://github.com/jashkenas/coffee-script/wiki/List-of-languages-
that-compile-to-JS)

If types were approved 13 years ago, javascript apps could have been made to
run much faster (fast enough for games even), perhaps negating a need for
'native' mobile apps that we have today, and either hastening the demise or
spurring the optimization of competitors like Flash and Java and
.NET/Silverlight.

(I'm already aware of arguments against static typing, and against having a VM
in the browser or treating javascript like one.)

~~~
dherman
It's a _lot_ harder to add types to a general purpose programming language.
Your types have to match actual programming idioms, and if you care about them
being safe (which, to be fair, recent languages like Dart and TypeScript
don't), you have to consider every possible loophole that could lead to a
dynamic error -- and the legacy language fights you, because all of its
dynamism was designed back when nobody was thinking about respecting some not-
yet-existent type system.

The type system for asm.js is a far more restricted problem, which is why we
were able to come up with a solution so quickly (we only started this project
in late 2012). The type hierarchy is extremely spartan, and it's just designed
to map a low-level language (intended for _compilers_ to be writing in) to the
low-level machine model of modern computers.

------
rntz
The fact that you can't represent function pointers (not closures, just plain
old C-style function pointers) in asm.js severely limits its usability as a
target language for even C-like languages.

~~~
marijn
Can you explain the use case where you need need function pointers but closure
pointers won't do?

~~~
rntz
Oh! Closure pointers would absolutely suffice. But as far as I could tell,
asm.js doesn't support closure pointers either. Obviously I could use a value
of "unknown" type to pass them in, but section 2.1.9 says "Since asm.js does
not allow general JavaScript values, the result must be immediately coerced to
an integer or double."

So I'm under the impression that asm.js can't really deal with closures or
functions as values. Am I reading the docs wrong? I _hope_ I am.

~~~
evincarofautumn
Looks like it’s in the works.

<https://github.com/dherman/asm.js/issues/4>

------
likeclockwork
Wow. The devs at Mozilla are really working it.

I mean, with empscripten, Firefox OS, Firefox browser, and now asm.js...
they're about to force everyone onto their own terms.

This is clearly a big move and the beginning of a major victory for Mozilla
and all users and developers.

Serving native apps, in the browser, with JS. And everyone is going to have no
choice but to follow them, because all other browsers will fallback to their
regular JS interpretter/JIT if they don't optimize on asm.js.

We're talking games and applications that will run an order of magnitude
faster on Firefox than in other platforms out of the gate. But they'll still
run everywhere, just very slowly.

------
AshleysBrain
Any thoughts on whether there could be a JS -> Asm.JS compiler? Might be a
handy way to get rid of GC pauses - and maybe even improve performance - for
existing JS code.

Or even a JS -> Asm.JS compiler written in JS... so you can feature-detect and
enable on demand :)

~~~
tachyonbeam
Yes, you could comple JS to asm.js, but then, unless you change the JS
semantics, you'll have to implement a GC, and your GC might have pauses.

Note that an interesting possibility would be to be able to generate asm.js at
run-time for a domain-specific language. You could easily implement this as a
JS library.

------
btipling
Can it still be considered a 'subset' if it adds new traits like 'int' and
'intish'? If it both adds and remove things wouldn't that make it more of a
variant like scheme is to lisp?

~~~
dherman
Yes, the observable semantics is 100% identical to running the same code in an
existing JS engine. That's the genius behind Emscripten (i.e., the genius of
Alon Zakai) -- he figured out that you can effectively simulate the semantics
of machine integer operations by exploiting the various parts of the JS
semantics that internally does ToInt32/ToUint32 on its arguments.

What asm.js is simply _formalize_ those tricks in order to _guarantee_ that
you're only using those parts of the JS semantics, so that it's sound for an
optimizing engine to directly implement them with machine arithmetic and
unboxed integers. But the behavior is absolutely identical to a standard JS
interpreter/JIT/etc.

------
evanprodromou
@dherman The document says, "extraordinarily optimizable". Do you have any
numbers on that?

I'd love to see benchmarks from running asm.js-compatible JavaScript on e.g.
SpiderMonkey versus an asm.js-optimizing SpiderMonkey.

Are we talking about incremental differences of 5%, 25%, even 50%, or orders-
of-magnitude improvement?

~~~
kibwen
evilpie posted some numbers below:
<http://news.ycombinator.com/item?id=5227841> and I'll reproduce the salient
points here:

    
    
                   odin (now)  odin (next)  sm      v8  
      skinning     2.80        2.46         12.90   59.35  
      zlib         2.02        1.61         5.15    5.95  
      bullet       2.16        1.79         12.31   9.30
    

<https://bugzilla.mozilla.org/show_bug.cgi?id=840282>

------
jlebar
Have you guys thought about memory management in the ArrayBuffer "heap"? One
can decommit pages from a real heap, which can be a pretty important
optimization.

------
niggler
What happened with javascript in pypy? I thought they were working on the
restricted subset suitable for static analysis.

~~~
tachyonbeam
You mean this: [http://www.formal-
methods.de/mediawiki/images/e/e5/Bachelor_...](http://www.formal-
methods.de/mediawiki/images/e/e5/Bachelor_zalewski.pdf) ?

------
ksec
Would asm.js reduce JS memory usage?

Would non computational scripts benefits from this? Things that are widely
used like jQuery.

------
pjmlp
Can we please replace JavaScript or use a proper VM instead of this nonsense?

Or better yet, just use native applications.

------
juiceandjuice
AKA RPython for Javascript

------
MatthewPhillips
Love it. When can we expect a JavaScript to asm.js compiler?

------
halacsy
first announcement of asm.js in Budapest on @mlocjs

------
frozenport
Will Javascript become Java?

------
dakimov
I've had similar ideas back when BlackBerry was popular with its Java-only SDK
in order to port our huge C++ app, and also when Windows Phone was C#-only.

Eventually it's worked itself out, as C++ has become available everywhere.

Nevertheless, if they succeed, the same thing can be easily made for Java and
C#, so we can make our C++ app ultra-portable.

It will be an absurd toolchain though.

People are so unable to communicate and adopt standards as if they were
retarded.

Sophisticated solutions atop idiotic problems.

It's actually the slogan of the overall field.

