
Running Lua in a browser via Emscripten - bzbarsky
http://mozakai.blogspot.com/2013/05/lua-in-javascript-running-vm-in-vm.html
======
marcosscriven
Emscripten and asm.js are frankly amazing. I ported OpenSCAD to Javascript
with it: <http://www.fabfabbers.com/openscad/>

With asm.js in Firefox nightly it's nearly native speed.

~~~
MHordecki
Curiously, even though compilation is indeed near real-time in Firefox, the
rendering of the model itself is really jaggy (when rotating). This is not the
case with Chrome, where it renders smoothly.

For reference, I'm using 2012 MBA with Firefox 23.0a2 (2013-05-30) (clean
profile, just to be sure) and Chrome 27.0.1453.93.

~~~
kevingadd
IIRC FF still doesn't use vsync for requestAnimationFrame so that could cause
rotating to look less smooth.

~~~
marcosscriven
Ah... that's interesting. Thanks for the info Kevin, I'll look into that (my
poor WebGL coding notwithstanding!)

------
mwcampbell
To my surprise, lua.vm.js gzipped is about the same size (within 2K) as my
Win32 Lua DLL build gzipped. The latter has the Microsoft C runtime statically
linked, making the comparison more fair IIUC, since code built with Emscripten
has to include a libc implementation.

This finding demonstrates that asm.js isn't significantly less efficient than
LLVM bitcode or native code, even though it's text.

~~~
takeoutweight
By "efficient" I assume you mean the object code is compact, filesize-wise?
It's fairly typical that byte-code or otherwise high-level object code is more
compact than machine code. For example, Java .class files will be smaller than
their equivalent native-compiled object code.

------
munificent
> It turns out that the entire compiled Lua VM fits in 200K when gzipped.
> That's too much for some use cases, but certainly acceptable for others
> (especially with proper caching).

Tell that to someone on a mobile device with poor connectivity.

> In particular, remember that the Lua VM is often significantly faster than
> other dynamic languages like Python and Ruby. These languages are useful in
> many cases even if they are not super-fast.

Something people often overlook is that performance is _highly_ dependent on
where the code runs. Ruby is fine for server-side programs because you can
always go fast by throwing more hardware at it.

Languages that run on end-user machines don't have that luxury. This is why
the playing field for client-side programming languages is much more
constrained and why C++ is still hugely popular there. It's also why so much
work has gone into optimizing JavaScript.

Being half as fast as JS could be tolerable for some apps, but that means your
app will likely be slow and stuttery in some cases. When it is, there's little
you can do about it. Is Lua that much of an improvement over JS to justify
that?

> There are however some tricky issues, for example we can't do cross-VM cycle
> collection - if a Lua object and a JavaScript object are both not referred
> to by anything, but do refer to each other, then to be able to free them we
> would need to be able to traverse the entire heap on both sides, and
> basically do our own garbage collection in place of the browser's - for
> normal JavaScript objects, not just a new type of objects like Lua ones.

This is a huge deal. It basically means if you use this, your app is very
likely to leak memory unless you are _very_ careful. The difficulty of being
appropriately careful is exactly why we moved to GC languages in the first
place.

WebKit (now Blink on the Google side) actually has a similar problem already:
WebKit manages memory for the DOM separately from V8's garbage collector. This
adds a bunch of complexity to the browser to deal with those cycles and has, I
think, a significant performance cost.

It's enough of an issue that the Chrome team is starting a new project
("oilpan") to provide a unified GC shared by both V8 and the DOM.

Don't get me wrong, I think this is a very cool hack. But I don't think it
says much about the viability for using something like this for real apps, at
least not yet.

~~~
mwcampbell
> Is Lua that much of an improvement over JS to justify that?

This depends on the project, of course. For existing software that's already
written in Lua and now needs to run inside a browser, running the Lua VM in JS
may be the best way to go. However, I've decided that for new code, Lua
doesn't have enough advantages over JS to justify the problems with running a
VM in a VM.

Here are the things that I consider significant advantages of Lua over JS, and
my thoughts on each.

1\. Coroutines: JS is getting a form of these via generators, though I don't
know how soon that feature will become ubiquitous. In the meantime, a compiler
can turn code that uses coroutines into continuation-passing style.

2\. Weak references: It's unfortunate that JS doesn't have these. But as the
OP pointed out, running a VM within a VM introduces other memory management
problems, since the guest VM has its own GC. To avoid those issues, I can live
without weak references in the language.

3\. Metamethods: The most well-known metamethods are for overriding table
operations, so one can implement properties, proxies, or other dynamic
behavior. I've sometimes found these useful on previous Lua projects, but for
new code, I can do without them in order to avoid the problems of running a VM
in a VM.

4\. Non-string keys for tables/objects: There are ways around this. For
example, instead of using another object as a key, one can assign a unique ID
to the object, then use that as the key.

~~~
gsnedders
ES6 will introduce Maps and WeakMaps, at least, which addresses two and four;
similarly, proxies will enable three.

~~~
azakai
WeakMaps don't quite give you everything that weak references do in other
languages. Sometimes weak refs can be used to check if the object referred to
has gone away (the weak ref becomes null in that case), that is not possible
with WeakMaps. You also cannot create a list of weak references and iterate
over them with a WeakMap.

There has been some discussion about adding more powerful weak refs to
JavaScript, but I'm not sure where that discussion stands.

~~~
gsnedders
I wrote a parenthetical about weak references. I wonder why I deleted it. In
short: weakrefs are unlikely to appear any time soon in ES, as they open up
all kinds of fun around cross-origin objects (being able to detect their
approximate life-time, and that's a fairly major side-channel attack) in some
lovely edge-cases.

~~~
samth
If you have any more details about this, or even if you don't, please post to
es-discuss about it. Currently there is broad consensus on TC39 to add weak
references in JS in the future.

------
pekk
Lua is famously easy to embed. Some implementations are fast as hell. It's
comparatively a very rational and minimal language. It doesn't have to
immediately interact with Javascript objects in any interesting way.

Please just embed the damn thing in a browser already. It's been way too long
without any competiton to Javascript being allowed.

~~~
Drakim
A lot of people want their language embedded in the browser suh as Lua, Dart,
Ruby, etc.

But the way things are standing now it's not gonna happen. JavaScript, for the
better and worse, is what we are stuck with. No browser vendor will be able to
push though another language, and no standards committee would suggest a
_second_ scripting language to the web that runs alongside JavaScript.

If you absolutely cannot stand programming in JavaScript, which is the only
native option for the web, you have to make do with the "language written on
top of another language" solutions that are so popular these days, along with
it's somewhat costy overhead.

~~~
vidarh
The only real option is presumably the Dart route: Work really hard to make it
compile to fast javascript and integrate well, and do a native VM as a
separate option that can at some point be slotted in as an "optimisation" if
the language gets enough traction. I don't think that alternative looks _that_
bad. I'd rather have that, than go back to plugin hell or have browsers
bloated with support for tons of languages.

------
nsomething
If this is the real deal, then why not ship common language VMs (in asm.js)
with the browser? Then you don't need to worry about 200k at all and all code
can run in the browser? I mean, if google is shipping NaCL and Dart VMs, why
not? Updates could ship with browser updates, or you can still load your own
rolled version if you like.

~~~
azakai
Hmm, you could avoid some downloads by basically populating the browser cache
by default with some common libraries, like jQuery, and maybe a Lua VM if that
catches on, yeah. But this would be totally different from shipping a
nonstandard VM, since there is no violation of the standards process if you
are basically just optimizing away first downloads of common JS libraries. It
does have the cost of increasing the initial browser download though, which is
perhaps why this isn't already done (also things like jQuery have many
versions which makes things harder).

Note however that Google has not actually shipped NaCl or Dart. (NaCl
technically ships in Chrome, but is disabled on the web by default, it is only
usable in the Chrome Store.) So currently no browser is shipping a new
nonstandard VM - which is good, because if a major browser did that it could
fragment the web.

~~~
NinjaWarrior
FYI, current mobile browser caches are very weak and unreliable. For instance,
iOS clears all cached data just by exiting Safari. Android is also in a
similar situation (unpredictable and can't explain).

I guess it is caused by the storage limitations. We can't rely on the mobile
browser cache, at least for now.

------
stormbrew
I really feel like the right approach for this kind of thing is something
_like_ emscripten, but _not_ emscripten. It just doesn't make sense to
duplicate the garbage collection (never mind that interactions between
different GCs can be strange and unpredictable) and run a whole VM for it.

What we need is a similarly narrow target JS subset for dynamic languages
where the outer GC is actually functional and lookup/inline cache kinds of
optimizations can be performed on correctly generated code.

~~~
azakai
I like the idea, and it could work in many cases, but it's hard to make it
truly universal. For example languages like Lua, Java and C# have finalizers,
and JavaScript GCs do not, so you can't reuse the outer GC. Other possible
issues are weak refs, etc.

So even if we duplicate by compiling another GC, we are enabling new types of
GCing, it isn't an exact clone.

~~~
mzl
While Java does have finalizers, there is no guarantee that they will run. In
other words, a valid implementation could just skip running them at all.

------
ot
It would get _really_ interesting if someone modified LuaJIT to emit asm.js
code.

In this case, the Lua code could be even _faster_ than the equivalent JS code
on some applications where LuaJIT generates better code than V8/IonMonkey
(even considering the 2x slowdown of asm.js wrt native).

The relevant yo dawg joke would be "I heard you like JITs so we put LuaJIT on
your OdinMonkey so you can JIT while you JIT"

------
wiredfool
In the spirit of a vm in a vm, has anyone successfully compiled emacs in
emscripten yet? I ran a quick test yesterday and it didn't go cleanly.

------
iso8859-1
The benchmark[1] includes scimark, which measures FLOPS. I get 0.62 MFLOPS. In
scimark running in the JVM[2], I get 353.4 MFLOPS (units explained here[3]).
So in conclusion, the Oracle JVM on x86-32 is __570 x faster __than lua.vm.js
on V8 on same platform.

[1]: <http://kripken.github.io/lua.vm.js/lua.vm.js.html>

[2]: <http://math.nist.gov/scimark2/run.html>

[3]: <http://math.nist.gov/scimark2/faq.html>

~~~
adlpz
I assume the JS code generated here is using heavily the asm.js subset, so it
can be run faster on asm.js-capable browsers.

Therefore V8 is not the ideal platform for benchmarking this implementation.
You could try with a nightly Firefox.

 __EDIT __: Indeed, 8 MFLOPS on FF Aurora, 1.2 MFLOPS in Chrome Canary.

~~~
iso8859-1
What do you get in the JVM?

------
mwcampbell
Did Lua's use of setjmp and longjmp for error handling pose any particular
challenge? I always thought that was an odd aspect of Lua's implementation.

~~~
azakai
setjmp/longjmp works in emscripten, it does prevent some optimizations in
functions where it is called though. But luckily it doesn't seem like it is
called in anything performance-sensitive in the benchmarks I've tested.

~~~
alayne
Not sure if this is related, but you often have to declare local variables
volatile when using setjmp/longjmp because they do not preserve registers.
It's one case where unoptimized C code will run fine, then -O will push
variables into registers and break it.

I wonder how difficult it would be to convert Lua to C++ with exceptions.

~~~
yoklov
This isn't an issue for Lua, because so much effort has gone into making it
extremely portable. Regardless, a port to C++ would be possible, though I'd
(without thinking about it much) guess it would take a performance hit due to
the increased code size from to the EH tables causing icache misses.

