

The elusive universal Web bytecode - espadrine
http://mozakai.blogspot.com/2013/05/the-elusive-universal-web-bytecode.html

======
Lerc
> The main problem is that we just don't know how to create a perfect "one
> bytecode to rule them all"

It boggles me that the prevalence of the assumption that there should be a
single bytecode. Not all tasks are the same. In real world computing an ATMega
is sometimes a better choice than an ARM or an x86.

When Notch released the DCPU16 specification there were multiple compatible
emulators running within 24 hours. After 3 days there were compilers.
Supporting a bytecode can be easy. Optimizing for speed makes for a much more
difficult task, but not all architectures must target the same goals.

The article lists the goals of Fast, Portable, and Safe. I would add to that
list more goals. Deterministic and Efficient are two that spring to mind. I
would advocate multiple bytecodes that favour some goals over others. The 8
bit AVR of the Arduino would be a good pick for a bytecode that has a light
footprint that would be ideal for small tasks.

I wouldn't want a free-for-all with a massive proliferation of architectures,
but at least three would mean there is a much better chance of having the
right tool for the right job.

~~~
comex
An ATMega is sometimes a better choice because it's cheaper and uses less
power, not because it has a better instruction format. An 8-bit bytecode would
be ideal for very little on a 64-bit computer (you'd end up simulating larger
integer arithmetic that the machine can do in a single instruction, for no
benefit); more generally, since the details of the target machine are going to
vary anyway, there is no point in having multiple bytecodes with the same
underlying memory and execution model that can be easily complied between.

Having a C-like bytecode and a Java-like bytecode and a Haskell-like bytecode
would be more reasonable, but it would also be a huge pain for browser makers.
One highly optimized VM is complex enough...

~~~
Lerc
Having written assembly for x86, Arm, 8 bit AVR and indeed DCPU16, I disagree.
Each performs particular functions better. More importantly, We should be able
to disagree. You don't have to use my AVR-like bytecode and I don't have to
use your Java-like bytecode.

Having a single "this works for me so it should work for everyone" is dumb.
Like only supporting png as your only image format.

------
yk
I am ranting slightly out of the scope of my expertise, but I think that 'web
bytecode' is putting lipstick on a pig. In fact, I think that the entire web
stack is upside down, it is intended to serve a mostly static webpage, perhaps
with a counter or a mouse-over effect. And it is doing this well. But to start
with the browser, having a nice HTML parser, a DOM tree etc. is nice to have,
sometimes. Similar with http, this is a stateless protocol, which again is
nice, except if you want state. And on the other side of the connection is the
webserver, which today is mostly a glorified front end for a database.

So every time I think about the web, my sense of software design is rebelling.
It should just be constructed the other way around, with a nice VM on the
client, that contains a browser if it is supposed to present structured
hypertext. That communicates to a server over TCP/IP, without reinventing TCP
atop of HTTP and a server that is actually tailored to whatever it is supposed
to do. ( Before anybody accuses me of advocating Java, I want all of this
nicely implemented.) But unfortunately, it is probably a billion users too
late to start again from scratch.

~~~
wwweston
> Before anybody accuses me of advocating Java, I want all of this nicely
> implemented.

So... I take it that in your opinion the reasons why Java (and to a lesser
Flash) didn't supercede the web have to do with its implementation flaws, not
its concept?

~~~
ssp
That would be my take anyway. A big issue with Java is that it required
installing a huge standard library - that should just be downloaded as
required. And flash was never really intended for applications; it was always
about "rich content". And Air (the application framework built on flash) has
the problem that it's proprietary so you have to trust Adobe.

------
jfb
I am reminded of "Worse Is Better" [1]:

    
    
      The good news is that in 1995 we will have a good operating system and
      programming language; the bad news is that they will be Unix and C++.
    

Javascript seems to exhibit much of Gabriel's "New Jersey approach".

[1] <http://www.jwz.org/doc/worse-is-better.html>

------
abecedarius
I like asm.js and have used it (<http://wry.me/hacking/Turing-Drawings/> ).
But I understand the basic case for 'web bytecode' to be this: software fault
isolation and portable low-level distribution formats have both been
demonstrated showing considerably less overhead than the roughly 2x of current
asm.js, going back to the 90s (e.g.
[http://www.eecs.harvard.edu/~greg/cs255sp2004/wahbe93efficie...](http://www.eecs.harvard.edu/~greg/cs255sp2004/wahbe93efficient.pdf)
and
[http://en.wikipedia.org/wiki/Architecture_Neutral_Distributi...](http://en.wikipedia.org/wiki/Architecture_Neutral_Distribution_Format)
). asm.js will improve, and it has a great adoption path, but it hasn't yet
been shown to run as fast as that old work claimed to have done.

~~~
azakai
To be fair, the 2x figure was from a few months ago, and was from the very
first prototype of OdinMonkey (asm.js optimizations) in Firefox. Things have
improved since then and will continue to do so, see

[http://arewefastyet.com/#machine=12&view=breakdown&s...](http://arewefastyet.com/#machine=12&view=breakdown&suite=asmjs-
ubench)
[http://arewefastyet.com/#machine=12&view=breakdown&s...](http://arewefastyet.com/#machine=12&view=breakdown&suite=asmjs-
apps)

for more current numbers. Many are better than 2x slower than native.

The first prototype was basically a few months of work by 1 engineer
specifically on OdinMonkey, building on a few years of work on the more
general IonMonkey optimizing compiler. Those are far far smaller amounts of
time than have been spent on compilers like gcc and clang, so it is not
surprising there is a performance difference. But it will get much smaller.

------
LowKarmaAccount
> It turns out that C++ compiled to JavaScript can run at about half the speed
> of native code, which in some cases outperforms Java, and is expected to get
> better still. Those numbers are when using the asm.js subset of JavaScript,
> which basically structures the compiler output into something that is easier
> for a JS engine to optimize. It's still JavaScript, so it runs everywhere
> and has full backwards compatibility, but it can be run at near-native speed
> already today.

The problem that many people have with web browsers is that they have evolved
from retrieving text to become bloated, Frankensteinish wannabe operating
systems that perform redundant operations at slower speeds. _It is the job of
the underlying operating to execute programs, not the web browser's_.
Constructing a redundant operating system and executing code at _half_ the
possible speed (at best!) is not progress.

~~~
azakai
I would say it is progress, if

1\. The browser executes programs in a totally platform-independent way, which
the underlying OS cannot (a website should be able to run on all browsers and
on all OSes). So having the browser run code is useful too.

2\. That "half speed" number was the status as of a few months ago, and is
still improving.

~~~
LowKarmaAccount
> The browser executes programs in a totally platform-independent way

JS has different scripting engines and different dialects, so it isn't really
platform independent in the sense that the same code will execute the same way
on every computer. In fact, it's possible for JS to execute in different ways
on the same computer when you run the code on browsers with different engines.
There are plenty of instances were a page will not render the same way in
different browsers (although this is usually because HTML use isn't
consistent).

> That "half speed" number was the status as of a few months ago, and is still
> improving.

It's still an effort to reinvent the wheel. When a 1:1 ratio is reached, you
(not you in particular) will still have ended up spending time implementing a
feature that already exists on your OS, instead of improving the original
program. For example, browsers have pdf readers baked in. Now you have a
dedicated pdf reader on the desktop, and a slower, buggier pdf reader on the
browser.

The problem is that programs on the desktop don't communicate well with each
other to get the same kind of integration you get on a browser, or the
underlying OS doesn't include programs to implement features that you want to
include on your web page. I understand that it may be simpler to turn a web
browser into a mini OS rather than get OS authors to change their OS (the
Chromebook is a misguided attempt to take this to its logical end), but it is
still an ugly and redundant solution.

~~~
azakai
JS, for the most part, does run identically on different browsers. With very
few cases of truly undefined behavior, any other difference is either DOM
stuff, or a bug.

I've ported lots of apps to JS, and generally they just work across browsers.
Even the Unreal Engine 3 demo runs on Chrome, JS was not what was holding it
back.

> It's still an effort to reinvent the wheel. When a 1:1 ratio is reached, you
> (not you in particular) will still have ended up spending time implementing
> a feature that already exists on your OS

The OS can run it, but native apps are not portable, as I said before. That is
what makes it worth reinventing some parts of OSes in browsers.

------
narrator
IMHO, The whole reason Javascript is the de-facto language of the web is
because Microsoft made the mistake of shipping a Turing complete language
runtime that was not dependent on the Win 32 API and the language happened to
be Javascript.

IMHO, any kind of new web bytecode standard will be immediately broken by
Microsoft into an incompatible variant that we won't be able to nicely hack
around and will thus fragment the market in their favor. The code base and
runtime installed base of Javascript is so massive now that they can't easily
fragment it.

~~~
jerf
Netscape shipped Javascript. Microsoft shipped JScript in a desperate attempt
to peel away their near-monopoly market share, and they (correctly, IMHO)
judged that impossible if they were entirely incompatible with Navigator.

Microsoft is no longer in a position to unilaterally destroy the web anymore,
and I see no high probability of them regaining that position in the
forseeable future. (Too much stuff going mobile, and they're still not strong
enough in that space to dictate.... to put it lightly.)

~~~
nitrogen
Microsoft may still be able to hinder the web by using its influence in W3C to
push for detrimental standards like Encrypted Media Extensions, and by
refusing to implement things like WebGL.

~~~
jabr
They can't stall forever on new web standards if they prove useful. I'm not
sure WebGL will, but if it does, the "graceful degradation" culture in web
development means that IE users will get some sort of slow, limited
approximation of the intended experience. And enough of them will know and
start installing Chrome or Firefox in larger numbers again.

Microsoft can slow the adoption of web standards (and just via the desktop),
but they don't have any real control now. And the only reason WebGL isn't in
IE 10 is because they didn't want to offend the DX/D3D team. Once IE supports
WebGL, D3D will die.

~~~
nitrogen
WebGL is useful for more than just games. For example, I'm using a fragment
shader to implement client-side demosaicing[0] in the web interface for my
Kinect-enabled home automation hardware (link in my profile if you're
curious). CSS Shaders/custom filters might serve as a partial substitute for
WebGL, but IE doesn't support those either.

On the bright side, I've seen rumors that IE11 will support WebGL, but that is
of little use to people stuck on XP, Vista, or W7 due to compatibility
requirements of other legacy apps. As you say, IE doesn't have the dominance
it once did, but for some reason it's still used in certain market segments.

[0] <https://en.wikipedia.org/wiki/Demosaicing>

------
n00b101
Please. JavaScript is a far cry from the ideal byte code / intermediate
representation for all future applications to be built in. It's bad enough
that people think every new app should run inside a web browser ... but saying
that it MUST be written in JavaScript (or something that can generate
JavaScript) is manifestly irrational and ridiculous.

~~~
wmf
Just calling JS "irrational and ridiculous" doesn't advance the conversation;
we need details.

~~~
lucian1900
"saying that it MUST be written in JavaScript ... is manifestly irrational and
ridiculous"

You misread.

------
stcredzero
Perhaps the most significant thing is what he glosses over in this
parenthetical paragraph:

 _(Of course there is one way to support all the things at maximal speed: Use
a native platform as your VM. x86 can run Java, LuaJIT and JS all at maximal
speed almost by definition. It can even be sandboxed in various ways. But it
has lost the third property of being platform-independent.)_

The argument is made that any bytecode is probably going to have most of the
flaws of Javascript anyhow. However, it's a long standing truism that most
problems in Comp Sci can be solved with another level of indirection. The TAOS
VM was a virtual Instruction Set Architecture which could be Just In Time
translated to a real instruction set architecture and run as fast as it could
be read off disk. Since it is not a real ISA, it would also be platform-
independent.

<http://www.uruk.org/emu/Taos.html>

Using such a VM would truly allow one to support any language on the web, at
something like 80% "native" speeds. NaCL, LLVM, or even ASM.js could be grown
into such a VM.

Additional levels of indirection are apt to create problems as well as solve
them, however. Such an undertaking as an overarching standard would be very
complex -- so much so, that it may never come to pass, so ASM.js may still be
preferable, since it's much closer to actually existing.

~~~
ahomescu1
I couldn't figure out from the link, how is TAOS different from the JVM/CLR?
It seems to me that pNaCl and asm.js are already 99% of that, they're only
missing all the libraries.

~~~
stcredzero
TAOS VM is like pNaCL, in that it's more of a target for implementing a VM
than a bytecode compiler target. And yes, ASM.js is already most of this.

------
mwcampbell
I've basically resigned myself to using JavaScript for new client-side
development, since it's the native language of the browser, which is the most
restrictive of all platforms.

I considered C# a few months ago; I know that things like Script# and JSIL can
translate C# or CLI bytecode to JS. But I figured there would be subtle
semantic mismatches that would require anyone working with such code to know
both C# and JS anyway. So arguably JS requires less working knowledge.

The most obvious semantic mismatch between JS and other mainstream languages
is that the other languages usually (if not always) support multiple threads
with shared state. Consequently, the standard libraries of these other
languages tend to have blocking APIs. This seems to me like a good reason to
use JS for new client-side development, rather than trying to make another
language compile to JS. One can convincingly argue that not supporting
multiple threads with shared state is a good thing, anyway.

The only limitation in JS that really bothers me, if it still even exists, is
that AFAIK the largest possible integer is 2^53.

~~~
iso8859-1
JavaScript still has no integers. But you can use a bignum library. Hopefully
you'll get integer level performance if the JavaScript VM sees that you're
only doing integer operations (that includes dividing and then coercing an
"integer" by using "|0" or similar).

------
revelation
This is missing a discussion of _debugging_. I don't think any of the
emscripten/whatever magic has that down yet.

~~~
zeckalpha
Source maps. Soon.

~~~
skybrian
Source maps are a partial solution. We also need a way to translate the data
(fields and data structures) back to its source representation.

~~~
zeckalpha
Isn't that part of source maps?

~~~
skybrian
Nope. It's just about mapping JavaScript code back to source files. We'll need
to come up with another standard for data.

------
thedufer
Years of effort have gone into making javascript VMs faster than they have any
right to be. Even an ideal bytecode would likely be slower than javascript for
years. It seems unlikely that many people would be willing to give up
performance now for theoretical future performance, and without a reasonable
amount of use, there's no good reason for browsers to implement the new
bytecode.

For better or for worse, we're stuck with javascript for the foreseeable
future.

~~~
ahomescu1
_Even an ideal bytecode would likely be slower than javascript for years_ \--
no, statically-typed bytecode should be significantly faster than dynamically-
typed JavaScript (both in execution and parsing time), IMHO.

~~~
azakai
I agree that makes intuitive sense, but the article mentions numbers showing
the opposite is true in some cases.

------
33a
What about size? asm.js is fast, but parsing it is slow and it is super
bloated. A real _binary_ byte code is obviously a better choice, but there
doesn't even seem to be any good options for this on the horizon. I would take
asm.js if we get it, but it is far from the ideal solution.

~~~
pcwalton
It is actually not as bad as you would think. Much better than LLVM, in fact.

[http://mozakai.blogspot.com/2011/11/code-size-when-
compiling...](http://mozakai.blogspot.com/2011/11/code-size-when-compiling-to-
javascript.html)

------
pixelcort
I thought this was going to be a comparison of asm.js and PNaCl, but the
answer is, JS itself?

What we really need are toolchains that cross-compile (the same codebase) to
both asm.js and PNaCl. What's nice is the asm.js also works on browsers that
support neither.

~~~
iso8859-1
It is possible to have the same codebase on Emscripten and NaCl if you use
SDL. SDL implements a subset of SDL and SDL (the actual library) is already
available for NaCl: [https://developers.google.com/native-
client/community/portin...](https://developers.google.com/native-
client/community/porting/SDLgames)

------
kryptiskt
Oh, it was already submitted, I hate these country-specific Blogspot adresses.
Deleted mine.

------
spo81rty
We already have bytecode options. Flash, Silverlight and Java plug ins. If
mobile browsers all supported Flash and Silverlight...

At this point our hope is improved Javascript features in the next release...

------
arianvanp
>Be a convenient compiler target: First of all, the long list of languages
from before shows that many people have successfully targeted JavaScript.

A counter example to your argumentation: Many people deploy products in PHP
aswell, though we've all decided PHP is something we need to move away from..

------
jabr
We have an existing, widely deployed VM that is portable and sandboxed, fully
described by open standards, and has several competing but fully compatible
implementations -- most of which are already very fast -- and people want to
throw it out and replace it because they don't like the file format...

------
MostAwesomeDude
I would like some of what the author is smoking. In addition, I would like to
see them explain exactly why they think that a full language with flexible
syntax qualifies as bytecode. There are so many useless assertions and
falsities in this post that I'm not sure where to start, and I'm not sure that
a piece-by-piece refutation is worthwhile.

~~~
wslh
I think sooner or later a more agnostic VM in the browser will came. HTML was
for hypertext, now (almost) it is for general UIs.

The problem is how to make a standard if organizations like Mozilla are not
part of that? I think it is not so difficult: don't have a VM? emulate it with
asm.js. Do you have a VM? run your application at full speed in more advanced
browsers. When Google Chrome came with a faster Javascript engine, others
needed to catch up.

~~~
jabr
The problem with this idea is that a new, "better" VM doesn't offer any
substantial benefits over using Javascript-as-a-VM. The limitations of JSaaVM
can be can addressed more easily by simply improving on the existing
infrastructure. There's not a good reason to throw out what we have when
incremental improvement gets us to pretty much the same point more quickly and
with less pain.

~~~
wslh
Native client is faster than current solutions.

