
WebAssembly, an executable format for the web - arnoooooo
https://blog.octo.com/en/webassembly-an-executable-format-for-the-web/
======
dchest

        function readUtf8String(memory, pointer) {
           let s = "";
           for (i = pointer; memory[i]!==0; i++) {
              s += String.fromCharCode(memory[i]);
           }
           return s;
        }
    

This doesn't read UTF-8, this reads bytes and converts them to UTF-16 char
codes, so it works only with ASCII or some other 0-255 encoding. To read UTF-8
into a JavaScript string (which is UTF-16), bytes should be properly decoded
from UTF-8 (e.g. with TextDecoder).

~~~
arnoooooo
Thanks ! I've renamed the function accordingly.

~~~
chrismorgan
ASCII only covers values 0–127; this function allows 128–255 as well, which
makes it ISO-8859-1 (latin1), the first 256 code points of Unicode being the
entirety of ISO-8859-1.

------
kodablah
I'd like to think one day that the non-web side of WASM could become a more
deterministic and cross-platform-predictable LLVM IR. Right now it is too
limited of course but as GC and threading specs come along, I can hope one day
it will reach parity. Granted it isn't as low level, but that has benefits for
compiler backend implementers (they can choose how they target their arch) and
compiler frontend implementers (they don't have to do different things per
platform). Basically, imagine if Java bytecode was developed less around OO
semantics, wasn't lumped in with its stdlib by the community, didn't have
encumbered JCK and other tests, etc, etc.

Edit: Changed "LLVM" to "LLVM IR" to clarify as requested.

~~~
mike_hearn
I'm not sure your criticisms of Java bytecode are all that reasonable to be
honest.

You don't have to use the Java standard library to target bytecode. Scala,
Ceylon, Clojure provide their own stdlibs. Indeed _not_ doing this is one of
Kotlin's competitive advantages.

Although the JVM wants to think in terms of objects, this is only really
mandated as a unit of linkage and garbage collection. You could equally
criticise C runtimes for being too oriented around procedural programming
semantics because they see the world in terms of top level functions and
global variables. You can write non-OOP programs in Java bytecode and Eta is
an example of using the Haskell Spineless Tagged G-Machine representation on
top of the JVM. If your runtime _doesn 't_ have any concept of an object-like
thing, then it gets rather hard to do garbage collection and certainly any
kind of moving GC.

Finally, the JCK is encumbered to protect the Java trademark. We can argue
about whether that's really been so useful given Android, but the principle
makes sense - it's about ensuring that if someone claims they run Java
bytecode, they actually can run it. If WebAssembly isn't trademarked and
protected in the same way then it'll be a repeat of HTML5, where a claim of
support is essentially meaningless and everyone has to constantly consult
giant feature/bug tables to figure out what subset of the spec works on your
particular implementation.

Now it may be that in the end everyone is OK with the word WebAssembly being
more of a statement of intent than a precise claim, just like we muddle
through with the word HTML5. And in reality we all know that only Mozilla,
Microsoft, Google and maybe Apple will implement wasm or at least only these
companies will have implementations anyone cares about. They will probably
comply with whatever test suite is produced without needing legal/trademark
incentives. Maybe.

~~~
titzer
> Now it may be that in the end everyone is OK with the word WebAssembly being
> more of a statement of intent than a precise claim, just like we muddle
> through with the word HTML5. And in reality we all know that only Mozilla,
> Microsoft, Google and maybe Apple will implement wasm or at least only these
> companies will have implementations anyone cares about. They will probably
> comply with whatever test suite is produced without needing legal/trademark
> incentives. Maybe.

There are already projects to implement WASM in other contexts beside the web
--some even in production. The specification, which has been entirely
mechanized with an implementation of a provably-correct interpreter in
Isabelle, is designed to be implemented easily. We welcome other
implementations of the core execution spec of WASM and other embeddings. I can
only speak for myself as a co-founder, but I believe most people involved in
the W3C groups hope the scenario laid out in this comment does not happen in
reality.

~~~
mike_hearn
Yes, but all those things are true of other platforms as well. The JVM's type
system has been proven correct. Everyone who writes specs designs them to be
easily implemented (in their view) and everyone hopes the spec won't be forked
by vendors or end up with widely deployed but buggy implementations.

The worst case scenario for wasm or any spec author is that someone comes
along and either implements it very poorly, or adds vendor extensions that
effectively become de-facto standards. E.g. Chrome might add new opcodes or
APIs that rely on other Chrome-only features. And then people start claiming
they've got a wasm module, but it doesn't work properly or well enough on
other implementations.

That's what happened to the web, to Java and others, and is one reason why the
TCK is "encumbered", as the OP put it. It's also why Android requires you to
pass an encumbered test suite to be able to use the trademark/logo/get access
to the Play Store etc. It's not like one path is inherently superior to
another. Protecting a trademark by licensing the test suite and mandating it
be passed is about risk management.

~~~
titzer
> The JVM's type system has been proven correct.

I'm not sure which proof you are referring to, but the Java and Scala
languages both have unsafe type systems:

[https://2016.splashcon.org/event/splash-2016-oopsla-java-
and...](https://2016.splashcon.org/event/splash-2016-oopsla-java-and-scala-s-
type-systems-are-unsound-the-existential-crisis-of-null-pointers)

(Admittedly this is due to parametric types which are erased).

The JVM bytecode verification algorithm is over 100 pages in the spec.

> E.g. Chrome might add new opcodes or APIs that rely on other Chrome-only
> features.

We aren't doing that currently. We plan to continue participating in the W3C
processes which have worked well over the past 2+ years.

~~~
mike_hearn
I'm talking about the JVM type system, not Java's. They're not quite the same
due to generics erasure and a few other things.

 _We aren 't doing that currently_

All browser makers have long histories of forking the platform with
proprietary extensions more or less whenever they feel like it. So what does
this assurance mean? Indeed that's now how HTML5 is meant to evolve! Nobody
even uses the W3C anymore do they? It's all just throwing whatever extensions
get built by browser makers into HTML5.

I'm sure WebAssembly will have its share of compatibility matrices and vendor-
prefix equivalents soon enough.

------
stromgo
Those of you who tried WebAssembly, what kind of performance are you getting
compared to native? In my application, execution time is about 200% of native,
which is quite bad compared to the few benchmarks I could find:
[https://hacks.mozilla.org/2016/10/webassembly-browser-
previe...](https://hacks.mozilla.org/2016/10/webassembly-browser-preview/)
[https://blog.acolyer.org/2017/09/18/bringing-the-web-up-
to-s...](https://blog.acolyer.org/2017/09/18/bringing-the-web-up-to-speed-
with-webassembly/)

It's a head-scratcher as my code is nothing special and I don't see what I
could be doing wrong. Then it occurred to me that these benchmarks look
suspicious, with many scores below 100%. Are there fair benchmark results
somewhere, measured by someone not trying to promote WebAssembly? What
performance ratios are HN readers getting currently, and what performance
ratios should we expect in the long term?

~~~
vanderZwan
Do you have your code online somewhere? What kind of tasks are you trying out?
How much time is spent sending data back and forth between WASM?

------
tux1968
Am I being paranoid to think this is going to be heavily used to introduce
proprietary and locked down protocols and content? How long is it until this
is used to render web content onto a canvas without using individual DOM
elements that can be stripped by ad blockers etc.

We're all focused on the potential performance rather than what this will
actually be used to deliver -- so maybe i'm just missing what prevents this
technology from subverting the open web.

~~~
klodolph
"The open web" is a platform where everyone can publish and access content
because the standards are open. WebAssembly makes the web more open because it
puts nails in the coffin of Flash, Unity Player, and the like.

Back in the day we had Flash-only web sites. They died off because it turns
out that it's easier to just throw HTML, CSS, and JS at people, and since
these were open standards, there was a fair bit of competition between
different vendors to make better tools for consumption, authoring, and
delivery.

If you want to reimplement that experience in WebAssembly, the ROI is going to
be horrible and if you render everything to a canvas your site won't be
indexable by Google (goodbye traffic) or workable with screen readers (hello
lawsuits).

~~~
Someone
I see that change as soon as DOM bindings for WebAssembly are available.

Apple/Google/Microsoft would be stupid if they weren’t already working on a
Swift/Kotlin/C# to WebAssembly compiler, as each of them would give them for
their preferred language what JavaScript now has: the ability for developers
to work in a single language, and move code between browser and backend at
will.

Long-term, JavaScript is dead.

~~~
icebraining
That already happens, since you can already write compilers that target JS:
Google wrote GWT and Dart. Microsoft seems to have gone the other way, pulling
JS into native apps.

~~~
Someone
That is true, but it still is on JavaScript’s turf. With WebAssembly the
JavaScript will be as ‘native’ as any other language. I think the
psychological difference matters.

------
hughes
I'm admittedly inexperienced with WebAssembly, but I'm a little unclear on
going beyond the basic "hello world" implementations.

Is the idea to write an entire application in WebAssembly, or is it to break
out parts of the application that benefit from performance improvements and
potentially cross-platform code bases?

Also, how do you do things like make network requests or decode JSON in
webassembly? Do you need to implement these things yourself, or pass things
back to the browser to handle them, or find C libraries that do that for you?

~~~
kodablah
Case by case situation. WASM is nothing but a spec and impl for CPU tasks. If
you want to do something like networking, you call out to JS (today). If you
want to decode JSON, you could call out to JS, or populate some part of the
WASM-accessible mem (a byte array) and then ask WASM to operate on it. It
could be a C library compiled to WASM or hand-written or in some other
compile-to-WASM lang.

------
skocznymroczny
WebAssembly might be a Javascript killer for high-performance uses. Javascript
for uses such as game development is dragged down by the garbage collector,
forcing you to orient entire code around minimizing allocations. Without
garbage collection, you have to manage memory yourself, but allocations aren't
as scary as they are with GC languages and they are more explicit.

~~~
vardump
Garbage collection contributes pretty little to Javascript's performance
issues. Unless you deal with some allocation happy data structures, such as
linked lists.

~~~
austincheney
GC is the primary performance killer for persistent CPU intense applications,
which is why there aren't many such applications written in JavaScript. The
thread is locked when GC happens.

In most other JavaScript applications GC is hard to notice, because it
typically runs once the application has finished executing.

~~~
vardump
CPU intensive applications don't usually allocate that much.

Excessive allocation makes also other languages, such as C++, very slow.

------
TekMol
Is WebAssembly related to asm.js?

I don't like to compile stuff. But improving performance by writing certain
functions directly in assembler or in a subset of js would be awesome.

~~~
brian-armstrong
Compiling stuff is awesome. I never want to discover my bugs at runtime.

~~~
goatlover
Even when prototyping, exploring or doing one-offs?

I'd rather have the flexibility in those situations.

~~~
brian-armstrong
Absolutely, yes. It's preferrable to trying to debug inexplicable glitches in
the prototype that the compiler could have caught. And with incremental
builds, compilation should be pretty lightweight anyway.

------
thephyber
Can someone explain to me why we would expect WebAssembly to be much different
than Java Applets?

There are inevitably defects in the design and/or implementation which will
open up browsers to gaping security vulnerabilities until they are found and
fixed.

~~~
nipplesurvey
operating from the premise that most applet exploits functioned by elevating a
sandboxed applet out of its sandbox (no idea if thats accurate), perhaps
because java applets had an un-sandboxed mode built in, whereas wasm is always
sandboxed and has no equivalent to applet's un-sandboxed mode.

~~~
mike_hearn
It's a bit tricky. I don't know if you can categorically say wasm is or will
be more secure than applets. Both of them rely on a runtime that is more
privileged than the code being executed, of course. Both of them can have
runtime bugs that allow privilege escalation of various forms. There was a
recent perma-root bug in ChromeOS and one of the exploits along the chain was
based on exploiting a WebAssembly runtime bug.

If you look at Java's security track record in recent years, there's been just
a couple of zero days in the past five years. That's probably a mix of
genuinely better security and less attention due to being kicked out of web
browsers. But if you go look at the security histories of other sandboxes like
browsers or kernels, it starts to look pretty good. A new paper that just came
out introduced a new Linux kernel fuzzer and discovered, I think, over 30 zero
day exploits across several different Android phones. So ordinary UNIX style
process isolation is pretty useless if you can reach device drivers from
inside the processes. Even the quite aggressive and resource intensive browser
sandboxes that browsers use routinely have escapes, often because they're
always adding large new attack surfaces like WebAssembly, WebGL etc (all
written in C++).

~~~
nipplesurvey
very interesting, did you read about the linux kernel fuzzer and the ChromeOS
root here or elsewhere? regardless if you can provide a link either sound like
very interesting topics.

------
LyndsySimon
This is the first time I've seen disassembled WASM. It's actually quite
readable!

------
danschumann
Is there a way to compile javascript in web-assembly, to protect the source
code?

IE, you'd have to convert the javascript to C, and then compile the C into
web-assembly, right? Does V8 have a method to convert JS to C?

~~~
kuschku
> Is there a way to compile javascript in web-assembly, to protect the source
> code?

This kind of obfuscation is harmful to the web, don’t do it. Also, any compsci
student with a summer of free time can easily break it, so it’s useless, too.
You’re just wasting your own time and the bandwidth and performance of your
users.

In fact, if you tell me what site you want to obfuscate your code on with
WASM, I (an average compsci student) will spend my next summer reversing it,
deobfuscating it, and publishing the results publicly (just to demonstrate you
how useless your obfuscation is).

~~~
danschumann
Is photoshop harmful to the web? Should they give away their source code?
Their source code is what's valuable in their company. So, art programs kind
of need source protection. I won't be making a web-version until this is the
case. If you can crack NW.js source protection, let me know
[http://docs.nwjs.io/en/latest/For%20Users/Advanced/Protect%2...](http://docs.nwjs.io/en/latest/For%20Users/Advanced/Protect%20JavaScript%20Source%20Code/)

~~~
kuschku
> Is photoshop harmful to the web?

Photoshop isn't on the web, but I oppose closed source in principle.

> Should they give away their source code?

Ideally, yes. If you do it is still your decision, but you still should be
aware that if you introduce truly new concepts, someone will clean-room
reverse them, and publish them.

> So, art programs kind of need source protection.

The Krita project begs to differ.

> I won't be making a web-version until this is the case. If you can crack
> NW.js source protection, let me know

Then you shouldn't make any build where the business code runs on the client.
Even native binaries can be reversed, their functionality analyzed, and
copied. Which is how LibreOffice/OpenOffice originally discovered how to parse
the Microsoft Office formats.

~~~
danschumann
It's possible, but not very ideal to have all the operations done on the
server, and just serve interface stuff, maybe even as a streaming video, to
the client.

When people paint, they want it responsive, no stream lag, no server hiccups.
I must allow people to install the software on their computer. I would still
like to protect the work I've done, because I'm just one guy making software,
and I don't have marketing money. What would stop big companies from taking my
source and spending more on advertising? I need source protection.

~~~
kuschku
> What would stop big companies from taking my source and spending more on
> advertising? I need source protection.

If your threat model is a company with billions in of dollars in revenue, then
source protection is useless.

NW.js protection can be cracked, too, requiring maybe one or two man-months
more. For a student reversing it, this is annoying, but not a problem.

For GoogleAppleAdobeZon? They dont even notice the cost difference. You can be
Snapchat, have unique features, and have it all obfuscated — and Facebook can
still copy it all in weeks.

The question isn't "will Adobe be able to copy my app" — the answer to that is
always yes.

~~~
danschumann
Sure it's possible, but I'm trying to make it as hard as possible. Currently I
think NW.js source protection is the hardest.

I think there are times when open source isn't the best. I never would have
spent so much time making art software if I didn't think I could make it
reasonably hard to crack. Eh, maybe I would have anyway, but I might have just
made another mindless web app that wasn't difficult to code.

I think there are definitely cases where closed source works, especially when
certain algorithms are easy to read but hard to contrive.

~~~
kuschku
Well, NW.js is the strongest on the web.

For native binaries, such as games, currently the strongest solution is
Denuvo, from a team that formerly implemented copy protection for Sony.
Decades of engineering went into it, at the risk of significantly slowing down
the game, using several layers of obfuscation (including VMProtect), all
wrapped around the existing, older obfuscation schemes, for a native binary.

The first version of it took a few hobbyists 6 months to break, by now games
protected with it get usually broken within of one week or less by hobbyists.

That’s hobbyists.

If, as you truly say, your threat model is Adobe, or a similar corporation,
your obfuscation is basically useless. If determined individuals can break
some of the most advanced obfuscation schemes in days, how long do you think
will it take Adobe to break yours, if they actually see you as competitor?

> I think there are definitely cases where closed source works, especially
> when certain algorithms are easy to read but hard to contrive.

They work short-term, but as soon as someone extracts the algorithm once,
everyone has it. Especially considering that in most places algorithms are
neither protected by copyright nor patents (including the EU).

~~~
Fifer82
Denuvo... I can't remember the game (it was shite) but it got cracked 6 hours
before release. You have to laugh especially because it actually did create
low FPS so all the actual customers were pissed off.

Same story since the Amiga Days

------
jwilk
"#include " was meant to be "#include <webassembly.h>", but somebody forgot to
escape <> in HTML source.

~~~
arnoooooo
I thought I fixed that one :/ It's fixed for good this time. Thanks !

------
Thaxll
ActiveX is back. Can't wait to see all the exploits escaping the browser
sandbox.

~~~
cdancette
Isn't it the same with JavaScript?

~~~
chungy
Yes, and WebAssembly runs on top of the existing JavaScript VM engines.

~~~
Thaxll
I don't think it runs on the same V8 ( chrome ) that the one runs JS, I
remember 2 engines running.

~~~
DiThi
The wasm binary itself may be using a separate engine, but heavily sandboxed:
it can only communicate with JS, and nothing else. The (newly added) attack
surface is minimal when compared with Flash, Java, etc (each with its own
APIs). Almost any flaw must be in the JS side, and that's already battle-
proven for years.

ActiveX didn't even have a sandbox, so not even related.

~~~
chungy
The attack surface with classic NPAPI plugins (Flash, Java, etc) is
practically unlimited. It was up to the likes of Flash and Java to secure
themselves, not the browser. Various random plugins from other sources (eg,
the QuakeLive and Unity plugins) may as well have cared about security _even
less_ than the already abysmal track record of Flash and Java.

