Hacker News new | past | comments | ask | show | jobs | submit login
WASM as a Platform for Abstraction (michaelfbryan.com)
86 points by MichaelFBryan on Dec 16, 2019 | hide | past | favorite | 84 comments



WASM advocates keep rediscovering the benefits from bytecodes as portable execution format, while presenting them as something great made possible only by WASM.

Reading a bit of mainframe history would do some good it seems.


This is almost word-for-word a Dilbert comic about the UNIX greybeard. It's also not useful, because your only objection is that "credit" isn't being given to mainframes, and this serves only as an opportunity to show that you know history that others don't, and would prefer to be dismissive rather than educational about it.

> we’ve encountered the dilemma where you want to make it easy for users to write their own application logic using the system but at the same time want to keep that logic decoupled from the implementation details of whatever platform the application is running on

A common problem with a whole range of solutions. A decade or two ago the "obvious" choice would have been Java bytecode instead. Even the little "BASIC Stamp" microcontroller system matches this description.


So what? A lot of people are also complaining that it resembles the JVM. Do you think that historical mainframe bytecodes would have been a good match for web applications?

I don't know, I know nothing about mainframes. I do not understand if your criticism is that they are making mistakes that have already been solved half a century ago (which is a good criticism, if true) or that they are not giving proper recognition.

The new shiny thing about WASM is that it fits into the constraints of the web platform. I suspect that even the best mainframes bytcodes had at least slightly different constraints.

I would really like a discussion and a comparison of them, seriously, I would read any such article I could find as the topic is interesting. But comments like this feels a lot like a hipster hating on the mainstream.


Not even when talking about the Web alone, WASM is innovative, PNaCL did it first.

Chrome PNaCL SDK came with C, C++ and OCaml support, with an open source version available, which Mozilla refused to use and came up with asm.js instead.


> Reading a bit of mainframe history would do some good it seems.

I wonder if anyone has a list of awesome mainframe features so that we know what modern computing is going to "invent" next? :P

(See also: hot-swappable parts, virtualisation, containers, etc... I never got to work with mainframes myself, just each time I dig into some new "industry game-changer" tech, I learn that mainframes had it in the 60's)


Someone was circulating the IBM 360 "principles of operation" recently, which sounds like a good way to start.

The trick is identifying which parts were abandoned by mistake, and which parts were abandoned because they were genuinely bad. Nobody wants 21st century JCL. https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.1.0/...


We already have a 21st century JCL: Kubernetes.


GNOSIS/KeyKOS: Capability security, Resource metering, Persistence mechanisms

http://www.cap-lore.com/Agorics/Library/KeyKos/


Yeah but WASM is an open standard that big names in the industry agreed on, including Foss advocates, with a design to accomodate many popular languages and the most popular plateform in the world: the web.

So it has actually a chance to not be limited to one ecosystem or one company.

In fact, the simple idea that it's going to have a monopoly in the browser pretty much guaranties a broad attempt of adoption.

JVM and main frames could have been it, but they never were.

Let's try if we can make it work with a different name this time.


Ancient bytecode formats were mostly about multiple languages support, also here WASM is hardly innovative.

And some of them, also with multiple vendor offerings.

Chrome already has preview features available, that other browsers might adopt, or not.


Sure it's been done before. The point is that WASM is a good, efficient, sandboxed, modern example of it that's usable now with a large software ecosystem around it.


What does modern mean in this context? I see this word being used a as a positive adjective a lot recently to describe rewrites etc but it is unclear to me what it actually implies


From an outside perspective, modern here means that it fits nicely with the mentality and toolchains in vogue today.


The term is overused, yes, but one way it can be understood in wasm's context is that there is now a generic platform/target and we have consensus on it. It's hyped, the inertia is there, so all of it is going to happen. And in this particular case it's a good thing. :) Probably soon we can stop rewriting everything in javascript.


By that I mainly meant that it runs on today's systems and maintained languages and toolchains today can target it. Sometimes in discussions about WASM, I see people bring up ancient bytecode formats in a way that makes them sound like usable alternatives.


It's more a political thing. Sun, Microsoft or other monopolies couldn't do it despite they are powerful. The difference is WASM is more neutral than other portable execution formats.

And one of the most successful 'Portable execution format' nowadays I believe is JavaScript. WASM is just a sequel of JavaScript.


So we're getting mainframe-like tech into the mainstream? Isn't it good?


If this was a conscious process, that is taking solutions from mainframe, analyzing them and deciding to use some of them also analyzing the knowledge,pitfalls, problems and experiences from the past and incorporating them into the new tech then this would be a good thing.

However if this is done through a simple reinventing the wheel route, there is a high chance of hitting the same walls and repeating the same mistakes.

This is kind of similiar to a "let's rewrite this code from scratch in new shiny lang/lib/framework" with the expectation that new code will be bug-free from the start. It's not always bad, it is sometimes necessary or it may turn out to be more effective than trying to fix bloated code. But very often the new code has problems exactly the same as previous code and effectively the developers reinvent the wheel multiple times in the process of rewriting.


I am not sure if you're following the github spec repos, but at least check them out. I really don't think these people are reinventing the wheel badly. The last 3 years have been full of discussion about every single detail, including behavior of various different prior art, host platforms, hosted platforms, etc. These discussions are public, why don't you join if you can help? Even a short list of "look here" would be a great help.


Not sure why there are so many replies about this being dismissive.

It is just pointing out people likes to reinvent technology every once in a while with 95% of the same thing. And it is not apparent whether those doing the "reinvention" knew what the constrain or trade offs or limitation the previous technology had.

Sometimes these sort of Full Circle makes me wonder if software has really moved forward at all.


This is like the reaction people had to Dropbox on this very site [1]. Even if wasm was literally the same bytecode copied from an old mainframe the possibilities that this new use allows are enough of a novelty and an invention in and of itself.

I would find a comparison of wasm with older bytecodes very interesting as for sure none of them are perfect. But this kind of reply is clearly dismissive. It contains no criticism and it offer no insight over why some failed and other succeeded.

I essentially see it as lamenting that another similar technology was/is treated unfairly. Which is a fine comment, since many technologies meet a unjust or undeserved demise. It is still irrelevant as essentially boils down to "how dare you succeed where others have failed".

This is even worse than just saying that it is bound to fail miserably the same way Java applets did.

There are for sure negative sides to wasm, the fact that many parts are similar to past ideas is simply inconsequential on the merits.

[1] https://news.ycombinator.com/item?id=8863


WASM is not a startup, it's a very niche tech and pushing it to other niches as a "platform" needs a lot of investment and buy in from professionals and management working in those niches. Not that the idea itself of an intermediate representation is bad, it's just WASM is not necessarily good and the odds are definitely against it.

HN top comment on dropbox wasn't really wrong either. Triviality proved to be a valid point. Plenty of companies made similar software once they saw there was a need for it. I mean who still uses dropbox today?


I surely am missing something, because I have no idea why you think wasm is niche. It is niche today the same way javascript-rich interfaces where niche during the reign of PHP.

I don't know how to answer the rest of your comment... I don't think we have enough common ground to understand each other.


There are maybe a few hundred people on earth who are in a position to embed wasm into something, how is that not niche?


here embedding can mean just plugins. This very article is about a framework for robotics.

Another extremely plausible option is simply to sandbox mods for games. There are even projects for allowing userspace modules to run safely inside the kernel similarly to EBPF.

This last one in particular is something entirely new.

Without even considering the obvious application of allowing photo editing in web interfaces with reasonable performance.

Or on the other side of legality many people will try and use the new performance to have better cryptomining botnets.

Just by considering plugins (with dynamical loading and safety) and number crunching in-browser I would say that 'few hundreds' becomes a significant understatement.


I'm not talking about potential applications of things that may or may not embed wasm and provide high level interfaces, libraries, compilers, tools. It's irrelevant whether high level stuff uses wasm underneath, application developers won't be touching it directly. I'm talking about actually doing all that work to enable application development. These are the people wasm targets and there are very few of such people on earth. It's very niche tech.

I was considering wasm for myself too, read the spec, articles, played with their ocaml implementation, parsed a bit of bytecode, but concluded that it's not a good tech, too browser specific to use elsewhere and waste time on it.


If I understand correctly you are talking about stuff like a debugger for an opaque wasm binary. The kind of tooling a system administrator could want to manage various wasm programs running on their servers.

I will assume so :)

There was a nice article detailing how for many application usage wasm today is less safe (under certain aspects) of x86, specifically due to the high quality level of tooling available for x86 and due to the fact that today debugging/inspection support of wasm is very low.

Other than that I would say that it is specific to the constraints you have in a browser. Two of which are the ability to run on minimal implementations requirement (as new optional features might be missing and might be running in a variety of environments) and the ability to safely run arbitrary untrusted code.

If you do not care about these then wasm does not offer much to you.

On the other hand it is perfect as a compilation target for plugins.

> application developers won't be touching it directly.

On the web it is meant to replace asm.js, it is not meant to be written by hand (unless you are a compiler developer).

The whole point of the article in this case is that it is a perfect interface for plugins over a specific and limited interface that allows running untrusted and opaque binaries.

For this specific usage only high-level languages where a viable choice up to now (two common choices were js and lua). wasm is comparatively closer to C than javascript/lua, but it is still safe to arbitrarily embed without using OS-level mechanisms.

It is a very specific use case, but also a relatively common one in applications.

One of the big sponsors (Fastly) in investing a lot in it due to the usefulness for them regarding lightweight sandboxing (for this reason they are also developing a debugger for wasm [1]).

On the other hand it is essentially useless if your application is something like a unix utility.

[1] I am not sure it is in this episode, but here the CTO of fastly explain why it is useful for their use case: https://softwareengineeringdaily.com/2019/09/25/webassembly-...


> And it is not apparent whether those doing the "reinvention" knew what the constrain or trade offs or limitation the previous technology had.

I'm quite sure that very often the "reinvention" is literally a reinvention - that is not based on previous invention, but an independent solution that just happens to be similiar to something created earlier (without the reinventors knowledge of the existing previous invention). With an abstraction level high enough, many concrete problems become one abstract problem with a few abstract solutions, and sometimes only very few "obvious"[0] solutions. This leads to a situation when many independent persons/teams come up with a concrete solution to their concrete problem that coincidentally happens to be very similiar to other solutions. If one of those solutions where created in the past, and others where rediscovered later, we then retrospectively call those later solutions as "reinventions" regardless of them being literal or just figurative reinventions.[1]

[0] in this context "obvious" defined as "most likely to be independently found/created by any sane person"

[1] "literal reinvention" being a solution that was consciously based on some previously existing invention and a "figurative" when it just happens to be similiar to something created in the past but the inventor did not (consciously) base their work on something already existing


While I do agree WASM makes a lot of sense for the Web, I personally am having doubts regarding treating WASM as a general abstractions for native code as used in the post. For this case, it might suit the job better to have a bytecode that resembles more of underlying machine architecture, rather than a still highly abstracted model like WASM. Please don't get me wrong, I do agree WASM is already one step ahead of, say, JavaScript, but we can do better than that.

The problem with WASM here, is that it really is a bloated model like JVM in its early days, huge amount of work is needed to make it closer to native speed, which is contradictory to the original slogan. What's more, people are still planning to add tons of new features to it: https://webassembly.org/docs/future-features/. Before you tell me those are opt-in features, the question I want to raise is: for an abstraction of general platform, you would definitely want to have a widely accepted standard so people know what features will be expected, one example is that people know SSE will be available for 64-bit x86 code.

With all those opt-in features, I doubt if we can have a proper layer that adapts well to different implementations with different supported features. We might end up with the situation like Rust, where you can claim a secondary compiler could exist, but in practice people are all using the same compiler/implementation.


And yet, it will probably be used for that and become popular.

Tech doesn't need to be perfect, the best, or even very good to win. They need to have a killer feature and a low cost of adoption.

Wasm seems on the right tracker for that.


> need to have a killer feature and a low cost of adoption

This is a powerful statement actually. Could be applied to any tech startup product.


I totally agree that it's not the best tech always wins, that's why I'm pointing it out, I really wish that 5 years from now, we can rely on something that makes sense to be there, not just because something has a low cost of adoption.


It sounds like you're bringing up a register-based vs stack-based VM argument, and claiming that register-based VMs have better performance because their model is closer to the hardware.

My understanding is that this intuition is usually untrue, because a JIT benefits from the stack-based code preserving code flow and thus allowing more efficient code generation.


No I'm not talking about register-based vs stack-based VM, that's a totally different topic. I'm just saying WASM is still quite distant from real hardware, making it a non-trivial task to performantly run the code. In fact if you look at the asm.js, which is the original inspiration of WASM, it is a much closer mode to real hardware.

And of course JIT can make WASM fast but if you look around, building a performant WASM JIT still remains terribly hard, some implementation even needs LLVM to perform optimizations. I'd say if this is the case, we must've chosen the wrong model.


> some implementation even needs LLVM to perform optimizations. I'd say if this is the case, we must've chosen the wrong model.

Why? Optimizing machine-independent code for a particular machine is part of the "core business" of LLVM, up to the point where a sufficiently capable bytecode/optimizer becomes comparable to LLVM.

OTOH, if the main argument here is the size/speed/other weight of LLVM, then of course the host machine that wants to run WASM only needs a tiny subet of LLVM: no frontend, single backend, only a subset of optimization passes... There is also a tradeoff to make to leave optimization passes out that actually improve the code a bit but are too heavy for the host machine.


There's nothing wrong with LLVM itself, my point is we could've picked a lower level model which don't need a complicated setup like LLVM. Or one that you can direct shipped optimized compiled result of LLVM, that will be a much better world


Doesn't ARM code emulated on x86 (and vice versa) perform even worse than WebAssembly? Isn't that essentially what you would get with a lower-level "optimized compiled result of LLVM"?


Does the fact that ARM is a bad choice disproves all other choices other than WASM? I'm not sure this is a good argument here. Lower-level bytecodes are more flourishing than just ARM.


I imagine it's more difficult to translate efficiently between two different low-level instruction sets (such as ARM, MIPS, x86, PowerPC, etc.) than to translate something slightly higher-level to the various low-level target instruction sets. Emulating ARM on x86 is usually slow (see the Android emulator) as is the reverse (see Windows 10 on ARM) and PowerPC on x86 (Apple's Rosetta) didn't seem particularly fast either.

Do you have an example in mind of a lower-level instruction set that can be efficiently translated to different real-world ISAs?


I believe that you are looking at wasm with a different priorities than intended. The two fundamental properties are that it must be fully portable and fully secure by default (as in any insecurity needs to be explicitly and statically declared in the bytecode).

Performance comes only after those two. LLVM as far as I know, has a completely different order of priorities.


Personally I don't see why we cannot get all three.


What major implementations are using LLVM? Firefox is using Cranelift, Chrome is using V8, both of these shouldn't be using LLVM, AFAIK, or am I wrong?


wasmer [1] has a LLVM backend, WAVM [2] uses LLVM as the bakend, I could be wrong but last time I checked, cranelift is only meant to be the next generation WASM engine used in Firefox, it is not yet in production.

And actually the argument is: all of v8, Firefox/Cranelift and LLVM used in wasmer requires non-trivial work to make WASM fast, which shouldn't be needed given a different model.

[1] https://github.com/wasmerio/wasmer/tree/master/lib/llvm-back... [2] https://github.com/WAVM/WAVM


I highly doubt that there is another model that would not require non-trivial work to be fast, while also being reasonably portable to different architectures.

Sure, we could be faster by just sending x86 machine code, but that isn't really the point.


LLVM more usable for cloud vendors and such who use wasm outside of the browser. So we do have 3 quality implementations already.


Why is it hard? Isn't wasm designed so you can statically and quickly compile pieces of it or the whole thing to native code, rather than needing to do all the tricks dynamic language runtimes do?


That is their very nice slogan, while in reality WASM still has quite a way to go to compete with native code.

Some shits I see these days are that when code speed is measured, people compare that with JS but not native code, when portability is talked about, the comparison is then made against native code, not JS.


> The problem with WASM here, is that it really is a bloated model like JVM in its early days,

Isn't WASM (as of its MVP) quite simple VM model compared to other VMs?

I agree with your concern about the new futures though. It might introduce another hell of segmentation.


> bytecode that resembles more of underlying machine architecture

In what way? This discussion lacks specifics. Doesn't this also risk tying you to a specific machine, which is the opposite of the intent?


That is not an issue regarding bytecode formats in general, given that in some platforms only the kernel does the final compilation to machine code, and they are a common executable format since early 60's.

However I do agree with WASM everywhere fashion complaint.


Besides, doesn't LLVM already have an IR which servers as one such higher-level abstraction?


LLVM IR is unstable, way too big in scope, and not as machine-independent as people think. There have been several efforts to use it as a target-independent high-level bytecode, and they either are very platform-specific single-vendor affairs (Apple) or were retired in favour of a simpler new language that doesn't have its problems (SPIR became SPIR-V, PNaCl became WebAssembly).


No, LLVM IR is machine specific. Any "native" language is since the ABI of a struct will depend e.g. on the size of pointers for that platform.

E.g. Consider in C int foo[sizeof(void*)];


Open source LLVM IR is machine specific, it doesn't have to be, as proven by watchOS bitcode, or PNaCL.


Both of the examples you gave were very carefully architected to make sure that was the case.


Which doesn't prevent the case of someone contributing back such kind of variants.


> it doesn't have to be, as proven by watchOS bitcode, or PNaCL.

Both of those have fixed 32-bit pointer sizes and are little-endian. When you compile for watchOS bitcode or PNaCL you just target a single virtual machine & "system" ABI. LLVM IR or any related techniques won't ever allow you to produce a 32 / 64 bit or ARM / x86 app that is able to leverage the whole feature set of the platform from the same bitcode.


Yet watchOS migrated from 32 bit to 64 bit.

It is possible, LLVM project just needs to actually want to support such use cases in a portable way.


> Yet watchOS migrated from 32 bit to 64 bit.

that is what they said in the press release but in practice they migrated from "classical" 32bit ARM to ILP-32 (akin to the x32 ABI on linux) so the size of pointers, etc etc does not change from 32-bit. You get more registers & stuff like that which is nice, but that is not moving to 64 bit, just having nicer 32 bit execution on 64 bit CPUs. If you want proper aarch64 support on WatchOS you have to recompile.


Fair enough. Still nothing prevents LLVM bitcode to be fully CPU agnostic, if LLVM guys were willing to keep such variant around.


Google created pretty much this, called it PNaCl and shipped it in Chrome. It now has been retired in favor of WASM.



Because Mozilla went political and came up with asm.js as counter technology.


Or as some other (like, eg Google itself) would say:

> Because Mozilla went political and came up with asm.js as better technology.


I am pretty sure that asm.js would not happened if Chrome already had the market share it enjoys nowadays.

WASM is still catching up to PNaCL in performance, hardly better.


I'm pretty sure it is not stable and not designed for such a use case.


Apple kind of disagrees with watchOS bitcode.


"not designed for that use case" != "isn't used for that use case in practice"

Apple has tight control over the bitcode version and the target platforms supported by Xcode. That's not the same thing as accepting arbitrary LLVM bitcode files.


Nothing prevents LLVM project to adopt such variant, other than unwillingness to do so.


The author has rediscovered the need for software fault isolation (SFI). Bytecodes or IRs like WASM can also provide SFI but are overkill because they provide more than just SFI.

If I were him, I'd have used this as an excuse to play with NativeClient.

--

The original paper on SFI was by Robert Wahbe and colleagues: https://cs155.stanford.edu/papers/sfi.pdf.

Google's NativeClient is a modern take on SFI for x86: https://static.googleusercontent.com/media/research.google.c....


Unfortunately, NativeClient has been abandoned.


Does SFI also cover sandboxing?


Yo dawg we herd you like abstractions so we put a VM in a container-runtime in a hypervisor in your CPU so you can use JIT compilation while you use other JIT compilation


The requirements

1. People need to be able to upload new code while the system is still running 2. This application will be interacting with the real world (think robots and automation), and we really don’t want a crash in user-provided code to make the entire system stop responding

is suspiciously close to Erlang, except for the user-provided part.


Lots of wasm questions:

In what sense can a wasm program "crash?" What sort of backtraces are available in that event? How's the debugging story?

Is there a fast wasm runtime available for non-JIT platforms? Is there big-endian support (still hanging on)?


Wasm programs can only crash by triggering a "trap", which has the well defined semantic of aborting the entire (wasm) function stack at that point. It depends on the embedding host how much backtrace or debugging support you get for this.

I'm not sure exactly what you mean with non-JIT platforms, As far as I know, most wasm hosts that generate native code just compile the entire wasm module at once, so its less like a JIT runtime and more like a regular compiler.

If you mean not compiling to native code at all, then you just have the performance of a plain old stack machine bytecode interpreter. Not sure how many there are currently and how well optimized they are though.

About big-endian - afaik little-endian is just the spec for storing to wasm's linear memory - the actual representation of stack values can be arbitrary (since you can not inspect their bytes directly).


> In what sense can a wasm program "crash?"

I'm not an expert in wasm but memory allocation may fail depending on the host environment. Also, runtimes of higher languages may define their own crash cases. As it is a stack machine dumping a stacktrace in case of "crash" should be easy.

> Is there big-endian support?

Wasm still assumes little-endian byte ordering. Honestly who really cares about big-endian?

https://github.com/WebAssembly/design/blob/master/Portabilit...


> Is there a fast wasm runtime available for non-JIT platforms?

Yes, Wasmtime can do AOT. Many other runtimes can do AOT as well.


You could also bring back genetic algorithms/genetic programming on top of WASM ;-)


Strange. Something like node.js for running server side WASM programs, maybe. But hard real time? That's a strange application for this. Why add the additional layer?


The traditional approach is dynamically loaded native libraries, and that certainly worked well when I did it in C++. The dynamic linking situation in Rust is immature, nothing like a stable ABI. Static linking has received much more attention, and the core developers can't prioritize at once. However, it can be done and I hope it can be done safely at some point.


The first few paragraphs are the rationale.


Where does it say hard real time?


Where it says: "While this section will be fairly specific to my use case (creating some sort of programmable logic controller that people can upload code to), it should be fairly easy to adapt to suit your application."




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: