Hacker News new | comments | show | ask | jobs | submit login
Asm-Dom – WebAssembly Virtual DOM (github.com)
158 points by mbasso 9 days ago | hide | past | web | 51 comments | favorite





I was going to say that the API is almost identical to Snabbdom's and ask whether that was voluntary or an independent reinvention, but given how similar the "inline example" is[0][1] (down to the comments) I assume either snabbdom was used as a starting point or the inspiration is very much wilful?

[0] https://github.com/mbasso/asm-dom#inline-example

[1] https://github.com/snabbdom/snabbdom#inline-example


They both have the MIT license, which may be a so-called “permissive” licence, but it does require attribution. Did they base asm-dom on Snabbdom? That’s perfectly fine. Did they remove the copyright notice? That’s illegal.

Just committed a fix for that in license.md

https://github.com/mbasso/asm-dom/commit/1e1659ebc2357bf34be...

Is in the example the usual place to put this?


This is the license of the example project, the license of asm-dom is in the root dir

Shouldn't the license be in every file?

The only reason to put the license in every file is if the "project" is really a loose collection of independent files, or if you stay up at night fearing a madman on the loose randomly downloads and reuses single files from your projects.

MIT specifically calls out that the notice must be included in copies of "substantial portions" of the software which makes no sense if the entirety of the notice is already part of every file.

> The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.


In the "How to apply this license" section it is not mentioned https://choosealicense.com/licenses/mit/

If someone has more information or links about that, please let me know


a "fix" eh?

I expect the performance here to be blazingly fast...is that an accurate assumption?

Last I saw, the lack of a direct WASM to DOM integration meant manipulating the DOM from WASM was roughly 10x slower.

See: https://github.com/rust-webplatform/rust-webplatform/issues/... for an example.


I wouldn't assume so unless you're comparing against a fairly naive JS implementation.

Copying strings between JS and asm.js/wasm is slow, so there's additional overhead just calling back and forth.

Reading through the C++ code it also uses std::string and std::map internally which are, well, somewhat slow and cache unfriendly. I'd expect the JS VM's types to generally beat these without too much effort (certainly the map, but the string depends on how its used and some other properties about it, e.g. if it's too big to fit in the small string buffer or etc).

Also, JS VMs have a much faster allocator than most malloc implementations, like the one bundled by emscripten (and this code doesn't seem to make much effort to avoid mallocs). They can/generally will implement the nursery as an arena and so you have a bump pointer allocation (which is only a small number of cycles for the common case). You pay for this later via GC, but the GC's are also fairly fast (with caveats, but I doubt they apply here for 99% of users).

TLDR: I wouldn't expect just using ASM.js or wasm to translate to a performance improvement for something like this. Of course, I could be totally wrong, but, well, there are no benchmarks...


That's a pretty good assessment.

I would note that emscripten's sdk comes with a really great malloc. If you're programming in C++ with a great malloc then your memory allocation performance will be better than what you'd get on the JS side. Be weary of microbenchmarks that try to say otherwise because the reason why malloc beats GCs for real shit is that it scales better. For example, GCs get slower if the heap shape gets weird, while malloc doesn't care about that.

Finally, WebKit's GC doesn't implement the nursery as an arena. I don't think that's such a popular GC technique anymore, particularly now that using non-compacting generational GCs is becoming so common.


Their benchmarks section just says this:

At the moment we haven't Benchmarks to show, but they'll come soon! Consider that benchmarking this library is not easy, we have to reproduce real world situations with big vnodes trees and frequent updates. Run a single patch or a sequence of patch in a for-loop might produce results that are not attributable to a real application.


"All interactions with the DOM are written in Javascript."

You can only be as fast as the slowest link.

Does it help for virtual dom reconciliation to be in WASM as opposed to JS that is optimized and JIT'ed by modern JS engines? I doubt it as the use case is very different from VR/AR/AI where WASM would presumably have some advantage.

Any experts to educate us?


Web browsers are pretty fast already, so if you only do those DOM updates that actually affect what you see on screen you can have your app update at 60 frames per second. That's where a virtual dom helps out.

Virtual dom systems effectively work in two stages:

1) figuring out how the DOM should be updated

2) applying these changes with as few (or cheapest) DOM operations as possible

I don't expect WebAssembly will do much for step (2), as the bottleneck there is in repeatedly crossing the boundary between the javascript world and the DOM, or in the DOM operations themselves.

Step (1) is essentially just crunching (tree based) data structures. Web Assembly should be a lot faster here, because (depending on implementation) you get denser data structures, better cache locality, no garbage collection, and so on. You'll also get much more consistent performance.

For complex web applications that keep a lot of view state around (1) might be the bottleneck, but in most real world applications I expect that (2) is the bottleneck, because the browser has to do so much work in terms of layout, CSS application, and so on every time the DOM gets updated.

If you really want to get a big performance improvement you have to do much more in WebAssembly. Use only absolute positioning for DOM nodes and create your own layout engine. Don't have any CSS except for the rules needed to display what is currently visible. Do all typesetting in WebAssembly too. This means recreating much of a web browser inside a web browser, and you can then optimize the WebAssembly code for the specific needs of your web application for big performance gains.

It may sound hugely impractical, but if WebAsembly is only 10% slower than C it's viable. It means you won't have to wait for standards organizations to give new FlexBox options, or a way to dynamically adjust font size to fit a fixed rectangle, you can just solve all these problems locally. If this works it could lead to a complete "user space" renaissance.


> Web Assembly should be a lot faster here, because (depending on implementation) you get denser data structures, better cache locality, no garbage collection, and so on.

You can get these advantages by using plain javascript with typed arrays.

https://github.com/vandenoever/baredom


I would guess a lot of the overhead would be thunking back and forth between webasm and js functions, so if you implemented the update in pure js where it can call directly into the DOM apis, and also read pointers, numbers and strings directly out of the webasm data array, then you could avoid any thunking between the two different worlds in the inner loop.

By thunking I mean doing this from C++ code:

    EM_ASM_({
        window['asmDomHelpers']['domApi']['appendChild']($0, $1);
    }, vnode->elm, createElm(vnode->children[i]));

So you propose to interface between C++ and JavaScript via a shared buffer and to eschew any explicit FFI calls.

If the data structure is lockless, that is doable.


"If you really want to..."

This is exactly what we've done with our product, Elevate Web Builder, so I can definitely attest to the fact that it is 100% correct. Our product uses a compile-to-JS transpiler and has its own layout engine, uses dynamic CSS, has update cycles to allow for efficient DOM updates, etc. The only real downside is that text measurment must touch the DOM and is really slow, so the layout engine has to perform a lot of intelligent caching and "measurement-avoidance" techniques to ensure that it measures text as infrequently as possible.


If you do your own line wrapping, then I'm pretty sure you don't need to do any text measurements. To get the kerning information for a given font face and size you can cache the widths of all character pairs ("AA", "AB", "AC"). Last I checked just summing the width of character pairs in a string would get you a width estimate accurate to the pixel, even for reasonably long strings. You don't have cross-platform pixel-perfect text rendering anyway, so you don't need to be 100% accurate. Maybe you're already doing this, but if not it's worth exploring.

This might be a dumb question, but after figuring out the width of "AB" and "BC", how do you know the width of "ABC"? Do you just calculate the distances A–B and B–C as ab_distance = width("AB") - (width("A") + width("B")) ... and so on?

Yep. Kerning is just the amount of space in between two characters. So to calculate the total width of a string you take the (fixed) widths of the individual characters and you add the variable widths of the space between each character pair.

Thanks, that is very interesting. We do happen to rely on the DOM for line-wrapping, but that's just because we didn't think you could get reasonable results doing what you've suggested.

Okay, off to see what this produces... :-)


Google Docs does its own line-wrapping for proportional fonts, so that may be interesting to look at.

It's a little finicky to get it all to work consistently cross browser and cross platform, so good luck!


What do you do if you encounter a pair like "ペア"?

The speed advantage - if there is one - would come from the code surrounding the DOM not the DOM code itself. Without this package you would write the surrounding code in JS because that's what you need to do to even talk to the DOM. But with this package that code can be written in C++ and so it will be fast.

If you have enough of that surrounding non-DOM code that significantly benefits from not being JS, then you will have a speed up. You could get this speed up even if talking to the DOM using this package is slower than talking to the DOM directly in JS. It just depends on how much do you depend on the speed of everything else and how much of everything else is written in C++.


>You can only be as fast as the slowest link

Only if that "slowest links" takes a large part of your processing time.

If you have a program that reads some data and passes them to a much faster foreign function (e.g. Python passing to a C extension) to do some calculation, then your "slowest link" (in this case, Python) wont matter much.

The program will still be many times faster than if the calculation had been written in pure Python.


Very cool, thanks for publishing this! I'm learning some interesting things by reading the code.

I'm curious about the three different coding styles used to define functions in domApi.js, if they have different behaviors, are meant to imply different meanings, or if they are just different styles for the same thing:

https://github.com/mbasso/asm-dom/blob/master/src/js/domApi....

  'setAttribute'(nodePtr, attr, value) {

  'parentNode': nodePtr => (

  'setTextContent': (nodePtr, text) => {
Why the single quotes around function names? Why are some normal functions and others fat arrow functions? Why omit the parens around the param of a one parameter fat arrow function -- does that define a get accessor?

> Why the single quotes around function names?

Syntactically the same as not having them, aside from the fiddly details of string vs property name access to an object.

  thisObject = {
    usualPropertyName: value1,
    'property name with spaces': value2
  }

  thisObject.usualPropertyName
  thisObject['usualPropertyName'] // this works too
  thisObject['property name with spaces'] // you can't use the dot access style with this property name
> Why are some normal functions and others fat arrow functions?

Fat arrow functions bind "this" to whatever the surrounding context is, and if the only statement is a single object or value, it implicitly returns that.

That is to say, this:

   whatever => value
is syntactically the same as this:

   whatever => (
     value
   )
is syntactically the same as this:

  whatever => {
    return value
  }
> Why omit the parens around the param of a one parameter fat arrow function -- does that define a get accessor?

If there's only a single argument, you don't need the parentheses. This is meant mostly for the convenience of things like this:

  someArray.map(item => item.propertyName)

> Fat arrow functions bind "this" to whatever the surrounding context is

They don't actually bind `this` at all, there is a slot on the function object ([[ThisMode]] in the spec) which describes how to resolve `this` references for the function's body, for arrow function the mode is "lexical" and they simply grab the this reference from their lexical environment[0] as if it were any other non-local variable.

In fact, lexical resolution is the default, so what happens is during [[Call]][1] arrow functions will simply skip all the hard work in OrdinaryCallBindThis[2] by early-returning while the other two types (strict and global) have to resolve their thisValue then bind their local environment.

[0] https://www.ecma-international.org/ecma-262/6.0/#sec-lexical...

[1] https://www.ecma-international.org/ecma-262/6.0/#sec-ecmascr...

[2] https://www.ecma-international.org/ecma-262/6.0/#sec-ordinar...


Thank you for your answer. I just want to add a little thing to the first point.

> Why the single quotes around function names?

In this case I use quotes to prevent name mangling. I have to be sure that the uglify plugin in webpack does not mangle the names of that object because I'll call it from C++ in this way:

window['asmDomHelpers']['domApi']['createComment'](...)

Also, for the same reason, here I'm using this notation instead of this one:

window.asmDomHelpers.domApi.createComment(...)

to let know the closure compiler that I don't want name mangling.

Without quotes these two different compile processes might cause some problems


I do not get why use Babel when both ASM.js and WASM are only supported in first class browsers.

Without native interface to DOM this project is will be limited but It is interesting anyway.


Babel does not require ASM or WASM. It is (usually) simply an ES6 to ES5 transpiler.

And even if Babel compiled to asm.js, the clever thing about asm.js was that it was valid JavaScript supported by almost all browsers, it just might not be a fast path.

That being said, asm.js in general goes out of it's way not to allocate and invoke the garbage collector, so it's not the best target for other JavaScript anyway.


My point is that you do not need Babel if you plan to target modern browsers anyway.

Many ES6 features are slower than their ES5 counterparts currently. let, const are slower than var, for example.

> let, const are slower than var, for example.

They aren't slower anymore in V8. Most ES6 features are now about as fast as their ES5 counterparts.


That hasn't been true in WebKit for a while.

Huh. What's the reason for that?

> As we said before the h returns a memory address. This means that this memory have to be deleted manually, as we have to do in C++ for example. By default asm-dom automatically delete the old vnode from memory when patch is called.

So may there be structural sharing across vdom trees? Or within a tree?


Yes! h always returns a memory address to a VNode. In C++ this address is reinterpreted in a VNode, that contains others VNode as children. If you decide to manually manage the memory, you can implement some interesting mechanism to diff vnodes and create vdom trees. To do this, I have to develop some APIs that allows you, for example, to replace a children with another and so on. This is certainly on the roadmap!

I don't like the way they write styles:

    style: 'font-weight: normal; font-style: italic'
instead of

    style: { fontWeight: "normal", fontStyle: "italic" }
I mean, it requires stringization and string-concatenation overhead.

If you're doing inline styles, the former is how the style is expressed to the style engine isn't it? i.e.

<div style="font-weight: normal; font-style: italic">


in react, styles can be just another variable in the component model/props

<div style={{ fontWeight: 'normal', fontStyle: 'italic' }}/>


Shouldn't it be called "wasm-dom" instead?

Eventually more parts of the DOM etc. will be directly manipulated from WASM somehow.. right?

DOM interactions are a future feature, I'll update asm-dom to pure WASM without JS when they'll be supported!

Yes it's on the roadmap

possible to use c++ to develop the frontend?

any example?


any benchmarks ?



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: