I think the reason that the browser is so slow is that every time you mutate something, an attribute or add or remove an element, the browser rerenders immediately. And this is indeed slow AF. If you batched everything into a DocumentFragment or similar before attaching it to the DOM then it'd be fast. I don't know how you do that ergonomically though.
It's partially true. Layout and repaint are two separate rendering phases. Repaint happens asynchronously, as you point out. But layout (i.e. calculating heights, widths, etc of each box) is more complicated. If the browser can get away with it, it will batch potential layout changes until directly before the repaint - if you do ten DOM updates in a single tick, you'll get one layout calculation (followed by one repaint).
But if you mix updates and reads, the browser needs to recalculate the layout before the read occurs, otherwise the read may not be correct. For example, if you change the font size of an element and then read the element height, the browser will need to rerun layout calculation between those two points to make sure that the change in font size hasn't updated the element height in the meantime. If these reads and writes are all synchronous, then this forces the layout calculations to happen synchronously as well.
So if you do ten DOM updates interspersed with ten DOM reads in a single tick, you'll now get ten layout calculations (followed by one repaint).
This is called layout thrashing, and it's something that can typically be solved by using a modern framework, or by using a tool like fastdom which helps with batching reads and writes so that all reads always happen before all writes in a given tick.
> I think the reason that the browser is so slow is that every time you mutate something, an attribute or add or remove an element, the browser rerenders immediately.
Is it really immediately? I thought that was a myth.
I thought that, given toplevel function `foo()` which calls `bar()` which calls `baz()` which makes 25 modifications to the DOM, the DOM is only rerendered once when foo returns i.e. when control returns from usercode.
I do know that making changes to the DOM, when immediately entering a while(1) loop doesn't show any change to the DOM.
The browser will, as much as it can, catch together DOM changes and perform them all at once. So if `baz` looks like this:
for (let i=0; i<10; i++) {
elem.style.fontSize = i + 20 + 'px';
}
Then the browser will only recalculate the size of `elem` once, as you point out.
But if we read the state of the DOM, then the browser still needs to do all the layout calculations before it can do that read, so we break that batching effect. This is the infamous layout thrashing problem. So this would be an example of bad code:
for (let i=0; i<10; i++) {
elem.style.fontSize = i + 20 + 'px';
console.log(elem.offsetHeight);
}
Now, every time we read `offsetHeight`, the browser sees that it has a scheduled DOM modification to apply, so it has to apply that first, before it can return a correct value.
This is the reason that libraries like fastdom (https://github.com/wilsonpage/fastdom) exist - they help ensure that, in a given tick, all the reads happen first, followed by all the writes.
That said, I suspect even if you add a write followed by a read to your `while(1)` experiment, it still won't actually render anything, because painting is a separate phase of the rendering process, which always happens asynchronously. But that might not be true, and I'm on mobile and can't test it myself.
> Now, every time we read `offsetHeight`, the browser sees that it has a scheduled DOM modification to apply, so it has to apply that first, before it can return a correct value.
That makes perfect sense, except that I don't understand how using a shadow DOM helps in this specific case (A DOM write followed immediately by a DOM read).
Won't the shadow DOM have to perform the same calculations if you modify it and then immediately use a calculated value for the next modification?
I'm trying to understand how exactly a shadow DOM can perform the calculations after modifications faster than the real DOM can.
The shadow DOM doesn't help at all here, that's mainly about scope and isolation. The (in fairness confusingly named) virtual DOM helps by splitting up writes and reads.
The goal when updating the DOM is to do all the reads in one batch, followed by all the writes in a second batch, so that they never interleave, and so that the browser can be as asynchronous as possible. A virtual DOM is just one way of batching those writes together.
It works in two phases: first, you work through the component tree, and freely read anything you want from the DOM, but rather than make any updates, you instead build a new data structure (the VDOM), which is just an internal representation of what you want the DOM to look like at some point in the future. Then, you reconcile this VDOM structure with the real DOM by looking to see which attributes need to be updated and updating them. By doing this in two phases, you ensure that all the reads happen before all the writes.
There are other ways of doing this. SolidJS, for example, just applies all DOM mutations asynchronously (or at least, partially asynchronously, I think using microtasks), which avoids the need for a virtual DOM. I assume Svelte has some similar setup, but I'm less familiar with that framework. That's not to say that virtual DOM implementations aren't still useful, just that they are one solution with a specific set of tradeoffs - other solutions to layout thrashing exist. (And VDOMs have other benefits being just avoiding layout thrashing.)
So to answer your question: the virtual DOM helps because it separates reads and writes from each other. Reads happen on the real DOM, writes happen on the virtual DOM, and it's only at the end of a given tick that the virtual DOM is reconciled with the real DOM, and the real DOM is updated.
I'm gonna apologise in advance for being unusually obtuse this morning. I'm not trying to be contentious :-)
> So to answer your question: the virtual DOM helps because it separates reads and writes from each other. Reads happen on the real DOM, writes happen on the virtual DOM, and it's only at the end of a given tick that the virtual DOM is reconciled with the real DOM, and the real DOM is updated.
I still don't understand why this can't be done (or isn't currently done) by the browser engine on the real DOM.
I'm sticking to the example given: write $FOO to DOM causing $BAR, which is calculated from $FOO, to change to $BAZ.
Using a VDOM, if you're performing all the reads first, then the read gives you $BAR (the value prior to the change).
Doing it on the real DOM, the read will return $BAZ. Obviously $BAR is different from $BAZ, due to the writing of $FOO to the DOM.
If this is acceptable, then why can't the browser engine cache all the writes to the DOM and only perform them at the end of the given tick, while performing all the reads synchronously? You'll get the same result as using the VDOM anyway, but without the overhead.
No worries, I hope I'm not under/overexplaining something!
The answer here is the standard one though: if you write $FOO to DOM, then read $BAR, it has to return $BAZ because it always used to return $BAZ, and we can't have breaking changes. All of the APIs are designed around synchronously updating the DOM, because asynchronous execution wasn't really planned in at the beginning.
You could add new APIs that do asynchronous writes and synchronous reads, but I think in practice this isn't all that important for two reasons:
Firstly, it's already possible to separate reads from writes using microtasks and other existing APIs for forcing asynchronous execution. There's even a library (fastdom) that gives you a fairly easy API for separating reads and writes.
Secondly, there are other reasons to use a VDOM or some other DOM abstraction layer, and they usually have different tradeoffs. People will still use these abstractions, even if the layout thrashing issue were solved completely somehow. So practically, it's more useful to provide the low-level generic APIs (like microtasks) and let the different tools and frameworks use them in different ways. I think there's also not a big push for change here: the big frameworks are already handing this issue fine and don't need new APIs, and smaller sites or tools (including the micro-framework that was originally posted) are rarely so complicated that they need these sorts of solutions. So while this is a real footgun that people can run into, it's not possible to remove it without breaking existing websites, and it's fairly easy to avoid if you do run into it and it starts causing problems.