Hacker News new | past | comments | ask | show | jobs | submit login
Google Docs will now use canvas based rendering (googleblog.com)
1167 points by lewisjoe 4 months ago | hide | past | favorite | 930 comments



Speaking as one of the original three authors of Google Docs (Writely), but zero involvement in this project (I left Google in 2010): I'm seeing a lot of comments asking how JavaScript-on-Canvas could possibly outperform the highly optimized native code built into the browser engines. It's been a long time since I've really been involved in browser coding, but having written both Writely and, farther back, several native-app word processing engines, here are some thoughts.

Word processors have extremely specific requirements for layout, rendering, and incremental updates. I'll name just two examples. First, to highlight a text selection in mixed left-to-right / right-to-left text, it's necessary to obtain extremely specific information regarding text layout; information that the DOM may not be set up to provide. Second, to smoothly update as the user is typing text, it's often desirable to "cheat" the reflow process and focus on updating just the line of text containing the insertion point. (Obviously browser engines support text selections, but they probably don't expose the underlying primitives the way a word processor would need. Similarly, they support incremental layout + rendering, but probably not specifically optimized in the precise way a word processor would need.)

Modern browser engines are amazing feats of engineering, but the feature set they provide, while enormous, is unlikely to exactly match the exacting requirements of a WYSIWYG word processor. As soon as your requirements differ even slightly from the feature set provided, you start tipping over into complex workarounds which impact performance and are hell on developer productivity and application stability / compatibility.

This is loosely analogous to CISC vs. RISC: browsers are amazing "CISCy" engines but if your use case doesn't precisely fit the expectations of the instruction set designer then you're better off with something lower-level, like Canvas and WASM. (I don't know whether Docs uses WASM but it would seem like a good fit for this Canvas project.)

Frameworks in general suffer from this problem. If you've ever had to fight with an app framework, or orchestration framework, or whatever sort of framework to accomplish something 5% outside of what the framework is set up to support, then you understand the concept.

Also, as noted in many comments here, browser engines have to solve a much more general problem than Docs, and thus have extra overhead.


I'd like to chime in here as someone who has worked on optimizing the execution of your code :) Google docs specifically was one of the subjects of a particular performance push when I was working on Spidermnonkey within Firefox, and I got to see how it behaves under the hood pretty well.

The thing that stands out to me the most was the giant sparse array (a regular js-native array) being used to store layout information, presumably. It really messed with our internals because spidermonkey didn't expect those to be used in fastpaths, and it was really lazy about trying to optimize for them.

Anecdotes aside.. I wanted to endorse your entire comment :) I remember thinking to myself how terrible it was to have to piggyback a document layout engine on top of HTML layout and these awful JS abstractions, and how much better and more performant it would be to do a proper layout engine - either in JS or compile-to-wasm, and have it run its own rendering logic.

In particular for large documents where you were making changes to early parts of the document, a single keystroke could invoke this _cascade_ of sparse array fetches and mutations and DOM rearrangements and all sorts of fireworks.


This is why I love HN. :-)

However, I can't claim credit (or blame, but I would argue mostly credit) for that code. There have been three generations of the Docs editor that I know of:

1. The original, which I was involved in, was an unholy mess perched shakily atop contenteditable. As such, it contained no layout or rendering code (but did all sorts of horrid things under the hood to massage the HTML created by the various browser contenteditable engines and thus work around various problems, notably compatibility issues when users on different browsers are editing the same document). Originally launched in 2005.

2. In the early 2010s, an offshoot of the Google Sheets team launched a complete rewrite of the Docs engine which did its own editing, layout, and rendering based using low-level DOM manipulation. This was more robust, supported layout features not available in contenteditable (e.g. pagination), and generally was a much better platform. My primary contribution to this effort was to incorrectly suggest that it was unlikely to pan out. (I was worried that the primitives available via the DOM would be insufficient; for instance, to deal with mixed-directional text.)

3. This canvas-based engine, which I learned about a few hours ago when this post popped up on HN.

I don't know whether #3 is an evolution of #2 or a complete rewrite; for all I know there was another generation in between. But I imagine you were looking at some iteration of #2.


You're right. This was a few years ago, so well after 2010.

And yes, I'd say credit as well for the layout code, not blame. I wasn't knocking the code - for that era sparse arrays + DOM stuff were pretty common approaches and there didn't exist better web tooling than that.

It's only been the last few years I'd say where the optimization quality (on the engine side) and API support has been good enough to justify this sort of approach ofjust plumbing your own graphics pipeline on top of the web.

That was a spidermonkey issue. I treat that experience more as a lesson in how obscure corner cases left as perf cliffs never stay obscure corner cases, and always get exercised, and you can't afford to ignore them for too long.


With a canvas-based engine, the editor is no longer relying on the contenteditable spec right?

For the majority of use cases, do you think contenteditable + view layer which precisely updates the HTML is still viable? More specifically, what do you think about open-source libraries like ProseMirror (https://prosemirror.net/) or Slate.js (https://github.com/ianstormtaylor/slate) which do that (ProseMirror uses its own view library on vanilla javascript, Slate uses React)?

I understand if you have really long documents or spreadsheets (I imagine latter is more frequent), you could maybe solve performance rendering problems with virtualization, which canvas gives more flexibility to?


> With a canvas-based engine, the editor is no longer relying on the contenteditable spec right?

Correct. In fact, contenteditable went out the window a decade ago when the "#2" engine (low-level DOM manipulation) was launched.

My experience with contenteditable is ~12 years stale at this point, so the only thing I'll try to say is that I expect it would work well up to a certain level of ambition, and no further. As I say above regarding frameworks: they're great so long as your requirements fit within the expectations of the framework, but you quickly hit a wall if you need to stray outside of that. For Docs, the desire for a paginated view/edit mode was an example; there was simply no sane way of squeezing pagination into a contenteditable-based engine.


My experience with modern contenteditable suggests that it does work pretty well, overall, though I've not been using it for something as layout-heavy as Docs -- I've worked on the VisualEditor for mediawiki, which has different requirements.

A canvas-based document editor with any sort of international ambitions has a fairly high bar to clear for reimplementing basic features. The browsers really do handle a lot of useful things for you in contenteditable, like the upthread-mentioned RTL issues, and complex IME input methods.

If you have a lot of HTML-rendering inherently required, strong internationalization requirements, and no need for something like page-based layout... contenteditable has advantages, particularly when comparing the up-front work required.


The sparse array is likely to be a protobuf. I ran into this issue with Firefox when working on Google Inbox, it was one of the reasons why the Firefox version was delayed, but there was degenerate performance with sparse arrays. (I'll note, various conspiracy theories on HN thought it was a deliberate attempt to hamper FF, when in reality, is was an unintended consequence of usage of an old protobuf format which never caused a problem until protobufs with huge extension fields were used in a specific way in the codebase, so the problem was discovered late)

protobufs can be stored in array format. In that format, each field number is basically it's index in the array. Extension fields in protobufs typically grab high numbered slots. So if you have a prototype with 1 field (id = 1), and one extension field (e.g. id = 10000000), you now have an array with [undefined, stuff, ... 999999 ..., stuff] and various array operators seem to reify this into a real array in older versions.


> various conspiracy theories on HN thought it was a deliberate attempt to hamper FF

I remember those being fairly rampant.

I wonder if a technical blog post about the issue would have silenced some of the conspiracy theories.

Regardless, there's a lesson in there somewhere. Never attribute to malice that which is adequately explained by degenerate performance of a browser pushed to its limits?


Yeah, it's primary the fault of overzealous pushing of limits. At the time we were using WebWorkers/SharedWorkers, bleeding edge CSS "compositor" behavior to achieve smooth 30-60fps animations, and lots of other stuff I don't remember. It was very easy to get off the 'golden path' of performance. Small changes in CSS or DOM structure for example would destroy layout/paint performance and require days of debugging.

Add on top of that, that Inbox was developed using a shared codebase for 3 platforms (Web, Android, iOS), the non-UI code was written in Java, while the UI code was written in JS, Java, and Objective-C respectively.

The shared "business logic" layer was cross compiled, and it was the protobuf runtime for GWT inside that was causing trouble. We "optimized" it by making it run on top of pure arrays instead of OO objects. This was a feature of GWT called 'JSO's (JavaScriptObjects) that let you pretend that raw JS objects had methods on them that they didn't, like Extension Methods in other languages.

All was good until IIRC, a utility function was introduced that did Object.keys(some protobuf array). This returns a sparse array on V8, but a reified real dense array on SpiderMonkey at that time, and so if you were unlucky enough to have a high extension field in you protobuf, you'd end up creating an array with a billion entries in it.

It was hard to forsee this because Inbox was built out of so many interacting systems. Ideally, the GWT Protobuf Compiler runtime would have had integration tests for Firefox that exercised iteration over sparse arrays with high extension number methods, but it didn't, which means the problem languished until discovered in Inbox. GWT Protobuf was probably a 20% project of someone at the time, implementing the minimal features they needed.

Also, debugging it was a nightmare, because as soon as Object.keys(big sparse array) was encountered, the Firefox debugger would essentially freeze/die, and we couldn't get iinformation out. Single-stepping through a huge ginormous bit of code after bisecting was how I tracked it down, because when I tried to console.println(object.keys(big sparse array)) it would die.

I'm not blaming Firefox, I'm not sure the JS specification even says what the right thing to do with things like Object.keys(sparse array), maybe it was unspecified/vague behavior? I'm just pointing out that there was absolutely no malice, and no desire to block Inbox from running on FF, or IE10 or WebKit for that matter. It's always basically a matter of launch schedules, late discovered bugs, and triage.


That's fun to know :)

Spidermonkey's diciontary object representation leaves a lot for improvement. The issue you cite here isn't specifically related (sounds like it could have been fixed with a one or two line change) but I can describe one of my (still standing) pet peeves about the implementation of objects in spidermonkey:

Dictionary objects are what we call objects that have fallen off the happy path of tracked property-names, and become degenerate associative maps from keys to values. They use a representation where the key-mapping for the object's bound names is kept in a linked entry hashtable (a hashtable where the entries form a doubly linked list) structure that hangs off of the hidden type of the object. Every lookup for a property (including array indexes) involved first pulling this hashtable out, then looking up the property on the hashtable, to obtain a shape, which gives the _offset of the property on the original object_, and then using that offset to look up the value on the original object.

All said and done, there were about half a dozen to a dozen distinct heap accesses, and pollution of about 6-7 cache lines, just to retrieve a single property on an object that had gone into dictionary mode (which is what sparse arrays would become).

Fixing the object representation was on my long-term todo-list for a while. It is a very time-consuming task because all the JITs and other optimization layers were aware of it, so any changes to it would involve adjusting a ton of other code.

> I'm not blaming Firefox, I'm not sure the JS specification even says what the right thing to do with things like Object.keys(sparse array), maybe it was unspecified/vague behavior? I'm just pointing out that there was absolutely no malice, and no desire to block Inbox from running on FF, or IE10 or WebKit for that matter. It's always basically a matter of launch schedules, late discovered bugs, and triage.

One thing you learn working on any sort of a public facing project a lot of people use is that people, especially the most emotionally invested people, will assign motivations to you personally that have no external reference points except their interpretation of events.

I've encountered that working at Mozilla, but thankfully largely been sheltered from direct consequences. You've arguably worked on even more public projects.

There's no need to pollute your commentary with defences that aren't owed.


So I've experienced Docs getting, hmm, sad once the doc you're editing gets beyond something like 30-50 pages.

Does this change mean that I can look forward to being able to write hundreds or thousands of pages in a Google Doc without it getting periodically non-performant?


I do a lot of game design documents that use images heavily, and it chugs hard around ~20 pages with ~30 images total on them. Macbook 2019, i9


My M1 does not have performance issues like that with docs 3x the size. Breeze right through even with 20+ other tabs open.


Yes but most of us aren't living in the future yet. We're stuck with Intel CPU's for the time being.


A majority of users don't have an M1 processor though.


I sat by Steve when Writely joined Google. [Hi Steve!] I sat by the Google Page Creator team when they were a thing. Regardless of performance, it's a miracle that a WYSIWYG editor can be written on top of the DOM at all, let alone a performant one. They had to work multiple miracles a day just to get bulleted lists to work somewhat reliably.

I have no doubt whatsoever that a Canvas-based editor can be faster and easier to maintain. I don't know how well it'll handle accessibility issues, though. I expect they'll have to do a lot of tedious work to get screen readers and the like to be happy.


I have nothing to add to the discussion, other than I’ve been using Writely since ~2005-2006 and wanted to say thanks for all the fish!

It was super handy before I had a laptop for regular use. I used it at public libraries for projects in my last year of high school. It helped me develop a habit of having a third-space workplace that was away from home and school.


Thanks!

The "floating workspace" aspect has always driven at least as much usage as the "collaboration" aspect. That came as a complete surprise to us, but it turned out to be very important to adoption. At some point I think we determined that the average document had something like 1.1 collaborators.


The name Writely reminds me of a side project I worked on around 2010. I was not satisfied with the performance of Google Docs and its competitors at the time like EditGrid and thought (naively, as it turned out) I'd be able to develop a faster alternative.

I had no idea what I was doing and thinking that using JavaScript to manipulate the DOM was going to be slow, I chose ActionScript and Flash as the language and runtime to develop the project in. I wrote a client-side expression parser and formula engine, and managed to develop a functioning spreadsheet UI with resizable rows and columns, copy and paste with Excel-like animations, cell references etc.

The problem that I ran into was text-rendering when there was a lot of text on the screen. The application would consume a lot of memory and the page would slow down to a crawl when scrolling. I couldn't really find a way to speed up the performance and stopped working on the application after some time. That's when I realized the incredible amount of work that went into Google Docs and other web-based spreadsheets. :)


Do you mean Google Sheets instead of Google Docs?


You are correct. I meant Google Sheets instead of Google Docs. I thought I'd tackle the word processor part once I got the spreadsheet to a usable state. I do think Google Docs suite is used as an umbrella term to refer to all the Google collaboration tools like Docs, Sheets, Forms etc.


How would you recommend someone to get started learning about the architecture of Text editors / word processors?

I think Monaco from vscode is probably an interesting read but I’ve never looked at such a big open source code base before.

Is there something you can recommend to understand better how it works architecturally?


There is no substitute to building one yourself, but The Craft of Text Editing book has a lot of accumulated wisdom. It is Emacs-centric, but basics are same.

http://www.finseth.com/craft/


One of my first programming projects as a teenager back in terminal-type days was to write my own text editor for the Atari ST. I was super happy with it, and sold three (3) copies of it! That made me very happy at the time.

Of course there was that time that I messed with the save/load code and destroyed the text files of one of my customers. Not so happy with that! Saved it by writing a fix system, and that actually led to being hired at that guy's company for my first "real" job. ;)


Former ST user here: out of curiosity, what was that text editor and company?


Ahahaha! You said company. ;) It was just me, a teenage kid, selling to people who came through the store I worked at selling computers (the store sold the Atari line).

I called it DEdit, because every programmer wants to grab a single letter title.


I'd love to have a good answer for you, but I learned the basics all the way back in the '80s. Seems like I've seen references posted occasionally on HN, hopefully someone has a good link.


There were a few really interesting blog posts about the architecture choices for the xi editor a few years ago, ending with this

https://raphlinus.github.io/xi/2020/06/27/xi-retrospective.h...

But I personally never worked on this kind of problem, I just remember reading these over the years


Here’s a blog post detailing some vscode internals: https://code.visualstudio.com/blogs/2018/03/23/text-buffer-r...


I recommend reading about the internals of CodeMirror and ProseMirror: e.g. https://codemirror.net/1/story.html.


I don't think there's any way better than build a text editor from scratch, you have to understand the exact problem before reading other people's solutions.


Might this be helpful? I recall seeing this on HN awhile back.


A good open-source example of this type of problem is CodeMirror (a code-editing widget for the web). To achieve syntax highlighting and everything else, it basically fakes every single aspect of text editing in a browser context - even the cursor and text highlighting - replacing the native versions with pixel-placed DOM elements driven by JS. It receives raw events from an invisible textarea and does everything else itself.

This is just about the worst possible use-case for the DOM: you get almost none of the benefits, and still get most of the costs.


Edit: To be clear, I'm not saying this to rag on CodeMirror. It's a marvel that they got it working as well as it does, and it's been around for a long time- possibly longer than the canvas API has been widely available. It's just that doing things this way requires a pile of hacks and I can see a strong argument for just cutting out the middle-man and doing 100% custom rendering.


VS Code and the likes do the same, right? I'm pretty sure this all will soon be optimized, if not already are.


> If you've ever had to fight with an app framework

... it usually means it’s the wrong tool for the job.

I have to ask, why not a native app? Once you start bypassing every browser feature anyway, what’s the point of using a browser?


Distribution. It is so, so, so much harder to get most users to install an app, especially for casual purposes ("please add your notes to this doc") which is often how people first come to Docs.


That's a common belief but there are a lot of exceptions that mean we should question how true this really is.

1. Mobile-first, mobile-only apps.

2. Minecraft or really any game.

3. The proliferation of Electron apps that are basically downloadable versions of the website.

4. Apple's own suite of apps. Keynote is pretty darn popular.

In the case of a user who is really, really unmotivated to comment on a doc, sure. Then every click, every second counts because the user doesn't really have a fixed need to complete the task to begin with. For most other things, users are willing to download apps and may even prefer it.

It's also worth considering that Writely/Docs never really supplanted Word and is still rather feature poor even after a decade of continuous development, perhaps because they keep having to rewrite the rendering engine. If Docs was a downloadable app with a simple web-side static renderer + commenting engine, it might have obtained features that could offset any loss of casual users due to needing a download to collaborate. Especially if the download was fast, tight and transparent.


I think a major advantages for me are easy bookmark-ability and sharing via URLs to people who may not have the app. You can bodge workarounds for those things in an offline app, but now the bookmarks aren't in my browser or I have to copy them to it, or the people I share the doc to are just looking at a browser rendered viewer or something.

While Docs hits 95% of my needs there's still that 5% and I suspect most of those are held back by the current implementation architecture. Hopefully moving to a Canvas based system will enable them to more easily add complex features.


On the other hand, allowing for a native desktop app could have caused it to end up in the same state as Microsoft's apps on the web (e.g. the web version of PowerPoint is pretty awful), which would have undermined a key differentiator between Docs and Microsoft's suite.

I am not sure what has held gSuite back all these years, but the pandemic seems to have brought them out of their slumber.


What I don't understand is why google doesn't just do what they do all the time and add 20 new APIs for it and strongarm everyone else into having to implement them too


Out of curiosity, what have you moved on to now that you don't do browser coding?


After leaving Google I founded scalyr.com. To get an idea of the type of work we're doing, this old blog post holds up: https://www.scalyr.com/blog/searching-1tb-sec-systems-engine....


Kids these days usually don't understand exactly how fast hardware actually is. ;)

"I have 200 million entries in a table I need to compress. I'm gonna write a flume job! I estimate it will take 5 minutes to start the job and a half hour to run! Then I'll spend a few days figuring out how to shard it so it actually finishes."

"Sounds good. But I also have this bit of Java code here that does the same compression on my desktop in about 30 seconds. Would you like that instead? You could convert it to C++ if that would make you feel better."

yellowbrick.com did some neat stuff pushing the query algebra down into the flash storage firmware.


Does the challenge of word processors extend to these chromium based IDEs like VsCode ? I wonder if that can also be optimized if going to a non DOM based approach?


They are already doing it for some features, https://code.visualstudio.com/blogs/2017/10/03/terminal-rend...


I assume they have used CSS to apply the styles on each text span, and now they don't?


I wrote the terminal canvas renderers in VS Code that has been called out a few times here. Initially I implemented a canvas renderer using just a 2d context to draw many textures which sped things up "5 to 45 times"[1] over the older DOM renderer.

Since then I moved onto a WebGL renderer[2] which was mostly a personal project, it's basically the first canvas renderer but better in every way since it works by organizing a typed array (very fast) and sending it to the GPU in one go, as opposed to piece meal and having the browser do its best to optimize/reduce calls. This was measured to improve performance by up to 900% in some cases over the canvas renderer, but actually much more than that if for example the browser has GPU rendering disabled and tried to user the canvas renderer on the CPU.

My opinion here is that canvas is a great technology, capable of speeding things up significantly and getting close to native app performance. It comes with very real trade offs though:

- Cost of implementation and maintenance is much higher with canvas. This is particularly the case with WebGL, there have been very few contributions to xterm.js (the terminal frontend component) in the WebGL renderer because of the knowledge required. - Accessibility needs to be implemented from scratch using a parallel DOM structure that only gets exposed to the screen reader. Supporting screen readers will probably also negate the benefits of using canvas to begin with since you need to maintain the DOM structure anyway (the Accessibility Object Model DOM API should help here). - Plugins/extensibility for webapps are still very possible but requires extra thinking and explicit APIs. For xterm.js we're hoping to allow decorating cells in the terminal by giving embedders DOM elements that are managed/positioned by the library[3].

More recently I built an extension for VS Code called Luna Paint[4] which is an image editor built on WebGL, taking the lessons I learned from working on the terminal canvas renderer to make a surprisingly capable image editor embedded in a VS Code webview.

[1]: https://code.visualstudio.com/blogs/2017/10/03/terminal-rend...

[2]: https://code.visualstudio.com/updates/v1_55#_webgl-renderer-...

[3]: https://github.com/xtermjs/xterm.js/issues/1852

[4]: https://marketplace.visualstudio.com/items?itemName=Tyriar.l...


Okay, if anyone else after reading Accessibility needs to be implemented from scratch felt ashamed, raise your hands with me. SW engineers suffer from assuming everyone is like them and there are no corner cases. My Mom was recently sued for violating ADA with her real estate website not working with screen readers well enough.


I’ve never worked somewhere that treated a11y as a feature, with the commensurate resources put towards implementing it. It’s always ignored and then sometimes maybe worked on as an afterthought, during a hackathon or whatnot.

In other words, even when engineers are aware of it and inclined to do something about it, mgmt still has to care, and I’ve just never seen that once.


Do you think the large performance benefits can be achieved for any general web app (e.g. if I rewrite my Vue app's render functions to using a canvas instead of the DOM) or is the canvas' benefits mainly for niche workloads?


Definitely niche workloads or when the performance benefit from a UX perspective is worth the cost of implementation. Start out with virtualizing the DOM so only the visible parts are showing, if the framerate isn't acceptable after that then consider switching to canvas.

Using the terminal as a case study, its DOM renderer needs to swap out many elements per row in the terminal every frame (the number depends on text styles and character widths) and we want to maintain 60fps. It's also not just a matter of maintaining low fps since more time rendering means less time processing incoming data because they share the main thread, which means commands will run slower.


In my experience no - at least on current iterations - for very specific things such as a text editor where the DOM isn't really prepared to deal with the way it (the editor) has to be structured probably yes, if you knew what you're doing - but for most things not really - and to have the same functionality you would need to implement a lot of things by yourself (even if functionally it would work it wouldn't have the same accessibility unless you did that yourself and not sure how much you can fully emulate it).


I remember Flipboard using canvas to render their UI before using react, which has the same idea, you can look at the repo and their post about it:

https://github.com/Flipboard/react-canvas


From what I understand DOM is pretty shit in terms of performance because it needs to support so much legacy crap (eg. float layouts) - so even using sandboxed WebGL (which adds overhead over native APIs which your browser would use to render) you can still be much faster.


> Supporting screen readers will probably also negate the benefits of using canvas to begin with since you need to maintain the DOM structure anyway (the Accessibility Object Model DOM API should help here)

The fact that the DOM elements are invisible (don't affect layout) should eliminate the majority of the performance cost, right?


They can't be display: none as that would mean the screen reader can't access them. To do this properly and help low vision people, you need to make sure the textarea is synced with the cursor position and that all the text is positioned roughly where the text on screen is. By doing this the screen reader will correctly outline the element being read.

There may also be additional costs like the string manipulation required to build the row text in the terminal, this is nothing that can't be optimized but then that's more memory and cache invalidation to worry about.


In 2009, I joined Mozilla and started working on the Bespin[1] project, which Ben Galbraith & Dion Almaer had brought to Moz. Bespin was built with a canvas-based renderer. Bespin was way faster than other browser-based code editors at the time.

Then the Ajax.org/Cloud9 folks came along with their Ace editor[2], which was DOM-based and still very fast. We ended up merging the projects. edit to add: and switching to DOM rendering

Rik Arends[3] was one of the Ajax.org folks and he's been working on a WebGL-based code environment called Makepad[4], which is entirely built in Rust and has its own UI toolkit. He's complained a lot about how difficult it is to make a performant JS-based editing environment.

My point in all of this is just that there are absolutely tradeoffs in performance, accessibility, ease-of-development, internationalization, and likely other aspects. If raw performance is what you're going for, it's hard to beat just drawing on a canvas or using WebGL. Google Docs needs to worry about all of that other stuff, too, so I'll be interested to see how this shapes up.

[1]: https://en.wikipedia.org/wiki/Mozilla_Skywriter

[2]: https://en.wikipedia.org/wiki/Ace_(editor)

[3]: https://twitter.com/rikarends

[4]: https://makepad.dev


It's funny — Google's approach here reminds me of the Netscape/Mozilla XUL tree[0] element.

For those unfamiliar, the XUL tree is a performant 1990s-era virtualized list that is able to render millions to tens of millions of rows of content without slowdown since it gives you the option of rendering internally in Firefox rather than through the DOM.

I still don't completely understand why Mozilla is/was planning to axe[1][2] it since there's no web-based HTML5/JS replacement (the virtualized "tree" is implemented in C++, iirc) and it's still being actively used in places.{xul/xhtml}[3] and the Thunderbird/SeaMonkey[4][5] products.

It's interesting that both Google's canvas bet (Flutter, Docs, etc.) and the Mozilla XUL tree are basically trying to solve a nearly identical problem (DOM nodes are expensive and DOM manipulation is slow) ~20-25 years apart.

[0]: https://developer.mozilla.org/en-US/docs/Archive/Mozilla/XUL...

[1]: https://docs.google.com/document/d/1ORqed8SW_7fPnPdjfz42RoGf...

[2]: https://bugzilla.mozilla.org/show_bug.cgi?id=1446341

[3]: chrome://browser/content/places/places.xhtml

[4]: https://www.thunderbird.net/

[5]: https://www.seamonkey-project.org/


> I still don't completely understand why Mozilla is/was planning to axe[1][2] it since there's no web-based HTML5/JS replacement (the virtualized "tree" is implemented in C++, iirc) and it's still being actively used in places.{xul/xhtml}[3] and the Thunderbird/SeaMonkey[4][5] products.

XUL is a maintenance burden and exacts a development tax on new features (having to make the Servo CSS engine support XUL so that it could be uplifted to Firefox was extremely annoying). It's also full of security problems, as it's written in '90s C++ that nobody is around to maintain properly. Getting rid of it is an inevitability.


Thanks for the reply! I'm a big fan of your work. I can only imagine the nightmare of trying to implement two separate XUL <=> HTML/CSS flex/box models in Servo/Rust.

For readers that are unaware, there is also a great blog post breaking down some of these points in finer detail [0][1].

I guess my question is — are there replacements planned for any of the legacy yet performant XPCOM interfaces / XUL elements like nsITreeView/tree? My tl;dr understanding of XUL trees is that the DOM is and always has been too slow to render millions of scrollable rows in a performant manner (bookmarks, thunderbird, etc.). Would it not be possible to re-implement the XUL tree logic in Rust, for example? Is the goal to completely get rid of all non-standards compliant elements in the long-run?

It seems like there will always be some custom elements necessary for a native desktop interface which can never be integrated into HTML...

"I’ve talked about this before, but things like panel, browser, the menu elements (menu, menupopup, menucaption, menuitem, menulist) don’t have HTML equivalents and are important for our desktop browser experience. While we could invent a way to do this in chrome HTML, I don’t think the cost/benefit justifies doing that ahead of the rest of the things in our list." [2]

..., yet I don't see much discussion about this anywhere.

I'm particularly interested because I'm currently working on a XULRunner project where a <tree> is central to the user interface (millions of rows, image column, embedded data, must run on macOS/Windows/Linux/*BSD, etc.), and it's a little alarming that there is an open bugzilla ticket that did not initially mention either the performance nor ecosystem implications (essentially kill Thunderbird, kill SeaMonkey more than it already has been) of its removal.

I think the one part I have trouble with is that implementing a native looking/performant cross-platform desktop UI is still a nightmare and XUL could have potentially been a fantastic desktop-focused superset/companion of/to HTML.

[0]: https://yoric.github.io/post/why-did-mozilla-remove-xul-addo...

[1]: https://news.ycombinator.com/item?id=24231017

[2]: https://briangrinstead.com/blog/xbl-replacement-newsletter-2...


I mean, you probably won't like this answer, but I don't think you should be writing a XUL-based app in 2021 if you want it to be useful, as opposed to for fun. XUL is 25-year old legacy technology, and using it is an exercise in retrocomputing.


Fair enough.

Sorry, I should have clarified a bit more — I'm writing a cross-platform desktop application that has a preact[0] frontend (+ a Go backend) using `firefox --app application.ini`.

I have been experimenting with performant lists (which is why I brought up the XUL tree — it's currently central to the interface, though not the final implementation for sure) — I'm currently only using the XUL window/menubar elements in order to populate the native macOS menubar.

I am a fullstack web dev in my day job, so my goal here is to write a fast, easily extendable UI that I can quickly iterate upon using modern html/js/css/etc.

I love gecko and used to write XUL add-ons many years ago, so I'm already familiar with JS code modules, XPCOM, XUL, the internal browser architecture etc.

Basically, I'm now using XULRunner (`firefox --app application.ini` as previously mentioned — will eventually be stubbed into a native macOS .app/OS program) as a replacement for Electron / Chromium Embedded Framework[1].

I'm basically doing the same thing as Positron[2]/qbrt[3].

[0]: https://preactjs.com/

[1]: https://en.wikipedia.org/wiki/Chromium_Embedded_Framework

[2]: https://github.com/mozilla/positron

[3]: https://github.com/mozilla/qbrt


While you're definitely correct, I enjoyed working with Komodo IDE which was (is?) built around XUL even up to a few years ago. A neat API for extending and hacking around on it, with quite nice discoverability. I'm sad to see it go, but it makes perfect sense as to why!


Probably still better than anything HTML.


I really doubt anyone is going to revive those old legacy widgets. That style of widget is flawed and predates MVC-style design, you'll find it becomes very impossible to present non-string data with that tree. Most applications now will want to show arbitrary widgets within the table, and for that they'll use the standard HTML/CSS. I would expect you can do something comparably fast by using a virtualized list in HTML, along with IndexedDB.


I'm not a XUL or JavaScript expert, but there is evidence that you can implement a virtualized list in regular HTML:

https://react-window.vercel.app/


I recently wrote a somewhat performant react-virtualized[0] list for a project at work, though it's definitely a bit trickier in plain HTML/JS.

As far as the XUL virtualized tree goes, a couple of Mozilla engineers wrote some examples using plain html/javascript + DOM node manipulation[1]. While promising, I can't imagine that this implementation could ever be as fast as the compiled C++ one[2].

[0]: https://github.com/bvaughn/react-virtualized

[1]: http://benbucksch.github.io/trex/fastlist-test.html

[2]: https://searchfox.org/mozilla-central/source/layout/xul/tree...

https://news.ycombinator.com/item?id=14158170


tangential but amusing — the old chrome://global/content/config.{xul/xhtml} used a XUL tree and rendered 0 DOM nodes to display its treechildren whereas the new about:config renders upwards of 4500 <tr> DOM nodes by default

ignoring the fact that you can no longer sort by specific columns (name, status, type, value, etc), you can really feel how slow the new implementation is if you click the "Show Only Modified Preferences" button — the DOM update feels incredibly sluggish whereas both searching and sorting columns in the old xul tree always felt snappy and instantaneous

https://imgur.com/a/abhYoW8


Yeah it's a real shame it's all moving to HTML/JS for the chrome & e.g. devtools, even if it's more "maintainable".


I use the modern Firefox HTML5/JS dev tools on a daily basis and love the featureset that they provide, though it is equally shocking to compare the feel to that of the old DOM Inspector[0] (and Venkman[1], the old JS debugger), which was a XUL add-on for DOM inspection that used to run in Firefox, Thunderbird, and SeaMonkey.

What feels snappy and instantaneous in DOM Inspector feels somewhat muddy and laggy in the modern devtools.

While I greatly appreciate the amount of features that Mozilla has integrated into the modern (post-firebug) devtools over the years, it is a little sad that the next generation will never get to experience just how fast some narrow aspects of web development used to be.

https://imgur.com/a/JZoSTXf

[0]: https://addons.thunderbird.net/en-us/firefox/addon/dom-inspe...

[1]: https://addons.thunderbird.net/en-us/firefox/addon/javascrip...


Thanks for the shoutout to the legacy DOM Inspector (which I used to maintain) and Venkman (which I have both admired and used in anger). When Mozilla decided to put together a devtools team for Firefox 4, I was disappointed when I realized that they weren't going to take any effort to make sure the fruits of their labor sidestepped any of the performance issues that Firebug had exhibited for most of its lifetime. I do want to quibble, though, about the suggestion that this is a matter of XUL+JS versus HTML+JS. I say this even as someone with strong feelings about what a joy XUL was in comparison, and a long-lasting bitterness over the decision (among many) that Mozilla made in mishandling its own future.

WebKit's Web Inspector has for a long time gone with HTML, and in all its incarnations I've ever tried out, it has always been snappier than either Firebug and or the devtools that ships with Firefox.

When making comparisons like this, it's important to keep in mind that you're comparing/contrasting teams and their output, and it's not just a matter of the building blocks they're using. Some teams do better work than others.


Venkman!


It's all about quantity over quality, lowest-common-denominator crap. Quality and craftsmanship has gone down the drain with "modern" software.


Similarly, my understanding is that the TreeStyleTab extension was forced to migrate in the same manner and it can get very sluggish as well when you have a lot of tabs open.


Why can't XUL vs DOM just be the same data with a fast C++ API and a slow JS API?


> Why can't XUL vs DOM just be the same data with a fast C++ API and a slow JS API?

This is a great question! I'm not really qualified to answer it, but I'll give it a try.

My understanding is that the XUL tree is fast because it implements the XPCOM C++ nsITreeView[0][1][2] interface.

If you're writing a XULRunner program ...

(Firefox "is distributed as the combination of a Gecko XUL runtime — libxul, other shared libraries, and non-browser-specific resources like those in toolkit/ — plus a Firefox XUL application — mostly just the files in Contents/Resources/browser/, plus the 'firefox' stub executable that loads Gecko and points it at a XUL application", see [3])

..., XPCOM[4] allows you invoke those implemented interface methods directly from JavaScript.

XPCOM is a technology that, since the removal of XUL/XPCOM addons, is inaccessible to everyone except for Mozilla devs and those who write XULRunner programs using `firefox --app /path/to/application.ini`.

So, some XUL elements (like <tree>) implement an XPCOM interface that invokes native C++ (or rust, python, java, etc.) code, which is statically compiled directly into the Gecko XUL runtime.

Modern HTML5 elements, in general, must utilize the native interpreted browser DOM/JavaScript and cannot choose to implement/satisfy an arbitrary internal XPCOM interface. While I'm sure that Mozilla has figured out a way to make these elements fast (C++, Rust, I have no idea), you are always bounded by the limitations of the DOM.

So, my understanding is that, because we are relying on standards-compliant HTML5 elements which mutate the DOM, we cannot specify and implement new XPCOM interfaces ("with a fast C++ API") that could theoretically bypass the DOM — we /must/ rely on the "slow JS API."

[0]: https://developer.mozilla.org/en-US/docs/Mozilla/Tech/XPCOM/...

[1]: https://searchfox.org/mozilla-central/source/layout/xul/tree...

[2]: https://searchfox.org/mozilla-central/source/layout/xul/tree...

[3]: https://mykzilla.org/2017/03/08/positron-discontinued/#comme...

[4]: https://developer.mozilla.org/en-US/docs/Mozilla/Tech/XPCOM


for future readers — ironically, there were actually WIP patches to re-implement the XUL tree (nsITreeView) using an HTML5 canvas years ago

https://bugzilla.mozilla.org/show_bug.cgi?id=441414#c172

---

  "What has been will be again,
  what has been done will be done again;
  there is nothing new under the sun."
—Ecclesiastes 1:9

"pining for a future that never arrived"

https://en.wikipedia.org/wiki/Mark_Fisher#Hauntology


Try holding down Alt on https://makepad.dev/

Such a cool feature that you can't really do with DOM based solutions (VSCode could never do this).


It may sound stupid, but this was the feature i tried to add to ACE and i couldn't. And i spent the last decade trying to invent a drawing API that would let me do this effect.


Would anyone mind explaining/screenshotting what it does, for the benefit of those of us on phones?


it basically folds (hides) the implementation code of every method in the file, giving an API-like view, but with a smooth animation shrinking the text instead of instantly disappearing it


Nnnnnnng, this is awesome, and annoying because it's so good. Imagine coupling it with the mouse wheel to zoom in/out.

I'd love to have this in MacVim/gvim somehow.


Can someone explain why https://makepad.dev/ is extremely slow and "unusable" on Microsoft Edge browser but run smoothly on Chrome? Is it because of bad WebGL perf on JavasSript perf in general?


Did you try using dev/canary Edge? It should pretty much be the same rendering engine and JS engine as Chrome. Definitely report this to the Edge team if you have the time (very easy to do from the dev version of Edge). In my experience, they are very responsive to bug reports and feature suggestions.


Back in the day i made it work on Edge on an xbox. This was microsofts browser +JS engine. Nowadays its just chrome though. If it has problems, i'd be highly surprised.


Another possibility is code that acts differently depending on user-agent.


It most definitely does not.


Edge or Edgium?


For me, it's not very fast in chrome.


You might be getting the software rendering webgl implementation if there are GPU driver problems (https://blog.chromium.org/2016/06/universal-rendering-with-s...).

(Of course in principle you don't need hardware acceleration to make this kind of thing run fast...)


Can you give a bit more flavor to "not very fast"? I'm on a measly chromebook and scrolling, selecting text, expanding directories, everything is smooth and high framerate.

Is there a specific operation that is not fast? Opening it for the first time took a few seconds but afterwards it was pretty buttery.


Makepad kinda has a minimum GPU spec. It's aging-out for the people who don't have this, but some people still don't have gpu's that can bitblit their screen with a solid color.


I tried for it not to be, however there is no technically specific reason no.


> ... it's hard to beat just drawing on a canvas or using WebGL.

Both of these APIs perform quite poorly for what they're doing.

To compete with native, the web platform needs simple low-level APIs that do not have a lot of Javascript marshalling overhead and other performance cliffs. You can always build a more convenient library above low-level interfaces, but the opposite is not true.


They don't need to be better than native, they just need to be better than the DOM plus have the advantages of web-based distribution.


It seems like WebGPU is the next thing:

https://github.com/gpuweb/gpuweb/wiki/Implementation-Status


It is, but with luck version 1.0 will arrive at the end of the year and then there is the whole adoption rate.

Now given that WegGL support is still hit and miss, and the only way to debug is to rely on native GPGPU debuggers, and having the pleasure to differentiate between browser own rendering code and the one from the application, that shows how easy it is to do 3D on the Web.


>Both of these APIs perform quite poorly for what they're doing.

When it comes to Canvas, do you mean that it actually performs poorly when putting pixels on the screen using putImageData, or do you mean that it does that fine but it performs poorly when it comes to drawing vector graphics? In either case, do you know why it performs poorly?

Personally, I would be happy if Canvas just let you put raw pixel data on the screen and did that as well as possible. I have never felt any need for its vector graphics features. To me, they seem too high-level for what Canvas is supposed to be. But I guess things are different when it comes to using the graphics card, since from what I understand it is actually optimized for drawing polygons.


It performs poorly in either case. For one, Javascript APIs have significant marshalling overhead.

Secondly, Canvas is often "hardware-accelerated", which can make some things faster, but also slower because this kind of immediate-mode drawing doesn't match the GPU interface well. It's particularly slow at vector graphics. Some effects would require pixel readback, which is slow for the same reason.

> Personally, I would be happy if Canvas just let you put raw pixel data on the screen and did that as well as possible.

Drawing cached Bitmaps is relatively fast in Canvas, if you don't need too many calls. Getting arbitrary data into such a Bitmap is slow, so if you want to do it every frame you may run into issues.


> Secondly, Canvas is often "hardware-accelerated", which can make some things faster, but also slower because this kind of immediate-mode drawing doesn't match the GPU interface well. It's particularly slow at vector graphics. Some effects would require pixel readback, which is slow for the same reason.

https://developer.mozilla.org/en-US/docs/Web/API/HTMLCanvasE...

One would assume/hope that specifying "bitmaprenderer" for the context type would give you a regular immediate-mode CPU rasterizer. Is that not the case?

> Getting arbitrary data into such a Bitmap is slow, so if you want to do it every frame you may run into issues.

To expand on this, doing that ("putting raw pixel data on screen") anywhere is slow if it's modified regularly. There just doesn't exist a fast CPU-buffer-to-display pipeline anymore, that died out years & years ago. So that one at least isn't a JS/web limitation, it's more a modern graphics architecture one. You just can't bit-bang pixels yourself anymore, not reasonably efficiently anyway. In theory that'd be possibly on unified memory architectures (read: mobile devices & integrated graphics), but GPUs don't like to publish their swizzled texture formats so you still don't get to even there.


> One would assume/hope that specifying "bitmaprenderer" for the context type would give you a regular immediate-mode CPU rasterizer. Is that not the case?

No, that's something else.

> To expand on this...

I almost wrote something like that, but then I considered that I haven't really benchmarked this. Streaming data from CPU onto the GPU is certainly possible and graphics APIs do have hints for such usage. You also don't need to convert to a texture to get arbitrary data on the screen, a trivial shader can do that for you.

If your data/transformations naturally live in RAM/CPU, that may well be the most efficient thing to do.


I think this is a problem of the mainstream conception seeing the future browser mainly as a monopolized, walled garden (compared to GNU/Linux, which nowadays would do everything one wanted from a computer and more) with the canvas being a kind of framebuffer.

Back when I first read about the canvas, iirc there was no fancy CSS, no fancy custom elements and making a simple doodle-element or the famous doodle-jump as a webapp was ... - well, I guess there was flash. So if you think of HTML and DOM as a GUI toolkit, it filled an important void (and continues to do so) but nowadays noone wants to use (standardized) HTML anymore, so...

If you look into tk (or nowadays tkinter) you basically see the same with the Canvas-class (I think you can't draw anything custom at all in tk easily)!


I don't build extensions or work on much front end web lately but this reads like Google wants more control over their stuff. The web is becoming less open.

> By moving away from HTML-based rendering to a canvas-based rendering, some Chrome extensions may not function as intended on docs.google.com and may need to be updated.

> If you are building your own integrations with Google Docs, we recommend using Google Workspace Add-ons framework, which uses the supported Workspace APIs and integration points. This will help ensure there will be less work in the future to support periodic UI implementation changes to Docs.

This is basically putting an API on top of an API as far as I'm concerned. The web renders markup and executes javascript to produce experience. Putting an API on top and using canvas to render your content creates a more closed system.

For those more deep in web technology, I'd like to know if there are reasons to move to canvas for strictly technical merits.


Performance of DOM based rendering is very problematic and not unified across browser implementations. Canvas rendering will likely increase the performance of Google Docs, and make the UX more unified across platforms. Google Docs is really an application built on the web platform. HTML DOM rendering was never intended to give developers the control they need to build fully featured high performant applications, we just shoe horned things until they sort of work. I think this is a positive thing, UX will be better and the integration API's will become much cleaner and not depend on structure of how they design the UI. Concerns will be separated and the end result is something much cleaner, more performant, and more supportable.


I think this is the real reason for the change as well. A few years ago Visual Studio Code underwent a similar change where rendering the terminal moved from using DOM to canvas. I never noticed a huge difference between the two methods but I imagine using canvas gave them a lot more flexibility in addition to being more performant.


Here's an article that talks about the switch to canvas for VS Code. The "5 to 45" times faster part really sticks out to me. Kind of surprised it took Google this long to do this with Docs.

https://code.visualstudio.com/blogs/2017/10/03/terminal-rend...


I personally didn't even bother checking out VS code because it was based on electron, and so I figured the performance just wouldn't be there because it's doing all of this awkward web stuff while trying to be an IDE.

I was completely wrong though. Using it, it really doesn't feel like a web app at all. It's really shocking and impressive. It feels like a text editor. Perhaps I should learn more about what they're doing.


The major problem I have with electron apps is that they eat memory for breakfast, lunch, and dinner.

This happens with jvm applications as well, but you can limit the max heap size and force the garbage collector to work more, trading off speed for the ability to run more apps side by side.

AFAIK, you can't limit the memory used in electron apps, and they don't respond by sharing heap with their child processes. With enough extensions to make it usable, vscode easily eats GB of memory.

I like lsp. But I don't need vscode to do the rendering.


The majority of the problems with "Electron" are actually just problems with the development style used by the types of people who publish and consume packages from NPM.

We've gone from a world where JS wasn't particularly fast, but it powered apps like Netscape, Firefox, and Thunderbird just fine (despite the fact that the machines of the era were nothing like what we have today) and most people didn't even know it, to a V8-era world where JS became crazy fast, to the world we're in now where people think that web-related tech is inherently slow, just because of how poorly most apps are implemented when they're written in JS.

If you want to write fast code, including for Electron, then the first step is to keep your wits as a programmer, and the second step is to ignore pretty much everything that anyone associated with NPM and contemporary Electron development is doing or that would lead you to think that you're supposed to be emulating.


I agree, and this is something also I have witnessed in my own Electron project, where careful care was taken to write fast and memory efficient code. It doesn't really use that much memory compared to native applications when running, I've done comparisons.

I feel also that the problem is more with the style of javascript development rampant these days, where not a lot of care is taken into making memory efficient or even efficient code.

This has to do a lot of course with the high rise in people studying to become (mostly) web-developers, without any deeper degree in CS or understanding of how computers really work.


This isn't entirely the fault of Electron though, but the convenient data types exposed in a web environment. Beyond the baseline memory of running Chromium, you could use various tricks to keep memory very low such as minimizing GC (eg. declare variable up front, not within loops), use array buffers extensively, shared array buffers to share memory with workers, etc.


> eg. declare variable up front, not within loops

I have trouble believing modern JS engines wouldn’t optimize this to the same thing.


Behaviorally they aren't the same thing, so that's not a straightforward mechanical translation. For example if the variable is an object instance then if escape analysis can prove the variable doesn't outlive the loop, then it can be put on the stack & then yes there wouldn't be any benefit to the suggested change. Although deoptimization makes stack allocation more complicated, so JS engines are more conservative here than say JVM runtimes.

But it's really easy for escape analysis to fail, it has to be conservative. So you can end up heap allocating a temporary object every loop iteration quite easily.


Is there a good guide to these techniques in modern JS? How likely are they to remain viable long-term?


Not sure on any particular guide, but I learned a lot from the old #perfmatters push from Chrome, getting a deeper understanding of what the JS engine does when you create an object, where it lives, how it interacts with the garbage collector and so on would be a good thing to learn about. Also it's generally only worth considering optimization for things that store a lot of data like arrays/maps. I don't see why these techniques wouldn't be good in the long term.

I definitely agree that it's easier to make webapps that consumer much more memory than it is using a lower-level language like C++, unless you're being careful.


I just in the past month upgraded my main work laptop from 16 GB from 40 GB (8 GB soldered + 32 GB SODIMM). So your point is granted, but on the other hand, DDR4 prices have collapsed ~50% from 2018 (I couldn't believe it either, given all of the other semiconductor issues).


nice, but you are lucky to have laptop that have extra RAM slot. not everyone can do that.


I specifically bought this laptop with that in mind.


Luck has nothing to do with it, the Macbook Pro users know what they're getting into.


Solution: 64GB RAM. Never look back.


This is a first-world solution.


> Kind of surprised

I'd assume a significant cost-benefit tradeoff. For all its flaws, the DOM rendering algorithm is at least "document-like," so there's a lot of wheel-reinventing to do going from just using the DOM to a custom document layout implementation underpinning a canvas-targeted rendering algorithm.


Look at the Google docs generated html markup some time. It’s not making nice neat <p>’s and <h1>’s.


Yes, at OrgPad, we are writing our own rich text editor and if you want to use the DOM approach, you don't have much choice than to do it like this. You can see the WIP demo here (in Czech but it is quite visual): https://www.youtube.com/watch?v=SkFJ1zcRjQY It is also written in ClojureScript. Some of the reasoning is here (in English, but 3 hours long): https://www.youtube.com/watch?v=4UoIfeb31UU


How? I thought they blocked access to the native underlying document format. Or do you mean the HTML export?


I assume your parent means the result of clicking inspect element in the (pre-canvas) editor.


> I imagine using canvas gave them a lot more flexibility in addition to being more performant.

I'm perplexed because I don't expect canvas rendering to be faster - or necessarily more flexible - because the web is document-first: HTML and CSS were/are all built-around describing and styling textual content, and computer program source code files are invariably all textual content files. So while browsers all have heavily-optimized fast-paths written in native code for rendering the DOM to the screen with the full flexibility of all of CSS's styling features - so applications switching to canvas rendering will first have to contend with needing to reimplement at least the subset of CSS that they're using for their editor - and it has to run as JavaScript (or WASM?) - and I just don't understand how that could possibly be faster than letting the DOM do its thing.

I appreciate that DOM+CSS rendering is not designed-around monospaced text editing or with specific support for typical text-editor and IDE features which do indeed throw a wrench into the works[1], but I think a much better approach would be to carve-out the cases where the current DOM and rendering model is insufficient or inappropriate for those specific applications' purposes and find a way to solve those problems without resorting to canvas rendering.

That said, is this change because Google wants to use Flutter for a single codebase for Google Docs that would work across iOS, Android, and the web? Flutter does have a HTML+DOM+CSS rendering mode, but it's horrible (literally thousands of empty <div> elements in their hello-world example...)

[1] e.g. a HTML/DOM document is strictly an unidirectional acyclic tree structure, and CSS selectors are also strictly forwards-only (e.g. you cannot have a HTML element that spans other elements, you cannot isolate individual text characters, you cannot select a descendant element to style based on its subsequent siblings, or ancestor's subsequent siblings), and how the render-state of a document is also strictly derived from the DOM and so does not allow for any feedback loops unless you start to use scripts, which means you can't select elements to style based on their computed styles (unlike, for example, WPF+XAML, where you can bind any property to another property - something I think XAML implements horribly...), and I appreciate this makes certain kinds of UI/UX work difficult (if not impossible in some cases), but in the use-case of an editor I just don't see these as being show-stopper issues.


>I'm perplexed because I don't expect canvas rendering to be faster

...yet it is. Really.

Even though DOM paths are heavily optimized, they are extremely flexible, and that flexibility creates a wall in possible performance optimizations. In a context like word processor, precision is more important than your regular website (and across browsers!) so you end up implementing little hacks everywhere, pushing half a pixel here and another 1.5 pixels there.

A purpose built engine that writes directly to the framebuffer of a canvas without dealing with legacy cruft has the potential to be a lot faster - if you know what you are doing. Google has no shortage of devs who know what they are doing so here we are.


They aren't that optimized, this small team changed Chromium's DOM to have better cache utilization and more coherent access patterns with data-oriented/SoA and got 6X speedup in some animation use cases:

https://meetingcpp.com/mcpp/slides/2018/Data-oriented%20desi...


> Google has no shortage of devs who know what they are doing so here we are.

They also have no shortage of devs who advance crazy ideas that somehow gain adoption... like starting a new general-purpose programming language in 2007 without generics nor package manager.


The web is (or at least was) document-first, yes, but Google Docs is an extremely heavily-featured WYSIWYG word processing and desktop publishing application that happens to be distributed on the web (in addition to other platforms). The fact that you're (sometimes) using Google Docs to generate a simple document that could easily be represented with simple HTML does not imply that Google Docs itself is a natural candidate for being implemented with simple web APIs like DOM.

Now, I think if the contentEditable API were significantly more robust and consistent across browsers, it could have been viable to build extremely complex WYSIWYG editors using the DOM. Most of the popular rich text editor libraries for the web are essentially compatibility layers around the contentEditable API that attempt to normalize its behavior across browsers and present a more robust API to the developer. These libraries are popular and do work pretty well, but based on my experience with them it's no surprise that an app as popular and extensive as Google Docs would constantly bump into the limitation of this approach. (My impression is that Google Docs never used contentEditable and instead wrote their own layout and editing engine that manually rendered out DOM, and they're now changing that to render out to canvas.)


> My impression is that Google Docs never used contentEditable and instead wrote their own layout and editing engine that manually rendered out DOM, and they're now changing that to render out to canvas.

Back before Google owned Google Docs, it was a non-Google company and website called Writely, and their website was basically a document-hosting system tied to a fairly stock `contentEditable` editor.

This was around 2005 - back when every web-application development client would insist that users have WYSIWYG/rich-text editors - of course they had no idea how WYSIANLWYG (What you see is absolutely nothing like what you'll get) those WYSIWYG editors are like.


> HTML and CSS were/are all built-around describing and styling textual content, and computer program source code files are invariably all textual content files.

HTML and CSS are fairly well optimized, but dynamic HTML and the DOM were an afterthought. If you could throw out a lot of the guarantees about DOM behavior, you could make a much faster browser, but you'd also break the web.


I could see how this could be faster.

At the end of the day, after the browser does all of its highly optimized processing of the dom, html, and CSS, it is issuing drawing commands that are the same as the ones you make on canvas. Canvas skips the in-between steps.

If you're in a situation where you know you want this text at this location on the page, it may be simpler to just draw what you want verses trying to arrange a DOM that will cause the browser to draw what you want. Especially if you're already doing pagination, at which point you're already doing the text breaking and layout anyway, and you're just trying to tell the browser in a high level language to give you the same low-level results that you already have in hand.

It looks like they're just doing this for text within a page, BTW. I looked at the sample document and the page scroller is DOM, and the individual pages are canvas of text, overlaid with an SVG containing the images.

The big question I have is how they manage to deal with stuff like IME (input method editors) and how they manage to work with the keyboard on mobile (looks like they don't do mobile though).


> The big question I have is how they manage to deal with stuff like IME (input method editors) and how they manage to work with the keyboard on mobile (looks like they don't do mobile though).

A common technique used in other web-based editors for other content-types (like online video editing, online image editors, etc) work by creating a hidden <textarea> or <input type="text"/> and giving that element focus - and then updating the manually-rendered content in response to normal DOM events like 'input', 'change', 'keydown' (if necessary - the 'input' event should be preferred, ofc). Because a "real" DOM element with native IME and soft-keyboard support is being used to process user input there's little to no degredation of the user-experience.

...though the user does lose the ability to do things like drag text-selection handles. Alternative approaches include instead making the textarea very visible and instead positioning it directly on-top of the manually-rendered content and using as much of the browser's built-in support for styling input elements and input text to match the manually-rendered content as closely as possible - but also hiding the manually-rendered content to avoid confusing the user. They may have a toggle to allow the user to choose between "simple-edit with live preview" (i.e. hidden textarea) and "edit mode". This technique isn't confined to just the web: lots of desktop software (especially in the days before WPF, JavaFX, etc) that needed to allow the user to precisely edit text within a design-surface would just instantiate a native textbox widget directly on-top of the text's location in the design-surface. It wasn't just 2D art software that did this, but also at least a few WYSIWYG-ish HTML editors (prior to contentEditable) did this. I actually wish this technique would come back (despite its clunkiness) simply because Markdown+Preview is far, far better than a WYSIWYG contentEditable widget where an inadvertant mouse-click or drag would create a `float` disaster - or bugs where elements wouldn't be closed correctly and so ending-up breaking the entire website layout...


> because the web is document-first: HTML and CSS were/are all built-around describing and styling textual content

They were built to display static textual content. Moreover, they were built to display static textual content on 90s-era computers in a single rendering pass. IIRC two-pass rendering didn't appear until some improvements around tables in early 2000s.

For that, yes, they are quire fast. Anything else? Nope.


Document display and document editing are rather different tasks. The DOM was built for the display of static documents. Dynamism was slowly added over the years through JS, and eventually CSS (animations, transformations, etc). But the underlying purpose of the browser rendering engine has remained the same, which is to display static documents. It's not surprising that a client built from the ground up around the concept of displaying static documents doesn't do a good job of allowing users to edit documents in a WYSIWYG kind of way. That has never been its job!


I like the hacking mindset to make something work even if the odds are against it but the better approach would be to fix the DOM APIs and to do the necessary performance work instead of basically throwing all the responsibility on some library and the web developer.


I haven't noticed the change either, scrolling is somewhere less than 60fps on new Mac hardware.

It's the only thing about VS Code I would change: have it use native APIs for text rendering so we can get the same framerate as native apps.


Microsoft Word, Pages and Open Office don't seem to be bottlenecked by rendering performance like Google Docs. Perhaps the browser is the wrong platform for document editing.


I believe this 100%. After using google office for years (just because it's free and cloud-based), I recently tried MS Word and Excel recently at work. The different was mind-blowing. I forgot just how functional and straightforward MS Office is compared to the clunky, barebones google options.

If I wanted a desktop-first, cloud-backed solution, what would be the most future-proof and durable? Can I use Open Office across OSes? What would be the best cloud backup service these days? (just a general question to readers)


I also prefer desktop-first, cloud-backed solutions, but I have quite the opposite experience. Working with MS Office has been a pain and I've been a happy Google Docs user for about 10 years. My wife who isn't an especially technical person also finds Google Docs quite a lot more intuitive and laments when she has to use MS Office products for work (she is a consultant for Microsoft including their 365 line of business and her whole firm makes pitch decks in Google Slides before converting them to MS Office to present at Microsoft meetings--IIRC for the Azure and other b2b lines of business they don't even bother with MS Office). Note that my wife and I (like most of our age group) grew up on MS office, so it's not a question of familiarity.

Google Docs just built a better product and MS Office still hasn't caught up. I wonder if this is because or in spite of the browser target?


Google Docs seems so bare-bones. I recently couldn't find a way to format a series of chunks of text within a Google Doc as code, and I'm pretty sure that it simply doesn't support styles for anything but headings and body text. It just doesn't seem to be the same kind of tool as Word.


What makes Google Docs a better product than MS Office? Can you provide some examples of features that are better in Google Docs?


Copy a few cells from a Google sheet and paste it in an email, then do the same with Excel. Collaborate on building out a document from scratch with 10 people in Google sheets vs Excel.

Excel is a monster, and much more powerful than Google sheets in many ways, but in my experience, Google docs apps are a little better for collaboration, and they integrate a little tighter with each other.


Google docs is their document editor. Sheets is a part of GSuite.

I've also never had trouble pasting a spreedsheet selection into a word document. Email is a nightmare in general though.

I'm not sold on collaboration personally. I've had to do it a bunch since the pandemic began and I've found it to be an anti pattern. One of the big inconsistencies is that cells in sheets don't update while being edited while collaborating, which is not great if you have a spreadsheet heavy workflow. Docs is impossible to replace that though, because it's auto formatting is draconian and always seems to reset its preferences. When editing docs we spend more time formatting them then creating the content.


> I'm not sold on collaboration personally. I've had to do it a bunch since the pandemic began and I've found it to be an anti pattern.

How much of this is really related to technology? I do a lot of writing in both Word and Google Docs and see different sets of problems for both products. Having a group of people jump into either and expecting a good product (and experience getting there) is unrealistic.

With the pandemic, I think people have been trying lots of things without understanding what will be most effective. At least early on, there was a feeling that people had to be seen to be productive. It's nothing like real remote work.

For important docs, I still come back to having individuals write their content and only then does one person attempt to assemble it. The individuals often need their own independent reviews and consultation anyway before they have a decent draft. In some ways it improves visibility and helps with keeping folks on schedule too.


Google sheets is the specific example that I hate. In my experience, it's often laggy and clunky. You can't even scroll smoothly: the window MUST snap to row/column lines. When I realized that google sheets has such a laughable shortcoming, I knew I needed to get out of google office eventually.


i think copying some cells from excel into outlook, which i guess is the comparable transaction, works pretty well - what doesn't work for you? Maybe I am just missing out on some amazing functionality by not using google docs.


Personally, I like it better sometimes for having less features. MS Word has such a massive number of formatting features that interact in complex ways that there's plenty of ways for your document to end up formatted in a weird way and to be very difficult to figure out exactly where the switch is to make it not do something. I think one time I had a document where the entire doc was highlighted in yellow, and it took me over an hour of fiddling with various formatting boxes to figure out how to turn it off. Any word processor that doesn't have the capability to do that has some appeal to me.


I haven't seen a word processing document in a professional setting for many years now (didn't realize it until just now). Who uses a word processor these days? Writers certainly don't use that garbage.

I use text editors so I can think about the content and if it is going to get prettied up with fonts it goes into a target system that supports markdown (confluence, git, email, etc..). If you are flummoxing around in a word processor or sending around formatted docs that aren't PDF I fully expect people to be looking at you sideways.


> Writers certainly don’t use that garbage.

I hate to inform you that, yes, writers do indeed use “that garbage”. I’m married to an author who regularly uses Scrivener to write. But anytime she has to send anything to anyone she has to convert to a Word document and send that out. Everyone uses Word that she interacts with. (Though author friends of hers might also use Scrivener for their writing)

Writers who understand git, let alone Markdown, are going to be extremely rare. You’re in a bubble if you haven’t encountered how dependent the writing field is on Word documents.


Unfortunately I do agree with this. I think a lot of tech isn't a matter of "what's the best?" but instead "what's the least bad?". I don't think Office is perfect but I think it's a lot less bad than google. I don't think MacOS is great but it's a lot better than windows for certain things, and vice versa. IMO unless software puts the user first in allowing customization and control, the best we can ever get is good instead of great.


What makes Google Docs a better product than MS Office? Ignorance and Dillusions.


> Can I use Open Office across OSes?

I would recommend Libreoffice over Openoffice, but yes (for both)

And you can of course backup to your cloud service of choice. The main benefit of google docs, o365, etc. Is real-time collaboration. But there is no reason why a desktop app couldn't support realtime collaboration with a suitable backend service.


The only time I've ever seen real-time Google Docs collaboration has been during meetings which should have been an email. Total waste of everyone's time. Not to mention the horrible UX of people constantly moving their cursor around and moving text around. I'd suggest that pass-the-baton style collaboration would be a much better UX if you absolutely must collaborate real-time on creating a document. Which I find the premise to be incredibly dubious to begin with.


Even if actual realtime collaboration is rare, there are other collaboration features that are missing in most desktop equivalents, like getting notified of changes, being able to mention people in comments, etc. that I do see used quite a bit.

But my experience is that realtime collaboration is useful. In particular, immediately after emailing a doc to multiple people it is not at all unusual for more than one person to be actively looking at commenting on, and maybe changing the document at the same time.


What do you prefer about Libreoffice? I've used both once or twice but not enough to really learn anything about them


LibreOffice is an actual active project; OpenOffice is a political ghost entity.


Very good to know, thanks!


There are lots of reasons.


There must be exactly zero reasons—not lots—why they can't, since some native applications do, in fact, support realtime collaboration.


I have had the exact opposite experience—I've used Google Docs for 10 years now, and in every way it manages to exceed Microsoft Office in usability. You're right that Google Docs can sometimes feel a little barebones, but it makes up for it by being very easy and straight-forward to use. In 10 years of using Google Docs, I can count on one hand—across probably tens of thousands of documents—the amount of times I've been missing something so critical to my work that I've needed to use an Office product.

(That said, I'm really excited about the recent changes Microsoft is making for Excel, with LET and LAMBDA, and I look forward to trying it out again in the future. Maybe this is the thing that finally gets me to switch! I've also enjoyed doing some more ~fancy~ graphic design in Pages on Mac, but overall the clunkiness was just so frustrating that I can't in good faith recommend it to anyone)


I prefer LibreOffice over Open Office, but I believe both are cross-platform (Linux, Windows, macOS). Then, I'd just use Dropbox or similar to save the files to for cloud storage. The only downside is no real-time collaboration. You can also look into Collabora, but I don't have any experience with it.

If you don't require Linux support or if the web is tolerable for Linux, I personally recommend the Microsoft Office suite. There's the obvious compatibility concern because nearly everyone uses those, they have real-time collaboration built in for both desktop and the web, comes with OneDrive storage, and will obviously be extremely future-proof. I cannot recall a single time any of the apps have crashed on me on both Windows and macOS, so I think it's pretty "durable".


> The only downside is no real-time collaboration.

This isn't a small thing for many users.


IMHO HTML documents backed by a versioning system (probably fossil or pijul rather than the overly complex git) are the way forward for documents where content is much more important than presentation.


While “text in a VCS” is a great option, it’s obviously far less usable than something like Google Docs, and you still don’t get real-time collaboration, which can be really nice.


Yeah... I'm wondering though, Fossil is based on SQLite - a database - and databases are designed to solve the issues arising when multiple users try to change the same data. (Also, fossil by default works in "autosync" mode.) So it should be "easy(er)" to make a real-time collaboration tool based on Fossil ?

P.S.: By researching this, I've stumbled on a (barebones) alternative to Google Docs : HackMD/CodiMD/HedgeDoc : https://demo.hedgedoc.org/


The best approach for a desktop first cloud-backed solution is possibly to have a VDI with Windows (on AWS for example), and use Microsoft Remote Desktop from your preferred physical computer to access it.

I have multiple desktop Macs in my various homes but I only use them for web browsing and RDP to the same Windows VDI.


Maybe an heresy around here, Microsoft Office with SharePoint backed server.


A free OneDrive account is enough, plus Office 2016+ autosave function, with the added bonus to have a cloud version of word to edit in collaboration your document on the go


It was indeed a very strong marketing move for... decades to convince people, like smart people, that document editing can be a web-based thing. Actually, now that the browser is so ubiquitous that GUIs sit on top of it (think Electron), then is time to ask the very obvious question - since everyone seems to agree that universal GUI is needed (proof: the browser) then is the browser the right universal GUI?

Not being heavily biased by any vendor, but really, is there anything better than XAML to describe user interfaces, that is also cross-platform and does not have the burden of DOM? Please - share examples.


> then is the browser the right universal GUI?

Absolutely not; but the web has became the behemoth it is through an absurd amount of money and engineering work. Chrome (well, Chromium) has 34 million lines of code now[1].

If we assume any competing universal GUI platform will need a similar amount of engineering effort, there's a very small list of companies in the world who have the resources to fund an effort like that. And Apple, Microsoft and Facebook have very little strategic incentive to care. (React Native notwithstanding). Google is trying with Flutter - but we'll see.

I wonder if maybe the the right direction is up. WASM is already supported by all major browser engines. I'd love to see a lower level layout & rendering API for the browser, exposed to wasm. We could do to the DOM what Vulcan did to OpenGL. And like opengl, if it was designed right, you should be able to reimplement the DOM on top in (native wasm) library code.

Then the universal GUI of the future could be the gutted out shell of a web browser (we'd just need wasm + the low level layout engine), running libraries for whatever UI framework you want to use, written in any language you like. A UI environment like that would be small, portable and fast.

[1] https://www.openhub.net/p/chrome/analyses/latest/languages_s...


I think you have just described in broad strokes what will happen in the next decade of GUI development.


That smells suspiciously like the Linux desktop environment. There was X. It was a minimal desktop environment. Then there were dozens of ones built on that… there was almost no way to have a consistent experience for a really, really long time.

I really don’t want to do that again.


Yeah, but the web isn’t very consistent already. The main set of common elements are buttons, links, form elements and scroll bars. Just about everything else is done custom on every webpage you visit.

I don’t think we should get rid of the common UI elements (if anything we need more of them & better APIs for them). But what Google docs, and flutter seem to really want is a simpler, more primitive way to create a layout out of those UI elements. Buttons and scrollbars are great. We need something more primitive than the DOM and CSS. Houdini is a solid start here.


I really like that scenario, but I don't think market forces are moving towards it.

Then again, we got wasm, and that feels like a miracle in itself.


Well, it’s clearly what the Google docs team wants. And it would yield higher performance for other similarly complex web apps (eg Figma). And allow native UI development in more languages (Blazor). It also looks to be the sort of thing the Flutter team want for web builds. And it could work well for the base system of chromeOS too.

For whatever reason, Google invests hundreds of millions each year into chrome, and trusts their engineers’ leadership on how to make it succeed. The question in my mind is if browser engineers themselves decide to push in this direction.


Chrome has been pushing Houdini [1] for years. It doesn't have special WASM integration right now AFAICT but it is basically a lower level layout & rendering API for the browser.

[1] https://ishoudinireadyyet.com/


I've looked at Houdini again and I'm not convinced.

First, because it's more like OpenGL 3 (add more powerful APIs) than Vulkan (clean room design).

Second, it seems mostly abandoned. The page you cited lists multiple sub-proposals that have "No signal" even from the Chrome team. All mentions of Houdini I can find on developers.google.com are from 2018. I can't find anything about Houdini integration with WebAssembly, which is what I'd expect if development was ongoing.

Overall, I'm seeing everything I would expect to see in the timeline where Mozilla has no intention of ever implementing Houdini, and Google has decided it's not worth pursuing beyond what's already implemented.


The killer feature of Google Docs is the real-time collaboration. People willingly gave up a lot of editing and layout functionality to get that. It was so much better than sending drafts of documents back and forth in email.


That and just being able to send one to anyone to collaborate on it quickly. That's the big thing that makes web apps so compelling.


I feel the need to argue that the browser is not the browser engine. An app sitting in a chrome tab is significantly different than an app built on electron, they just share some rendering code paths.

Electron apps have shown that you can use a browser's rendering engine to make high quality apps distributed on multiple platforms. They also have the benefit of persistence, filesystem access, hooks into native code should you need them (not WASM - mind you), you can implement true multithreading and explicit SIMD optimizations. You don't have memory limitations, and you don't have to worry about browser sandboxing, malicious or well intentioned extensions that break the experience, etc.

The browser is not the same platform as electron. I would guess that Google Docs would function much better in electron than on the web.


> An app sitting in a chrome tab is significantly different than an app built on electron, they just share some rendering code paths.

That isn't really true, Electron is basically a thin veneer over the Chrome browser, with NodeJS tacked on the side. Just take a look at the source code.

> Electron apps have shown that you can use a browser's rendering engine to make high quality apps distributed on multiple platforms.

Electron has shown that you can use a re-skinned browser and NodeJS to ship applications on all platforms capable of running Chrome. That ranges somewhere between "acceptable tradeoff" and "absolute overkill", depending on the application.

> You don't have memory limitations, and you don't have to worry about browser sandboxing, malicious or well intentioned extensions that break the experience, etc.

You still do have almost all of the limitations of a web browser in your rendering code, and you have none of the features of the web browser outside of it. The bridge between the two is inefficient.


Yeah, I'm wondering why Google isn't building a desktop version of their office apps in electron. I can practically hear the collective sigh of relief upon those landing in users ' laps.


An app sitting in a chrome tab also shares its user interface, which is a problem when your app starts deviating from a simple HTML document.


> It was indeed a very strong marketing move for... decades to convince people, like smart people, that document editing can be a web-based thing.

I think this is overly reductive. There was a technical problem driving some of this; namely - document collaboration sucked (to some degree still does).

Moving documents online was a tradeoff - making the editor web based solves a bunch of problems but causes some other ones; desktop based cloud backed editing didn't exist (not that it's perfect now) at a time when you could get useful collaboration done with web based editors.

I'm not saying this was the only thing going on, but reducing it to just "marketing" misses the mark, I think.


The way that word processors are designed, essentially as very smart linked-lists of objects, would've actually allowed for the document collaboration very early on. We can perhaps speculate dozens of reasons why dis did not happen, but I guess it was for strategic reasons. But it will and is happening.

Is about right making the point that IMHO the desktop office processor is far from dead, actually I would imagine a comeback of desktop UIs because they are so much easier to get right, especially when you have complex forms (which all business software has) or custom GUIs (such as those in software like Blender, Photoshop, Lightroom, etc).

Question is did people really needed the collaboration feature so much, or as much as it was praised for decades... When it shows that source code (which IS one very important content) is being developed not collaboratively in real-time in the browser, but with the aid of various version control systems (CVS, SVN, GIT etc.) that is neither real-time, nor collaborative in the sense that Google DOX is.

So the whole collaboration thing is fun to have, great thing to demo, but perhaps not the killer feature.

Question is whether other features were more important and thus got implemented in the office packages. Such as enterprise integration capabilities and very powerful and well crafted WYSIWYG that is only possible with custom built engine.

Let's be honest - the most complex apps that is typically running on an average desktop OS is the browser and the word/spreadsheet processor. Back in the day the browser was not a VM and was not that complex. And as OpenOffice showed - this is not very easy to get right. As WPS Office (the Chinese office) showed - even if the presentation layer is fast/correct, it is not really that easy to (originally) come up with it nor integrate it with other enterprise services.

One may wonder whether MS Office was created to run best on Windows, or was it that Windows is made so to enable good run of MS Office and the integration of all this mandatory software that constitutes the modern enterprises... (again, trying to be as unbiased as possible)


> Question is did people really needed the collaboration feature so much, or as much as it was praised for decades... When it shows that source code (which IS one very important content) is being developed not collaboratively in real-time in the browser, but with the aid of various version control systems (CVS, SVN, GIT etc.)

This is a good point. I don't think realtime collaboration is so important, but multiple author collaboration is. And "track changes" is a sort-of good-enough solution, but painful.

I've had good luck collaborating on documents (research papers) using latex and source control, but that assumes (a) participants are comfortable with both and (b) the storage format is amenable to revision control. Most word processing doesn't work well like this because you can get the document into a broken state in ways that are hard to recover from, and many of the users have no mental workflow map for "source control"

TeX/LateX or orgmode/Markdown type approaches have an advantage here for complicated collaboration.

These days a lot of collaborative stuff is being done outside of spreadsheets and word processing docs, the lines are blurrier and the collaboration is broader. In the "old days" a wiki might have done the trick for this but people want richer environments too. Not sure what he answer really is.


Microsoft Word and Pages both also have web apps, for years, that are 'bottlenecked by rendering performance' (would put it as 'clearly would be improved by better rendering performance', as you're noting)


This mode of argument seems odd to me. Google is announcing a solution to the problems they were having with the platform. Wouldn't the criticism "Perhaps the browser is the wrong platform for document editing" only be appropriate if Google was complaining that they have been unable to fix the problems?

The fact that, while developing for a given platform, you can encounter problems and fix them, doesn't seem to imply that there's something wrong with your choice of platform.


Google docs is worth it for the coorporation, but if you are writting for yourself, or anything seriously it is simply not good enough, but I don't think the performance is the issue.


The browser is the wrong platform for anything that isn't an HTML document, and not only for performance reasons, but perhaps much more importantly : for interface reasons.

For instance : in your typical windowed program, when you press "Alt", it's supposed to show the Menu, which you can then quickly navigate using keyboard shortcuts. You can't do that properly inside the browser because it's going to conflict with the browser's own Alt-Menu.


Based on inspecting the DOM of the read-only preview document they link to, my guess is that they will be using traditional DOM elements for much of the editing UI. There appear to be many empty DOM elements that are there to hold various toolbars and other UI elements. And for what it's worth, there seem to be empty DOM elements intended to be read by screen readers.


I hope so. Just one example, but when you use the API to export HTML, nested lists aren't actually nested.. they just inject increasing padding on subsequent LI tags. This is ridiculous and causes big issues for me, but I'm sure they had to do it for formatting purposes. So hopefully they can give us semantic HTML now that it's not coupled to the editor.


I get performance but for text based content, is canvas the best medium to render it?


Flutter uses renders everything on Canvas and performance and usability are terrible. Not even scrolling works properly.

So no, I doubt it’s inherently cleaner.


The canvas actually seems less performant when I compared it now. Scrolling is not as smooth. Click `Save as copy` and try it out.


> For those more deep in web technology, I'd like to know if there are reasons to move to canvas for strictly technical merits.

Performance. My company switched from dom to canvas (and then to webgl) for a document-centric app a long time ago because of performance reasons. drawing to a canvas is much faster than updating dom. Also better control over how it displays. With dom you have to worry about differences in how differeny browsers render the same dom a loy more. Although that is less of an issue than it used to be.

There are downsides too though. Besides making it much more difficult for extensions to modify things, you also have to build your own spell check, because there isn't a browser API for that. However, I think google docs was already using google's own spellchecker.


In this case, I think the switch is most likely entirely based on technical merits, rather than some way of asserting more control.

So with that in mind, the fact that the team behind one of Google's most interactive pieces of software has to throw up their hands and say "DOM is too slow, we gotta roll our own" should be a wakeup call for everyone working on Chrome and other browsers, but mostly for Google itself.

When you escape the DOM, you're going to be doing pretty much everything yourself. And for someone like Google, that might be worth the absolutely insane amount of effort, but what about everyone else? You're Google, Chrome has 60%+ market share. Why isn't the plan here to systematically start improving DOM performance, or create APIs to more directly modify how elements are laid out and created? Why do all of this work to benefit only Google Docs?

We've had years (decades!) of articles and talk about how the DOM is slow (including a bunch from Google), so why not improve it? Why give up and waste all this time on a custom solution? Why not create something that is *actually* capable of handling the complexity of modern, highly interactive applications, including Google's own products?

You can say it's Flutter, but that's yet another effort to escape the DOM, rather than actually improve it.

Maybe this has been the plan behind the Google Docs team, to push people on the browser side and other Google teams to start seriously looking at what to do with the DOM, if so, I hope this actually has the intended effect. We all deserve a better, more performant web.


DOM performance has already improved leaps and bounds after millions of dollars of engineering and countless hours of effort. Same with Javascript. At some point you have to accept that the DOM has fundamental design flaws, its specification and requirements are the problem yet cannot be radically changed because of backward compatibility concerns. Browsers have spent decades optimizing everything possible, one should start by acknowledging that before tiredly trotting out magical performance improvements as the answer.


> Why isn't the plan here to systematically start improving DOM performance,

There are already many steps taken to improve DOM performance over these years. However, DOM is designed for documents. The performance can never be good enough when it is abused for non-document usage.

> or create APIs to more directly modify how elements are laid out and created? Why do all of this work to benefit only Google Docs?

Because other browser engines are unlikely to adopt these APIs just so that Google Docs can have better performance. Not to mention that these new APIs will take years to be present in every user's devices.


> There are already many steps taken to improve DOM performance over these years. However, DOM is designed for documents. The performance can never be good enough when it is abused for non-document usage.

JavaScript was originally designed for simple tweaks, but we've significantly expanded and improved the language over the years to adjust it for what it's used for *today*. I don't see why DOM is special. Sure it was designed to handle small, unchanging documents, but it's used for much more now, just the same as JavaScript. Also it's worth noting we're talking about Google Docs here, so it looks like DOM fails even at its intended use-case (I'm saying this *mostly* jokingly).

> Because other browser engines are unlikely to adopt these APIs just so that Google Docs can have better performance. Not to mention that these new APIs will take years to be present in every user's devices.

We wouldn't have fetch, canvas, async/await, PWAs, websockets, etc. if those things had to be available immediately and/or be guaranteed to be adopted. I'd rather it take years to get improvements, but eventually have them, than not doing anything and still be talking about how bad X, Y, or Z is 10 more years from now.

I'll take FLIP animations as a specific example. If I want to have a box animate from one part of the page to another (where it's position in the DOM hierarchy changes), we're having to do all kinds of crazy gymnastics around when you read the DOM, when you write to it, how do you update it, etc. And even still, you're unable to do this without using JavaScript animations (if your box contains content and it happens to change size, we'd have to do a reverse scale animation on the content).

This is stuff that's trivial in iOS and Android, and commonly used. In the web land, we're stuck doing this poorly both from a development point of view, and with bad performance, resulting in poor end user experience.

The FLIP hack has been talked about for 6+ years [1], and yet here we still are, unable to simply move and animate a box from one place in the tree to another. Want nice drag and drop interactions? Good luck. Limited animations, or slow, and often both.

Why are we getting articles from Google about how it's bad to change the size of something on the screen [2], instead of seeing improvements to the underlying APIs that cause it to be slow in the first place? If a hacky JavaScript based solution is able to make this performant, surely a native API would do better.

The DOM has to evolve to support interactive apps in a performant way, or risk being replaced by custom things like the canvas or WASM, that are not easy for machines to parse, that won't have nearly as much consideration for accessibility and extensibility. That aren't as easy to enforce good usage of, or share knowledge about. It should not be "DOM is slow, oh well", or "DOM is slow, lets drop it", or "DOM is slow, so lets build a JS scaffolding around it (VDOM)." It should be "DOM is slow. What are the contexts in which it matters most, and how can we improve the APIs such that it can natively do those things performantly and easier?" Be that better selection APIs, animation APIs, better ways to read/write to styles, the list is endless. The DOM is slow, but it does not have to be slow. We *choose* not to make significant improvements to it, and one can come up with plenty of reasons why or excuses for it.

My point is that we should choose to improve it, because the alternative will lead us down a worse path. Years of neglect has led us here, where Google, a browser vendor themselves, has to give up on DOM because it's bad. This is fundamentally messed up.

[1]: https://aerotwist.com/blog/flip-your-animations/ [2]: https://developers.google.com/web/updates/2017/03/performant...


I would not consider myself super deep in web technology, but rendering to canvas allows programmers to have pixel perfect control over the look of their applications across all devices. Currently, web developers need to "reset" lots of default rendering behaviors in every major browser to ensure that their applications look the same.

After building lots of specialized UI components within the HTML standard, a programmer may ask themself if they might as well write their own UI library. Specifically, lots of specialized applications have UI components which do not have a corresponding HTML standard. For example, in a spreadsheet, a cell may have a clickable triangle in its upper right corner that should display a comment bubble. Should a programmer create that in css or write a specialized library?


Do your users care about your app looking the same or do they care about their browser looking and acting like a browser?

Moving to canvas is sure to break many features such as text selection, adblocking, and accessibility. All in the name of more controls over the pixels? Are you truly doing it for the users?


Very good point, users do not care that a padding is rounded up or down when they switch from their laptop to their desktop, as long as the application is usable, understandable and visually competent.

But mindspace is much more important for interaction, so if their laptop is a Mac, their brain will be in "Mac mode", and "Linux" or "Windows" mode when on their other device. Respecting the platform's conventions will allow them to keep their cognitive load due to "fiddling" to a minimum.


I don't think the GP was talking about the app looking consistent with the rest of the platform, but the exact opposite: the app looking the same whatever the platform. Using Flutter allows them to have the app look the same on the browser as on mobile.

This means that at best it will look "Mac mode" for everyone, including Windows users; at worst it will look foreign to everyone.


Yes, I understand and agree.

My second paragraph was maybe a bit unclear, I meant that the user would expect platform conventions to be respected over application conventions. (especially if we consider platforms with different primary interactions)


Having a program look the same on a big screen with mouse+keyboard input as well as on a small touchscreen is a recipe for having a bad user experience on both.


I might be wrong, but I'm pretty sure Google Docs is already using a completely custom implementation of word wrapping, text layout, text selection, cursor placement, etc. It's not like it's just a <textarea /> with some CSS styles. Likewise, they seem to already have completely separate DOM elements that are invisible to normal browsers but can be read by screen readers. Based on the DOM on the new read-only preview document they link to, it looks like they will continue to use traditional DOM elements for some of the editing UI (just not the actual WYSIWYG editing area) and for screen readers.


Canvas still sucks for rendering fonts. Getting accurate font metrics is still hacky[1] and more advanced methods are still experimental[2]. Though I'm sure with Google Docs moving to canvas, some of this will be expedited. But it's yet another one of those things about web tech that anyone would half a brain would tell you should have been there day one.

[1] https://stackoverflow.com/questions/1134586/how-can-you-find...

[2] https://developer.mozilla.org/en-US/docs/Web/API/TextMetrics


> [...] rendering to canvas allows programmers to have pixel perfect control over the look of their applications across all devices.

Canvas-based fingerprinting due to rendering differences is a thing, so using the canvas is not pixel-perfect either.

Creating an UI library atop of that is a lot of work, though to be fair certainly manageable by Google. Remember, a UI library is not just about putting things on the screen, but sanely defining layouts, interactions, accessibility...

> Specifically, lots of specialized applications have UI components which do not have a corresponding HTML standard

I think that is the entire point behind Web Components [0], and if one really really don't want more DOM elements for their visuals, then the CSS Paint API and in fact the whole Houdini initiative [1] should be pursued instead, at least at the Google scale.

Besides these technical points, interoperability should be considered. Web browsers do a lot of work to match user expectations in behaviour to their native operating systems, as well as web conventions. (The classic example infractions being: links that cannot be control-clicked or middle-clicked to open on new tabs because they are not actual links but elements with click handlers; or not being able to scroll with page up/down, arrows, or middle-click, because the page reimplements scrolling in an unsemantic way)

Making your own UI toolkit is bound to all those problems, and those are user-facing problems that will affect the often-ignored long tail of users with unconventional setups.

Look at Flutter for Web, for example [2], it definitely feels entirely different from a regular website, even if it were to look the same. The scrolling does not respect my system settings, interaction is limited to only the most-common method, image scaling is subtly different.

And as Google observed themselves, extensibility and the user-agent should be considered before making such a decision, but it appears they consider it a liability in detriment of the user.

[0]: https://developer.mozilla.org/en-US/docs/Web/Web_Components [1]: https://developer.mozilla.org/en-US/docs/Web/Houdini [2]: https://gallery.flutter.dev/


> Canvas-based fingerprinting due to rendering differences is a thing, so using the canvas is not pixel-perfect either

Luckily Firefox asks you if you want to use the HTML5 canvas API. You can specifically whitelist some pages to use the canvas API and stop it from running by default on all your other browsing. Also: you may want a dedicated browser just for Google's whole ecosystem so they can't track you across the web. I have a Chromebook just for that, which has its own unique canvas fingerprint completely separated from my main workstation PC's fingerprint.


> Luckily Firefox asks you if you want to use the HTML5 canvas API.

Really? Where? I know you can disable it in about:config, but not that it was asked of the user. I have used Nightly for years and never seen such a dialog.

Unless you did mean about:config, but I don't know how that works with a whitelist either.


> Really? Where?

https://www.ghacks.net/2017/10/28/firefox-58-warns-you-if-si...

It says 'HTML5 canvas image data' which is probably different than what I said (HTML5 canvas API). Sorry for any confusion.


Did you feel the same when Maps went from HTML to Canvas?

There is no factual basis for your claims. The simplest and most obvious answer is performance. Docs these days can take 5-10s to fully open, especially with comments and annotations.


I think the panic over this is a little unfounded. For docs this makes perfect sense. My concern I guess is if this becomes the normal way of doing development, and people build HTML replacements that work using canvas. What will the impacts be.

* How many browser functions will break.

* Will middle click still work.

* Will screen readers still work?

* Will search engines/ctrl+f still work?

One of the awesome things about web browsers is you get so many features for free on every website. Making the text bigger on a website mostly just works everywhere while on traditional apps it only does if the app has a specific setting for it.


I don't think it'll become the normal way. If anything, the opposite is true, people making Electron apps to get to access the power of web development, instead of making native apps.

It's definitely not easy, it requires you to re-implement everything from scratch. It only makes sense in cases where performance is paramount. I could imagine applications being migrated to the web doing it though, like Photoshop an the like.

But I absolutely don't see normal web development migrating over, it's just way too much pain for little value. Development, debugging, testing, etc. Everything becomes much harder.


>it's just way too much pain for little value

I'm assuming that we will eventually have powerful HTML like frameworks built on top of canvas. So for the end developer, its just as simple as using GTK or HTML.


Oh, maybe that is why I can no longer use Google Maps. It used to perform OK, and then one day it became hopeless.

I was thinking it might be assumptions about 3D acceleration hardware.


No factual basis? Moving from DOM produced content to canvas produced content closes off your ability as a user to see what is happening.

As for Google Maps, it is not speedy or snappy by any stretch of the imagination. In fact, it is pretty abysmal in terms of page performance on the web.


It's not about the web being more or less open, it's about the browser playing the role of a distributed application run time. I'd argue that Google Docs is basically not a part of the web, it just incidentally happens to run in the same browser as the web does for logistic reasons.


Web apps have been around for well over a decade but some people are still struggling with the idea that a web browser can display hypertext documents and also run applications, plus a whole universe of hybrid things which lie in between these two extremes.

Not everything a browser displays has to fit in the "page" paradigm.


Web 2.0 enabling web apps was always a myth. There have been web "apps" since the 1990s with CGI. XMLHttpRequest merely allowed for the moving of some of that logic to the client side.

In that respect, we've been working around the "page" paradigm since the web was practically born. It was a flawed analogy because, even back then, screen sizes and display tech varied among users. Designers still approach web design as if they are designing for print. I've always maintained that if the web were based on a vector technology (think PostScript, but obviously not PostScript) we would be in a much better place both design-wise and accessibility-wise. Content would flow in a much more controlled manner with much less room for browser interpretation and second-guessing. But people were still clinging on to the write once run anywhere (ahem, Java) naivety of the day. And likewise they really thought that you could divorce presentation from semantics and... have something that just worked? I guess? Just sprinkle on some afterthought CSS tech crap and no one will ever notice that the entire thing is flawed at a fundamental level.


>Google wants more control over their stuff. The web is becoming less open.

That's because the old problem "web-document vs web-application" hasn't been solved properly. HTML was designed for documents. It wasn't designed for applications. No wonder as applications become more sophisticated they try to squeeze out HTML/DOM where possible.


The irony being that Docs is a web document editor


And a document editor usually has a very different feature set from a document viewer.


The part I don’t understand is how in the world is a renderer written in JavaScript better performing than their own Chrome c++ code? With Edge being a Chrome clone and Safari being also performant browser, what are they worried about?


Their own renderer has to support all of HTML and CSS. Slapping some rectangles on a raster surface has a lot less to worry about.


Well... sort of.

For specific apps, or parts of apps, yes. Doing less is how you make things faster, highly agreed. And sometimes canvas allows you to do that, and then your app is much faster.

The problem is that in many cases, moving to canvas eventually turns into having a UI framework that renders to canvas, which turns into a layer of abstractions that handle keyboard and mouse events for you, including stuff like hover, which means suddenly you're tracking element position on your raster surface and thinking about z-indexes and event bubbling to parent elements...

I think this is part of the reason why individual apps that start using canvas and that can genuinely cut down on complexity by doing so tend to be able to get real speed improvements, but app frameworks like Flutter tend to perform so poorly. Eventually your cross-platform GUI toolkit like Flutter ends up being just another browser engine written in WASM. And in that scenario your approach becomes strict downside.

One good example: the browser doesn't expose an accessibility engine other than the DOM. So what I see apps end up eventually doing is either writing their own accessibility engine that doesn't work with programs like JAWS, or rendering out to a hidden DOM. For something like a game, you can get away that, maybe you don't even provide an accessibility layer at all. For a web component or a chart, a lot of your rendering might be unrelated to accessibility at all. But you get away with those kind of shortcuts because it's a targeted, specific use. For a big UI toolkit, it's harder to do that, and then surprise, suddenly you have all the overhead of updating a DOM tree and the overhead of updating a canvas.

When people talk about getting raw access to the graphics layer, I think it's important to understand there's a difference between apps that are genuinely reducing complexity vs the theoretical canvas-backed "universal web framework" that people sometimes talk about as just around the corner.


Are you sure you meant Figma? It's an image editor, not a framework.


Eh, bleh, you're completely right, I meant Flutter. Good catch.

Figma is not only not a framework, it's also not completely canvas based, or at least wasn't last time I checked.


"Slapping some rectangles on a raster surface" can also be done in HTML/CSS, with the right options (set 'overflow' content to be clipped with no reflow).


No, it can't be done in HTML+CSS in a performant way.


VS Code feels performant enough, with its complex functionality IMO exceeding Google Docs, and yet I don't think it is using canvas. I believe it comes down to strategic design that avoids unnecessary layout and reflow events in the UI.

That said, the UI of VS Code (the desktop app) only needs to run in Chromium. And generally Google Docs could be a different enough beast that it can’t take advantage of the same tricks—hard to say from the outside.


> VS Code feels performant enough, with its complex functionality

VS Code has an entire dedicated team that only works on VS Code. They can spend resources on trying any trick in the book to make something performant. Whenever actual performance is required, well, they ditch DOM and go for canvas: https://code.visualstudio.com/blogs/2017/10/03/terminal-rend...

And while sufficiently complex, it actually displays significantly less complex information than required by a regular document that will have any number of fonts, layouts, inline images and tables, references to other documents, etc.

> I believe it comes down to strategic design that avoids unnecessary layout and reflow events in the UI.

Yup. And it's nearly impossible to do any amount of "strategic design" because if you as much as glance at a document, it will repaint and reflow: https://csstriggers.com


some of VS Code has been canvas-based since 2017 - https://code.visualstudio.com/blogs/2017/10/03/terminal-rend...


I stand corrected, seems like they’re using canvas for at least the integrated terminal and maybe (?) the editor. Makes all the more sense for Google Docs to follow suit. I’m not against apps moving entirely to canvas by the way, as long as the regular non-webapp sites don’t start doing this just because they can.


The editor is all DOM based apart from the minimap. More pixels could definitely be pushed faster by re-writing it in canvas but it would be quite the undertaking when you consider accessibility, backwards compatibility, monaco extensibility, etc. with the end result just being a improved scrolling experience.


but VS code still only has to support monospaced code + some popups and sidebars, not mix-and-match of font and all its variants in all complex layouts, line heights, paragraph spaces, column layouts, images including float etc etc


JavaScript is actually quite fast these days, especially if one has a compiler in the flow to narrow it to the set of operations that are known to be high-performance.

And Google would be paying a lot of that cost anyway if the DOM is the render target, because what they gain in the render algorithm being precompiled assembly they lose in the JavaScript layer pushing the wrong abstraction around to trigger all that C++ code.


> JavaScript is actually quite fast these days

Do you think a browser written entirely in JavaScript would be competitive with Chrome written in C++?


Do I have Google's engineers to develop it and full control over the implementation of the JS engine?

Am I allowed to compile the JavaScript to assembly?

Because if yes to all of these, then in the abstract, as a thought experiment, I can create an implementation in the JavaScript language with machine code that is byte-for-byte compatible with Chrome written in C++. Step one is write a C++ compiler in JavaScript... ;)

... but more importantly, I don't know how the question is relevant to the question of whether a JavaScript implementation of render commands into a canvas might be faster than a JavaScript implementation of layout declarations that have to play a bunch of games to get desired results from a C++ renderer. The gains from C++ render performance start to get lost if the renderer is making a bunch of wrong guesses about what should be rendered and when.


That's kinda a weird question, since it would obviously depend on what is executing the JavaScript for the browser written entirely in JavaScript. Chrome doesn't ship C++ code to your machine, they compile the C++ to native code for your particular hardware and operating system (presumably with a great many differences and performance tweaks between each compilation target).


They will use WASM, and it will become big and clunky.


Judging from Google Docs' performance, they're already not really using the browser for much.


> With Edge being a Chrome clone

yes

> and Safari being also

yes

> performant browser,

no

Browser UI is so slow when compared to non-browser UI, it's infuriating.


in general a lot of dom-nodes can reduce performance and with canvas you have much more control over what and how things get rendered. I would assume this is also nothing new or "special" afaik google spreadsheets is using canvas for years under the hood with some dom for nicer ux


"A lot of dom nodes" is not inherently problematic if you do not insist on twiddling them individually in a JS for-loop. Other than that, it's just HTML - and plain HTML/CSS rendering is blazing fast.


> I'd like to know if there are reasons to move to canvas for strictly technical merits.

In Google's announcements, they say the reason is performance.


That seems like a moderately weak argument. DOM tables are slow, but if you know what you are doing the DOM is otherwise an insanely fast interface. You can get sub-nanosecond response speed from DOM calls in Firefox (faster than a billion operations per second).

It would be interesting to see the performance differences in numbers. The performance impact of a canvas based approach can be approximated from measuring the performance of heavy SVG animations on GPU load.


Raw calls to DOM mean literally nothing when you have to layout those nodes.

And yes, tables are slow. "If you know what you're doing" routinely becomes "let's reinvent virtual lists on an interface that doesn't have a single API to make this pleasant or performant in any conceivable way".


> For those more deep in web technology, I'd like to know if there are reasons to move to canvas for strictly technical merits.

We’ll see the results, but let’s be honest - web is bad fit for apps. It was never designed for it, and has tons of hacks and layers to make it possible, that makes them messy and slow.


> The web is becoming less open.

Canvas and WebGL are both open standards, no? There are a million and one examples of losing openness, but I think this isn’t one of them.


I bet it's a feeling of, "I used to be able to right click -> view source on any web page and see source code I could understand and learn from. I can't do that anymore, which makes it less open."

Which has some truth to it, but it applies almost equally to the current HTML-rendered Google Docs I'd bet.


> technical merits

The DOM gives a developer very little control over when something should be repainted (or how repainting occurs) relative to the compositing options available when controlling one's own canvas. Moving the rendering engine to canvas allows the docs team more control over the optimizations of layout and rendering they can do (especially cross-platform; there are a hundred hundred mutually-incompatible bugs and quirks in Firefox, Chrome, IE, etc.'s layout and content rendering algorithms that make cross-platform high-performance very hard to guarantee at the DOM layer of abstraction).


It is not unreasonable to request that devs building on top of your platform use an API. They also aren't requiring it, they are simply suggesting it to avoid future breakages when they change the internals of how their application works.

Requiring that they maintain this compatibility would be like requiring the maintainer of an OS library to maintain the contract of a private method because my app relies on grepping their code base to parse the contents of the method.

When you don't have a defined set of public interactions with your app, every change is a breaking change.


I did a small project where I drew a large animated graph with SVG. Despite careful optimization, it ran at a mediocre frame rate and kept spinning up my fans. I know about another project (whiteboard with notes) that ran into the same issues.

PowerPoint, Miro and Figma all run in the Canvas. I don't blame Google for doing the same.


> this reads like Google wants more control over their stuff. The web is becoming less open.

Absolutely agreed. I'd also be surprised if they don't try to roll out the same for search results, ostensibly for the purpose of improving performance, but actually to thwart ad-blockers.


The only thing that's kept everyone from doing that, so far, is accessibility. If not for that all the major ad platform companies (FB, Twitter, Google, and so on) would be already be all-in on Canvas.

Accessibility concerns make it both expensive to develop a UI that renders to canvas, and ensures that the content can be processed & understood by a program (else how will assistive tools read it?), which opens up the door to ad blockers again, defeating the purpose of the whole exercise.

We literally have blind people to thank for the Web remaining as open as it has, for this long.


Do you have any evidence for this idea?


Only that it's a really obvious move for them, for a few reasons (including that making it the norm for stopping ad blockers and trackers forces all their would-be competitors to play catch-up on basic taken-for-granted stuff like displaying text on a screen) and accessibility is the only thing I'm aware of that attacks both the difficulty of the project and the feasibility of the desired outcome, sufficiently to explain why they've not at least given it a shot.


> This is basically putting an API on top of an API as far as I'm concerned.

Exposing a raw on-screen DOM tree as part of your API for extensions/plugins to use is terribly fragile and generally a bad idea.


The article says the reasons are,

> to improve performance and improve consistency


I worry about the same thing, but in Google's defense, docs is a wysiwyg document editor, not a webpage for displaying info. It's meant to help users create and edit documents. It has different needs than HTML.


replying to myself though haha. I use rikaikun, an extension to add popup translations to Japanese words. It obviously won't work on google docs drawn in canvas which sucks if someone posts a link to a google docs doc. It will also suck for support for brail etc... Maybe they'll need a "fallback to HTML" button

I also wonder how they'll support CKJ IMEs. Generally when sites try to do their own input the languages that need an IME get 2nd class support. You can see an example but looking at the Qt WASM examples which draw the entire app in canvas using WebGL. They don't support anything but English

https://www.qt.io/qt-examples-for-webassembly


I have a running theory - Google doesn't create a product unless it captures data in a unique manner compared to their other products. Maybe they have moved past this phase. Maybe I just don't see the collection happening here.


Docs value prospect isn't really consumer use at all: It's businesses and (mostly) schools, paying for G Suite licensing.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: