Hacker News new | past | comments | ask | show | jobs | submit login
I want to talk about WebGPU (cohost.org)
672 points by pjmlp 9 months ago | hide | past | favorite | 238 comments

I suspect this article may even be underestimating the impact of WebGPU. I'll make two observations.

First, for AI and machine learning type workloads, the infrastructure situation is a big mess right now unless you buy into the Nvidia / CUDA ecosystem. If you're a research, you pretty much have to, but increasingly people will just want to run models that have already been trained. Fairly soon, WebGPU will be an alternative that more or less Just Works, although I do expect things to be rough in the early days. There's also a performance gap, but I can see it closing.

Second, for compute shaders in general (potentially accelerating a large variety of tasks), the barrier to entry falls dramatically. That's especially true on web deployments, where running your own compute shader costs somewhere around 100 lines of code. But it becomes practical on native too, especially Rust where you can just pull in a wgpu dependency.

As for text being one of the missing pieces, I'm hoping Vello and supporting infrastructure will become one of the things people routinely reach for. That'll get you not just text but nice 2D vector graphics with fills, strokes, gradients, blend modes, and so on. It's not production-ready yet, but I'm excited about the roadmap.

[Note: very lightly adapted from a comment at cohost; one interesting response was by Tom Forsyth, suggesting I look into SYCL]

WebGPU has no equivalent to tensor cores to my understanding; are there plans to add something like this? Or would this be "implementation sees matmul-like code; replaces with tensor core instruction". For optimal performance, my understanding is that you need tight control of e.g. shared memory as well -- is that possible with WebGPU?

On NVIDIA GPUs, flops without tensor cores are ~1/10th flops with tensor cores, so this is a pretty big deal for inference and definitely for training.

Shared memory, yes, with the goodies: atomics and barriers. We rely on that heavily in Vello, so we've pushed very hard on it. For example, WebGPU introduces the "workgroupUniformLoad" built-in, which lets you broadcast a value to all threads in the workgroup while not introducing potential unsafety.

Tensor cores: I can't say there are plans to add it, but it's certainly something I would like to see. You need subgroups in place first, and there's been quite a bit of discussion[1] on that as a likely extension post-1.0.

[1]: https://github.com/gpuweb/gpuweb/issues/3950

Yes, we expect to have a natural path towards explicit cooperative matrix multiply ops.

If you have a wishlist, we have an issue tracker! ;) https://github.com/gpuweb/gpuweb/issues

I haven't tried it myself, but it looks like several are already looking at implementing machine learning with WebGPU, and that this is one of the goals of WebGPU. Some info I found:

* "WebGPU powered machine learning in the browser with Apache TVM" - https://octoml.ai/blog/webgpu-powered-machine-learning-in-th...

* "Fastest DNN Execution Framework on Web Browser" https://mil-tokyo.github.io/webdnn/

* "Google builds WebGPU into Chrome to speed up rendering and AI tasks" https://siliconangle.com/2023/04/07/google-builds-webgpu-chr...

Also, Tensorflow.js WebGPU backend has been in the works for quite some time: https://github.com/tensorflow/tfjs/tree/master/tfjs-backend-...

This is the discussion I hoped to find when clicking on the comments.

> Fairly soon, WebGPU will be an alternative...

So while the blog focused on the graphical utility of WebGPU, the underlying implementation of WebGPU is currently about the way that websites/apps can now interface with the GPU in a more direct/advantageous way to render graphics.

But what you're suggesting is that in the future new functionality will likely be added to take advantage of your GPU in other ways, such as training ML models and then using them via an inference engine all powered by your local GPU?

Is the reason you can't accomplish that today bc APIs haven't been created or opened up to allow such workloads? Are there not lower level APIs available/exposed today in WebGPU that would allow developers to begin the design of browser based ML frameworks/libraries?

Was it possible to interact with the GPU before WebGPU via Web Assembly?

Other than ML and graphics/games (and someone is probably going to mention crypto), are there any other potentially novel uses for WebGPU?

> ...in the future new functionality will likely be added to take advantage of your GPU in other ways, such as training ML models and then using them via an inference engine all powered by your local GPU?


> Is the reason you can't accomplish that today bc APIs haven't been created or opened up to allow such workloads? Are there not lower level APIs available/exposed today in WebGPU that would allow developers to begin the design of browser based ML frameworks/libraries?

That is correct, there is no way before WebGPU to access compute capability of GPU hardware through the Web. There have been some hacks based on WebGL, but those are seriously limited. The fragmentation of the existing API space is a major reason we haven't seen as much progress on this.

> Was it possible to interact with the GPU before WebGPU via Web Assembly?

Only in limited ways through WebGL - no access to workgroup shared memory, ability to do random access writes to storage buffers, etc.

> Other than ML and graphics/games (and someone is probably going to mention crypto), are there any other potentially novel uses for WebGPU?

Yes! There is research on doing parallel compilers on GPU (Aaron Hsu's co-dfns as well as Voetter's work[1]). There's quite a bit of work on implementing Fourier transforms at extremely high throughput. Obviously, physics simulations and other scientific workloads are a good fit. To me, it feels like things are wide open.

[1]: https://dl.acm.org/doi/pdf/10.1145/3528416.3530249

> Was it possible to interact with the GPU before WebGPU via Web Assembly?

Only with tons of restrictions by going through WebGL, or by writing your own or extending an existing WASM runtime outside the browser which connects to system-native GPU APIs like D3D12, Vulkan or Metal.

> are there any other potentially novel uses for WebGPU

In general, the only thing that WebGPU has over calling into D3D12, Vulkan or Metal directly is that it provides a common subset of those 3 APIs (so you don't need to write 2 or 3 implementations to cover all popular operating systems) and it's easier to use (at least compared to D3D12 and Vulkan).

WebGPU is an important step into the right direction, but not a 'game changer' per se (at least outside the browser).

> GPU in other ways, such as training ML models and then using them via an inference engine all powered by your local GPU?

Have a look at wonnix https://github.com/webonnx/wonnx

A WebGPU-accelerated ONNX inference run-time written 100% in Rust, ready for native and the web

Could someone explain what kinds of useful things will become possible with it?

I don't get it yet, but HN seems excited about it, so I'd like to understand it.

What I get so far - running models that can be run on consumer sized GPUs will become easier, because users won't need to download a desktop app to do so. This is limited for now by the lack of useful models that can be run on consumer GPUs, but we'll get smaller models in the future. And it'll be easier to make visualizations, games, and VR apps in the browser. Is that right and what other popular use cases where people currently have to resort to WebGL or building a desktop app will get easier that I'm missing?

It's not so much that people want to run the models in the browser. They want to be able to write and publish one desktop app that e.g. runs a LLaMA quality model and runs decently across the hundreds of different GPUs that exist on their users' machines.

So it's more of a CUDA alternative for building desktop apps that are GPU powered?

Nowhere near CUDA. Maybe OpenCL and Metal replacement because nobody bothers to support them, so just a fallback option for AMD and ARM chips.

So then:

If your app needs CUDA, you'd need to write it in CUDA.

If you don't need CUDA, you'd write it for WebGPU instead?

If you benefit from CUDA but don't strictly require it, then you can write it in CUDA with a WebGPU fallback.

Is that right?

I think you only write WebGPU code if you are using JS or Rust for your hobby project

> because users won't need to download a desktop app to do so

Not only that, WebGPU is OS and hardware neutral. It'll work regardless of what machine you have. Currently it's Nvidia or get lost.

If this is something users actually care about, wouldn't Vulkan AI be a big deal? It has way more features for AI than WebGPU will likely ever have, and it's widely available today.

The blog spends a lot of time pointing out that it is not, in fact, widely available today in any practical sense.

It spends a little time on the topic, and imho doesn't argue the point very well.

It points out that MoltenVK doesn't work very well, and I agree, although I think it's questionable whether the Molten-compatible subset is worse than WebGPU or not.

It says that Vulkan has driver issues on Linuxes, but provides no citations for that, and in my experiences, Vulkan drivers and developer tools have been the best for Linux, overall (my experience is limited to Nvidia, so maybe that's why, Idk).

HN was similarly excited about Vulkan, but this is newer.

This fixes the main problem with Vulkan, which is that there were no big tech companies pushing them. WebGPU has Apple, Google and Microsoft all committing to support it in their browser/OS.

Not quite. The main problem with Vulkan is, as the blog post goes into, one of usability. Vulkan isn't designed for end developers to use, it's designed for middleware vendors to use. Vulkan is already being pushed by many major big tech companies include Google, Nvidia, AMD, etc... It's really only Apple that's a problem here.

WebGPU is basically a middleware for those that just want to use graphics, not all of unity, unreal, etc...

There is no Vulkan on game consoles, those little devices HN keeps forgeting about.

Yes, the Switch supports Vulkan alongside OpenGL 4.6, but if one wants the full power of the games console, the name of the game is NVN.

As for Apple, all relevant middleware has support for Metal.

The Vulkan situation on Apple is incredibly funny, especially once you look at the landscape of gaming. There's games out there (I believe Final Fantasy XIV is like this) where the official Mac client runs in wine, using a DirectX -> Vulkan translation layer, on top of a Vulkan -> Metal translation layer.

One can go that way, or use one of the middleware engines that has support for Metal for several years now.

I'm not clear this is accurate. More, I'm not sure these are the right companies to push?

Many companies have tried to unseat Nvidia. Many of them big companies. There is no magic in the ones you named. Worse, I'm not clear that Google will bring stability to the project. Such that I see little reason to be overly excited.

Happy to be proven wrong.

> unless you buy into the Nvidia / CUDA ecosystem

Coming at it from a graphics processing perspective, working on a lot of video editing, it's annoying that just as GPUs start to become affordable as people turn their back on cryptobro idiocy and stop chasing the Dunning-Krugerrand, they've started to get expensive again because people want hardware-accelerated Eliza chatbots.

Anyway your choices for GPU computing are OpenCL and CUDA.

If you write your project in CUDA, you'll wish you'd used OpenCL.

If you write your project in OpenCL, you'll wish you'd used CUDA.

I've never once regretted my decision to write GPU code in CUDA... I mean, I wish there were alternatives because being locked into Nvidia isn't fun, but CUDA is a great developer experience.

> If you write your project in CUDA, you'll wish you'd used OpenCL.

Only if one is stuck in a C for GPGPUs mindset.

>the Dunning-Krugerrand

Stolen. Genius. You made my day.

I stole it from someone else first, and I'm sorry to say I can't remember who.

Credit where it's due, though, it's an awesome term.

I think I first heard it from @cstross

spir-v seems to be a real option anywhere with vulkan available

The thought also come to mind, but after listening the work of Neural Magic At Practical AI [1], and how the work with model quantization over CPU is advancing by leaps and bounds, I don't foresee the strong dependence we have on CUDA persisting, even in the near future.


Is Vello called that because 2-D graphics are very difficult, i.e., "hairy"?

> so reportedly Apple just got absolutely everything they asked for and WebGPU really looks a lot like Metal

...tbh, I wish WebGPU would look even more like Metal, because the few parts that are not inpired by Metal kinda suck (for instance the baked BindGroup objects - which requires to know upfront what resource combinations will be needed at draw time, or otherwise create and discard BindGroup objects - which are actual 'heavy weight' Javascript objects - on the fly).

So much this. Metal is so elegant to use. I've tried reading through Vulkan docs and tutorials, and it's so confusing.

Also, this seems like some major revisionist history:

>This leads us to the other problem, the one Vulkan developed after the fact. The Apple problem. The theory on Vulkan was it would change the balance of power where Microsoft continually released a high-quality cutting-edge graphics API and OpenGL was the sloppy open-source catch up. Instead, the GPU vendors themselves would provide the API, and Vulkan would be the universal standard while DirectX would be reduced to a platform-specific oddity. But then Apple said no. Apple (who had already launched their own thing, Metal) announced not only would they never support Vulkan, they would not support OpenGL, anymore.

What I remember happening was that Apple was all-in on helping Khronos come up with what would eventually become Vulkan, but Khronos kept dragging their feet on getting something released. Apple finally got fed up and said, "We need something shipping and we need it now." So they just went off and did it themselves. Direct X 12 seemed like a similar response from Microsoft. It always seemed to me that Vulkan had nobody but themselves to blame for these other proprietary libraries being adopted.

> What I remember happening was that Apple was all-in on helping Khronos come up with what would eventually become Vulkan, but Khronos kept dragging their feet on getting something released. Apple finally got fed up and said, "We need something shipping and we need it now." So they just went off and did it themselves.

This is not really how it happened. AMD released Mantle back in 2013 based off their experience with game console specific APIs. From what I remember, AMD expressed some interest in Mantle becoming a cross-vendor standard, but were a bit wishy-washy early on. GDC 2014 then saw some AMD talks on Mantle, the announcement of DirectX 12 from Microsoft and the AZDO talk. Apple then announced Metal in June of that year. The "Next Generation OpenGL Initiative" kicked off around that same time with a public call for participation in August. Apple did join the working group at some point (they're one of the many companies listed in a slide from the announcement presentation), but I don't see any evidence that they were ever a major player in the standard.

Now obviously, Vulkan was not an option for Apple in 2014 when they announced Metal since the project hadn't really gotten started yet, but I don't see any evidence that they pushed Khronos to get started on a replacement for OpenGL earlier either. They were also lagging behind on OpenGL support for years before they announced Metal (they stopped at 4.1 which was released in 2010) and notably almost none of the techniques presented in the AZDO talk worked on Mac OS for this reason. I'm sure part of the reason they decided to go their own way with Metal is that you can move faster as a single company, but I think it would be naive to assume that making cross-platform development between iOS and Android harder wasn't a factor.

Additionally given how Long Peaks went, had it not been for AMD giving Mantle to Khronos, they would still be arguing to this day how OpenGL vNext was supposed to look like.

I think your doing your own revionist history here. The rumors are all that Apple is the one that blocked Khronos' initial attempts with the "Long Peaks" proposal that was supposed to become OpenGL 3.

And while Metal was released in 2014, Apple had already stopped updating OpenGL way back in 2010.

Apple is also supposedly engaged in a legal dispute with Khronos, which is why they so vehemently rejected SPIR-V in webgpu.

Mantle (which is what became Vulkan) also came out before Metal did (2013 vs. 2014)

> Direct X 12 seemed like a similar response from Microsoft

That seems like a stretch since nobody ever expected Microsoft to do anything aligned with Khronos. They'd been doing their own thing for a decade+, why would you think DX12 was anything different?

Caused by how Khronos mismanaged OpenCL.

> What I remember happening was that Apple was all-in on helping Khronos come up with what would eventually become Vulkan, but Khronos kept dragging their feet on getting something released. Apple finally got fed up and said, "We need something shipping and we need it now."

Is there any evidence for this?

> Is there any evidence for this?

Timeline? Metal first came out 2 years before Vulkan.

How is that evidence that Apple ever tried to engage with Khronos on a Metal-like API, though?

Seeing as Apple had already stopped updating OpenGL versions about 4 years before the release of Metal, it seems more likely that Apple never planned on working with Khronos on anything.

Well, knowing that Apple is member of the Khronos group, it would be very surprising that they did not get involved with Vulkan.

Do a search for Apple & Khronos and you'll find a lot more examples of spats between them over the last decade than anything else. Including an ongoing legal dispute.

You also won't find any hint of involvement from Apple in anything Vulkan-related. Or really anything else Khronos-related. They've even pulled out of OpenCL - the thing that only Apple ever cared about in the first place.

No idea why they are still a member of the group. Possibly they just haven't been kicked out yet, possibly they still want to retain voting input for something (like webgl).

For the same reason Sony, Nintendo and Microsoft are, they care about specific parts of Khronos, not all of it, not for all their products.

Sony cares about Khronos on Android phones, not so much on Playstation.

Nintendo supports OpenGL 4.6 and Vulkan on the Switch, with NVN as the main API. During the previous generations they started to support GL like APIs, not really 1:1 to the standards.

Microsoft was the main contributor to the initial set of GlTF 2.0 improvements, and BabylonJS is the first browser engine to driver most of the WebGPU efforts, and they care about Khronos APIs in the context of WSL.

Apple once cared about Quickdraw 3D, OpenGL was the path out of the tiny market they were in and what NeXTSTEP used anyway, nowadays all relevant game engines support Metal anyway.

Apple is a member of lots of groups. They were part of the Blu-ray consortium but never ever shipped a Blu-ray drive or blu-ray authoring support

I know people below are disputing your retelling based on how they externally viewed development, but from talking to people who were in the development committees at the time, your retelling is the one that matches up best.

BindGroups should not be that heavyweight, and there's murmors of a proposal to recycle BindGroups by updating resources in it after the fact.

> WebGPU goes live… today, actually. Chrome 113 shipped in the final minutes of me finishing this post

Note that WebGPU in Chrome is not yet available for Linux.

Firefox Nightly has some partial support for Linux so that the first two of these examples work for me: https://webkit.org/demos/webgpu/

Hi there, member of the Firefox WebGPU team here! We don't consider WebGPU on FF "ready" yet, but you're welcome to follow along with our progress on Bugzilla: https://bugzilla.mozilla.org/show_bug.cgi?id=webgpu-v1

It works with --enable-unsafe-webgpu --enable-features=Vulkan command line arguments. Not very stable though.

Sadly it will likely be years and years still until we get to broad adoption eg to where even old androids can use it

Old Android phones will support WebGPU the day Chrome ships WebGPU support for Android. Dawn, Chrome's WebGPU implementation, works on Vulkan 1.1, which has been required since Android 9.

With various levels of works, and Android drivers don't get updates, so in those cases most likely Chrome will blacklist the drivers and offer no support.

Chrome is updated separately from the Android OS itself, unlike iOS and Safari.

I'm on Chrome on Windows and some of those examples work but most do not.

That's because these demos seem to be out of date (originally those were for WebKit's WebGPU prototype, and then probably updated along the way but not completely).

Try these instead: https://webgpu.github.io/webgpu-samples

The article has a humorous history of graphics APIs that I very much enjoyed. I did the the Vulkan tutorial for kicks one month on Windows (with an nVidia GPU) and it was no joke, super fiddly to do things in. I look forward to trying WebGL in anything that isn't JavaScript (some language that compiles to WebASM or transpiles, I guess).

It was interesting...except that it omitted SVG entirely. So one should take it with a grain of salt.

How many game engines run on top of SVG?

How many games target SVG as their graphics pipeline?

How many applications, for that matter?

The comparison to PDF was better than you probably realize. PDF is Turing complete, and there have been ray-tracers implemented in it.

Not wanting to disregard your pint, but there are definitely applications in SVG. The user interface for the most famous kitchen “robot” (Thermomix) is done entirely in SVG, and they (through open source consultants Igalia) are one of the most important contributors to SVG and Canvas functionality in WebKit.


> Not wanting to disregard your pint

Please leave my drink alone. :-(

I think you're confusing PDF with PostScript. The whole point of PDF was to remove the Turing completeness from PostScript.

PostScript ray tracer:


> Don't send this one to a printer. It will take too long

Damn, and here I was going to send it to my LaserWriter IINT!

Ahhah, I sit corrected! Thank you.

Oh the comparison to PDF was perfect. PDF and Metal have the same recent ancestry with postscript.

What ugly prejudices so many people on this thread seem to have. And how trivial and nonsensical. All I said was she should not have omitted svg in her history of graphics apis since she's writing an article about browser graphics. To deny that svg is a part of that, or that people don't program to it... well, I was going to write "ridiculous" but the xkcd about people who don't know things comes to mind...is it possible you just don't know that so many people make things with SVG? In any event, it seems I've touched a nerve that I didn't want to, and I find it ugly, and this is my free time, so I'm leaving the convo.


> All I said was she should have not omitted SVG in her history of graphics APIs since she’s writing an article about browser graphics. To deny that SVG is a part of that

You are continuing to double down on a straw man, which is why you’re getting all the push-back. The ugliness is ugliness you’ve single-handedly created because this is not an article about the history of browser graphics. It would have included SVG if that were the case, but it’s not, and you’re making assertions that aren’t warranted or justified here. This was an article about GPU APIs. Canvas was only mentioned briefly off-hand in passing twice, and the history of Canvas was not discussed at all, nor was the history of browser development. Nobody denied that SVG is part of anything, you are projecting your off-topic wishes onto a discussion where SVG simply does not belong, it’s tangential and not relevant to the article. It’s not relevant here how many people make things with SVG.

SVG is great. I wish development for SVG2 hadn’t stalled out, I would like to see SVG2 become broadly supported, there are some new options for dynamic scaling I’ve wanted to use for ten years. That said, this is a completely separate topic from the article & thread. You could get your SVG fix by submitting an article to HN that’s actually discussing SVG rather than trying to hijack the comment thread on an article about GPUs.

> it seems I've touched a nerve that I didn't want to, and I find it ugly

Well, yes, but you can see how your original comment may have been read as dismissive of other people's work?

Not only that, his comments are factually incorrect, while glibly dismissing a well written factually correct article as untrustworthy, based on his incorrect assumption that his incorrect facts imply her true facts are false. "So one should take it with a grain of salt." Sheez. How rude.

In what sense is Metal descendent from PostScript? The comparison is anything but perfect: they're practically polar opposites.

Metal isn't a text based programming language like PostScript, and it has a completely different graphics model.

PostScript isn't pixel oriented, or 3d oriented, and Metal doesn't have a stencil/paint graphics model or text rendering.

They're totally different things, by totally different people and organizations.

Metal uses SPIR-V binary byte code, which is nothing like PostScript, a stack based high level polymorphic cross between Lisp or Forth, and a descendent of Interpres, JaM, and Design System.

As programming languages go, and as imaging models go, PostScript and Metal are at completely different ends of the spectrum.

SVG, like PDF, is the PostScript stencil/paint imaging mode, without the Turing complete programming language, but expressed as XML, with some CSS thrown in. And some implementation of SVG even let you define event handlers in JavaScript, but Metal sure doesn't let you do that. And to top it off, SVG is "object oriented" (i.e. a DOM) while Metal is "immediate mode" (i.e. an API). They couldn't be more different.

Since your facts are wrong, and you're throwing around insults like "ugly prejudices" and "trivial and nonsensical" and "ridiculous" and "people who don't know things" and "I've touched a nerve" and "I find it ugly", then it's a good thing that you're "leaving the convo", because you haven't contributed anything but misinformation and insults.

More about PostScript and its history:


> Metal uses SPIR-V binary byte code

It certainly doesn't, except that they're both LLVM based. As the article says, Apple can't use anything from Khronos.

That's right, I was mistaken. The point I was trying to make is that compiled shader languages are very unlike dynamically interpreted PostScript, which is much more like Lisp or Smalltalk than C or C++.

She omitted it because it's not an API, it's a file format. Even if you want to be lax about what you call an API, svg was never a competitor in this space. She also left out canvas, directdraw, direct2d, gdi, etc, because they're just not relevant to any of this.

> ...article about browser graphics

Despite the name, WebGPU isn't limited to browsers, I think the article sort-of implies that without explicitly pointing it out (until late into the article).

Instead WebGPU has a good chance to become the 'missing link' standard cross-platform 3D API that sits above D3D12, Metal and Vulkan and at the same time is relatively straightforward to use for mere mortals.

SVG is not considered a graphics API, just like PDF isn't.

You mean YOU don't consider SVG a graphics API. But it's right there in the name, dude.

If the article is going to bring in Canvas, it should bring in SVG. SVG shares more with opengl than Canvas does.

SVG is just a file format though, that I suppose you can manipulate in the DOM. By your standard glTF files would be an "API" also. Have you written D3D/GL/Vulkan/Metal? It's not even remotely similar.

> SVG shares more with opengl than Canvas does.

SVG is declarative.

Canvas and OpenGL/Vulkan are imperative.

These are all tools programmers use to draw to the screen. In the browser there are now 3 such tools: svg, canvas, and webgpu.

svg's primitives are shapes and you program it with js and dom.

canvas primitive is a 2D array of pixels, and you program it with js and an api.

webgpu's primitive are pipelines (or whatever) and you program it with js and an api.

Maybe you're hung up on using the dom instead of an api. But the dom is an api, and it is further specialized by svg. Mutating the screen is exactly equivalent to mutating the dom within SVG. The mental model is that of a persisting scene that you nudge around with changes. This, versus the mental model of canvas where shape persistence is up to you. My understanding of low-level 3d graphics (limited to that of a hobbyist) is that driver commands setup a pipeline and then nudge around a scene graph in a similar fashion to svg. So it is doubly ironic that she would omit it.

I'm going to give you the same unwarranted snark that you gave the author:

"In the browser there are now 3 such tools: svg, canvas, and webgpu."

That's wildly incorrect, and anyone should take your comments with a huge grain of salt. There are far more than 3 drawing tools in the browser.

Why are you excluding CSS? You can draw with it, as well. https://codepen.io/matheuswf95/pen/wgRMwW

Why are you excluding Video HTML elements? It's possible to pipe directly to them from Javascript.

Why are you excluding drawing in the DOM? Do you hate ASCII and ANSI art? https://www.crummy.com/software/ansi2html/

Why are you excluding the JS Console? You can draw to that. https://dev.to/shriji/game-on-console-log-5cbk

Why are you excluding the HTML Title? Someone made a game that you play entirely in the browser's Title? https://titlerun.xyz/

Why are you excluding GIF? I can't find it, but I saw someone made a remote desktop client that encoded the video as a never-ending GIF stream.

Why are you excluding fonts? You can create custom fonts and use them to draw all kinds of crazy things.

We draw the line somewhere. I thought the History of Graphics APIs in the article was great. Sorry it disappointed you. I felt your attack on the author's credibility, "[I]t omitted SVG entirely. So one should take it with a grain of salt" was pedantic and rude.

I had a lot of fun clicking through these links. One of my favorites to add here is drawing with the Checkbox. https://www.bryanbraun.com/checkboxland/

You can create an HTML table with 1 pixel cells, and use the DOM api to change individual cell background colors to make it work like a canvas.

Does that mean that HTML tables are a graphics API?

> svg's primitives are shapes and you program it with js and dom

Maybe you do and that's fine. I program SVG by dynamically generating it on the server in Lisp. No js required unless I need to change an element at runtime. SVG is much more centered on "paint a picture with vectors and then send it to the browser" than "call a function to draw this vector RIGHT NOW" which is why people here are saying it's not really an API the way Vulkan and WebGL and WebGPU are.

For SVG to be considered an API, HTML would have to be one too. For SVG to be relevant to TFA, it would also need to express 3D geometries, and SVG is focused on 2D.

It's already god knows how many words long.

how is svg connected to opengl?

Thanks for writing this article! I am super excited about WebGPU. One not-so-fancy prospect worth commenting about on HN, though, is replacing Electron.

With WASM-focused initiatives to create hardware accelerated UI in the browser, we may soon see a toolchain that deploys to a WebGPU canvas and WASM in the browser, deploys native code linked to WGPU outside the browser, and gives the industry a roadmap to Electron-style app development without the Chromium overhead.

This is also my fear, but... I don't know, I think the potential outweighs the risks.

In some ways the problem with everyone trying to render to canvases and skip the DOM starts from education and a lack of native equivalents to the DOM that really genuinely showcase the strengths beyond having a similar API. I think developers come into the web and they have a particular mindset about graphics that pushes them away from "there should be a universal semantic layer that I talk to and also other things might talk to it", and instead the mindset is "I just want to put very specific pixels on the screen, and people shouldn't be using my application on weird screen configurations or with random extensions/customizations anyway."

And I vaguely think that's something that needs to be solved more by just educating developers. It'll be a problem until something happens and native platforms either get support for a universal semantic application layer that's accessible to the user and developers start seeing the benefits of that, or... I don't know. That's maybe a long conversation. But there has to be a shift, I don't think it's feasible to just hold off on GPU features. At some point native developers need to figure out why the DOM matters or we'll just keep on having this debate.

People wanting to write code that runs on both native and the web is good, it's a reasonable instinct. Electron-style app development isn't a bad goal. It's just how those apps get developed and what parts of the web get thrown out because developers aren't considering them to be important.

When it comes down to it, I think most of our issue is that the technical details still matter enough that we can't afford to provide a universal human interface layer...and when that falls on the app developer, either they compromise the app or they compromise the user, because nobody is capable of filling in every rough edge and also doing the thing they set out to do in that moment. If they actually set out to create a fix, that quickly turns into their career.

A lot of the "web frontend churn" meme is derived from the quest for a slightly better compromise - which has produced some results if what you are building resembles a web page. But over and over, native devs will say, "no, I want the hardware access. This layer still doesn't address my use case." That's been true of anything deeply I/O related on the Web - graphics, audio, input. Things where you can't make a simplifying assumption if you actually start covering all use cases.

> I think most of our issue is that the technical details still matter enough that we can't afford to provide a universal human interface layer

I don't think this is true for the majority of native apps I run. I think a lot of developers think they need low-level control over exactly what pixels on the screen turn what colors, but I don't think most apps are in that category. Devs that use the GPU a lot are just used to working that way.

There are cases where that kind of access is needed. I use WebGL right now, I will use WebGPU when it comes out. But how much of your desktop software genuinely, legitimately could not be described using a pure-text tree? Maybe Blender? Even an app like Krita really only needs precise control over pixels for the actual drawing part.

It's mostly games and very experimental apps and visual content creation tools. Everything else is at most "we need a buffer rendered via the GPU embedded in this interface for one thing".

My take on this is: if your app works with a screenreader, that app is probably representable via some kind of universal interface. And the advantages to having a layer like that on native platforms would be profound. People talk about how great the terminal is, imagine if you could do everything you can do on the terminal -- piping output between programs, scraping content, building wrappers around interfaces, easy remote access, etc -- imagine if you could do all of that for all of the GUI apps on your computer too.

What we would lose in developer flexibility to define an initial interface we would make up for ten-fold with the additional flexibility developers and users would have to build around existing interfaces and to extend them.

> A lot of the "web frontend churn" meme is derived from the quest for a slightly better compromise

I'm also not sure this is entirely true. Yes, performance is a thing people talk about but systems like React came out of developers wanting a declarative model for building interfaces. In many ways it was going the opposite direction of canvas/webGL code. A lot of other libraries are similar -- you look at something like d3, that's not really about how things get drawn on the screen, what makes it special is data bindings.

Debates about the virtual DOM are kind of a small subset of the churn that we see online, and my impression is that the churn is pushed more by people debating how to build interfaces in the first place and how to further abstract that process, not how this stuff should get rendered to the screen.

We can have a high diversity of tools for building interfaces, what makes the DOM special is that it's universal render target. Whatever tool you use, the user gets the same semantics. And the DOM isn't necessarily a good universal render target. But it's kind of telling that even though the DOM has a ton of issues, the underlying philosophy of "you should have to describe what you want to put on the screen using text first and then we'll let you style it" is itself really empowering for end-users.

The only possible application I can imagine for that would be videogames, though.

Because HTML+CSS+JS provides a fantastic cross-platform UI toolkit that everybody knows how to use.

Videogames create their own UI in order to have lots of shiny effects and a crazy immersive audio-filled controller-driven experience... but non-videogames don't need or want that.

Heck, I'm actually expecting the opposite -- for the entire OS interface to become based on Chromium HTML+CSS+JS, and eventually Electron apps don't bundle a runtime, because they're just apps. My Synology DSM is an entire operating system whose user interface runs in the browser and it just... makes sense.

> Videogames create their own UI in order to have lots of shiny effects and a crazy immersive audio-filled controller-driven experience... but non-videogames don't need or want that.

Because UI is such a pain, it's actually not uncommon for browser-based games to use HTML/CSS to handle their menus and HUD elements.

HTML/CSS definitely made prototyping easier for the game I'm working on, although I'm not sure if I'll stick with it for the final interface.

My only critique in that area is that I can't put shaders on the DOM. CSS shaders were part of the CSS spec but they turned out to be pretty big privacy/security risks I guess, so they got deprecated before they ever really landed. I never figured out the full story.

There's post-processing that you can do on a canvas that you can't do on a DOM tree -- not because it wouldn't be possible to do, just because... for multiple reasons, some of them very good, the browser doesn't support that kind of thing. If you could do fragment/vector shaders on parts of the DOM, I would very likely be using it for basically every text element in my game's interface.

Those parts of the interface aren't really a performance concern either. The parts of my game where I feel like I really want GPU code are sprite rendering, the main gameplay window, etc... the menus are fine, I just don't like that I can't throw a fragment shader on top of them.

What is holding back a UI framework for WebGPU? It won't have HTML but couldn't you just simulate a scaffold inside the WebGPU canvas?

Why would you duplicate the massive amount of helpful functionality and styling HTML+CSS already provides? What's the point?

It's one thing to build panels and buttons for a video game. It's a totally different beast to build a text layout and editing and selection engine, accessibility and screen reader compatibility, clipboard support, date pickers and file dialogs, and ten thousand other things.

In this hypothetical 3D Application, made possible by WebGPU, the UI & Text elements are a part of the "video game" and need physics, animations, and xyz spatial controls missing from HTML+CSS.

Flutter would seem to be a logical replacement (though admittedly last time I checked, there are big issues with Flutter Web on some of the points you mentioned, particularly around text selection and accessibility).

Even in the current HTML + CSS world, everyone writes their own date picker, file dialog, combo menu, etc. in React. It’s no longer acceptable to use built-in browser controls.

For most applications you'll want to do things like render text nicely, select text, copy and paste text, right-click to inspect, boost font sizes and change colors for accessibility, auto-translate to other languages.

A lot of it is about text, I guess!

Once you've re-implemented all that in a canvas, you've more or less built another web browser, so why not just use HTML?

Presumably the argument in this thread is that lugging around an Electron app just for HTML is overkill if you could put the same functionality in a much smaller package.

Also, we may be able to create those packages from parts of browsers, which wouldn't be a re-implementation, and would benefit from the rigor of the standards.

I agree that it would be kind of silly to then put this in the browser. Instead, I think it could be used to build the native target.

And just to clarify my earlier statement, I'm not a fan of phasing out the use of the DOM in the browser— it has graceful degradation and accessibility features that are really important. It's more that there are people with existing Qt or JavaFX or Cocoa dependent projects, or whatever, who are now able to deploy to the web by implementing the UI layers in WASM/WebGPU.

Alright guys, hear me out…

Let’s build a browser inside of virtual assembly code that runs in a browser that runs on an OS that runs on real assembly code (sort of). We could run that on a physical processor, or in a VM (hey, maybe we could run that VM inside a browser that runs on…)

CAD stuff could take advantage of this too.

At the end you pointed out the issue : the browser is good at displaying documents, but a program wants to run in the OS directly, and trying to make the browser an OS inside the OS only makes sense if you are Google and are trying to wrestle with Microsoft for the control of the OS that the average user is exposed to.

> for the entire OS interface to become based on Chromium HTML+CSS+JS

I don't care about HTML and CSS and JS in specific, but yes, holy crud I want my native OS to be based on a user-inspectable XML-like tree. Yes please.

It doesn't have to be HTML if everyone hates HTML. But there is a lot of potential beyond just developer convenience to having native apps start representing their interfaces using semantic language. I don't know how to get across how much stuff that would enable on the computer. I need to sit down at some point and try to build proof-of-concepts because the idea is something I genuinely just randomly get excited about sometimes, even though I know it's probably not going to ever actually exist any time soon and nobody is working on it.

But just off the top of my head:

- I would like to be able to have extremely low-latency remote-desktop access by streaming only the "DOM" tree instead of the entire screen.

- I would like to be able to apply user-CSS styles to every single app on my computer. GTK/Gnome styling/etc... sort of works in this direction but... it's not as good. Let me right-click on a running app and inspect element. Let me style stuff in real time. This is something that was always a part of native development and still kind of is, but it feels like the native platforms have all gotten worse in this regard (but maybe that's just anecdotal).

- I would like to be able to pipe the interface of an app into another app and scrape its content.

- I would like to be able to script my native apps by running DOM queries rather than by learning some plugin language or simulating mouse movements.

- I would like to be able to stream a specific sub-tree of an app's interface to another device. I would like to be able to open up my phone and have it show specifically the color picker from Krita and have it pipe my input back into Krita and I don't want to learn a Krita-specific scripting language to do that, I just want to run a query and pull out that sub-tree and stream it. I would like to be able to have an app just pull out the "canvas" element from Krita and occasionally snapshot it or mirror it to another monitor.

- I would like to have multiple "views" of a single application by just passing the same application's "DOM" into two different renderers without needing to have two processes for the actual application.

- I would like to be able to do all of that on-the-fly without sitting down and writing some specialized program, the same way I can do stuff on a website by just right-clicking and going "inspect element."

And again, I don't really care if it's the DOM specifically, I just want all the stuff the DOM provides me as a user.

> I don't care about HTML and CSS and JS in specific, but yes, holy crud I want my native OS to be based on a user-inspectable XML-like tree. Yes please.

I mean, that's been the case for most GUI toolkits I know of, for decades before electron was a thing, and part of why I've never been impressed by the value proposition of DOM and web in general as a GUI toolkit/rich application platform by people with no prior exposure to proper GUI frameworks.

Qt has GammaRay JavaFX has Scenic View Gtk has a built in inspector (Ctrl+shift+I) Windows has things like UISpy/Inspect

> that's been the case for most GUI toolkits I know of

Yes and no? Or at least, yes and to the best of my knowledge, no.

Yes, GUI toolkits do have similarities. But none of these tools have ever really been expanded out or utilized to the same degree that the DOM has been utilized. If you look at the documentation for most of the tools you link, they're primarily billed as debugging aids. They're also not something that the OS itself treats as an first-class feature.

I do think in particular GTK has been moving more in this direction, but my (possibly incorrect) take on it is that it's basically copying the web (which to be clear, is a good thing). GTK outright uses CSS now.

I'll also add onto this that I don't think the DOM is the equivalent of a GUI toolkit in the first place, I think it's a render target. It's at its best when it's targeted as a user-facing feature, not as a developer aid. That's not really ever been my perception of native toolkits (although to be fair, system theming has been a thing forever, but that's really just scratching the surface of what's possible here).

There's stuff I can do with a browser console and extensions that I can't do as a user with a compiled GTK app -- and that's even acknowledging that browsers themselves don't leverage the DOM to its full potential. But they're still miles ahead of what I can do with native apps as a user.

Not to say that these inspectors couldn't go in that direction. And I hope they do. Like I said, I'm not married to HTML in specific, I don't really care about how we end up heading in this direction, I'm mostly just pointing out that there's really serious advantages to having an XML-based text interface that's inspectable and that is separate from styling and those advantages get lost if everything just gets spit out as a pixel blob. Qt in particular is fun to bring up here because their web targeting makes the exact same mistake, so I don't think GammaRay is proof that they actually understand the strength of this model, because otherwise I don't think they'd be making this mistake when targeting the web.

What I will say is in praise of these tools is that the success of platforms like QT and GTK should on some level be evidence that "the web is for documents, not apps" is kind of nonsense. Most apps are documents. QT and GTK do just fine while forcing the developer to build a hierarchical UI of standardized components. They have no problem standardizing things like input handling. A pure-text tree-based UI with universal semantic components is something that works for most apps, even on native. I mean, GTK literally uses XML for interfaces. So I don't buy the whole "document/application" distinction that people sometimes pull out; everyone is already building interactive documents on native and it's completely fine and completely sufficient for a ton of serious applications.

> But none of these tools have ever really been expanded out or utilized to the same degree that the DOM has been utilized.

I think you got the history backwards: the DOM was never designed to hold rich applications, controls and containers like a GUI toolkit does. HTML got forms (later, with HTML4), then organically grew more diverse and specialized controls (without ever reaching parity with most GUI toolkits in this area), never quite figuring-out containers nor behaviour inheritance (unlike in object oriented programming and the "Widget" model), while JS got steadily tamed and reshaped into a general-purpose language. I don't disagree with the assertion that "DOM has been utilized in extreme ways", I just don't think that it's a positive thing, simply because what we ended-up with is something extremely ad-hoc, inefficient and counter-productive (but as the saying goes, "when all you have is a hammer…"). What made the Web a successful platform is the "URL as deployment process", not that it's a great tech (neither on the producing nor consuming side).

> If you look at the documentation for most of the tools you link, they're primarily billed as debugging aids. They're also not something that the OS itself treats as an first-class feature.

Which is totally different than web browser's DOM appearing under "developer tools", or is it? Also, this ability to hook onto the DOM and play with things in the console disappears with packaged electron apps, which turn-out to be often less "introspectable" in practice than their native counterparts. Even on the "normal" web, this ability becomes a rarity as more and more pages get minified and bound to heavy SPA frameworks.

> my (possibly incorrect) take on it is that it's basically copying the web (which to be clear, is a good thing). GTK outright uses CSS now.

And it's not the only one, JavaFX and QML (Declarative Qt) can be built from XML and styled using CSS-like rules. Microsoft has had many attempts at declarative UI based on XML (XAML, UWP, …). XUL was Netscape solution to building applications with HTML at a time HTML wasn't ready (which lasted well into recent history, with 2018 IIRC being when Mozilla could finally go without it). Though let's not conflate everything: Web implies DOM, but CSS/XML doesn't, and defining documents and UIs in a SGML is much older than HTML itself.

> Also, this ability to hook onto the DOM and play with things in the console disappears with packaged electron apps, which turn-out to be often less "introspectable" in practice than their native counterparts.

To be clear, Electron is NOT what I am asking for. Electron is not fulfilling the spirit of having native DOM. I don't care what it's authored in, I care about what I can do with it as a user. Electron is in some cases a step backwards because I lose even the very limited styling options that the OS does provide.

It's not about what the developer writes, it's about what I as the user am presented with. The major strength of the DOM is that you have to describe your interface using text to the user, not to the compiler.

> Web implies DOM, but CSS/XML doesn't, and defining documents and UIs in a SGML is much older than HTML itself.

But this gets back to your original premise: the DOM was never designed to hold rich applications, and yet, it turns out that XML is perfectly fine for authoring rich applications. The whole document/application distinction hasn't held up (and arguably never held up), which is evidenced by basically every one of these toolkits pushing in that direction. I'm not sure how people can say that the DOM/CSS is insufficient for this when native toolkits are literally pulling CSS in now.

But again, I don't really care about HTML/CSS in specific. I care about the architecture. If the problem is just what specific components HTML provides, then use something else. But I want there to be something universal that (almost) everything on my OS uses. I want to get some broad agreement from application devs that they're not going to push pixels to the screen, they're going to give me as an end user a live interface in a shared XML format.

And I want to be really clear again, defining documents in UIs and SGML is not what I care about. I care about what gets presented to the user. The authorship is not the important part here.

I want to be able to open my calendar app's interface in a text editor.

You mention the browser's dev tools yourself, and it's a really good question that I think kind of demonstrates my point:

> Which is totally different than web browser's DOM appearing under "developer tools", or is it?

To be completely honest, yeah, I do think it's totally different. I can do things as a user in the browser dev tools that I can't do for any of these native toolkits: insert additional markup, add my own rules and event listeners and scripts, etc. And importantly, I can do that stuff on every single website, not using a specialized debugger for a small portion of the apps on my computer. I regularly open my dev tools and write little helper scripts for websites. I regularly alter CSS on the fly for websites I visit.

And sure, I don't actually think that browsers are doing a great job of leveraging this, I think they could go way further and could support even more powerful use cases. And some of the integration between the DOM and JS and the tendency towards div soup is actually quite bad on the web. But even so, the experience of working with the dev tools as an end user consuming somebody else's web app interface is miles ahead of the functionality on native platforms. In a lot of web apps that I use seriously I have extra CSS rules getting piped in through uBlock Origin. I don't have that on for any of my native Linux apps.

Yes it's still just called "developer tools", yes browsers could support more, but the entire philosophy around CSS as a user-extensible styling tool and HTML as something that can be queried and manipulated on the fly by multiple agents is something that's always been embraced more in browsers than on native platforms (limited OS styling options aside).

I wonder if application frameworks like Flutter will move to WebGPU? I imagine it shouldn't be that hard to get Skia running on a wgpu backend. The current web target generates a lot of markup that isn't really semantic or representative of a web app's structure with lots of canvases anyway, so I imagine moving to a uniform render target will make things smoother. They're already experimenting with Dart in WASM instead of transpiling to JS as well.

I'd like to see this too, but it's quite unlikely (it could have already happened with WebGL), the developer experience won't be much different than writing a native app on top of one of the native cross-platform libraries (like SDL2, GLFW or - shameless plug - the sokol headers).

You will always have the overhead otherwise you need to check with every browser update if something got broken.

Same problem with WebView

To get an idea where WebGPU is heading a couple of projects worth looking at.

- The Bevy game engine, using WebGPU on the backend. https://bevyengine.org/

- Wonnx, A WebGPU inference engine for running AI compute on the server or in the browser. https://github.com/webonnx/wonnx

Related - this was published exactly 3 years ago on a similar topic:


Yes. Kvark pushed WGPU as a cross-platform graphics base for Rust, and that worked out quite well.

It's actually better in an application than in the browser. In an application, you get to use real threads and utilize the computer's full resources, both CPUs and GPUs. In browsers, the Main Thread is special, you usually can't have threads at different priorities, there's much resource limiting, and the Javascript callback mindset gets in the way.

Here's video from my metaverse viewer, which uses WGPU.[1] This seems to be, much to my surprise, the most photo-realistic game-type 3D graphics application yet written in Rust.

The stack for that is Egui (for 2D menus), Rend3 (for a lightweight scene graph and memory management), WGPU (as discussed above), Winit (cross-platform window event management), and Vulkan. Runs on both Linux and Windows. Should work on MacOS, but hasn't been debugged there. Android has a browser-like thread model, so, although WGPU supports those targets, this program won't work on Android or WASM. All this is in Rust.

It's been a painful two year experience getting that stack to work. It suffers from the usual open source problem - everything is stuck at version 0.x, sort of working, and no longer exciting to work on. The APIs at each level are not stable, and so the versions of everything have to be carefully matched. When someone changes one level, the other levels have to adapt, which takes time. Here's a more detailed discussion of the problems.[2] The right stuff is there, but it does not Just Work yet. Which is why we're not seeing AAA game titles written in Rust. You can't bet a project with a deadline on this stack yet. As you can see from the video, it does do a nice job.

It's encouraging that WGPU is seen as a long-term solution, because it improves the odds of that work getting completed.

Still, I was chatting with one of the few people to ship a good 3D game (a yacht racing simulator) in Rust, and he admits that he simply used "good old DX11".

[1] https://video.hardlimit.com/w/tp9mLAQoHaFR32YAVKVDrz (That's on PeerTube. I'm curious to see how it copes with an HN-sized load. PeerTube is supposed to scale automatically.)

[2] https://www.reddit.com/r/rust_gamedev/comments/1302512/reall...

Is there some reason you need a different stack of all new UI libraries just because you decided to use a certain language? (Yes, I know everyone does it, but they should stop.)

Does it actually work with OS features like screenreaders and printing that aren't just "pixels go on screen"? Does it break when a user has an OS setting changed that only one Microsoft QA greybeard remembers exists?

Almost nothing integrates with accessibility, that's always a "do it later" task.

Yes, and they should stop doing that.

Protip: accessibility APIs are good for writing UI tests!

As often, the guessing about historical reasoning could be all over the place. And not just about WebGPU (oh that never ending “why not SPIRV?” discussion). Claiming that Vulkan and D3D12 were kicked off by a GDC talk about AZDO sounds ridiculous to me. These APIs are about explicit control, they allowed to talk to the driver more and better, which is the opposite of AZDO approach in a way.

Anyway, congratulations to WebGPU release on stable Chrome on some of the platforms! Looking forward to see it widely available.

And I always thought AMD was to blame for Vulkan, because they couldn't catch up with NVIDIA's OpenGL driver performance ;)

AMD's Mantle is the direct precursor to Vulkan. AMD donated it to Khronos.

> The middleware developers were also in the room at the time, the Unity and Epic and Valve guys. They were beaming as the Khronos guy explained this. Their lives were about to get much, much easier.

Lol, I wonder what their opinion is now 8 years later (at least those who haven't been burned out by Vulkan).

One place I worked at had a guy who proudly wore a "Vulkan - Industry Forged" T-Shirt every day. I'll just assume he had 5-7 identical ones.

I had a great tie-dyed Microsoft DirectX t-shirt that a dude in the Microsoft booth at CGDC gave me in 1996. I loved wearing it to Linux and Mac trade shows, because it would make people's heads explode.

I did game engine development in the late aughts. It was embarrassing how good DirectX was relative to OpenGL at the time. Not merely for reasons of the API being newer: the mechanism for discovering the capabilities of a graphics card in OpenGL was actively hamstrung by a total lack of incentives for vendors to not just lie to the API about capabilities (such as declaring such-and-such shader feature was supported when in reality it was computed in software on the CPU and if you enabled it you got seconds-per-frame performance). In contrast, the DirectX logo was revocable; if a company put together a card that sucked, Microsoft could just decertify it (or refrain from certifying it).

There's a reason so many cross-platform games of the era had some kind of demo or animation that'd run near the beginning; it was secretly the way to test card capabilities and figure out whether the hardware could do what it said it could.

In similar vein, there is a certain irony attending FOSS conferences, where the presenters keep showing off their Windows and macOS laptops.

This was the most well-written technical piece I've read in a long time. Kudos Andi. The conversational tone, imagery, well-sourced history - it was perfect.

It was interesting to read that Rust has a WebGPU implementation [0] outside of the browser. I wonder if this will become a more general standard.

Anyone know how WebGPU compares to CUDA in terms of performance and functionality?

[0] https://sotrh.github.io/learn-wgpu/#why-rust

> ¹² If I were a cynical, paranoid conspiracy theorist, I would float the theory here that Apple at some point decided they wanted to leave open the capability to sue the other video card developers on the Khronos board, so they are aggressively refusing to let their code touch anything that has touched the Vulkan patent pool to insulate themselves from counter-suits. Or that is what I would say if I were a cynical, paranoid conspiracy theorist. Hypothetically.

Such a shame that they lobbied against SPIR-V on the web. Textual formats are evil.

The entire web has been built on textual formats though, and quite successfully so.

Even without Apple in the way, SPIRV as it is wouldn't have been usable for WebGPU: http://kvark.github.io/spirv/2021/05/01/spirv-horrors.html

eh, I think the game is given away when kvark self-admits in that post that he's never written a compiler before, or done parsing before.

Yes, SPIR-V is weird and confusing... if you've never worked on a compiler. A lot of it is relatively standard if you have. Some of the messy control flow stuff got pretty massively cleaned up by the POPL paper. [0]

The issue is that Naga wants to support SPIR-V, but doesn't want to grow to become a full compiler. e.g.

> When writing SPIR-V, you can’t have two integer types of the same width.

This makes compilers much easier as now you can guarantee that all types are uniquely comparable by index. Just shove the type in an array, done.

> However, it can also declare an output “interface struct” with a bunch of special built-ins and an empty name… Perhaps, it’s a mechanism that helps with tessellation shaders. But for us in Naga land that caused more trouble than good.

No clue why you would care, unless you just blindly broadcast names when translating from SPIR-V. But also, don't do that? It's fine for things to have no names.

> Later I was pointed to the fact that all the existing production tools for generating SPIR-V (namely, glslang and dxc) only do so for a concrete entry point. So the multiple entry points is there in spirit, but discriminated in practice by the driver bugs, as well as this little validation rule.

Yeah, this is unfortunate. Multiple entry points are one of those things that make it seem easy to write a translator. Unfortunately, they're under-tested, not in the CTS, as glslang/DXC don't support them, and aren't how most shader pipelines tend to work.

Anyway, I don't want to enumerate all of this. SPIR-V has issues, sure, but it can be simultaneously true that SPIR-V isn't flawless, while also being true that it's the best version of this that anyone's seen yet.

[0] https://www.doc.ic.ac.uk/~afd/homepages/papers/pdfs/2023/POP...

WASM is very much binary-first, and quite popular.

Unfortunately WASM is not popular in the slightest outside of the HN community. Less than 1% of websites use it, and those that do often use it for malicious purposes.


Please source where on Wikipedia it says >1% of websites use WASM.

I don't really know what the most trusted / canonical sources would be on this topic, but the first few results from Google all suggest <1% of websites use it:




I wonder whether you could have used a slightly modified WebAssembly instead.

That wouldn't have been all that different from WGSL though, the most important thing is that whatever WebGPU uses for its shaders can be translated to and from SPRIV (e.g. via https://dawn.googlesource.com/tint and https://github.com/gfx-rs/naga).

Textual formats are great. You can build shaders by pasting strings together and using #define and #ifdef.

>a lot of the Linux devices I've been checking out don't support Vulkan even now after it's been out seven years. So really the only platform where Vulkan runs natively is Android

I got really curious about this. To my understanding, I have been using Vulkan on my Linux desktop computers for quite some time now. What Linux devices could the author mean?

In context, I don't think "many" means "most". You need a GPU from Nvidia or AMD, or an Intel platform at least as new as Ivybridge. That covers a lot of devices, but there are also a lot that it leaves out.

That would mean that she's checking out 11+ years old Intel platforms, that doesn't seem plausible to me. Also "the only platform where Vulkan runs natively is Android". No, Vulkan seemed to be supported out of the box on the random distros I have tried.

> It really seemed like DirectX (and the "X Box" standalone console it spawned)

Did the name "XBox" come from the fact that in ran DirectX? Sort of short for DirectXBox?


MS called all their multimedia/gaming APIs DirectSomething for a while, then decided to group it all together into DirectX. It was also a time when you had to put an X into everything because you were XTREME.

If I'm not mistaken, this would have been toward the end of the 90s. These were Xtreme times.

Super Soakers, Jolt Cola, the Teenage Mutant Ninja Turtles, Sonic the Hedgehog, the FBI would come after you for reproducing copyrighted films (according to the warning at the start of the film). Definitely seems like an era in which something would be named DirectX.

I blame it on the Gen Xers

They were all hopped up on lead and sitting too close to televisions.

Their gaming project codenames were also extreme, or more like, the theme was "being racist about the Japanese". DirectX and Xbox were named Manhattan and Midway because they were going to "kill Sony" and the DirectX logo was almost the radiation symbol.

Unsurprisingly, this didn't sell many Xboxes over there.

(Being racist is a continuing tradition with Western game journalists, and caused Japanese devs to genuinely think nobody liked them and try to Westernize their games for a generation or two. The director for the new Final Fantasy actually said he doesn't like it being called a "JRPG" because they think we use it as a slur.)

That was actually the originally proposed name, yes.

I believe main use of WebGPU, as well as WebGL will be fingerprinting therefore they should be put behing a permission. It is ridiculous that browser developers do nothing to reduce API surface usable for fingerprinting and instead just extend it.

People keep saying “webgpu will just be used for fingerprinting” here on HN as a meme constantly but it’s clearly not the case. There’s obvious use cases not involving finger printing for webgl, and webgpu has tons of convenience functions that makes it superior to webgl to make things nearly impossible otherwise (eg atomics to name just one)

While I don't disagree with the essence of what you're concerned about, I imagine that every new step/functionality in the browser evolution introduces new minutiae that can be used in fingerprinting.

Also, once/if the WebGPU interface expands into more direct GPU functionality, I imagine there will many that will try to use your GPU for crypto mining/etc.

But surely a permission option/prompt will eventually be introduced, but probably not by google in chrome. Anyone know the roadmap/timeline for WebGPU on Firefox and other browsers?

> I imagine that every new step/functionality in the browser evolution introduces new minutiae that can be used in fingerprinting.

No problem; just put such APIs behind a permission and they become unusable for fingerprinting.

If you do that too much, you just train users to enable all permissions all the time without thinking.

If you can query for permission without a prompt appearing that's another data point.

Kinda doubt it's that great for fingerprinting. Look at something like WebGL report and any semi-modern system will support everything. "Computer was made in the last 10 years" isn't that useful. Even knowing the exact GPU... we're talking maybe 5 major vendors? Seems like if you really care for anonymity you need a vpn and a special browser at this point anyway.

I love the level of detail in this post, really helpful for filling in a lot of the gaps in my knowledge about the history and motivations at play. This is exactly the kind of post I love to see on HN.

I haven't played with WebGPU much (coughLinux supportcough) but I'm looking forward to it. And I generally agree that even more than the API itself, having GPU code that's easily portable between the web and native languages is a pretty big deal to me.

Are there any compute/memory resource quotas?

It's quite easy to almost lock up a computer by doing high intensity GPU work.

It's the usual story with web APIs, the implementation probably has resource quotas to keep applications from accidentally or deliberately doing denial of service, but you the developer targeting that implementation aren't allowed to know what the quotas are. You just have to make an educated guess at how much you can get away with and hope for the best. WebAssembly has a similar issue where consuming too much memory will get your app killed, but there's no reliable way to know how much is "too much" until it's too late to recover.

There are some limits you can query from the adapter: https://developer.mozilla.org/en-US/docs/Web/API/GPUSupporte...

Those limits only tell you how big each individual GPU allocation can be, not how much you can allocate total before the browser puts a bullet in your context.

Based on previous experience I assume there are no quotas and the webpage can hang your system. I had Google Street View hang a whole Linux system by using too much memory and Firefox did nothing to prevent this.

What if you are using it in a native app rather than a browser/Electron ?

That's probably already possible with just a huge HTML page? At least on my system, if I create such a page and open it via a file:// URL, firefox will happily gobble up memory.

This is the case with AWS's Go SDK for EC2: https://docs.aws.amazon.com/sdk-for-go/api/service/ec2/

No matter what PC I use, I can't open that website...

I don't know who would make such a monstrous HTML page UT I'm pleasantly surprised that once it loaded my phone can interact with it just fine.

Someone should probably tell Amazon that hyperlinks allow linking to other HTML documents, though, because that's a ridiculously long page.

Works nicely on my Firefox (Kubuntu 22.04), but Chromium is troubled. Doesn't scroll smoothly...

My PC hangs for some time as well

I thought FF should not allow that to happen?

Firefox on a garbage macbook: Site loads in a few seconds and causes no problems. It's just a bunch of text and hyperlinks, why would that be a problem?

Works perfectly fine on Firefox for Android. I'm able to scroll and view everything even before the styling has finished loading.

Firefox seems to fare okay on that website. Chrome usually struggles more with huge websites IME.

It opened (after a while) on mobile Firefox on a Galaxy S7...

holy crap this document is awful, crashed my browser

FF on Windows 11, no problem.

Uhhh... and at some point it segfaults if I try to load 5 GB of HTML?

[ 7527.750745] HTML5 Parser[14186]: segfault at 0 ip 0000556be8b79cef sp 00007fbdb1570420 error 6 in firefox[556be8b77000+9c000]

That's definitely a bug. Have you reported it?

I tried. I'd need a github account. I don't want to have a github account.

> I think it will replace Vulkan

I do not expect this to happen at all. WepGPU (including its shading language) is a subset of Vulkan. Furthermore, it is up to the runtime to expose vendor extensions to the code (as one example, Node supports ray tracing, but nothing else does). This means that WebGPU will be perpetually behind Vulkan.

That being said, if WebGPU does what you need then don't bother with Vulkan.

I bet it will replace Vulkan for most people who would otherwise have no other choice than using Vulkan on Linux or Android (these are the only operating systems where there's no alternative modern GPU API than Vulkan).

Yup. Don't even get me started on the lack of push constants. Last I checked WebGPU doesn't even expose raw GPU memory which means optimizations, like memory aliasing, are off the table.

Exactly. I assume that some things can't safely be exposed either, as they could be a risk to the sandbox.

> In fact it is so good I think it will replace Vulkan

WebGPU does not support bindless resources making it a non starter as a Vulkan or D3D12 replacement.

That the initial version of WebGPU doesn't support a specific feature doesn't mean it won't be supported in extensions or new versions down the road though.

(most current restrictions are enforced by the requirement to also work on mobile devices)

And it doesn’t support ray tracing either!

> In fact it is so good I think it will replace Vulkan as well as normal OpenGL, and become just the standard way to draw, in any kind of software, from any programming language.

I fully agree with that, for a lot of use cases WebGL has everything you need, means it has the potential to become the cross platform graphics API OpenGL dreamed to be. And as a bonus you have a realistic way to run whatever app you are writing in the browser with WASM+WebGL.

I just think for AAA games Vulcan, Metal and DirectX12 will probably still be the way to go. But GUI libraries? Less highest end games? There is just no point once you can use WebGL everywhere. And then if you want to have a browser Demo you have a realistic chance to get it.

> for a lot of use cases WebGL has everything you need

I think the original post refers to WebGPU

I can't tell if it's just a masterful troll implying web graphics history repeats itself.

it's a very embarrassing typo.

given that WebGL was known to use an "old" approach even before it was released and we don't have the problem with WebGPU I hope history doesn't repeats itself

History is repeating itself, there is no roadmap for mesh shaders or ray tracing.

So wait... do we now have a situation where the browser engine has converged with where the Java Virtual Machine was and provided a container to run write-once-run-anywhere desktop apps compiled to WASM?

All we need is the last mile -- progressive web apps -- to include better support for integration with desktop OSes and we have a way to take WASM apps and drag them to the desktop.

The up and coming languages for writing these WASM apps seem to be Go and Rust. Here's a Rust example:


WebGPU can be used outside of the browser through native libraries:



(these are the libraries that are used in Firefox and Chrome to provide the WebGPU implementation, don't know if Apple/WebKit also has open-sourced theirs)

Some non-browser WASM runtimes also have WebGPU support already, but unfortunately only for offscreen rendering or compute tasks (e.g. for writing cross-platform windowed apps as WASM blobs, there's a window system glue layer missing).

> don't know if Apple/WebKit also has open-sourced theirs


I don’t know about WebGPU, but WebGL is missing some key performance features from OpenGL ES, like client-side buffers, pixel buffer objects, and MSAA render-to-texture frame buffers.

Client-side buffers are not a great idea. Pixel buffer objects and MSAA framebuffers exist in WebGL 2.

WebGL MSAA framebuffers exist, but require extra memory and an intermediate step compared to EXT_multisampled_render_to_texture https://registry.khronos.org/OpenGL/extensions/EXT/EXT_multi...

PBO support in WebGL lacks glMapBufferRange for non blocking read back.

Too bad that WebGPU builds on top of Vulkan, so the dependency (and incomplete support for some configurations) is still there.

WebGPU is designed to split the difference between Vulkan, DirectX 12 and Metal, it can be implemented on top of any of those.

In practice it'll only use Vulkan on Android and Linux, the main two implementations both default to DX12 on Windows even though they could technically use Vulkan there.

wgpu at least prefers Vulkan over DirectX 12 on Windows.

My mistake, they plan to use DX12 by default on Windows, but the DX12 backend isn't fully cooked yet so for now it defaults to Vulkan.


If the browser ever becomes a popular platform for mainstream games then I suspect the quality of web applications will also boom just because of all the talent pouring into the browser as a platform.

Open this site with uMatrix - blocking cookies, third-party content, and XHR requests - and the page simply goes blank after about 10 seconds.

Why it needs cookies/XHR to display a page of plain text, I don't know, but I left the site.

After disabling Javascript [0], everything is fine.


[0] The aptly named "Javascript Toggle On and Off" plugin for Firefox sets up a keybinding for toggling. It's very convenient and great for scaling many a paywall.

> In the browser I can already mix Rust and TypeScript, there's copious example code for that.

I’d love to see a production architecture and file structure for this setup if anyone has a pointer to a GH repo or something similar

Not a production example, but good "academic" examples are found in the Rust & WebAssembly Gitbook: https://rustwasm.github.io/docs/book/introduction.html

It's not Rust and TS, instead C and JS, but Emscripten has a very nice way of integrating C/C++ and JS (you can just embed snippets of Javascript inside C/C++ source files), e.g. starting at this line, there's a couple of embedded Javascript functions which can be called like C functions directly from the "C side":



Web based graphics editor where the engine is written in rust. I think it uses tauri

You can check out Koofr Vault. The engine is written in Rust and the web frontend is written in TypeScript and React.


I like Cloudflare's docs as a good starting place.

Unfortunately, I couldn't get it to run on chrome 113 under linux. Even after some fiddling and proper VK_ICD_FILENAMES for nvidia (3090 RTX) and VK_LAYER_PATH set to explicit.d, it borked out on "vkAllocateMemory failed with VK_ERROR_OUT_OF_DEVICE_MEMORY" which makes no sense. I thought chrome, well google, internally used linux a lot. I guess not in this case. State of this seems to be at least few years out then on any sort of (wide) adoption rate. I'll come back to it then.

Chrome 113 doesn't support WebGPU on Linux and Android yet, so that's kinda expected.

> Except Linux has a severe driver problem with Vulkan and a lot of the Linux devices I've been checking out don't support Vulkan even now after it's been out seven years.

What? Vulkan is supported on all relevant GPUs on Linux.

There is no "severe driver problem" either.

This was great for catching up!

I stopped doing graphics programming somewhere between OpenCL and Vulkan being released and always wondered what had happened - especially along with the sentiments in the industry since that's difficult to glean from an wikipedia article.

It was 6 years ago that we banned Flash. Today, there is still no replacement. Still think banning Flash was one of the worst ideas. Nothing today even gets close to what Flash has had to offer.

Isn't that less about Flash the browser plugin and more about Flash the application/IDE? If there were a fantastic accessible authoring experience for Canvas/SVG content then it could be much like the glory days of Flash.

Out of (ignorant) curiosity: why? just why we need an extra "standard"? and why browsers keep growing to become full OS adding more bloat to the tech stack?

Zawinski's Law: "Every program attempts to expand until it can read mail. Those programs which cannot so expand are replaced by ones which can."

Zawinski himself has stated:

"My point was not about copycats, it was about platformization. Apps that you "live in" all day have pressure to become everything and do everything. An app for editing text becomes an IDE, then an OS. An app for displaying hypertext documents becomes a mail reader, then an OS."

The trajectory we're on is for browsers to become consumer operating systems and consumer operating systems as currently conceived to, basically, vanish into the background. This is a good thing, you should want it to happen. All anybody ever wanted was a decent UI/UX that runs stuff, and that's just software so there's no real reason, besides historical mistakes, that it can't run the same on every machine.

The dream is for Explorer/Finder/etc to no longer exist and for the whole computing experience to be something you just download and customize to your heart's content. Just imagine! A day when you can no longer tell you're using Windows unless you're unfortunately saddled with the job of making something run on it under the hood. That's the only way Microsoft's negligent idiocy is ever going to be shut down, anyway. I honestly can't wait.

(... although, hopefully in this new world browsers can run a language that isn't based on Javascript and applications can be built in a language that isn't based on HTML.)

No I don't want that, I want an OS that's an OS and a browser that's a browser.

Thanks, I vomited a bit into my mouth.

Because browsers are by far the easiest, safest, and fastest way to distribute applications. Operating Systems still don't have any sort of meaningful sandboxing so downloading and executing binaries from any source is out of question. With web applications, you can do that. Instantly and uncomplicated. This is not going to change anymore, it's just way too useful.

> Apps that you "live in" all day have pressure to become everything and do everything.

Thank you. Somehow I had missed that expansion, but this makes the quote a lot more helpful.

Perhaps it should be called "Zawinksi's Trap" — to a programmer working on an app all day, the app becomes the operating system, leading them to justify expanding the feature set, which benefits them. For all other users where the app is not the operating system the app becomes more bloated and complex.

3D apps in browsers can be useful, and WebGL is very limited.

With a standardized (and standalone) WebAssembly API to WebGPU, this has half a chance of becoming a truly cross-platform high-performance graphics solution.

The API definition exists:


The next missing piece is a standard window system glue API for WASI though.

So how does OpenVG fit into this picture?


It doesn't

does anyone know what tool might have been used to develop this image: https://staging.cohostcdn.org/attachment/45fea200-d670-4fab-...

It looks as if it was simply drawn in Inkscape or a similar program.

Author claims it could replace vulkan in alinea one. Decide for your self if you want to read/spend more time. I won't.

This article is so full of obvious false information, the most important is: "OpenGL ES 2.0 was the same as OpenGL 3.3, somehow."

VAO is the final GL feature.

There is no point to use anything other than OpenGL (ES) 3+ for eternity.

Copying the wheel over and over is the trap.

I disagree with almost everything said about Vulkan in that post.

Ok... care to explain why?

I also disagree with some statements about Vulkan API. My two cents: > The docs are written in a sort of alien English that fosters no understanding— but it's also written exactly the way a hardware implementor would want in order to remove all ambiguity about what a function call does. In short, Vulkan is not for you. It is a byzantine contract between hardware manufacturers and middleware providers, and people like… well, me, are just not part of the transaction.

Vulkan is a Standard. It uses standardese language to define things. It defines all the relevant concepts like instance, physical device, logical device, command pools and buffers, etc. (Vulkan is not that different in this regard from other standards like C Language Standard, or C++ Language Standard).

The Standard also explains how this different concepts relate to each other, how GPU and Host interoperate, how memory flows, how synchronization works.

For example, it explains that there is a thing called instance, then this instance allows you to enumerate physical devices and obtain their properties, it explains that such devices have queues which can execute work. Then one or more physical devices can be used to create a logical device, which is a sort of lightweight context object. Then this logical device is used for allocating different resources, building pipelines, etc.

> Every Vulkan call involves passing in one or two huge structures which are themselves a forest of other huge structures, and every structure and sub-structure begins with a little protocol header explaining what it is and how big it is. Before you allocate memory you have to fill out a structure to get back a structure that tells you what structure you're supposed to structure your memory allocation request in....

Vulkan IS cumbersome to write using just C API, yes, but when you write actual program you usually wrap this functionality in a thin layer, and then suddenly everything becomes a lot less cumbersome. For example, you write initialization code exactly once, tailored for you specific needs, then wrap it in a function.

Construction of all of the pipelines, render passes, command buffers, etc. can also be wrapped in much the same way.

For example, construction of buffers and textures requires 3 steps: 1. creation of buffer/image object; 2. memory allocation; 3. binding memory to the object.

This functionality can be trivially wrapped in a function, then used many times. Then there are some libs like VulkanMemoryAllocator which simplify writing Vulkan even more. Memory synchronization is explained in https://github.com/KhronosGroup/Vulkan-Docs/wiki/Synchroniza...

The thing is, Vulkan requires planning. So if you want to write some simple game which displays sprites on the screen, then you are better off with something OpenGL-like. But once you need an optimized rendering pipeline with HiZ culling, minimal CPU-GPU memory transfers, etc., then Vulkan is the way to go. You can use newer OpenGL versions for this, but the code becomes much like something you would write using Vulkan API anyway.

> Vulkan is a Standard. It uses standardese language to define things. It defines all the relevant concepts like instance, physical device, logical device, command pools and buffers, etc.

FWIW i also disagree with the article on that front. I wrote a simple demo of Vulkan[0] the day the specs became available and was able to learn it (or enough of it to write the demo) just by reading the specs themselves - i remember finding them very easy to read. However i already had graphics programming knowledge using OpenGL for many years before that - but it isn't like someone is going to learn graphics programming from OpenGL's spec either, there is/was the Red Book for that.

That said, i agree that Vulkan itself is cumbersome to write - and really why i never bothered with it after writing the demo and getting my feet wet with it. IMO...

> Vulkan IS cumbersome to write using just C API, yes, but when you write actual program you usually wrap this functionality in a thin layer, and then suddenly everything becomes a lot less cumbersome. [..] Construction of all of the pipelines, render passes, command buffers, etc. can also be wrapped in much the same way.

...having to paper over an API to hide its guts doesn't exactly sound good for that API's design. Personally at least i never felt i had to paper over OpenGL, or even the little (pre-12) Direct3D code i wrote.

[0] https://i.imgur.com/rd8Xk84.gif

> Vulkan is a Standard. It uses standardese language to define things.

As a counter example, the "D3D Engineering Specs" do the same job, but are much more readable:



> You can use newer OpenGL versions for this, but the code becomes much like something you would write using Vulkan API anyway.

Ironically, a lot of "modern" Vulkan might end up looking like old-school OpenGL again by using this new extension ;)


> As a counter example, the "D3D Engineering Specs" do the same job, but are much more readable:

DirectX is not a standard so I don't think that makes this comparable. Most likely fewer people will have to read and implement it.

The issue with Vulkan is that it's so cumbersome, I just don't want to use it at all. Tried to switch, but went back to OpenGL. There isn't anything in Vulkan that warrants that insane amount of overhead.

WebGPU is based on similar modern paradigms but it's an actually usable, sane, not overly verbose API. It's a pitty that WebGPU is deliberately gutted to make it a 5 year old smartphone graphics API because I'd prefer it over Vulkan any day.

I'm glad that Vulkan is taking some steps back on render passes, pipelines, etc. If they keep removing unnecessary entry barriers, I might eventually try the switch again in a few years.

> Vulkan IS cumbersome to write using just C API, yes, but when you write actual program you usually wrap this functionality in a thin layer, and then suddenly everything becomes a lot less cumbersome. For example, you write initialization code exactly once, tailored for you specific needs, then wrap it in a function.

It follows the tradition of Khronos APIs, having each developer writing that thin layer as rite of passage, one must earn their stripes to use Khronos APIs.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact