I think the parent is implying there are 1001 soc out there with some form of embedded gpu that probably have issues actually implementing webgpu. Like those in millions of Chinese tablets. Are they likely targets? Probably not now but in 5 years? Mainstream desktop hardware? No problem.
I also created glyphon (https://github.com/grovesNL/glyphon) which renders 2D text using wgpu and cosmic-text. It uses a dynamic glyph texture atlas, which works fine in practice for most 2D use cases (I use it in production).
I did something similar with cosmic-text and glium, but it would be fun to have a vector rendering mode to do fancier stuff with glyph outlines and transforms for games and 3D stuff. And open source, of course.
I suppose vello is heading there but whenever I tried it the examples always broke in some way.
wgpu has some options to access backend-specific types and shader passthrough (i.e., you provide your own shader for a backend directly).
Generally wgpu is open to supporting any Metal extensions you need. There's usually an analogous extension in one of the other backends (e.g., Vulkan, DX12) anyway.
I don't think that's accurate. Creating a shading language is obviously a huge effort, but there were already years of effort put into WebGPU as well as implementations/games building on top of the work-in-progress specification before the shading language decision was made (implementations at the time accepted SPIR-V).
The PoC was made in 2016, the work started in 2017, but the first spec draft was released on 18 May 2021. [1] This first draft already contained references to WGSL. There is no reference to SPIR-V.
Why did it take this long to release the first draft? Compare it to SDL_GPU timeline, start to finish in 6 months. Well, because the yak shaving on WGSL had already begun, and was eating up all the time.
Sure, but that proves my point. They took so long to decide upon the shading language that implementations had to erect a separate scaffolding just to be able to test things out.
Scaffolding wasn’t a problem at all. Both used SPIRV-Cross for shader conversions at the time and focused on implementing the rest of the API. The shading language barely matters to the rest of the implementation. You can still use SPIR-V with wgpu on its Vulkan backend today for example.
> Only 10k non-overlapping triangles can bring my RTX GPU to its knees
Your benchmark doesn't match the experience of people building games and applications on top of WebGPU, so something else is probably going on there. If your benchmark is set up well, you should be limited by the fill rate of your GPU, at which point you should see roughly the same performance across all APIs.
> On my computer, the average Unity game with shadows, shaders 'n stuff takes 5% GPU and a simple WebGPU demo takes 7%.
GPU usage isn't a great metric for performance comparisons in general because it can actually imply the inverse depending on the test case. For example, if the scenes were exactly the same, a lower GPU usage could actually suggest that you're bottlenecked by the CPU, so you can't submit commands fast enough to the GPU and the GPU is sitting idle for longer while it waits.
So you test the difference between them with technically the same code.
(They can get 78k birds, which is way better than my triangles, because they batch 'em. I know 10k drawcalls doesn't seem good, but any 2024 computer can handle that load with ease.)
They're 10k triangles and they're not overlapping... There are no textures per se. No passes except the main one, with a 1080p render texture. No microtriangles. And I bet the shader is less than 0.25 ALU.
> at which point you should see roughly the same performance across all APIs.
Nah, ANGLE (OpenGL) does just fine. Unity as well.
> a lower GPU usage could actually suggest that you're bottlenecked by the CPU
No. I have yet to see a game on my computer that uses more than 0.5% of my CPU. Games are usually GPU bound.
I think a better comparison would be more representative of a real game scene, because modern graphics APIs is meant to optimize typical rendering loops and might even add more overhead to trivial test cases like bunnymark.
That said though, they're already comparable which seems great considering how little performance optimization WebGPU has received relative to WebGL (at the browser level). There are also some performance optimizations at the wasm binding level that might be noticeable for trivial benchmarks that haven't made it into Bevy yet, e.g., https://github.com/rustwasm/wasm-bindgen/issues/3468 (this applies much more to WebGPU than WebGL).
> They're 10k triangles and they're not overlapping... There are no textures per se. No passes except the main one, with a 1080p render texture. No microtriangles. And I bet the shader is less than 0.25 ALU.
I don't know your exact test case so I can't say for sure, but if there are writes happening per draw call or something then you might have problems like this. Either way your graphics driver should be receiving roughly the same commands as you would when you use Vulkan or DX12 natively or WebGL, so there might be something else going on if the performance is a lot worse than you'd expect.
There is some extra API call (draw, upload, pipeline switch, etc.) overhead because your browser execute graphics commands in a separate rendering process, so this might have a noticeable performance effect for large draw call counts. Batching would help a lot with that whether you're using WebGL or WebGPU.
> I think a better comparison would be more representative of a real game scene, because modern graphics APIs is meant to optimize typical rendering loops and might even add more overhead to trivial test cases like bunnymark.
I know, but that's the unique instance where I could find the same project compiled for both WebGL and WebGPU.
> Either way your graphics driver should be receiving roughly the same commands as you would when you use Vulkan or DX12 natively or WebGL, so there might be something else going
Yep, I know. I benchmarked my program with Nsight and calls are indeed native as you'd expect. I forced the Directx12 backend because the Vulkan and OpenGL ones are WAYYYY worse, they struggle even with 1000 triangles.
> That said though, they're already comparable which seems great considering how little performance optimization WebGPU has received relative to WebGL (at the browser level).
I agree. But the whole internet is marketing WebGPU as the faster thing right now, not in the future once it's optimized. The same happened with Vulkan but in reality it's a shitshow on mobile. :(
> There is some extra API call (draw, upload, pipeline switch, etc.) overhead because your browser execute graphics commands in a separate rendering process, so this might have a noticeable performance effect for large draw call counts. Batching would help a lot with that whether you're using WebGL or WebGPU.
Aha. That's kinda my point, though. It's "Slow" because it has more overhead, therefore, by default, I get less performance with more usage than I would with WebGL. Except this overhead seems to be in the native webgpu as well, not only in browsers. That's why I consider it way slower than, say, ANGLE, or a full game engine.
So, the problem after all is that by using WebGPU, I'm forced to optimize it to a point where I get less quality, more complexity and more GPU usage than if I were to use something else, due to the overhead itself. And chances are that the overhead is caused by the API itself being slow for some reason. In the future, that may change. But at the moment I ain't using it.
> It's "Slow" because it has more overhead, therefore, by default, I get less performance with more usage than I would with WebGL.
It really depends on how you're using it. If you're writing rendering code as if it's OpenGL (e.g., writes between draw calls) then the WebGPU performance might be comparable to WebGL or even slightly worse. If you render in a way to take advantage of how modern graphics APIs are structured (or OpenGL AZDO-style if you're more familiar), then it should perform better than WebGL for typical use cases.
The problem is that it's gonna be hard to use WebGPU in such cases, because when you go that "high" you usually require bindless resources, mesh shaders, raytracing, etc, and that would mean you're a game company so you'd end up using platform native APIs instead.
Meanwhile, for web, most web games are... uhhh, web games? Mobile-like? So, you usually aim for the best performance where every shader ALU, drawcall, vertex and driver overhead counts.
That said, I agree on your take. Things such as this (https://voxelchain.app/previewer/RayTracing.html) probably would run way worse in WebGL. So, I guess it's just a matter of what happens in the future and WebGPU is getting ready for that! I hope that in 10 years I can have at least PBR on mobiles without them burning.
Mobile is where WebGPU has the most extreme performance difference to WebGL / WebGL2.
I'm not convinced by any of these arguments about "knowing how to program in WebGPU". Graphics 101 benchmarks are the entire point of a GPU. Textures, 32bit data buffers, vertices, its all the same computational fundamentals and literally the same hardware.
> I'm not convinced by any of these arguments about "knowing how to program in WebGPU". Graphics 101 benchmarks are the entire point of a GPU.
You're totally right that it's the same hardware, but idiomatic use of the API can still affect performance pretty drastically.
Historically OpenGL and DX11 drivers would try to detect certain patterns and fast path them. Modern graphics APIs (WebGPU, Vulkan, DX12, Metal) make these concepts explicit to give developers finer grained control without needing a lot of the fast path heuristics. The downside is that it's easy to write a renderer targeting a modern graphics API that ends up being slower than the equivalent OpenGL/DX11 code, because it's up to the developer to make sure they're on the fast path instead of driver shenanigans. This was the experience with many engines that ported from OpenGL to Vulkan or DX11 to DX12: performance was roughly the same or worse until they changed their architecture to better align with Vulkan.
Simple graphics benchmarks aren't a great indicator for relative performance of graphics APIs for real use cases. As an extreme example, rendering "hello triangle" for Vulkan vs. OpenGL isn't representative of a real use case, but I've seen plenty of people measure this.
Mostly because their drivers suck, and don't get updates.
Android 10 made Vulkan required, because between Android 7 and 10, most vendors didn't care, given its optional status.
Android 15 is moving into OpenGL on top of Vulkan, because yet again, most vendors don't care.
The only ones that care are Google with their Pixel phones (duh), and Samsung on their flagship phones.
There is also the issue that by being NDK only, and not having managed bindings available, only game engine developers care about Vulkan on Android.
Everyone else, devs would still have better luck targeting OpenGL ES than Vulkan, given the APIs and driver quality, which isn't a surprise that now Google is trying to push for a WebGPU subset on top of OpenGL ES.
> I have yet to see a game on my computer that uses more than 0.5% of my CPU.
Just a nitpick here, you probably have some multicore CPU while the render-dispatch code is gonna be single threaded. So that 0.5% you're seeing is the percent of total CPU usage, but you probably want the % usage of a single core.
Apparently "Bevy's rendering stack is often CPU-bound"[0], so that would make sense.
To be fair that quote is somewhat out of context, but it was an easy official source to quote and I've heard the same claim repeated elsewhere too. (I'm not a Bevy user but am using Rust for fulltime indie game dev, so discount this comment appropriately.)
> The webgpu and webgl apis are pretty different so im not sure you can call it “technically the same code”.
Isn't Bevy using WGPU under the hood, and then they just compile with it both WebGL and WebGPU? That should be the same code Bevy-wise, and any overhead or difference should be caused by either the WGPU "compiler" or the browser's WebGPU.
Yes but also no. WebGL lacks compute shaders and storage buffers, and so has a different path on WebGL than WebGPU. A lot of the code is shared, but a lot is also unique per platform.
---
This is also as good a place as any, so I'll just add that doing 1:1 graphics comparisons is really, _really_ hard. OS, GPU driver, API, rendering structure, GPU platform, etc all lead to vastly different performance outcomes.
One example is that something might run at e.g. 100 FPS with a few objects, but 10 FPS with more than a thousand objects. A different renderer might run at 70 FPS with a few objects, but also 60 FPS with a few thousand objects.
Or, it might run well on RDNA2/Turing+ GPUs, but terribly on GCN/Pascal or older GPUs.
Or, maybe wgpu has a bug with the swapchain presentation setup or barrier recording on Vulkan, and you'll get much different results than the DirectX12 backend on AMD GPUs until it's fixed, but Nvidia is fine because the drivers are more permissive about bugs.
I don't trust most verbal comparisons between renderers. The only real way is to see if an engine is able to meet your FPS and quality requirements on X platforms out of the box or with Y amount of effort, and if not, run it through a profiler and see where the bottleneck is.
There's plenty of room for both approaches: a lot of projects can benefit from using a platform-agnostic API like WebGPU (web or native) directly, others might want to use engines. Anecdotally I use WebGPU (through wgpu) in a commercial application for a visualization, and would've never bothered to apply Vulkan or DX12 for that otherwise.
Documentation will keep improving with time. There have already been a number of high-quality tutorials and references created over the past few years, for example:
Vectors contain buy/sell orders and are sorted by price, the keys of the hashmap were different securities. Buy orders and sell orders lived in separate vectors
reply