Seeing that the author of Blade (kvark) isn't exactly a 3D API newbie and also worked on WebGPU I really wonder if a switch to wgpu will actually have the desired long term effect. A WebGPU implementation isn't exactly slim either, especially when all is needed is just a very small 3D API wrapper specialized for text rendering.
Cross API graphics abstractions are almost always a bad idea even if its just wrapping modern DX12 and Vulkan, and always always are when Metal comes into the mix.
Kvark was leading the engineering effort for wgpu while he was at Mozilla.
But he was doing that on his work time and did so collaborating with other Mozilla engineers, whereas AFAIK blade has been more of a personal side project.
WebGPU has some surprising performance problems (although I only checked Google's Dawn library, not Rust's wgpu), and the amount of code that's pulled into the project is massive. A well-made Metal renderer which only implements the needed features will easily be 100x smaller (in terms of linecount) and most likely faster.
There is also the issue that it is designed with JavaScript and browser sandbox in mind, thus the wrong abstraction level for native graphics middleware.
I am still curious how much uptake WebGPU will end up having on Android, or if Java/Kotlin folks will keep targeting OpenGL ES.
For a text editor it's definitely good enough if not extreme overkill.
Other then that the one big downside of WebGPU is the rigid binding model via baked BindGroup objects. This is both inflexible and slow when any sort of 'dynamism' is needed because you end up creating and destroying BindGroup objects in the hot path.
The modern Vulkan binding model is relatively fine. Your entire program has a single descriptor set containing an array of images that you reference by index. Buffers are never bound and instead referenced by device address.
Python had already exploded in popularity in the early 2000s, and for all sorts of things (like cross-platform shell scripting or as scripting/plugin system for native applications).
> GPUs, from my understanding, have lost the majority of fixed-function units as they’ve become more programmable.
That would be nice but doesn't match reality unfortunately, there are even new fixed-fuction units added from time to time (e.g. for raytracing).
Texture sampling units also seem to be critical for performance and probably won't go away for a while.
It should be possible to hide a lot of the fixed-function magic behind high level GPU instructions (e.g. for sampling a texture), but GPU vendors still don't agree about details like how the texture and sampler properties are managed on the GPU (see: https://www.gfxstrand.net/faith/blog/2022/08/descriptors-are...).
E.g. the problem isn't in the software, but the differing hardware designs, and GPU vendors don't seem to like the idea of harmonizing their GPU architectures and they're also not a fan of creating a common ISA as compatibility shim (e.g. how it is common for CPUs). Instead the 3D API, driver and highlevel shader bytecode (e.g. SPIRV) is this common interface, and that's how we landed at the current situation with all its downsides (most of the reasons are probably not even technical, but legal/strategic - patents and stuff).
Thanks for the link to the post. I also watched her talk posted elsewhere in these comments. We’re lucky to have people like her doing the hard work for free software.
> most of the reasons are probably not even technical, but legal/strategic - patents and stuff
I think fighting for specified interoperable interfaces is important and we must be vigilant again forces that undermine this, either knowingly or through ignorance.
Wow, you should get NVIDIA, AMD and Intel on the phone ASAP! Really strange that they didn't come up with such a simple and straightforward idea in the last 3 decades ;)
DXGI+D3D11 via C is actually fine and is close or even lower than Metalv1 when it comes to 'lines of code needed to get a triangle on screen". D3D12 is more boilerplate-heavy, but still not as bad as Vulkan.
Except it didn't. In the GL programming model it's trivial to accidentially leak the wrong granular render state into the next draw call, unless you always reconfigure all states anyway (and in that case PSOs are strictly better, they just include too much state).
The basic idea of immutable state group objects is a good one, Vulkan 1.0 and D3D12 just went too far (while the state group granularity of D3D11 and Metal is just about right).
> Similarly, WebGPU could have done without that static binding mess.
This I agree with, pre-baked BindGroup objects were just a terrible idea right from the start, and AFAIK they are not even strictly necessary when targeting Vulkan 1.0.
There should be a better abstraction to solve the GL state leakage problem than PSOs. We end up with a combinatory explosion of PSOs when some states they abstract are essentially toggling some bits in a GPU register in no way coupled with the rest of the pipeline state.
That abstraction exists in D3D11 and to a lesser extent in Metal via smaller state-group-objects (for instance D3D11 splits the rende state into immutable objects for rasterizer-state, depth-stencil-state, blend-state and (vertex-)input-layout-state (not even needed anymore with vertex pulling).
Even if those state group objects don't match the underlying hardware directly they still reign in the combinatorial explosion dramatically and are more robust than the GL-style state soup.
AFAIK the main problem is state which needs to be compiled into the shader on some GPUs while other GPUs only have fixed-function hardware for the same state (for instance blend state).
> Except it didn't. In the GL programming model it's trivial to accidentially leak the wrong granular render state into the next draw call
This is where I think Vulkan and WebGPU are chasing the wrong goal: To make draw calls faster. What's even faster, however, is making fewer draw calls and that's something graphics devs can easily do when you provide them with tools like multi-draw. Preferably multi-draw that allows multiple different buffers. Doing so will naturally reduce costly state changes with little effort.
reply