Hacker News new | past | comments | ask | show | jobs | submit login
Real-Time Ray-Tracing in WebGPU (maierfelix.github.io)
117 points by Schampu 3 days ago | hide | past | web | favorite | 31 comments





I read Practical Parallel Rendering (1st Edition: 2002) quite a long time ago, on someone's recommendation. There's quite a substantial section on how to build and manage effective job queues that's worth a read even if you don't do any CGI.

But there's also a thesis in there: given that scene descriptions grow in size much faster than screen resolution increases, there should be a tipping point where ray-tracing is more efficient than rasterization. I don't think they expected it to take quite this long though.


"Practical Parallel Rendering" is a great book for anyone. It really is "Parallel Work Distribution Options: The Book".

The argument that "Ray tracing is logarithmic in scene complexity while rasterization is linear. Therefore, ray tracing will win eventually!" ignores the fact that rasterizers also use hierarchical graphs to maintain logarithmic complexity just like ray tracers do. You could make the same argument if you compared a well-designed rasterization-based system vs. a naive ray tracer that brute-forces every triangle vs. every pixel.

The difference is really a focus on local vs. non-local data. Rasterizers focus on preparing data ahead of time so that it can be directly indexed without searching. Ray tracers focus on making global searching as fast as possible. Rasterizers do more work at the start of a frame (rendering shadow, reflection, ambient occlusion maps). Ray tracers do more work in the middle of the frame (searching for shadowing/reflecting/occluding polygons).

It's wonderful that we finally have both accelerated in hardware. HW ray tracing still has a very long way to go. Currently, budgets are usually less than 1 ray per pixel of the final frame in real time apps! Figure that out how to use that effectively! :D But, it still opens up many new possibilities.


I (lycium) wrote a bunch on this topic on a recent thread on reddit r/hardware, with many similar points: https://www.reddit.com/r/hardware/comments/enn41z/when_do_yo...

Isn’t < 1 day per pixel totally ok because of the DNN denoising kernels? One could even train the de loser on offline computer frames indicative of the specific scenes.

Yes, so the primary test, "Does this triangle cover this pixel" is the same between rasterization and ray-tracing. So the fundamental difference is the outer loop.

Rasterization goes for each triangle, which pixels does this intersect, whereas for ray-tracing it's for each pixel which triangles does this intersect.

Clearly, this allows us to build lookup data structures over the inner loop. So back of the hand ray-tracing becomes more efficient when you have more triangles than pixels (give or take an order of magnitude or so due to algorithmic differences)

So, the complexity of real-time content development has slowed due to multiple of reasons, not the least of which being that rasterization (especially modern GPU rasterization with quad based shading) is poorly suited to scenes where polygons approach pixels in coverage. And more importantly, we've hit the polygon density where we appear to be getting better visual quality gains by spending cycles on improved shading rather than more polygons.

Then add in the transition to 4K and we get a much larger pile of pixels once again changing the math all over again.

I don't know what this means for the future. I suspect we won't see much increase in scene complexity until we get to the cliff where ray-tracing is then viable, then I imagine scene complexity for real-time scenes will make a big jump.

Keep in mind, what you read is with offline CGI in mind. And we've already hit that tipping point for offline rendering. Even the last major rasterization hold-out (PRMAN) has switched to path-tracing even primary rays. It's just that real-time rendering is a bit of a different beast.


Rasterization has some niceties for things like cache coherency. Since you're rasterizing a triangle at a time, all with the same shaders, textures, and buffers, SIMD techniques used in GPUs is very effective. But when you have ray-tracing, any ray can hit any triangle. If you cast 64 rays and each one hits a different triangle, you lost your entire parallelism. You possibly have to give up on SIMD if the material models between the triangles are different.

For this reason, and for the reason of real-time ray budgets barely approaching 1 ray per pixel at 1080p on the highest cards, ray-tracing tends to mostly be used for specular effects at this time.

Even in non-real-time workloads, this was a massive time sink until ray sorting and batching entered the common practice, collecting rays that hit a single coherent area to be processed at once. And the current RTX model has no strong support for ray batching.


I would have thought that for physically based shaders, most polygons are the same shader with different material parameters, wouldn't that allow SIMD techniques to continue to work?

there should be a tipping point where ray-tracing is more efficient than rasterization

In a previous life I worked for a company that was developing real-time ray tracing products. The founders had this magical algorithm but the catch was that dedicated ray tracing hardware was almost never successful because by the time the dedicated hardware made it to market general purpose processors had caught up.

However, what it seemed like to me was that the founders had developed a blazing fast algorithm that cut a ton of corners. Each time they'd fix an edge case the product got slower. Regardless they were moderately successful and might still be around.

And then there was the time I accidentally nuked all of our internal infrastructure in the middle of a product release demo.


> However, what it seemed like to me was that the founders had developed a blazing fast algorithm that cut a ton of corners.

No idea if it's your company, but this bit sort of reminds me of Euclideon demos: https://www.youtube.com/watch?v=DrBR_4FohSE


Nope that's a different company.

> there should be a tipping point where ray-tracing is more efficient than rasterization.

How so? Wouldn't both scale logarithmically with respective hierarchical acceleration structures? In what way does ray tracing scale better?


>>> Recently I began adapting an unofficial Ray-Tracing extension for Dawn, which is the WebGPU implementation for Chromium

Wow, very impressive! I believe this is only available for MacOS Chrome Canary, with enable-unsafe-webgpu flags toggled on. But we are starting to see more example code.

https://github.com/tsherif/webgpu-examples

This is the first specific RTX target engine I've seen so far though. Starting to feel like the future with full time real-time hardware rendering capabilities in the browser ;)

Do you mind my asking what you plan to build with it?


Hey thanks for you comment.

The Ray-Tracing Extension is currently only available for Windows and Linux.

My next plan is to implement the extension into Dawn's D3D12 backend, so I can build chromium with my Dawn fork and have Ray-Tracing available directly in the browser (at least for myself) :)


Hey, I really appreciate your work on these bindings. I have done a lot of work with Metal on iOS, and found it frustrating to start over from scratch when trying to combine graphics and deep learning (CUDA on Linux). It would be awesome to see a future where you can write high-performance GPU-driven apps in a cross-platform way with js/webgpu/wasm and slap in platform-specific UI, without bundling all of Unreal or Unity. Running in a browser would be an optional convenience.

Anecdotally, I was trying connect PyTorch's CUDA tensors to the GL textures that Electron/Chrome uses to render in a Canvas without going through CPU memory, but couldn't figure out where to inject my code. Chromium's GPU code is quite a maze. Perhaps a smarter person will be able to accomplish that.


You may be interested in this Chromium fork that Intel is working on which adds a machine learning API, loosely modeled on Android's NNAPI: https://github.com/otcshare/chromium-src/commits/webml

It's not likely to be standardized as is, but the code demonstrates how to integrate something like this into Chromium. There's a Web ML community group that's working to figure out what could be standardized in this area. https://webmachinelearning.github.io/


Have you tried tensorflow.js?

Now if only Apple would get off their high horse and allow SPIRV so that the WebGPU standard can go forward.

Totally! I share your frustration.

And I'm cautiously optimistic that the recent conversation the GPU Web working group had with the Khronos liaison will spur some SPIR-V progress.

https://docs.google.com/document/d/1F6ns6I3zs-2JL_dT9hOkX_25...

The meeting notes also reveal a clue as to why Apple might be pushing WSL so hard:

> MS: Apple is not comfortable working under Khronos IP framework, because of dispute between Apple Legal & Khronos which is private. Can’t talk about the substance of this dispute. Can’t make any statement for Apple to agree to Khronos IP framework. So we’re discussing, what if we don’t fork? We can’t say whether we’re (Apple) happy with that.

> NT: nobody is forced to come into Khronos’ IP framework.


We sincerely think a text-based language is better for the web. It’s honestly weird that anyone thinks a binary format is a good webby choice. Both our browser engine people and our GPU software people agree.

Khronos basically said in that meeting that it would be fine to fork SPIR-V, which would solve Apple’s and Microsoft’s issues with their IPR framework. We’ve also discussed using a textual form of the SPIR-V format. We’ve offered all sorts of compromises. It’s Google that isn’t willing to budge, even stating in a WebGPU meeting that they never even considered what compromises would be acceptable to them. Encourage Google to be open to meeting in the middle and maybe we will get somewhere.


Why would a shader byte code standard prevent text based shaders?

The only difference would be that (runtime) compilation happens in an optional Javascript / WASM module instead of being baked into the standard and browsers, and Apple can ship such a WHLSL compiler with Safari, so that startup is faster there for WebGPU code which uses WHLSL shaders, but please don't force it on everybody else.

And the "web people" are used now to compile stuff for quite some time (see TypeScript, WebPack, JS Minifiers, etc etc). Just one more compile step won't make much of a difference, but offer a lot of freedom.

(I guess in the end, byte-code versus text representation isn't that much different, but it is important that it is a good compilation target, let's not repeat the long journey from transpiled JS to asm.js to WASM).


As someone who really wants to use this stuff, I just want mommy and daddy to stop fighting. I personally lean towards a SPIR-V approach for a number of reasons, including faster shader compilation and a more unified ecosystem. I also see the advantages of a high level text format, but feel like that could get polyfilled in 100k or so of wasm in the most important cases.

But it's clear that either approach can work. From what I'm seeing so far, flipping a coin and doing "disagree and commit" would be a healthier process than what's going on right now.


I do agree that a text based format is better in many cases. A byte code format would suck for debugging. People want to type their shaders directly into a browser window and have compilation happen in less than a frame. Shader compilation needs to be fast enough that it takes less than a frame because people do have things like colour pickers in their in-browser shader editor where you can drag the value around and it changes the text. To emulate this without browser support would require a lot of work.

But that text based format should be GLSL because that's what everybody's shaders are already written in for WebGL and obviously there will be a transition period where both WebGL and WebGPU will have to be supported (which is easy since most people use a library such as Babylon or Three).

Having a text based language that is not GLSL is pointless IMO. You have the drawbacks of both a bytecode language (need to ship a compiler with the page to compile GLSL into WSL) and textual formats.

As an outsider the most logical option is to support both GLSL and SPIR-V.


Would that be a textual format that is essentially already compiled, or do you mean a textual format that is just source (like glsl)? The latter historically has had a bunch of issues with compiler incompatibility across vendors, so I assume you mean the former. But what is then the benefit of having essentially textual bytecode compared to binary?

> We sincerely think a text-based language is better for the web.

I believe that you and the webkit/metal devs are sincere in that. Apple has a culture of human focused design. And I can tell that a lot of work went into making WSL a good, human friendly GPU shader language that would meet a lot of peoples' needs.

> It’s honestly weird that anyone thinks a binary format is a good webby choice.

Okay so is it weird that common image formats like PNG or JPG are binary formats? Those are the best "webby choices" for images, because they are capable of representing the necessary information to display high quality images on many platforms, and can be generated from a whole vast range of tools that export them. It's not weird at all. You wouldn't argue that the most "webby" choice of image format is ascii text images, just because someone can edit them with a text editor. PNG gives you access to a whole ecosystem of image editing tools that do stuff a text editor won't ever be able to do, producing images that ascii will just never realistically duplicate, even if theoretically it somehow could.

¯\_(ツ)_/¯

What does SPIR-V give us that makes it a good "webby choice" even though it's a binary format? It's a portable, cross-platform Rosetta Stone of GPU functionality, in ways that a high level language like WSL or GLSL by definition can never be. They are just not the same.

The tooling that is already beginning to emerge around SPIR-V will enable possibilities that WSL just won't realistically ever be able to enable, ever. It's highly unlikely that people will develop transpilers for WSL that are as flexible and seamless as the compilers people are already making for SPIR-V. WSL is too opinionated. It's an inherently opinionated high level language.

And once the GPU Web project gives the go-ahead, people are gonna be even more motivated to work on SPIR-V tooling. It will give shader and GPU compute developers freedom to choose from a whole range of high level constructs and preexisting tooling from whatever language best fits their specific application, so that they can focus energy on their actual problems instead of just wrestling with the tooling. On top of that, people will still be able to use and quickly re-purpose shaders written in HLSL, GLSL or WSL.

WSL looks like a fine language for doing linear algebra in low dimensions so that you can rotate some colorful geometry for your brand or whatever.

But it cannot and will not ever be what SPIR-V is. Telling developers that WSL is plenty fine because your specific team of devs have that opinion is like one artist telling another that they ought to be perfectly content with crayons and markers. Sure, you can do some nice stuff with crayons and markers, but there's a whole world of visual art mediums out there once you remove that constraint. What about ceramics? And I've talked to quite a few developers who are excited about SPIR-V. You all might not be that into SPIR-V but you're burying your heads in the sand if you can't see that a lot of people are really into it.

Let me share why I'm so invested personally that I'd sit down and write this unusually long letter:

WebGPU is critical to the direction of some of the work I'm most excited about doing while I'm here on this planet. My goal is to produce interactive multimedia art using Clifford algebra and other interesting mathematical structures, targeting the web browser as a distribution platform. I've been working on this stuff for 10 years now and feel like I have another 10 years to go before I'm really ready to illustrate the stories and concepts I want to express. Soon I'd like to have some real tools that can enable collaboration with other artists and story tellers. But so far I've mostly just been been prototyping.

I've written code to visualize non-euclidean transformations and geometries using conformal geometric algebra, and to do computations in all kinds of other algebras using GLSL, Python (Sage/SymPy), Julia and C. I made a Blender modifier that does conformal inversion. I've written transpilation tools to convert SymPy expressions to GLSL, and used these to create generative art. I wrote an abstract algebra code generator that spits out algebraic glslify compilable modules from SymPy expressions.

But it's so cumbersome and I'm pretty tired. I bumped into the limits of OpenGL ES a while ago. I want to use features from newer versions of OpenGL on the web but it's not possible. WebGL is stalled. WebCL is a little better but it's even more stalled.

Plus, even if I could use the latest versions of OpenGL or OpenCL, generating code in these languages to do the math I want is really cumbersome. There are much better languages for doing math, and people have been writing all kinds of great math libraries in these languages for decades. I want to use those when I'm writing shaders and GPU compute programs. Realistically, I'm going to need to use existing tools if I want to complete the work I dream of, and SPIR-V is beginning to make it possible to use them in a cross platform way.

If WSL becomes the only possible shader language for WebGPU, I'll basically pivot away from the goal of targeting the web browser and move into cross platform desktop/mobile app development, which I'm just not nearly as excited about. It wouldn't be the end of my dreams, but it'd mean giving up on the web browser.

> We’ve offered all sorts of compromises. It’s Google that isn’t willing to budge.

On a personal level, I trust your sincerity and respect your passion, but having gone through a bunch of meeting minutes and I just didn't get that impression, and am left feeling pretty disappointed all in all with how Apple as a corporation has engaged in and publicly represented the process.

All that to say, I'm really optimistic about the progress you all have made in getting to the point of meeting with Khronos and hashing things out. Serious props on that.

And I hope you don't take my attitude toward WSL as being dismissive of it entirely. It honestly looks great for what it is. It just doesn't suit my needs.

https://drive.google.com/drive/folders/1tM5KAZ7S52cv3jnpPOZs...

---------

Relevant meeting minutes

Google/Mozilla people discussing WSL:

https://docs.google.com/document/d/1BOvJKklz-4PZN2StCU56nN4m....

Apple people making their case for WSL:

https://docs.google.com/document/d/1CmKo59tjZwmePVrFpHpIG0W5....

https://docs.google.com/document/d/1opv8MIK94DNIKU5qbgeqlkkT....

The GPU Web group prepares for the Khronos meeting:

https://docs.google.com/document/d/1RI5hgOFuOH0-v9MPxuoV6w2L....

Meeting with the Khronos laison:

https://docs.google.com/document/d/1F6ns6I3zs-2JL_dT9hOkX_25....


Is there any good WebGPU tutorials?

I've been wanting to play around with graphics programming for a while and the web is such a perfect platform due to cross platform compatibility and lower barrier for entry.


It's barely got any support ATM.

https://github.com/gpuweb/gpuweb/wiki/Implementation-Status

I doubt you'll see a lot of good tutorials for it until you start seeing it land in stable versions of browser (or even nightly builds tbh).

While starting to show it's age, WebGL should have a lot of tutorials and you can start working with it today.


I wonder if people will use WebGPU directly or use a higher level framework/library such as Babylon? Looks like it has support for it but doesn't seem everything is supported yet but haven't personally played with it.

I'm of the opinion that WebGPU should try not to be a friendly API on its own, but instead practically mandate a framework on top. This is the philosophy that the newer APIs like Vulkan/D3D12 adopt; that you should have a higher-level renderer driving it that's able to reason about your entire scene graph. I've suggested as much to the WG before, who roughly seem to agree.

[0] https://github.com/gpuweb/gpuweb/issues/171


I don't agree. The Vulkan and D3D12 APIs didn't have to be so programmer hostile, they're just badly designed APIs.

Thankfully, WebGPU took a lot of inspiration from the Metal API, and less from Vulkan and D3D12, and thus is usable without a "sanity layer".


That sanity layer ended up being yet another way to foster adoption of middleware engines.

I'm excited for it, but WebGPU is hardly close to ready yet. Right now they're still negotiating the shader language.

But if you want to learn some graphics programming on the web, I really recommend trying out regl. It's a great way to learn about graphics primitives without getting bogged down in the WebGL API directly.

https://github.com/regl-project/regl




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: