Hacker News new | past | comments | ask | show | jobs | submit login
Ray Tracing Essentials Part 7: Denoising for Ray Tracing (nvidia.com)
118 points by ibobev on May 11, 2020 | hide | past | favorite | 123 comments



This might be a stupid question, because I don't really know what I'm talking about.. But is there a way this could be used to improve physics engines? My understanding is that a physics engine generally uses continuous collision detection. I don't know if noise plays a roll in casting for collisions, but if you could cast less rays, and fill in the holes in the same way... that seems like it could improve performance there too?


Aren't the collision models already pretty simple as is? Simple shapes like boxes, spheres and lines, for which the collisions are fast to calculate. Don't know how it works if you have full blown meshes colliding though...


I wouldn't be surprised to see something like that.

Remember that Krishevsky's software implementation for Alexnet of making neural networks work with GPUs, wasn't initially intended by Nvidia either. But, look how that worked out.


The problem is that you might get a lot of instability.

The most obvious case: stack 20 boxes one slightly above other and let them fall.

If you will have random noise in there the simulation can become unstable/explode.


Yes. Here's a video that describes an approach using neural networks to estimate physics instead of using a traditional simulation: https://www.youtube.com/watch?v=atcKO15YVD8


Yes. I think I’ve seen some preliminary papers with related ideas (because that seems like where ML is at) but I don’t remember anything substantial.

It could be an interesting area of research. Start branching out from ray tracing to other integration problems.


I dont know, it depends on implementation. Collision detection doesn't involve any graphics, it might not paralellise well. so it might still be easier on the CPU.


The tech is cool but I still can't shake the feeling that from a practical perspective, in the context of videogame graphics, that's a lot of work for a meager difference in the end result compared to good old rasterizer shading. Sure, you get more accurate shadows and reflections... But I'd gladly trade that for more detailed models and larger, more complex environments for instance.

Maybe it'll become the new standard for real time 3D graphics in the future but for the time being I file it next to HairWorks as an nvidia gimmick whose main purpose is to make the competition look worse in benchmarks because they don't implement that API when it practice the visual difference is fairly subtle.


In addition to the examples other have given where it improves the visuals, raytracing can make developing a game easier.

Currently, game developers spend huge amounts of time cooking up scene specific lighting hacks, things that make no physical sense but result in the visual appearance they are after.

Another hack is dynamically computing an environment map for a shiny object (that is, rendering the scene from the vantage point of the object) so it can be used as a texture to model first order reflections. It is infeasible to compute that dynamic environment map for every object, so the art department picks and chooses which assets get reflections and which don't. Or maybe an environment map is computed from the vantage point of the centroid of a car, which is used for all reflections (windshield, bumpers, roof, hood), and that works OK for objects which are far away from the car. But if there are any objects near the car, the reflection angles are wrong in different ways on different parts of the car.

With raytracing, a lot fewer hacks are needed. Yes, what nvidia is offering here is a hack in its own way (denoising, neural net upscaling too), but is one hack which applies to all cases, vs the old way of needing to create a different hack for each situation.


> scene specific lighting hacks, things that make no physical sense but result in the visual appearance they are after.

The flip side of this is that physically realistic shadows aren't always what you want. Any amateur photographer who has noticed just how awful hard shadows are on faces on clear sunny days has learned this lesson the hard way.

To quote a graphics professor of mine (tongue in cheek), "Physically accurate rendering is a crutch for people who aren't smart enough to cheat their way to the look they want".


If you want more detailed models and more complicated scenes, you should really look into raytracing.

The problem with raytracing is that it mostly depends on your resolution, not on your scene. Right now, we're just on the edge of being able to do raytracing at practical resolutions in realtime. But once that's possible, the rest is basically free. Rendering 10M polygons is almost as fast as rendering 1K polygons.

And writing a path tracer isn't that hard - most of the math is quite trivial. You get all the magic shadow / reflection / caustic / lighting stuff for free. We're getting quite close to being able to do movie-level graphics on consumer machines in realtime, and all that without any of the tricks required for okay-ish looking rasterizer shading. Best of all, it's even easier to work with for artists as well!


"Rendering 10M polygons is almost as fast as rendering 1K polygons"

That is correct for ray tracing.

But modeling and animating 10M polygons requires a Hollywood movie budget, while I can model and convincingly animate 1K polygons myself.


Of course, but at least now you have the ability to render it. "Black Ops 2" made $1B in 15 days, so some studios already have this budget, so they should be able to make use of it.

As for smaller players? Personally, I'd be interesting in procedural generation, or just modelling NURBS surfaces. And have you seen the demos of raytraced Minecraft / Doom? The highly advanced lighting already makes it look way more detailled, even with a low poly count. Even low budget games will be able to make good-looking games.


The Doom+Raytracing pictures look great, but I believe they are fake

https://www.dsogaming.com/news/doom-rtx-released-mod-bringin...

For Minecraft, the water is a bit nicer, but it still looks very unconvincing to me:

https://www.youtube.com/watch?v=nXoLY9lF-NI


Unless I'm missing something it's not a very fair comparison, you have low-res unshaded textures on one side and high-res bump-mapped textures on the other. That alone accounts for 99% of the difference between both versions.

Meanwhile current "AAA" games that have RTX support don't look radically different with RTX on or off. Better and more accurate reflections of course but also a big hit on performance that's, IMO, not worth it.

Of course that may just be because the devs don't bother to use the tech to its full potential when only a tiny fraction of the playerbase will be able to use that mode anyway.


High quality models are created via sculpting (5-10 million is actually still on the low end for that) before being retopologized. For anything high-fidelty (that is, anything you might encounter in a non-hobbyist project) it's _more_ work to create a model with a low polycount. The human work amount for animation is also mostly dependent on the rig, so if you're fine with a simple rig, animating a high polycount mesh isn't really any harder.

What kills it though is that even though raytracing scales well, animation/simulation does not.


I'd guess that rigging a 10 mio mesh will be pretty much impossible in 3ds max / Maya / Blender. Their viewports are not built to handle such an insane polycount.

Plus, I wouldn't know of any UV-mapping tool that can handle a 10 mio poly count.


Well, you could also use ZBrush for rigging. I think that multi res in Blender can work with up to 10 million polys (or maybe this is a work in progress?), but don't quote me on that.

I don't see why you wouldn't just go for vertex colors instead of UVs. Polypainting works fine even at 10 million works in all tools I've worked with (zbrush, substance painter, 3dcoat). If there's some reason you'd want UVs, you could just bake them to UVs afterward.


>But modeling and animating 10M polygons requires a Hollywood movie budget, while I can model and convincingly animate 1K polygons myself.

I can make 10M polygons just by opening blender and asking for a UV sphere with a bunch of sides... Artists using the sculpt tool can produce models with humongous numbers of polygons, and then they have to work really hard to optimize them. I would argue the exact opposite of your claim. Models with more polygons are easier to end up with than models with fewer.


I think the argument here is that creating quality assets with those levels of detail is expensive, not that you can create large and unwieldy meshes by not understanding how to model efficiently... that goes without saying.


"model efficiently" implies optimizing a benefit for a cost. In this case, I think the nature of the cost has completely changed. The definition of efficiency will probably have to change also.


Not really.

A 10 mio poly mesh with a typical XYZW+RGBA+UV+UV2 = 12 float stream format would require 12x4x10 mio = 480 MB of GPU memory per copy. Skinning needs weights (200 MB) and at least a second copy, so one single 10 mio poly mesh would eat up about 1.3 GB of GPU memory.

So even with ray-traycing, optimizing models for a manageable polycount will remain crucially important.


If you have static lights in a static world with pre-computed raytraced lights, yes it doesn't change that much. But once you have a dynamic world and dynamic lights, it's a game changer.

Minecraft with raytracing is a great example. It's so much better with raytracing compared to simple shaders.

I also heard that it will make game development a lot smoother and faster in the futur once most hardware will be powerful enough. No need to build lightmasses, no ambiant occlusion hacks and similar, no reflections with approximate results.


Full RT will be amazing for some things like a DOOM game where shadows in corridors make you stay alert, but otherwise I don't think it is a game changer in terms of gameplay.

What is a given is that there will be a flood of games overusing it, like it happened with Bloom and other effects.


It's actually kind of funny, because that's exactly how the 2004 Doom 3 turned out - defining the game-play based upon lighting (and monster closets) to build tension.


The original DOOM game (1989) already used it for that, that is why I mention it!

It even had flickering lights, pitch-black corridors, traps using light as bait, monster ambushes...

The first FPS with all that!


Totally true - I forgot about the pitch black areas.


This[1] demo does pretty convincing dynamic lighting without the need for RTX relying just on regular shaders. To me nvidias RTX looks like a gimmick with the sole purpose of enforcing vendor lock-in.

1.: https://www.youtube.com/watch?v=GtU2194C-D4


RTX is hardware acceleration for many ray tracing operations, how is that a gimmick?


That demo is a ray marching shader with hardly any geometry, and the geometry is procedural. You do understand this method can't be used to render video games or films?

> RTX looks like a gimmick with the sole purpose of enforcing vendor lock-in.

That's like saying a floating point unit, or a Google TPU, or even an Intel CPU is a gimmick to enforce vendor lock-in. If you want to do something faster in software, you can make hardware for it. Don't buy hardware if you don't need speed.


The most expensive step in ray marching is to find the distance which you will get for free after one rasterization pass via the z-buffer. With the z-buffer already present every effect used in this demo can be applied in the same way to purely rasterized graphics with even less cost.

E.g. Doom Eternal has very elaborated dynamic lighting effects but uses 100% forward rendering. RTX is completely pointless.


> The most expensive step in ray marching is to find the distance which you will get for free after one rasterization pass via the z-buffer.

Not true, you can't make that claim. The most expensive step depends entirely on what you render. The material can easily be more expensive than the SDF.

> With the z-buffer already present every effect used in this demo can be applied in the same way to purely rasterized graphics with even less cost.

You're completely forgetting that shadows are a thing.

> RTX is completely pointless.

I interpret that to mean that either 1- you're mad at Nvidia for various reasons not particularly related to RTX, or 2- you don't know what RTX actually does.


> The material can easily be more expensive than the SDF.

But you still don't have to calculate the SDF in the first place. Which means one step less computation for your material. Also geometries with high object counts are not as slow as it is the case with SDF.

> You're completely forgetting that shadows are a thing.

Those are being done via stencil buffers since doom3. And for non-point-like sources you have a bag of tricks available via shaders to make them soft. No RTX needed!

> you don't know what RTX actually does.

I know what RTX does. As long as the results are so subtle to the point where on some screenshot comparisons you can't even tell which one is with or without, RTX is completely pointless especially when it comes with a performance penalty.


Why are you arguing about SDFs? The demo you linked proves nothing about what either games do nor what RTX does, in the context of this thread it’s irrelevant whether z-buffers save you one iteration of ray marching.

Not needing a special shadow map or stencil volume or a big bag of tricky tricks for area lights is one of the reasons to use ray tracing & RTX.

The main point of RTX is faster ray tracing. You’re complaining about the screenshot marketing of specific games, not really demonstrating an understanding the tradeoffs of ray tracing. It’s fine if you don’t see any advantage to having ray tracing in Control or Battlefield, and/or don’t like the idea that some people see value in better visuals or easier development. That means you don’t like it. I see a point even if you don’t.


It is already being used in movie studios for a better rendering experience.

https://home.otoy.com/render/octane-render/

Fun fact, at GTC 2020 Otoy revealed that they are moving away from Vulkan and adopting CUDA instead (via Optix 7) due to better compute power and tooling for ray tracing algorithms.

https://developer.nvidia.com/optix


Funny that you mention HairWorks, because that one causes a lot of issues in practice because the hair is rendered in post-processing. Therefore, it is not present in the z-buffer so that alpha-blended transparent effects (e.g. smoke) may accidentally overlap with the hair. For an example of this issue, see the excellent Shadow of the Tomb Raider game.


A lot of Nvidia's new RTX tech may be marketed towards gaming but it's actually having an impact in 3D rendering. Cycles supports RTX out of the box now and I'm sure other renderers will/do as well. And the denoising tech is amazing, while not up to the challenge on final renders it's perfect for getting a quick frame upstream.


If nothing else the tech allows 3D artists and level designers to get instant feedback while they are tweaking a scene, without having to wait for a full render/lightmap bake - having iteration time go from minutes to seconds is a game changer for creative work like that.


But sadly, it doesn't work like that.

Let's say I render a reflective vase in V-Ray and use NVIDIA OptiX denoising. The ray-tracer will cast a few specular rays so that some pixels will get super bright from having a direct reflection of the light source, while other pixels will not yet receive specular illumination. The OptiX denoising will then blur things out, so that the entire vase is equally brightened up.

But that quick preview looks so significantly different from the final rendering result, that it is effectively useless and, of course, highly misleading. In the final result, only the parts of the surface where the curvature is "just right" will receive specular. In my preview, everything will.


You’re attempting to draw a broad conclusion from a contrived corner case example. All you have to do is wait a little longer for more specular samples. The question is whether you wait for less time with denoising than without to get a pretty good preview, and the answer is yes. Many artists are really happy with denoising, and would rather have it than not, regardless of whether the processor is a GPU or CPU.


Well I had to pick an example to illustrate my grief with denoising, but the problems are not limited to my example.

In general, if you take an AI to replace your noisy image with a noise free but plausible alternative, you should be aware that you're replacing content, so that this can change what you see to the point where it is not representative of the fully converged image anymore.


In general, it's still true that denoising a higher quality input will yield a higher quality (closer to resolved) output, so all you have to do is wait a little longer.

Denoising is typically applied in interactive environments where you don't get a static result, you get a new denoised result that includes more samples every frame. Undersampling is easy to spot because the denoised image jumps a lot every frame. Once it stops moving a lot, you can have a lot more confidence.

So the problem you're citing is not one that you're stuck with, and it's not a problem in practice.


I don't know if you noticed that one of the use cases I mentioned was level artists. 9 times out of 10, a level artist isn't trying to optimize some specular highlight in a vase, they are trying to see if this room is lit in a good way (aesthetically pleasing, the player can see the right things and so on). So I agree that there are use cases for which a fast denoised preview isn't useful, like the one you mention, but there certainly are ones for which it is quite sufficient, and infinitely better than waiting around for minutes waiting for the lightmaps to bake GI.


Using GPUs for offline rendering to gain speed is a dangerous fallacy. Dangerous because the GPU has many restrictions which developers have to invest time in working around.

I feel in recent years makers of offline renderers have spent more time dealing with these restrictions to get their renderers to work well with GPUs than actually improving their renderers. I.e. reduce level of noise, better algorithms for hair, volumes, etc.

The renderers produce way too much noise (or have prohibitive render times w/o except for big studios).

Solution? Denoising! Which is (mostly) using GPU-powered ML!

The irony of this seems lost on many freelancers who buy expensive GPU rigs to be able to deliver their jobs on the likes of OTOY or Redshift.

For this money you can buy a vastly superior CPU rig. If only your renderer could use it! Most people compare apples to oranges here. When I hear people rave about GPU offline renderers, I usually ask them:

- Have you made time to first pixel part of your comparison (complex scenes can take dozens of minutes before they start rendering on GPU renderers).

- Are you comparing images as they come out of renderer A before denoising to images of renderer B before denoising? Because anyone can pipe their shit through Intel Open Image Denoise after the fact. It's free even.

- Have you actually used a machine that has more than eight cores personally in production? If all your money goes into buying expensive graphics cards your CPU is usually shite.

A notable exception is 3Delight which doesn't even ship a denoiser. On simple scenes the renderer can't beat the GPUs but on anything a bit more complex -- the stuff you deal with in actual production -- it smokes the competition CPU & GPU alike.

Not least in the amount of noise it produces using comparable sampling settings. Without using any GPU compute.

When this GPU craze took over other vendors who's renderers were traditionally CPU bound too, in the last 5-6 years, 3Delight's developers have spent all their time finessing the performance and output quality w/o relying on GPUs. It shows.


The trend in renderers is extremely clear, nearly everybody is building a GPU version, and the makers and their users all report pretty big speedups with GPUs...

> Because anyone can pipe their shit through Intel Open Image Denoise after the fact. It’s free even.

The same is true of the OptiX denoiser. I’m not sure what point you’re making?

> Not least in the amount of noise it produces using comparable sampling settings. Without using any GPU compute.

The amount of noise has nothing to do with GPU vs CPU.


No. My scenes usually need 40+ GB of memory to render. Trying to put that onto a 11 GB RAM GPU will swap like hell and be excruciatingly slow.

It's rendering much faster on CPU.


Huh? No what? What are you referring to? Is this a reply to a different sub-thread?

> Trying to put that onto a 11 GB RAM GPU will swap like hell and be excruciatingly slow

The comment you replied to didn't mention memory, but what renderer are you currently using that will swap while using the GPU?

What are you referring to when you say "that", in "Trying to put that onto a 11 GB RAM GPU"?

Does the RAM limit of your GPU mean that mine will go slower too?


I am reading between the lines here but I think he is saying that the 'trend' you are talking about is a figment of your own and the imagination of some marketing folks from some companies -- combined.

I.e. it has no substance from a user's pov a priori (his point) and it has no substance from the pov of someone looking at the numbers of this a posteriori (my point, elsewhere in this tread, which I am happy to actually back up any time).

I mean a Quadro 8k RTX can stash 48GB. Standard on 3D artist workstations is 128GB today. Even my freelancer friends have this in their boxes now at the very least.

Go figure what is standard RAM size on render rigs on farms these days based on that ...

And that's not even considering compute restrictions on these GPU rigs that make them simply unfit for certain scenes.


Please do, instead of claiming you can back it up, please just do it, what are you waiting for?

The list of offline renderers adding GPU ray tracing support is pretty long. If you think the trend isn't real, then are you saying you believe the list isn't growing? If you think it's imagination, maybe you could produce the list of serious commercial renderers that are not adding GPU support, and perhaps evidence they're not currently working on it.

RenderMan, Arnold, Blender, Vray, Modo, RedShift, Iray, Clarisse, KeyShot, Octane, VRED, FurryBall, Arion, Enscape, FluidRay, Indigo, Lumion, LuxRender, Maxwell, Thea, Substance Painter, Mantra... pretty sure there are whole bunch more... not to mention Unreal & Unity.

It's quite true that memory limits are a serious consideration. Which is why, currently, GPU renderers that swap aren't generally a thing. They will be in the future, but right now you get CPU fallback, not swap. So seeing the claim about swap in the comment makes it suspect. Despite the trend and various improvements will continue to be a factor for a while as the limits improve. That doesn't change the trend. It means that preview is currently a bigger GPU workflow than final frame.


V-Ray on GPU will swap in the sense that it offloads textures out of the GPU and then re-uploads them later for another bucket while still rendering the same frame.

And you know, just because everyone is adding GPU support doesn't mean that professionals will switch their entire pipeline and render farms on their heads just to use it.

I acknowledge that they have GPU support and that some people like it, but I personally can usually not use it, so it is also not a purchase decision for me.

Plus, people already have large farms of high-memory high-CPU servers without GPUs, so switching would require lots of expensive hardware purchases.

And you usually render so many frames in parallel that it doesn't really matter if the single frame takes 5 minutes or 50 minutes. You just fire up 10x more servers and your total wait time remains the same.


> just because everyone is adding GPU support doesn't mean that professionals will switch their entire pipeline and render farms on their heads just to use it.

You’re right, it doesn’t. The fact is that it’s already happening with or without you. Widespread GPU support being added is a symptom of what productions are asking for, not the cause.


> The trend in renderers is extremely clear, nearly everybody is building a GPU version, and the makers and their users all report pretty big speedups with GPUs...

The 'trend' of the US government under president Trump is also extremely clear. Sorry, I couldn't resist. :)

TLDR; This 'trend' is not economically viable except for two parties. Makers of GPUs and companies renting out GPU rigs in the cloud.

Aka: it's just a trend. It's not that anyone sat down and really looked a the numbers. Because if they did this trend wouldn't exist.

It's also history repeating itself for those that do not learn from it. It will not go anywhere. Mark my words.

I've been there, in 2005/2006, when NVIDIA tried to convince everyone that we should buy their Gelato GPU renderer. I can elaborate why that went nowhere and why it will go nowhere again. But it's a tad off topic.


Feel free to elaborate, I have no idea what your point is here or what you mean with your non-sequitur about the government or how that relates to developers of 3d renderers in any way. I don't know what you mean by "it's just a trend." The fact is that there's evidence for my argument, and you're attempting to dismiss it without any evidence.

Comparing Gelato to RTX seems bizarre, they're not related, other than that they're both Nvidia products. Are you trying to say you distrust Nvidia? RTX already is a commercial success, and there are already dozens of games & rendering softwares using RTX on the market and hundreds more building on it. RTX already went somewhere.


> Comparing Gelato to RTX seems bizarre, they're not related, [...]

I did not compare RTX with Gelato. Where do you find the word 'RTX' in any of my replies?

I compared Gelato as a GPU-based offline renderer with other contemporary GPU offline renderers.


I am not talking about games. I am talking about offline rendering only.

My point was that indeed bizarre things can become trends and it doesn't make them less bizarre.


Cycles does the same work regardless of whether it's running on the CPU or GPU. You can even run mixed where both the CPU and GPU do rendering. The amount of noise has nothing to do with it running on the GPU or not, that's simply how path-tracing works...


Maybe you misunderstood me. I didn't peg the amount of noise to CPU vs. GPU rendering.

I said that these renderers produce too much noise. All of them. And the solution is not denoising. The solutions are novel algorithms that produce faster convergence. There are other issues of course. Noise is just one.

What I implied was that the developers would better spend their time fixing these than trying make stuff fit into a GPU.

This[1] is a comparison of light sampling in 3Delight, Autodesk's Arnold and Pixar's RenderMan (RMan).

TLDR; The bottom of the page has a 'Conclusion' section which is worth reading.

The page is from around 2017 if my memory serves me right and was a private page requiring a password at the time. Pixar asked 3Delight to not make it public until RMan 21 was out. After which 3Delight re-ran the tests and RMan came out worse even. They never published the results because they do not care about publicity.

Mind you, these were all pure CPU renderers at the time.

Few people know about this. What everyone is exposed to is just the marketing mumbo jumbo on the vendors websites.

As one can guess even skimming over this comparisons there are many things one must get right.

E.g. speed of convergence. I.e.: what is the threshold of samples I can get away with? Some renderers converge linearly. An image with twice as much samples will have half the noise. But others manage to converge non-linear in a good or a bad way (twice the samples give you more than twice the quality or less than twice the quality).

Another is bias: does the image look the same when it has fewer samples and you squint or is it darker/brighter than a reference that has 'infinite' samples?

Very important for artists doing look development: time to first pixel. The graphs on the website have not changed much with recent versions of Arnold & RMan, unfortunately.

Note that shaders wise 3Delight already used OSL (byte-code, run-time interpreted) whereas Arnold & RMan used C++ (at the time).

Basically, what I tried to say was: if I was Pixar or Autodesk (insert any other renderer vendor here that came from CPU land) and saw this page (which Autodesk & Pixar did then) ... considering that the competition is running on the same metal, a CPU – maybe I'd try to get to the same level of speed/quality before putting a bunch of new problems on my plate that come when you strive for making this all work on a GPU.

[1] https://www.3delight.com/documentation/display/3DSP/Geo+Ligh...


> And the solution is not denoising. The solutions are novel algorithms that produce faster convergence.

IMO, there are a few things you seem to be confused about.

1- There are no perfect sampling algorithms, the need for denoising will never go away. 3Delight's geo light sampling does not remove the need for denoising.

2- Denoising and better sampling are independent. You can do both, and everyone already is doing both.

3- 3Delight's geo light sampling is a pretty narrow use-case way to compare renderers. Games don't generally use geo lights at all, and films only rarely. What 3delight doesn't show you in that web page is how their sphere, quad & point light sampling algorithms compare to Pixar's & Arnold's, which is what everyone uses in practice. Spoiler alert: it's not better.

> Very important for artists doing look development: time to first pixel.

Your 3delight page doesn't say whether they're using binary or ascii .rib files. (How do I know it's being fair?) It doesn't compare realistic production scenes. It doesn't compare texturing quality, or how well the renderers perform under extreme memory usage. You are being sucked into some marketing, and missing the bigger picture.


1. Of course not. And I didn't say so.

2. I never contested that too. What I did say is that GPU porting of offline renderers is a waste of time better spent otherwise. And that denoising is solving a problem you would have less off if you didn't waste time with the former. And that 3Delight is proof of that.

3. I never said this was a broad use case. I was making a point in the context of ray-tracing which is the topic of this HNs discussion. But I do assure you: comparing these renderers by other means wouldn't make them look better.

> Your 3delight page doesn't say whether they're using binary or ascii .rib files.

This is completely unimportant. The resp. renderer plug-ins were used inside Maya and 'render' was pressed.

3Delight didn't need to generate RIB files. Neither did (P)RMan. The Ri API is a C binding that can talk directly to the renderer or spit out RIB. If Pixar uses this in their Maya plug-in, I dunno.

Every time I used RenderMan from inside Maya I was working at some facility that had their own exporter. Arnold: I have no idea if it writes .ass files or talks directly to the renderer.

In any case, if you checked the numbers on the page for building up the ray acceleration structure it would be obvious to you that any amount of file I/O with the example scene in question is negligible.

I know a bit about this. I was maintainer of the Liquid Maya rendering translator at Rising Sun Pictures and wrote the Affogato Softimage to RenderMan exporter there. I also wrote a direct in-memory export that made Nuke render with 3Delight using the Ri (called AtomKraft for Nuke) and was heavily involved in the Atomkraft for AfterEffects. So I can assure you RIB is not a factor in this comparison. Trust me. :)

Texturing quality is the same with the caveat that Pixar can only use power of two textures. Which can have adverse effects on memory use and all implications coming from that.

Meaning that if I have e.g. a 46k pixels per axis resolution texture I have to upsample that to 64k to get the same quality that I can get in 3Delight using ... well: a 46k texture.

Because the next closest power of two I could use with RMan, 32k, will mean I loose 14k of resolution (aka quality) per axis axis. And lastly, apart from that: if you compare texture cache efficiency between those two, I am happy to bet considerable money on who will come out on top. Guess. :]

> you are being sucked into some marketing, and missing the bigger picture.

What marketing? 3Delight doesn't have any and that page is not known and extremely hard to find even through google unless you enter exact search terms.

I do not need marketing to be a tad informed about these topics. I have been using Pixar's RenderMan for almost 20 years in production and 3Delight for a decade.


> What I did say is that GPU porting of offline renderers is a waste of time better spent otherwise.

Not true. You can get speedups of 10x-100x by moving to the GPU. There is no known alternative way to spend your time to achieve the same outcome with the same effort. Using ideal (unachievable) perfect importance sampling algorithms might give you 2x, in some cases, if you're lucky.

What is making you think that people aren't improving their rendering algos or renderers, or that GPUs are preventing other improvements?

> And that denoising is solving a problem you would have less off if you didn't waste time with the former. And that 3Delight is proof of that.

Not true. You can't get rid of the need for denoising when using Monte Carlo path tracing techniques, and 3Delight's geo light sampling does not in any way demonstrate that you can.


You get those 10x speedups only if stuff fits into GPU RAM. For movie cgi, that is usually not the case, meaning no GPU speedup.


That is currently changing with on-demand loading, streaming, and NVME technologies, larger GPU ram, and multi-GPU systems.


We are speculating here now. I though this discussion was about the status quo; not what we imagine soon to be.

Even if the bandwidth bottleneck gets solved in the future ... scenes from movie sets usually are big enough so transformations are stored as double precision matrices to avoid objects starting to jitter when they are far away from the origin.

Have you checked double precision GFLOPs on you favorite GPUs lately? And then compared those and their prices to some Ryzen 3970X CPU specs and their prices?


You're wasting your own money if you buy a CPU based on fp64 flops and then start using any of the renderers you've cited so far.


Yeah, right. Numbers?


Which renderers use fp64?


All of them.

Just for example in the PRMan 3.9 release notes from 2001, it says under 'Miscellaneous Changes': "Some parts of the renderer now use double-precision floating point arithmetic, to avoid round-off error."[1]

Your model-to-camera to camera matrix needs to be only single precision, usually.

But your model-to-world matrix needs to be double precision to avoid jittering.

So to calculate the former you use double precision and then you can truncate the resulting matrix to single precision for use e.g. in shaders. Everything dandy by then, even for GPUs.

But first you need to get there somehow.

So if you have a gazillion instances in your scene, particles, blades of grass, leaves of tress, spaceships, whatever, you need a gazillion matrix multiplications with double precision to build your acceleration structure and to actually start generating pixels.

It's one of many reasons why GPU based renderers performance goes to shit, particularly on time to first pixel, when scenes of such complexity get thrown at them. Contemporary GPUs have comparatively shitty f64 performance.

Edit: added example PRMan changelog backing up the claim in the 1st sentence. This was most likely considering just xforms. For ray-tracing specific issues and f64 see e.g. the PBR book on solutions to ray intersection precision challenges.

[1] https://renderman.pixar.com/resources/RenderMan_20/rnotes-3....


> Not true. You can get speedups of 10x-100x by moving to the GPU. [...]

Ok, I take you by your word. Mind you, all the points I made were about offline rendering. I dare you to put your money where your mouth is.

Disney's Moana Set[1] it a good, publicly available example. Show me how you get a 10× speed up using a GPU over 3Delight[2] with any commercial renderer out there. Really. I look forward to it. Note that I do not take you up on your 100× because it will satisfy me to see a 10× speedup to accept your point. By far.

Price of GPU and CPU rig need to be comparable. Electricity cost must be factored in too so if your render costs twice as much, rig plus running costs, you divide your results by that factor. And vice versa.

And as for your RIB comment earlier: don't worry -- we will add the scene parsing time to the finals; as we should.

Total time will be from render command issued to final image being available on disk.

And while this is an outdated example the scene complexity reflects what users of Houdini, even single artist types, are throwing at renderers nowadays. Or trying to.

> What is making you think that people aren't improving their rendering algos or renderers [...]

Uhhm, experience? I have been using offline renderers in production since 1994. And I see the rate of progress in commercial offline renderers drop in correlation with those people spending more effort on porting their stuff to GPUs. Sure, could be a coincidence and no causality at all. Maybe there are better explanations -- I'm all ears.

> Not true. You can't get rid of the need for denoising when using Monte Carlo path tracing techniques [...]

Says who? I would never make such a claim either way.

So before you mis-quote me again: I ever said you can get rid of the need for denoising and I never said you cannot get rid of the need for denoising for Monte Carlo path tracing. In fact I never even mentioned Monte Carlo path tracing.

Now going back to what I actually wrote: maybe re-read that. Notice the word I put in Italics so it was harder to miss?

Just for the record: you are mis-quoting me for the fourth time in this thread.

[1] https://www.technology.disneyanimation.com/islandscene

[2] Just an indication (white bars only) since this misses data on electricity costs and parsing time (I think, but I will verify with them): https://documentation.3delightcloud.com/display/3DLC/Cloud+R...


> I never said you cannot get rid of the need for denoising for Monte Carlo path tracing.

You claimed that investing in sampling algorithms would reduce the need for denoising. You claimed that 3Delight's sampling algorithm is proof of that. I disagree, I think denoising gives the same benefit to 3Delight as it does to Arnold & RenderMan, which is reduced time to a noise-free preview.

> In fact I never even mentioned Monte Carlo path tracing.

I don't even know what you mean by this, if we're not talking about Monte Carlo, why are you commenting on denoisers at all? You posted a link to 3delight's geo light sampling marketing comparison. You know that's a Monte Carlo method, right?

> Sure, could be a coincidence and no causality at all. Maybe there are better explanations -- I'm all ears.

Yes, that's right, it could be coincidence.

Do you follow Siggraph or rendering research, and have you noticed that the rate of progress in CPU rendering algorithms has slowed?

Do you have more examples of features that CPU-only renderers have that are surpassing everyone porting to GPU? I've seen only one example in this thread.

> you are mis-quoting me for the fourth time in this thread.

I don't believe that's true. I may be misunderstanding your points or summarizing you incorrectly or in a way you don't like. I am definitely disagreeing with some of your points and trying to explain why, but I don't believe I have mis-quoted you even once.

>> You can't get rid of the need for denoising when using Monte Carlo path tracing techniques [...] > Says who? I would never make such a claim either way.

Wikipedia says so, and I'm happy to repeat it. Would you not make that claim because you haven't studied what Monte Carlo is, or because you know of advanced techniques that are noise free? There are no known noise free Monte Carlo rendering techniques except for on trivial and/or hypothetical scenes. The kind of scenes you're talking about always have randomness (noise), because the very name "Monte Carlo" is referring to random sampling.

I'm happy to explain more about why Monte Carlo comes with noise, in case it's not clear, or why denoisers will always be valuable in the context of Monte Carlo rendering methods...


> You claimed that investing in sampling algorithms would reduce the need for denoising. You claimed that 3Delight's sampling algorithm is proof of that.

One thing I can share is that 3Delight has a found a way to do weighted filtering of pixel samples w/o introducing correlation. They do not publish any papers though.

Many other vendors of renderers have this issue and the noise this correlation produces is hard to remove.

But I'm sure you understand all that since you were offering to lecture me on Monte Carlo methods below. ;)

Because of this e.g. Intel Open Image Denoise needs uncorrelated pixels samples to work at all.

And the crazy thing resulting from this is that Intel ask you to turn off this sort of pixel filtering in your renderer. Which makes the image more noisy.

Aka: you are adding noise to be able to then denoise. More irony ... this time lost on advocates of denoising, I guess.

Quote from the IOID docs: "Weighted pixel sampling (sometimes called splatting) introduces correlation between neighboring pixels, which causes the denoising to fail (the noise will not be filtered), thus it is not supported."[1]

This is actually misinformation (or maybe trending fake news?). What they should have written is "We believe that weighted pixel sampling (sometimes called splatting) introduces correlation between neighboring pixels [in all renderers we know of] [...]"

To quote a 3Delight developer on this text: "That text from Intel, apparently the same as for RenderMan, is pure, unadulterated, pseudo-science. [...] 3Delight will work with [the] Intel denoiser [...]"[2]

Proof: you can feed a pixel filtered image from 3Delight into IOID and it will work as the samples are not correlated[3].

[1] https://openimagedenoise.github.io/documentation.html

[2] https://discordapp.com/channels/618404428776472581/618415516...

[3] https://cdn.discordapp.com/attachments/618415516641525791/70...


> You claimed that investing in sampling algorithms would reduce the need for denoising. You claimed that 3Delight's sampling algorithm is proof of that. I disagree,

Then we agree to disagree. I can back up all my claims with hard data. Can you?

3Delight's changelog gives you access to all versions of the renderer maybe back to 2006 or so. I think their path tracing core is only after 2016.

So we can download any 'sample', throw the same scene at it and calculate noise properties in the resulting image and see how they got better over time. Be my guest. I know the result and it will back up my claim. Cheers! :)

> You posted a link to 3delight's geo light sampling marketing comparison

Nice 2nd try but not biting. This is not marketing material. 3Delight doesn't employ a single marketing person and this was never published at the time when it was relevant (2017).

> I'm happy to explain more about why Monte Carlo comes with noise, in case it's not clear, or why denoisers will always be valuable in the context of Monte Carlo rendering methods...

Thank you for your kind offer, professor, no need. ;)

May I remind you that I said (analogously) that relying on denoisers for improvement is a bad idea not that denoisers are a bad idea per se?

But let's not try diverting this from you being put on the spot to prove your outlandish 10x+ speedup claim, shall we?

Or are you preparing an extra reply to my challenge?

1. GPUs are not faster than CPUs for offline rendering when you factor in costs.

1.1. They are not even faster in the general case when you just go by GFLOPS/US$. You have to look at specific cases to get a speed advantage (exactly my point: you spend R&D on making the general case work within the restrictions of the special case. Example: FP precision).

3. Time spent working around restrictions of GPUs (data transfer, memory size, precision) could be better spent at improving a CPU renderer. Proof: 3Delight.

4. GPUs are not a general alternative to CPU offline rendering because of aforementioned restrictions. That they work in specific cases means the only good reason to invest in R&D supporting them is if your target audience only ever needs to deal with those special cases. No such renderer (claiming to cater only for such audiences) exists. Thus there are no renderers out there which should focus their R&D on GPU or they are simply lying to their customers (and/or themselves).

> Do you follow Siggraph or rendering research, [...]

Yes.

> [...] and have you noticed that the rate of progress in CPU rendering algorithms has slowed?

No.

But I'm curious how you define a "CPU rendering algorithm". I'm all ears.


> I can back up all my claims with hard data.

That's where I think you're already off the rails, I don't think this particular point can be proven with hard data, it's about goals. Denoising helps whenever there's noise. 3Delight produces noise with low sample numbers, just like all other renderers. A denoiser will make that faster and enable quicker preview turnarounds.

Your argument is that 3Delight's algorithm makes final frame rendering faster, which I'm not arguing with. If it takes longer than 5 seconds to get to final frame, it doesn't matter whether it's faster than others, denoising will provide the same benefit.

> I'm curious how you define a "CPU rendering algorithm". I'm all ears.

I was really referring to just rendering algorithms in general. There were a lot of papers on how to speed up rendering algorithms in the 80s and 90s, and the speedups were large. A lot fewer in the 2000s and 2010s, and the speedups are no longer multiples, but more often measured in percents. Based on your claims, I'd think you'd have noticed this trend. It's directly related to the rate of progress of features and speed that commercial renderers are releasing.

Feel free to provide sources on flops/$ for fp32. (BTW, talking flops/$ ignores RTX.) First Google result disagrees with you https://aiimpacts.org/current-flops-prices/#Graphics_process...

Saying that GPUs aren't a general alternative is true but ignoring the trend. They used to be much less of an alternative. They are currently a fast alternative for some workflows, especially preview. They will in the future become more and more of a general alternative.


> Feel free to provide sources on flops/$ for fp32. [...]

Ok, early morning maths ... been coding for 10 hours but I try:

A Quadro 8000 RTX has about 510 f64 GFLOPS and 16,320 f32 GFLOPS according to Wikipedia[1]. It costs US$5,500.

A Ryzen 3970X has about 1,320 f64 GFLOPS according to this test[2] and 1,900 f32 GFLOPS. It costs US$2,000

Now a Super Micro board with two EPYC sockets is about US$700 according to Google.

So for US$4,700 I get a CPU compute rig with 2,640 f64 GFLOPS. Aka: more than five times as much as the Quadro 8k RTX rig -- at US$800 less.

Let's say, we are being unfair and we limit our scene complexity to one that only requires single precision (f32) floating point operations (see my reply above). That's btw one of those "GPU restrictions" renderer developers then try to work around that I talked about earlier.

I guess I am trying to say: without going single precision GPUs can not even beat CPUs -- they basically suck.

Now with single precision the Quadro 8k RTX is way ahead of the CPU. At 16,310 f32 GFLOPS it is a bit more than four times the compute power of the CPU rig which clocks in at 2×1,900 GFLOPS of f32 precision (mind you just four times, that's why I said if you can even prove a 10x speedup of GPU vs CPU in offline rendering, I take your point).

Now looking at actual rendering time ... from launching a render to getting an image on disk .. can the GPU we're looking at ever be four times as fast as the US$700 cheaper CPU rig with this constraint to f32?

Absolutely. Most definittly. I would never contest this.

Will these constraints satisfy the needs of people doing professional offline rendering today? Definitely not. Shoveling those scenes onto the GPU will rid you of any speed advantage as low as 4x.

Let's not even talk about electricity costs. The AMD CPU draws 35W under full load. So that's 70W for two. The Quadro 8k RTX draws 260W. And this is not considering the AC you need to buy, maintain and power feed too cool a farm of these beasts.

People who say GPUs will hit it big in offline rendering "soon" are missing important details.

And as someone who had to bear with convincing people the other way once in his life already to prevent disaster later on a production I mean this in a very caring way.

Lastly: at that time I just talked about everyone was saying GPUs will "soon" play a bigger role. Then over ten years passed.

In 2016 two friends of mine, the lead developer of 3Delight and a famous CG researcher who works at NVIDIA were chatting over beers at my house in Berlin. And the latter guy asked the 3Delight dude why they are not investing in GPU research.

And the 3Delight dude said: "Because they asked us the same then. And ten years later we're still faster than any GPU renderer out there for the customers we target. And we don't see that changing."

Now it's 2020 and it's still spot on.

[1] https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_proces...

[2] https://www.pugetsystems.com/labs/hpc/AMD-Threadripper-3970x...


The 3970x has a 280W TDP. It draws nowhere near 35W under full load. It's been measured in some benchmarks to reach 450W [0].

Luckily for blender we also have a great set of benchmarks with comparable results: https://opendata.blender.org/. While the 3990x tops the charts, even the 2070S beats the 3970x. That's more value, more performance, more efficiency for less cost.

> People who say GPUs will hit it big in offline rendering "soon" are missing important details.

At least for cycles, it's already a hit. There's plenty of services offering rendering services and all the ones I've found offer either both or exclusively gpu rendering.

[0] https://www.phoronix.com/scan.php?page=article&item=amd-linu...


I would put Blender clearly into "hobbyist" territory, so whether or not it works well on GPU won't really affect the next Pixar movie ;)


Ah yes, Blender the "hobbyist" tool that's also:

* Used by NASA for a lot of their 3d modelling

* Used in Spider Man 2 for storyboarding

* Used for some special effects in Red Dwarf

* Used in Captain America: The Winter Soldier for previsualization

* Rendered the entirety of the movie Next Gen

The list goes on and grows every year. No, it won't affect the next Pixar movie, but if that's your standard then 3Delight sounds like just as much of a hobbyist thing.


Are you just blindly arguing against anything and everything said in favor of GPUs? Why are you stooping to pick on Blender? It is being used in production, so call it what you want.

Pixar is working on a GPU renderer, they’ve been discussing it for a couple of years.


I asked about fp32. You picked one of the most expensive GPUs NVIDIA makes, and one not built for fp64, in order to compare fp64 flops/dollar, without explanation... to artificially justify using two CPUs instead of one? Your numbers here are pretty contrived. The RTX8000 is not the best flops per dollar for either fp32 or fp64 by a long way. Try again.


I picked a rig that freelancers using Redshift and the like buy to 'render faster'. I hang out on the resp. Discord servers a lot where these things get discussed. So from what I read there this comparison is spot on as far as the current market goes.


Rationalize it however you like, the conclusion you jumped to using your cherry picked numbers above is off by an order of magnitude.


I think the implication is that GPU rendering ends up using lower-precision floating point numbers than are available on modern CPUs, which leads to more rounding errors.


No, CPU renderers also use IEEE single precision floats for performance. Unless you do something really absurd[0], single precision has more than enough range and resolution to not lead to any noticable inaccuracies in the result.

[0] like trying to render an insect eye to scale while having everything shifted 1000km from the origin.


While that is in general correct, most GPUs do not fully adhere to IEEE or they do not calculate at full resolution.

It's a well-documented fact that the same floating point operations on a GPU will have less precision than on CPU, despite both using the same IEEE storage format.

https://developer.nvidia.com/sites/default/files/akamai/cuda...


That paper does not say what you claim it does, it is not showing that GPUs are computing at lower resolution, nor that CUDA is failing to adhere to the IEEE 754 standard.

In at least two cases, it says the opposite of what you just claimed:

FMA on the GPU was a way to achieve higher float precision than you could with CPU math, in 2011 when the paper was written. That is changing now with CPU adoption of fma, but that doesn't change the claims of the paper.

The GPU's strict adherence to IEEE 754 is mentioned multiple times.

"The same inputs will give the same results for individual IEEE 754 operations to a given precision on the CPU and GPU. As we have explained, there are many reasons why the same sequence of operations may not be performed on the CPU and GPU. The GPU has fused multiply-add while the CPU does not. Parallelizing algorithms may rearrange operations, yielding different numeric results. The CPU may be computing results in a precision higher than expected. Finally, many common mathematical functions are not required by the IEEE 754 standard to be correctly rounded so should not be expected to yield identical results between implementations."


As far as I can tell, Pixar only started using GPUs for previsualization around 2013 or 2014. It looks like Renderman didn't have real GPU super until 2018.

They've never actually confirmed moving from CPUs to GPUs rendering for their final production renders.

So, yeah, I think you're on to something.


I'd gladly trade that for more detailed models and larger, more complex environments for instance.

No doubt nVidia could make a card that enabled games to support models with double the polycount of today's games, but if companies refuse to ship games with them because the art effort is too great then it's not going to get players to update their PCs. I suspect the push towards raytracing is a way of selling more graphics cards without radically increasing the amount of investment needed by games companies in the way that higher fidelity models or more complex environments would.


> No doubt nVidia could make a card that enabled games to support models with double the polycount of today's games [...]

The solution is to use tessellation on the GPU, on demand, based on projected size of features.

In blockbuster VFX heavy models are a recent phenomenon because mircopolygon renderer cores were replace with ray-tracing ones.

Models used to be much lighter, subdivided at render time and displacement, both textured as well as resolution unlimited procedural, added the final details.

The same offline rendering techniques that enabled the VFX of the 90's and early 2000's could be utilized in GPUs today to push geometric detail to the next level in realtime graphics.


Micropolygons work beautifully for local only shading, but that is a very limited model and a thing of the past. The milage that you can get out of this model that was designed for severely memory constrained systems (high resolution cinematic animations on computers with 4MB if RAM) is enormous.

But once you bounce rays randomly through a geometrically complex scene, you either re-tesselate all possibly intersected geometry dynamically again and again with a comsiderable runtime overhead or you pre-tessellate the entire scene based on camera position and film resolution. The later approach is the basis for VFX at the moment, but some renderers like Renderman still can do dynamic on demand tessellation where required AFAIK.


Have a look at the game Control. I think this is a game where I feel they've used the technology really well. Maybe the office building location is a good one for raytracing, as it's full of reflective surfaces, but the graphics feel substantially better with it switched on. Remedy Games also managed to maintain a reasonable framerate with lower end RTX hardware. Other games look less impressive, however, such as minor changes in the shadows. It obviously requires some thought before it can be used.


Control has fantastic raytraced reflections. I don't want to go back to faked reflections based on screen space hacks and cube maps.

The game gave me a few accidental jump scares because I repeatedly registered movements in glass windows that was really just a vage reflection of something else in the scene, e.g. the face of the player character that you usually see only from behind.


Character animation is still a field that is lacking, compared to otherwise more and more realistic environments. So Raytracing may help make interiors and exteriors really impressive, but you'll be stuck with characters that have the same looping animation when they run and move around.


A lot of computational work yes (with the currently available hardware at least), but many visual effects are much much easier to implement with ray tracing than with rasterization techniques. Dynamic off-screen reflections, soft shadows, ambient occlusion, dynamic global illumination, caustics etc. can all be implemented more easily and realistically with ray tracing than with previous techniques.


I think Minecraft RTX pretty much seals the deal: ray tracing will definitely be the future of lighting in real-time computer graphics.


I think the most important thing to do with RTX will be indirect illumination. Maybe RTX will be capable of computing relatively high quality lightmaps on the fly, store the results for as long as necessary, and keep recomputing for new parts of the scene.


just wondering is it possible to implemented in FPGA for ioquake3?

https://researchspace.auckland.ac.nz/handle/2292/36394


Same. You can always spend those transistors/cycles/watts on something else that is more compelling (more frames, better AA or higher rennder resolution). For most games doing 4K/120 without raytracing would probably be a better experience than 4K/60 with raytracing.


Ray tracing is not powerful enough nor are games built yet from the ground up with ray tracing in mind. With the newer generation consoles expected to arrive with ray tracing that will change then probably we will get to see something different. Personally I feel raytracing will make the most difference in vr rather than normal games.


What APIs are being used to perform ray tracing in these new games that support it?

Edit: Found a starting point of an answer, covering Nvidia’s hardware at least, at https://developer.nvidia.com/rtx:

> Ray tracing acceleration is leveraged by developers through NVIDIA OptiX, Microsoft DXR enhanced with NVIDIA ray tracing libraries, and the upcoming Vulkan ray tracing API.


I wonder if the same concept of running the output image through a fast neural network could be applied to typical rasterized game engines. For instance, applying antialiasing or anisotripic filtering or shadow blurring?


This is more or less what DLSS is


I watched some videos comparing ray traced and rasterised graphics and in real life the benefit seems negligible. On top of that rasterised graphics use all kinds of smart techniques to improve the performance. Ray tracing the whole scene is kind of brute forcing, brick and mortar method. It should be used selectively and not on each frame IMO


The problem with comparison videos is that most of them are focused on comparing games with RTX on/off. But this means that the game is developed with raytracing as an afterthought: it cannot depend on it for core features, because the majority of consumers don't have it yet.

Imagine a shooter with, say, a subway entrance level. Initially the indoor part is lighted by overhead lamps. During a shootout, the defending team can destroy those to provide cover. while they can still see backlighted silhouettes of people walking in. The attackers might choose to park a semitruck at the entrance to provide some cover in shadow.

Or imagine the end of an escalator. How do you look for people coming in? Obviously, looking directly into the shaft exposes you to anyone coming in, but perhaps you can use the slight reflectivity of the marble floor to look at it indirectly.

Should you use the east of west side of a valley to go to a target? Well, that depends on the position of the sun, of course. You want to hide in the shadow. But what if a day only lasts 5 minutes on your world? Shadows are constantly changing, and so are the hiding places.

Want to hide in a room? Turn off the lights to make the window into a one-way mirror. Be sure to turn off your laser pointer and flashlight, though!

And have you seen the mirror scene in John Wick (https://www.youtube.com/watch?v=7-TZCEyok_o) ? Try doing that with a rasterizer!

So no, using it selectively isn't the way forward, if you ask me. We'll only see the full magic of it if we're able to use it fully in every scenario. We'll only figure out how to use it when we see how gamers interact with the effects of it. All the things I described above aren't something the developer explicitly programmed, they're just properties inherent to raytracing left for the player to discover. Only after a few years of that will developers be able to fully make use of the benefits it provides.


From the fundamental point of view, the cost of raytracing grows proportional to the number of pixels rendered times log(scene size), whereas rasterization grows linearly with the scene size. So raytracing enables much more complex scenes at a fraction of the cost of rasterizing, which explains why it's being used more and more in real time rendering.


For others interested in why ray tracing scales O(log N), this is covered in an earlier video in the NVIDIA "Ray Tracing Essentials" series, specifically this one:

https://youtu.be/ynCxnR1i0QY?t=173

It's timestamped to the discussion of why this is true, but the whole video (like the series, IMO) is very informative, this one focused on "Rasterization vs Ray Tracing".


For someone that doesn't want to watch, the essence of it is that space is world-space is divided in a tree-like structure, which makes traversing the scene as costly as traversing the tree, thus log(n) operations.


This is only true from a naive approach to visibility. With rasterization you can rasterize geometric values like normals and positions to an image buffer and shade only what is visible. Rasterization in old versions of renderman would do a lot of shading that was thrown away once scenes became more complex. Real time games eventually followed a similar pattern and now do 'deferred shading'.

Rasterization can also be used to cull polygons that aren't visible when they are going to end up hidden by opaque objects.

In theory ray tracing the first hit visibility can scale better, but in practice that part isn't a big deal and rasterization will probably win anyway. Not only that, but the idea that more polygons will make something look better is another trap. High quality lighting and high resolution textures become more important once polygonal geometry has enough polygons to not have faceting artifacts.


You make some good points, but culling doesn’t really work for complex on-screen geometry like foliage, hair, volumetics, crowds, etc, and games are pushing in those directions.

Raster is currently cheaper for well tuned video game assets, that is true, but is not cheaper for film quality assets, and that’s where games want to go in the future.

Filtering is a very important issue, and you’re right that more polys is not automatically better from a quality standpoint. Still, ray tracing can enable rendering of geometric complexity that is not possible with raster in real time, as long as you have a way to filter. Games aren’t yet really pushing on the boundaries of instancing, but they will down the road.

The other appeal with ray tracing is being able to consolidate all the various tricks and algorithms into a single path tracing framework. Ray tracing architectures don’t need deferred shading (though they may choose a similar wavefront approach), and they don’t have a collection of difficult and separate ad-hoc methods for each effect they want (shadows, AO, reflections, indirect lighting, volumetrics, etc. etc.). Ray tracing gives you a unified framework where getting new lighting effects is easier to add, generally speaking.


> but culling doesn’t really work for complex on-screen geometry like foliage, hair, volumetics, crowds, etc, and games are pushing in those directions.

Just because something doesn't work in all situations doesn't mean it doesn't work in general. Games putting polygonal leaves and crowds full of furry creatures would be a waste, it isn't going to happen without level of detail.

It is actually cheaper for film quality assets, but the camera visibility rays aren't where most of the time is spent. Still, things like REYES / renderman architecture decouples all the sampling required for motion blur and defocus so that the visibility samples don't have to be shaded.

> The other appeal with ray tracing is being able to consolidate all the various tricks and algorithms into a single path tracing framework.

My reply was about ray tracing not actually being faster than rasterization for first hit visibility in practice. I didn't say anything about ray tracing in general here.


The problem is that although it scales better than rasterization, the computations involved in tracing are much more expensive (on current hardware at least).

Another big drawback of ray tracing is that you can't use conventional occlusion and view frustum culling [1] with it. So significantly reducing the scene size, as you do in a rasterization renderer, is just not possible. If you have access to the GDC Vault I can recommend DICE's 2019 talk about reflections in Battlefield V. The slides are available for free [2].

[1] https://media.giphy.com/media/xUPGcgiYkD2EQ8jc5O/source.gif From the game Horizon Zero Dawn. Only geometry that might end up on the screen is actually sent through the rasterization pipeline.

[2] https://gdcvault.com/play/1026282/It-Just-Works-Ray-Traced Talk about culling starts on slide 56. Occlusion and Frustum culling aren't an option, so new techniques had to be developed.


The main benefit of raytracing is primarily for the CG artist pipeline/workflow. Complex, realistic lighting effects just drop out of the integrator, and changing scene lighting is much more akin to a lighting director on a movie set. Whereas with rasterization an 3D artist has to know the correct hack (i.e. "smart technique") to use in each situation--along with the failure modes for that hack.


If we had enough power though we could do away with the performance hacks and just use one 'correct' way to draw the scene. This combined with accurate representations of materials would make building scenes easier and result in more coherent worlds.


All kinds of smart techniques are emerging for ray tracing, too; you don't have to shoot the rays directly from the camera per se. Both rasterization and ray tracing will coexist to provide better graphics, and it was just that ray tracing didn't have the place it deserves on the chip.


While this works OK-ish for static images, it is almost unusable for animations. But for static images, blur+sharpen in Photoshop has worked OK-ish, too, for many years.

So the practical benefit of this is negligible.

What we need is a denoising technique where the denoising artifacts move convincingly with the features of the scene, so that you can use it for movies and games.

Oh and for games, as long as this is NVIDIA-exclusive, developers have to treat it as an optional add-on. For multiplayer games, that implies that Ray Tracing may never show details (such as a reflection of an enemy) that would give a strategical advantage.

Plus the real issue with contemporary game development is that consoles make a majority of the revenue (due to less piracy) but they choke when you have 50k+ polygons on an animated character. And you'll be limited to 2 GB GPU RAM on 30% of your PC player base, because they use laptop GPUs.

In the end, then, you usually don't have enough detail to make ray-tracing look good. It looks amazing for high-poly curved surfaces, such as those used for offline-rendered cinema movies. But on a blocky realtime game model, ray-tracing may also highlight artifacts.

Here's a ray-traced low poly bunny: https://i.imgur.com/MGotRC7.png

Notice how clearly you can see that it is low poly. In a rasterization engine, one would "fix" this by blurring the edges with shaders and bending the corners with normal maps.

So in a sense, ray-tracing is too honest to work well with current video game models.


>Oh and for games, as long as this is NVIDIA-exclusive

It isn't. [0][1]

The APIs are intelligently laid out such that hardware accelerated raytracing can be used on popular APIs regardless of GPU vendor, if the vendor has made their drivers correctly.

0. https://devblogs.microsoft.com/directx/announcing-microsoft-...

1. https://www.khronos.org/blog/ray-tracing-in-vulkan


At the moment, there is no AMD GPU which supports the full ray-tracing spec. So effectively, it is NVIDIA-only.


There's no reason why ray-tracing can't work with bent normals. That's how Blender's "smooth shading" works, and Blender uses raytracing for rendering.


That is correct for Cycles, but the kind of raytracing that is possible with hardware-acceleration inside a GPU is severely restricted in comparison to a general-purpose CPU raytracer like Blender's Cycles.

I believe Eevee, the Blender GPU ray-tracer, does also currently not support having reflection rays collide with surface generated by normal/displacement operations, but instead they will approximate flight path by using the unmodified geometry.


Cycles also runs on GPUs.


But without using the ray-tracing hardware advertised in the article.


You said, "the kind of raytracing that is possible with hardware-acceleration inside a GPU is severely restricted in comparison to a general-purpose CPU raytracer like Blender's Cycles." This is manifestly incorrect.


You're a decade behind on consoles, 50k animated triangles haven't been an issue for years. Even early PS4 games had more triangles per character, in some cases more triangles simply for hair, let alone the rest of a character.


"Tom Clancy's Rainbow Six Siege" 40,000 Triangles + 3 2048 x 2048 Texture Maps https://www.artstation.com/artwork/0eKg8

"Fortnite" LOD 0: 23,041 Triangles https://i.imgur.com/y8fOip0.png

I'll give you that purely GPU-skinned on PS4 can go up to 700k for the entire scene, but that means 14 or less people visible at a 50k budget.

So for a game like Assassin's Creed, the per-character and per-item poly counts need to be significantly lower, to make sure the combined sum of the scene is still manageable.


Just as an explanation, I was referring to "Denoising for Ray Tracing", because that's what the link points to.

I do not oppose ray-tracing in general, but merely the over-hyped denoiser.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: