Hacker News new | past | comments | ask | show | jobs | submit login
Breakdown of a Simple Ray Tracer (nyu.edu)
389 points by rhema on Feb 3, 2017 | hide | past | web | favorite | 48 comments



This is pretty cool. I've written a couple of toy ray tracers, but always in a language that runs on a CPU, so it's interesting to see how you would do it on a GPU. And I noticed that this is from the Ken Perlin, which is cool - I'm mostly just used to seeing his name as the "Perlin" in Perlin noise.

There's one thing I'm curious about - does anyone know why he's taking the square root of the color to produce the final pixel color?


He's assuming lighting is in linear space and doing gamma correction at the end of the shader. Sqrt is a faster approximation of pow(c,1.0/2.2).

I first saw this trick in writing in this article (by the co-creator of shader toy). http://www.iquilezles.org/www/articles/outdoorslighting/outd....

For reasons why linear space lighting is important read (http://http.developer.nvidia.com/GPUGems3/gpugems3_ch24.html) The gist of it is gamma space lighting tends to look unnatural and blown out, and it becomes more of a problem the more math you do in the lighting (adding specular, multiple lights, etc.)


A small note, linear space lightning is important because the lightning equations are linear equations with respect to physical quantities (unless you specifically introduce materials with non-linear response, which are quite rate). So simply

  intensity(pixel)=intensity(pixel illuminated by source 1)+intensity(pixel illuminated by source 2)+...
It's perception that is non-linear, so you can separate it as a last non-linear step instead of always calculating

  perceived_intensity(pixel)=pow(x,pow(1/x,perceived_intensity(pixel illuminated by source 1))+pow(1/x,perceived_intensity(pixel illuminated by source 2))+...)


> It's perception that is non-linear,

Since perception happens when you look at the image on your screen, applying an explicit "perception" step in rendering is actually counterproductive, since those nonlinearities would compound.

Gamma correction is used to allow for better contrast resolution for the brightness levels where the eye is most sensitive, when you have to compress your color values down to 8 bits per channel. The display then inverts this to produce ordinary linear space intensities.


there's more to the story than that, even. there's also the part where old CRT monitors had nonlinearity of similar exponent (or maybe its inverse, I can't find the details of it quickly on mobile). newer display technologies emulate this nonlinearity so they can be compatible receiving the same signals / data. (someone correct me if I'm wrong, I feel like I am not being 100% exact here)

But the point that remains today, is not the bits (as shaders work with floats internally), nor the response curve of a CRT (because almost nobody uses those any more), but the fact that the physics calculations of light operate on linear quantities (proportional to an amount of photons), so you better do those in linear space.

Then at the end you must convert to perceptual space, which can be done in a number of ways. I'm not sure how much of a win replacing one pow(color, 2.2) with a sqrt is, at the end of a fragment shader. Especially when the pow(color, 2.2) is already a quick approximation by itself, there are much fancier curves to convert to perceptual space (that don't desaturate the darks as much, for instance).


Accurate. The 2.2 and various hacks compensate for the baked in hardware.

Although perception is nonlinear, the nonlinearity here specifically addresses hardware on standard sRGB displays. After all, as cited above, why would you apply a nonlinear correction for perception when it has that baked in?

What is missed by many, is that standard LCD technology, despite being inherently linear, has that low level hardware nonlinearity baked in. Typically, it is a flat 2.2 power function, although other displays or modes may use differing transfer functions.

The irony is that the nonlinear 2.2 function adopted into most imaging results in a closer-to-linear signal, that ends up further nonlinear based on tonemapping, SMPTE2084, or other such nonlinear adjustments for technical or aesthetic reasons. In the case of raytracing, a mere 2.2 adjustment is woefully worthless.

PS: The CIE1931 model, of which the RGB encoding model is effectively derived from, used visual energy. That is, it doesn't model "reality" so much as the psychophysical byproduct that happens in the brain. Luckily, the base model, XYZ, operates on a linear based model with some extremely nuanced caveats. Raytracing using RGB tristimulus models as a result, sort of work.


Why should anybody care about speed in this demo? That sqrt just serves to confuse. Which IMO is at consistent with that terrible book Perlin wrote (co-wrote?) years ago.


Paste this into the editor to see the difference:

    varying vec3 vPos;
    
    float sRGB(float x) {
       if (x <= 0.00031308)
          return 12.92 * x;
       else
          return 1.055 * pow(x, 1./2.4) - 0.055;
    }
    
    void main() {
       float v = (vPos.y + 1.) / 2.;
       if (vPos.x < -0.5)
          gl_FragColor = vec4(pow(v, 1./2.2));   // 2.2 Gamma
       else if (vPos.x < 0.0)
          gl_FragColor = vec4(sRGB(v));          // sRGB 
       else if (vPos.x < 0.5)
          gl_FragColor = vec4(sqrt(v));          // sqrt approximation
       else
          gl_FragColor = vec4(v);                // unadjusted
    }
The sqrt() function adjusts for the output curve of your display (usually sRGB nowadays, but in the past various gamma values have been used). The strips on the left should have an approximately linear gradient between light and dark, whereas the rightmost gradient will be too dark.

The question of why your display doesn't just output linear colors is more interesting. Small differences in dark colors are more easily perceived than differences in lighter colors, so it's useful to spend more encoding space on the low end. With more bits, you could use linear colors throughout, but all color data would take up proportionally more memory. It ends up more efficient to just decode from sRGB, do your computation, and encode into sRGB again each time.

Modern graphics APIs can automate this sRGB coding for you by letting you specify an sRGB format for textures: each time you read from an sRGB texture, the system will decode the color into linear space, and all writes will automatically encode back into sRGB.


WebGL + GLSL should already adjust this for you, giving a linear colorspace. Notice that 'unadjusted' actually looks correct (at least on my browser, Chrome @ Linux).


Should as in "it would be nice"? Maybe.

Should as in it does already and this sort of thing is unnecessary? Definitely not. Hence EXT_sRGB.


Should as in "I believe WebGL + GLSL already does this by default, it does at least on my machine".


This would be in violation of the standard (in a way that would cause a lot of things to render wrong -- anybody who was rendering correctly would now be wrong!), and also it isn't practical in most cases.

It's a common misconception that this is taken care of for you by <platform X>, but it basically never is. I'm positive this is the case for WebGL.


Linear is not a colourspace.

It is an attribute of a properly defined encoding model. Further, there are two forms of linear display linear and scene linear, which are fundamentally essential to grasp for rendering approaches.

See ISO 22028-1:2004 for a discussion of terms.


> Linear is not a colourspace.

I never said so. I said it gives >> a << linear colorspace. There are many such color spaces. Some are linear in different aspects (better at close or far away colors), and some only preserve some aspects (linear in brightness, or chroma, etc). I didn't intend to fully go into color theory.

I'm not even certain how well defined the color space is, and how properly linear it is. I do know that it accounts for gamma correction though (at least on my machine).


No.

Linear is not a colour space, as a properly defined colour space, as per the ISO specification, is:

- A well defined transfer function.

- Properly described primary light chromaticities.

- Properly defined white point colour.

Where you reference "there are many such color spaces" is the issue. Linear specifically relates to the first point, and even that doesn't properly describe whether one is speaking of display linear or scene linear.

The other points relate to perceptually uniform discussions, which would be a misuse of the term linear.

I agree that it is not a suitable forum for color theory. “Linear” however, is a much confused subject, worthy of explanation.


Unadjusted looks correct for me in Chrome and Firefox in Fedora. I have an nvidia card so I'm not using Wayland, I'm not sure if that would make a difference.


Square rooting the color is a trick I have used for ages, but I have not frequently seen it elsewhere. It will do pretty much the opposite of multiplying the color by itself - i.e. decrease the contrast and produce a "softer" color space still in the 0-1 range.

In general, you can produce a variety of interesting color-space transformations with simple math on the current fragment, as opposed to more complex methods that rely on sampling output, like HDR processing.

Here is a page on more advanced color space manipulation: http://filmicworlds.com/blog/linear-space-lighting-i-e-gamma...


Gamma correction is achieved by raising the color to a power, the power being the gamma value. So sqrt is the same as gamma correction with gamma = 2.0; On typical monitors, gamma is 2.2 , so this is close and maybe faster if the GPU has optimizations for fast square root.

More info : http://http.developer.nvidia.com/GPUGems3/gpugems3_ch24.html


does anyone know why he's taking the square root of the color to produce the final pixel color?

Probably to approximate gamma correction without getting too involved.


I thought about that, but some of the results I found on Google suggest that output from WebGL shaders is supposed to be in a linear color space [0] [1]. Then again, some of the comments here [2] suggest that you do have to do gamma correction manually.

Incidentally, I didn't know to do gamma correction until I came across [3], which explained why the output from my raytracers always looked a little off ;)

[0] http://stackoverflow.com/questions/10843321/should-webgl-sha...

[1] https://www.khronos.org/webgl/public-mailing-list/archives/1...

[2] https://www.shadertoy.com/view/Xdl3WM

[3] http://blog.johnnovak.net/2016/09/21/what-every-coder-should...


If you load a texture like from a JPEG and use that in a shader, the texture is converted to gamma space when loaded, so you need no correction. If you are generating colors in the shader, you will need to correct for gamma.

GL does have the capability to handle linear color space buffers correctly, but you have to enable SRGB, and initialize the framebuffer correctly with SRGB color space, and I'm not sure if GLES (WebGL) can do this.


What happens if you need to alter the color of a texture that's been loaded? For example, if you want to do diffuse shading on a sphere with a texture mapped to it. Do you first need to convert the texture pixel back to linear space, apply the shading, and then correct for gamma?


I believe that, in WebGL, if you sample the texture in a shader the resulting color is in a linear space. And output colors from a fragment shader are meant to be linear as well.

I realize now that I can't fully answer your question, because I'm unsure whether WebGL implementations are supposed to respect color correction information in the image file format. But I'm pretty sure that the color space in WebGL "shader land" is always linear.


I have absolutely no experience with WebGL, so I have no idea. Sqrt would diminish intensity in highlights, if nothing else. So, in that way it would seem to be an approximation of gamma correction. I'd have to see the result (can't on tablet).


Ken Perlin has an interesting collection of Java applet based graphics demos that is rapidly falling into obsolescence

https://cs.nyu.edu/~perlin/


It's not that those demos are obsolete, just that browsers and Oracle have screwed up Java so much that access to this kind of resource is being lost. :-(


Try it without `sqrt(c)`! It adds a bit of ambient lighting so you can see the sphere when the light is added in. Without it, the dark side of the sphere fades into the background.


If you're interested in GPU ray tracing, I played with nVidia's OptiX a few years. It was a fun (and pretty easy, IIRC) way to do it on a GPU (nVidia anyway).

https://developer.nvidia.com/optix


Square root of color sounds like gamma encoding. He had a linear color, and wants to store it as sRGB; raising it to 1/2.2 is "correct" but sqrt is close enough.

(Or it's just for tone mapping.)


I wonder if taking the square root of the color inverts the original color. I also wonder why he is taking the square root and don't know why.


Nah, you'd subtract your color from white in order to get negative. This would diminish original. Sqrt is there for poorman's gamma.


I loved Ken Perlin's java applets years ago. Sadly, the demise of Java as webapps means they now require jumping through a few hoops to get them running nowadays.

http://mrl.nyu.edu/~perlin/

Tangentially, there's always a bit of heartache when I use the text input in Steam's Big Picture mode, as I wish Perlin's innovative input modes saw wider usage (see the "pen input" section of the above link). Of course, I understand why not: it faces the same hurdle of teaching people a new input layout that dooms all non-traditional input methods.


That's really interesting, I hadn't seen his input method stuff before.

Reminds a lot of MessagEase, which I use on every touch system that supports it.


Ahh that is too bad. Would be a cool student project to convert these to canvas, WebGL, or P5JS.


This is sort of the tutorials that doesn't make any sense if you are not already knowledgeable in the subject of math and shaders and rendering basics. At first I was hopeful that it would finally walk me though the basics of a renderer and explain the concepts behind it, but I was let down :)

I have so many questions regardless :) The struggle of a rookie goes as:

SECTION A

- why on earth do you take the square root of c? Is there a sane reason to do this?

SECTION B

- why normalize the vector? I'd love to see a detailed explanation. - why is vPos.xy the first two parameters? - is vPos.xy a constant value or is it evaluating as the shader script is executed pixel by pixel?

SECTION D

- V -= S.xyz; // what's going on here? Why use the -= operator? Is this something to do with the way shaders operate? - float B = 2. * dot(V, W); // what take the dot product? - float C = dot(V, V) - S.w * S.w; // Why take it again and substract a square? - return BB < 4. C ? 0. : 1.; // Why 4.? No other value seems to work

SECTION E

- "Improve raytrace to return distances of near and far roots." OK I admit that I'm totally lost at this point. Asking line-by-line explanation doesn't really help :)

SECTION H

- YAY, sin and cos!!! Finally I can mess around with something :)


Very cool! This is my favorite toy raytracer example:

http://www.kevinbeason.com/smallpt/


We used to have a sign in the office: "Absolutely no chrome spheres over checkerboard planes."

(First day it went up, I wrote a ray tracer from scratch in C in two hours :-)


I've got some ray tracing links on GitHub:

https://github.com/melling/ComputerGraphics/blob/master/ray_...


Here's a more interesting one that runs on WebGL, has various materials and surfaces:

http://madebyevan.com/webgl-path-tracing/

and another, simpler one:

http://hoxxep.github.io/webgl-ray-tracing-demo/


It still blows my mind that people have to define Vec3 and Vec4 types. These should be native to modern programming languages as should Vec2. Modern CPUs have vector registers that can support these sizes and GCC has intrinsics for them that can be passed by value (even returned from a function IIRC), added and subtracted or multiplied by a scaler, and yet no language that I'm aware of has them as built-in types.

Vectors are not just for parallel computations, these small sized ones deserve to be a proper data type.


You're not aware of OpenCL, GLSL, or CUDA then!


This is a really cool demonstration, love the code breakdowns. For a less technical, but broader scale description of ray tracing, Disney made an excellent video describing their use of ray tracing in movies. Check it out here:

https://youtu.be/frLwRLS_ZR0


Expected to see a simple ray tracer break down on some edge case. Saw a simple ray tracer, simply ray trace. Was quite disappointed. Why is this posted? What value does it add?


This explains and demonstrates, piece by piece, how a ray tracer works it's a "breakdown" in the sense of "an explanatory analysis", not in the sense of "a failure."

Good for you that you already know how ray tracers work, and so possibly this adds no value for you. I already knew how they work and I thought it was a nice clear demonstration.


I see that it is a _nice_ demo, but don't see how this ends up as #1 here?


Other people found it interesting. It's pointless to complain about content that doesn't strike your fancy. Just upvote stuff that does...


Because it's by Ken Perlin, probably.


So essentially you're disappointed by your own misinterpretation of a title




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: