Hacker News new | past | comments | ask | show | jobs | submit login
Games Look Bad: HDR and Tone Mapping (ventspace.wordpress.com)
374 points by megaman22 on Oct 23, 2017 | hide | past | favorite | 215 comments

A lot of the oversaturation problem can be attributed to bad displays. Think about early Instagram. Everyone used terrible super saturated filters on their photos because the early smart phone screens were so bad, and got washed out so easily in the sunlight. By overcompensating with filters you got a photo that looked good on a bad screen, even in sunny conditions.

The same was true of early games. Gamers love the bright colors of World of Warcraft for instance because displays were bad, and it was easier to watch bright colors for hours on end while gaming. Even today my modern TV looks pretty bad in the sunlight. As much as I love the muted colors of Breath of the Wild I have to admit that without closing my curtains its really hard to tell what's going on if a sunbeam is hitting the TV.

I think as gamer's screens get better we will start to see a transition away from exaggerated brightness and a trend toward more realism, just like Instagram has transitioned largely from a platform that overcompensated for bad smartphone screens to a #nofilter style.

I think this takeaway misses a significant point in the article: even a picture of that display will look better than it does on the display itself:

> [W]hy does the LDR photo of your TV look so much better than the LDR tone map image? There’s another tone map in this chain which nobody thought to examine: Nikon’s.

The author isn't merely saying that games are too colorful. He's saying the colors are wrong from both an artistic perspective and from a utilitarian perspective. He's saying this is because the people making decisions about how color should be rendered are either ignorant of how to make the decision (copy-paste code) or make the decision without real awareness of how it will effect either of those properties.

WoW was released the same year as LCD monitor sales first surpassed CRTs (2003). So in the early days of WoW, people actually got better colors than a few years later when most people had dumped CRTs for the relatively poor consumer LCD monitors of the day.

I was just a kid, but as far as I remember the transition to LCD brought worse colour not better. To be clear, i'm not talking about early professional colourists LCD displays that were £5000 or whatever. Surely everyone is familiar with those horrible TN panels in laptops... desktop LCDs used the same tech once.

That's the point - that in the first years while CRTs were still quite widespread, people had better displays than a few years later when many had switched to terrible early LCD screens.

On the otherhand, I cannot COUNT the amount of headaches I got as a kid from the TV and CRT monitors in our house.

It's physically painful to remember it.

I got a Compaq Portable (8088 clone) for a dollar years ago. I weighs a "tiny" ~35 LBs. ("portable"!) I got it to boot up after some problems with floppy drives. I was so cool to see little BASIC programs displaying silly, fancy, tech-demo graphics. But the whine... oh the whine... I'm half tempted to replace the built-in screen with a small LCD and just disconnect the blue and red wires to keep it solid green.

Sidenote: I still miss the warm glow of an amber screen though. There's something magical about it.

It was so sad in the mid 2000s trying to find a decent quality monitor for a decent price. You had to look very hard for a monitor that wasn't 6-bit dithered up to 8-bit.

I worked for an animation studio around that time. They stuck to their CRTs long after they were discontinued because they couldn't certify anything else--I think I remember hearing they paid something like $150 per monitor when they were still manufactured. The closest LCD was 10x as expensive.

They eventually switched to DreamColors when they were released in 2008.

I agree on your overall point. However in the case of Instagram, it was on iPhone years before Android, and iPhones generally have always had very good color accuracy. The early cameras, however, were crap, and filters helped immensely there. Garbage in, filtered garbage out.

Early iPhones and iPod Touches (for the first 3–4 years) had much smaller color gamut than the recent ones, not matching any standard. They also did no color management whatsoever, with the result that colors and color relationships of basically all content were always fairly dramatically wrong [at least to my eyes, as an amateur photographer and amateur graphic designer who cares a lot about precise color rendering].

Some app authors and content producers put in the effort to understand and specifically target the iPhone display, e.g. by gamut mapping their images beforehand. But I would guess that “some” to be significantly less than 1%. (Though I suppose you might consider the exaggerated-color Instagram filters we’re discussing here to be one sort of gamut mapping, targeting the relatively muted display.)

At some point Apple started making all of their devices very precisely hit the sRGB spec, so even though iOS still did no active color management, that made things a lot better, at the expense of making it impossible to accurately target colors at all iPhones for a while because different models had different gamuts. If you go look up the historical tech specs and display teardowns for different iPhone generations, I’m sure you can figure out exactly which version was the first to be close to sRGB. [More recently, I believe they now also do some color management, though I haven’t really investigated this in a few years.]

The first couple iPad mini versions also had a smaller-than-sRGB color gamut and no color management (I have one of these). I’m not sure about the earliest full-sized iPads.

Of course, this is all still better than Android where there is extreme fragmentation and every display looks slightly different.

Instagram launched several months after the iPhone 4, which had a pretty great display: http://www.displaymate.com/iPhone_4_ShootOut.htm

It wasn't perfect nor did it hit sRGB, but still pretty darn good for photos. As you say, they didn't color manage but they did a great job factory calibrating and selecting panels, and thus the lack of color management wasn't really an issue in practice, at least when designing apps. About 2 years after Instagram launched, the iPhone 5 came out which had sRGB.

The point still stands that the displays weren't the reason insta photos were filtered - it was driven by camera limitations and the culture / artistic sensibilities that Instagram imbued in the product.

That link of yours shows that the iPhone 4 had substantially less saturated reds and blues than sRGB. http://www.displaymate.com/iPhone_4_ShootOut_files/image009....

Earlier iPhones were an even smaller gamut. http://www.displaymate.com/iPhone_3GS_ShootOut_files/image00...

More than just desaturating images that aren’t gamut mapped, however, these iPhones substantially distort color relationships.

They were historically great displays for mobile devices, but they would have been a lot better with some color management involved.

That said, even when iOS didn’t do active color management (for performance reasons), there were some attempts to simulate it:

> Your content is matched to the sRGB color space on the authoring platform. This matching is not performed dynamically on the iOS device. Instead, it happens during authoring on your Mac OS X desktop.


> Targeted color management may also occur when you sync content to your mobile device. In fact, iTunes running on the desktop provides color management to the iOS targeted color space when you sync content from iPhoto to your iOS device.


(As you noted, there have also been more recent developments. Since last year there have been iOS devices which support the DCI-P3 gamut and use active color management; some of those devices also support “True Tone”, which adjusts the color mapping based on ambient light.)

Displays matter, but you may be missing the point. In short, low dynamic range means the display has less options to choose from. Better displays are very nice but they will always be limited by dynamic range and tone mapping. The problems of HDR are a pain in the ass, even for myself after working in film production for a decade and now doing CGI. But, when you do it all right (rarely can my projects afford to) it clicks. Games may never be able to in our lifetimes and the returns are diminishing. But, a great display with hacked dynamic range will hardly make a difference.

This article makes me appreciate the balance struck in GTA.

IMHO this type of scene is better than any of the examples shown in the article:


GTA's tone mapping is discussed at http://www.adriancourreges.com/blog/2015/11/02/gta-v-graphic... - including a couple of interesting "filmic tone mapping" links

I liked this enough to merit an addition to the write-up. It's not perfect but I'm seeing some great GTA 5 shots that are very nicely balanced overall. Doubly interesting given the substantial age of the game.

This picture is from a video game ?!

Yes, there are mods out there that make GTA look like a photo-realistic adventure: https://www.youtube.com/watch?v=7cbL3HUqAJU

A more than four-year-old videogame, yes.

on last generation hardware too, though the screenshot in question might be taken one of the ports to more modern systems

It's from a computer, not a console.

Look at the white brick pillars in the background behind the car; they're exactly the same. Some other stuff like the soda bottle pops out as well. It does look nice though.

I don't know if there's supposed to be some depth-of-field effect going on here, or what, but I find it really jarring that the resolution of the textures applied to objects in the scene are so different. The car and that one section of wall are super-sharp and high-resolution, then right next to them some of the trash is fuzzy, blurry lo-res textures.

In this case it seems it's some sort of DoF-Effect, since the lines between the objects seem to blur too.

well, yeah. do you have any idea how massive the world map is in GTA??? Of course they have to scale the textures.

> We would need 20 bits of just luminance to represent [the range 1,000,000:1

I don't understand this statement. Why do you need a certain number of bits to represent a large range? You can encode a range using any number of bits you want, but if you use fewer bits the resolution within may be lower.

Representing a large luminance range isn't a problem of resolution though is it? It's a problem of presenting that contrast ratio on a screen that is not capable of a contrast ratio of 1,000,000:1. If I represented the same luminance range with a thousand bits it doesn't do anything to solve the core problem, does it? So why does the number of bits matter?

Not a graphics programmer, so genuinely asking, not saying the article is wrong!

I've spent years working on rendering algorithms and game engines, so I guess I am a graphics programmer.

You are right, the final output dynamic range is limited by your output device, you can't circumvent that.

The 1,000,000:1 is the contrast ratio you might see on a sunny day, and represents the state of the world that your eyes are seeing.

Your eyes aren't capable of 1,000,000:1 contrast either, at least not globally across your field of vision, so we can play tricks in rendering to simulate what you would perceive with your limited eyeballs on a 1M:1 contrast ratio day, and this is what HDR rendering does.

Internally, the rendering code handles luminance well in excess of its ability to display on the final output device. All the game rendering happens internally in HDR, preserving this high level of contrast. The final steps of producing a displayable image are doing something to convert non-displayable "real" contrast ratios into post-processed simulation to fool you into perceiving high contrast scenes. This last part is subjective and artistic, since you are trying to figure out a way to show that high dynamic range scene. Common tricks played here are the light bloom from bright areas, the artificial amplification of darks areas, etc.

All these in-between rendering steps compose the result of multiple rendering passes into an image. Each pass which is blended with the results of a previous pass is subject to quantization error. The more passes you do (and modern games do lots), the more this quantization error accumulates, and leads to pretty ugly banding artifacts. You want to do your internal processing in the highest color resolution that you can, so that you only have to worry about the error of quantizing the 1M:1 scene into a 256:1 output device. You don't need the full 20 bits, and common practice is to use 16-bit S10E5 floats since hardware handles this quickly.

HDR televisions, while nicer than nothing, don't help you all that much. Displays have generally been 8-bit per channel, giving you 256 values distributed along a gamma curve which maps to a device with perhaps 3,000:1 contrast capability. HDR tv's give you 1024 values along a gamma curve that spans a higher range of contrast, but not all that much higher.

OK but that doesn't explain this sentence from the original blog post :

   In the real world, the total contrast ratio between the brightest highlights
   and darkest shadows during a sunny day is on the order of 1,000,000:1.
   We would need 20 bits of just luminance to represent those illumination ranges

Theoretically you can have arbitrary contrast with 1 bit (white = looking at the sun, black = 1 photon per square centimeter per year, or whatever). The author probably has some specific amount of brightness discrimination in mind that he wants to adequately capture. For example, with brightness levels corresponding to just-noticeable differences in terms of human perception, after adapting to looking at that specific portion of the scene.

In practice modeling vision precisely gets really complicated because there are several types of spatial and temporal adaptation and contrast effects that the human visual system can undergo / be affected by, and something like a video game definitely doesn’t have the computation available to handle it perfectly (not to mention the programmers don’t want to implement anything that complex and finicky). There is also a complication that display black is never especially close to zero light, and depends a lot on the light hitting the display. Etc.

So for practical purposes you just have to come up with some kind of “good enough” heuristic.

> The author probably has some specific amount of brightness discrimination in mind that he wants to adequately capture.

The assumptions seems to be a linear illumination scale with the darkest value at 1 (with true absence of illumination at zero); 1,000,000:1 needs about 20 bits for that.

   So for practical purposes you just have to come up with some kind of “good enough” heuristic
Ok I agree ; that's what I'm saying to chrisseaton

2^20 is just above 1M, so that's how many bits you need to capture the darkest and the brightest pixel without losing information (brightest/darkest = 1M). the assumption is that you tone map those 20 bits onto however many bits you have available (8 for standard display, 10 for HDR, 14 for DSLR sensor, etc.) - at least that's my understanding after consulting wikipedia.

Either I'm crazy or everyone else is :) You still have lost information with 20 bits. There are an infinite number of levels of light between the darkest and the brightest. With 20 bits you can represent 1,000,000 steps between darkest and brightest. Why do you need to represent 1,000,000 steps? Why does it matter that the brightest is 1,000,000 times brighter than the darkest? 20 bits lets you represent ones times brighter, two times bright, etc up to 1,000,000 times brighter, but what about one-and-a-half times brighter? You can't represent that in 20 bits. Why doesn't that matter?

The article says that the highest light level in a scene is 1,000,000 times brighter than the lowest. It does not say, or even imply, that you can see the difference between those levels of light to a one in a million, which is what 20 bits needs.

Can you see what I mean? The factor between the lowest and the highest is entirely separate from the question of how many levels between you can see, which is what directs how many bits of resolution you need.

I'm not sure this is a constructive comment, but I just wanted to say that you are completely correct, everyone else is crazy, and I am baffled at how much confusion there seems to be over this topic.

One thing to add: Wikipedia sayeth "the eye senses brightness approximately logarithmically over a moderate range". If we go with that, then you presumably want to encode brightness logarithmically, and the number of bits you have available will determine the ratio between your adjacent quantized levels. In that case I believe the ratio between adjacent levels would be exp(ln(max_range_ratio)/(2^bits)).

You don't want to encode brightness perceptually until you're showing it to the user. You need a linear space to actually do lighting calculations, which means physical luminance values. Within that space, you can use whatever scale you wish, with consequences to banding and quantization artifacts for decimating the bit depth. Tone mapping includes conversion to a log space via the gamma curve.

I explained it in another comment but basically the "20 bit" value is based on an idealized digital image sensor rather than a game rendering pipeline (which operates in floating point). It has admittedly proven somewhat confusing.

It's worth noting explicitly that floating point numbers let you have a linear scale but logarithmic-ish storage at the same time. A 12-bit float could represent a 1:1000000 range, let you perform normal linear math, and also be just as banding-free as a 20 bit integer.

That's why even the good old 8 bits per channel images are encoded logarithmically. Enter the gamma curve: https://en.wikipedia.org/wiki/Gamma_correction

no - the REASON why we use gamma is deeply historical and has essentially to do with trying to build TV receivers with the minimum number of vacuum tubes

Is it just a coincidence that the gamma curve also corresponds roughly to our eyes response to light? Awfully lucky if it was.

To be honest I think everyone else is crazy and/or doing a calculation without thinking about what it means.

All we know is that the range is N to 1000000N, but we do not know in and of itself that N is the smallest delta perceivable or possible.

Well since at least one person thinks I'm not crazy I'll stop trying to argue now - thanks!

I think you're making sense.

I don't know anything about games but at least in cameras the relation you want between the bit depth and dynamic range is determined by what you want to measure rather than a formula. An eye tracker I have is a 10-bit IR camera because a normal 8 bits are insufficient to both note that IR LED reflections are way brighter than everything else, while also having sufficient detail in the lower values that the edges of the pupil are easy to detect. At least, without having the scale be logarithmic or discontinuous or otherwise compressed.

You are correct. Luminance is a real number not integer so the more bits the better (unless you go all they way down to photons!). So, more bits allow us to increase the dynamic range and also allow more values in that dynamic range.

The real benefit of HDR is in the "more values" part since, as the author notes, our displays have a very limited dynamic range anyway.

Yes. I'm leaning towards a "didn't really think it through" oversight on the author's part, rather than fundamentally confusing signal-to-noise ratio (which is what more bits give you, if you treat em right) with numerical range.

Hell, I like to think I'm somewhat knowledgeable about these things and I read straight past that line thinking "that's a million, that's about twenty bits, okay".

But you are absolutely right, now that you pointed it out it's obvious.

And indeed you can get the same range using only one bit (per channel, that is) and if you had high enough (very high) resolution and proper dithering, you'd totally get away with it, too. In that case, the tonemapping goes just before the dithering.

The straight to the point answer is yes, you can use less bits to encode the same range, though for practical matters you'll get strong banding as a result. That is because the log curve applied to make the range viewable on a screen will stretch the lowest part of the range. Moreover, the multiple passes made in the linear (wide) range will produce artifacts viewable in screen space much more quickly.

The only negative result you're certain to get and can't avoid is a low signal-to-noise ratio. You only get banding (unwanted spatial artifacts) if you do it badly (without dither).

To express a ratio of 1000000:1, you need at least 20bits. I think that's all the article is saying, and that far more bits than the current contrast ratio that is supported--essentially show that we're far off from supporting actual contrast ratios.

I didn't interpret the articles premise that if you had 20 bits, you could 100% reproduce the light in a scene.

> To express a ratio of 1000000:1, you need at least 20bits.

You only need 1 bit if you declare an encoding scheme where 1 == 1 million. You need 20 bits to define 1 million DIFFERENT luminances. Their ratio is subject to an arbitrary scaling.

Sorry I still don't get it :)

> To express a ratio of 1000000:1, you need at least 20 bits.

You mean to express all the integer factors in the range 1000000:1, you need at least 20 bits. With 20 bits you can represent 1 times brighter, 2 times brighter ... 1000000 times brighter.

But there's nothing special about those coefficients. 20 bits does not allow you to represent 1.5 times brighter. That's still in the range 1000000:1 but 20 bits isn't enough to represent it.

If we're happy to skip 1.5 times brighter, why can't we skip all the even integer times brighter values, and use 19 bits?

> If we're happy to skip 1.5 times brighter, why can't we skip all the even integer times brighter values, and use 19 bits?

Things get wonky if you don't have a linear scale with a true zero; in such a scale the low end of your N:1 contrast ratio (in the smallest representation) has a value of 1, and the high end has a value of N.

Good thing in real life there's no such thing as truly zero photons, then (which is a detail the article actually shortly touches upon).

Also, if that type of "wonky" throws off your rendering pipeline, you're bound to get something else wrong.

Such as ever having a linear scale with a small number of bits in your pipeline. The linear scaled brightness stays afloat all the way through (cause floats have this handy feature of being transparently sorta-logarithmic in the way they use their bits, even 16-bit floats beat 20-bit ints for that purpose), only at the very end you apply the tonemap+gamma function(s), then dither, then truncate to fixed (8) bit integer.

Hmm… but the actual display’s representation of 0 cannot be darker than the minimum brightness value it supports. If 1 really is 1x the minimum, then 0 and 1 have to be displayed identically. But then your scale isn’t even linear, nor does it have a true zero. How does that help?

edit: And aren’t output color spaces already highly nonlinear due to gamma correction?

0 is the lowest and x-1 is the highest for a ratio of 1:x. It’s 1 all the way to x; You just subtract one because computers start counting from 0

But in that case, skipping "even integer times brighter values" wouldn't exclude 0 - because 0 is 1x brighter, an odd integer.

Is that the reason you can’t just chop off the two least significant bits when converting from 10-bit to 8-bit?

what he meant, technically:

2^19 = 524,288

2^20 = 1,048,576

Let's call that the writer's artistic license.

Well it's kinda accepted in photography / computer vision to say this.

Another argument for this formula might come from Shannon information entropy but I'm not 100% sure

Yeah. There's definitely some unstated assumptions in that number. I think he is implying a linear scale, with the ability to represent details similar in magnitude to the minimum brightness. Without some assumptions like those, all it takes to represent that contrast is a single bit to choose between 1 and 1,000,000.

EDIT: I might as well link to the author's clarification. I missed it in the noise: https://news.ycombinator.com/item?id=15538738

It is a little odd. I don't think a linear, fixed-precision scale for luminance is used much, though I could be quite mistaken. I'm not very familiar with HDR. Still, I might as well explain a bit further...

Light accumulation is linear. If you have two lights, the amount of light reflecting off the surface is twice as much. Twice as many photons.

Human perception, however, is logorithmic. Something that is twice as bright in terms of photons emitted actually only looks a little brighter to the human eye. To get the same perceived brightness increment again, you need to double the photon count again.

With that in mind, what's the best numeric representation to use? Well, you want a simple way to add up the total amount of light from all sources. The amount of light may vary by orders of magnitude, but it's manageable because your need for precision scales with the magnitude you're dealing with. Basically, you want floating point.

In practice, a float for each color channel means 3 floats (96 bits) per pixel, which is a lot. So, while that might be used during calculations, it's probably stored compressed. For example, RGBE is a 32 bit per pixel format. You use 8 bits per color channel and multiply them all by 2^(E+128), where E is an 8 bit signed integer. It's kind of inaccurate for the dim channels if luminance varies significantly between colors, but the brightest one is probably going to overwhelm the other channels anyways.

The larger the dynamic range, the more bits you need to represent the possible values in between.

Imagine a gradient that smoothly transitions from black to white. 8 bit color means only 256 shades of luminance, which would quickly show banding artifacts.

In fact, the entire reason we gamma encode images is because 256 is too small for even a low dynamic range monitor. Gamma encoding is a clever technique to give us more bits to represent shadows, which humans are more sensitive to than highlights.

Right, but the contrast range 1:1,000,000 - that's a real range isn't it? Not an integral one? So why does it ideally need to be represented with a particular quantisation of 1,000,000 different states? I mean there's not 1,000,000 different values between the real numbers 1 and 1,000,000 are there? There's an infinite number. There's only 1,000,000 values (20 bits) if you decide to quantise on each integer. Why do you need to do that? Why not more? Or less? What's special about the integral values on the number line?

Because 1 defines the lowest value you want to see. 20 bit is the minimum number of bits you need (in linear light) if you don't want noise in the darkest parts of your image

There are an infinite number of luminance levels between a contrast of 1 to 1,000,000, right? You will need 20 bits if a human can distinguish 1,000,000 levels between those two extremes. Is that the case? Maybe a human can distinguish 2,000,000 levels between 1 and 1,000,000, in which case you would need 21 bits.

You can't take a real range and give a number of bits needed to represent it without also stating what resolution you need. And I think it's unlikely that the resolution that a human can perceive is coincidentally the same as the factor of one extreme to the other.

This was certainly not meant to be a crucial piece of information, but sure, let's get into it.

Much of the post comes from a general assumption that the goal of computer graphics is primarily to replicate how a camera sees the world around it. Thus I think it's easiest to start from the idea of real world light entering a digital image sensor. Light in this setting is not continuous! Each subpixel in an image sensor acts as a photon counter. One photon hits the sensor, the count ticks up by one. There's no question of being able to perceive the values between 1 and 2 because they don't even exist. Either the sensor counted one photon or two. If you were going to literally create a digital camera that can process the entire world, you need 20 bits to count up to a million photons without losing any along the way. So if you were to build the hypothetical rendering pipeline that works on "real world" data about the scene, that 20 bit value would be the input.

As a practical matter, nearly all modern games store lighting levels internally in floating point, in arbitrary units chosen by the developers. Lighting pipelines are not integer based, but they're linear and not perceptual. The conversion to perceptual 8 bit (gamma curve) happens as part of the tone map stage. Doing things in floating point physical units is a better idea than the photon counter anyway, but the line you're confused about was really written with idealized cameras in mind.

(Technically an image sensor is an analog device and the voltage increases with each photon detection by an increment that is subject to noise of all sorts and pre-amplification before hitting ADC. Don't jump me on the photon counter thing.)

> Technically an image sensor is an analog device and the voltage increases with each photon detection by an increment that is subject to noise of all sorts and pre-amplification before hitting ADC. Don't jump me on the photon counter thing.

But it's rather important, and not even for those reasons. This is not meant to jump on you! (and I really loved your article, like you said it's not a crucial point at all)

The 1:1000000 or 1:2^20 contrast ratio only corresponds to exactly 20 bits if the 1 on the low end of this ratio corresponds to exactly one discrete unit of light (photon). If it's off by a factor of 0.5, 1.618 or whatever, that's what the whole argument is about.

First, the sensor counts not photons but a value relating to photons per second (because exposure time). If the 1 on the low end of the range corresponds to some exact minimum number of photons, it's going to be "one discrete unit of light per <exposure time>". Making the whole thing analog from the start.

Second, those sensors most probably aren't able to count individual photons any way[0]. The human eye, after about 30 minutes to get optimally adjusted to darkness, can sort of perceive individual photons, or small bunches of maybe 2 or 3, kind of. Those barely-perceptible specks of light in the utter darkness aren't the sort of resolution issues we're worrying about in the dark end of these types of scenes. And, as soon as you make a light source that can be described as "emitting single photons" in a certain context, you get uncertainty effects and all that quantum jazz (show me a photon/path tracer renderer that gets the slit experiment correct[1] and you can have your integer photon counters :) ).

So the sensor output values can (and should), for all intents and purposes, be assumed to be an analogue value.

The amount of bits you represent it with just puts an upper bound on your signal-to-noise ratio (as per Shannon entropy). But since we're dealing with 2D images, the distribution of this noise over the spatial resolution (either as a result of sensor noise at the input or explicit noise shaping dithering at the end of the pipeline) also comes into play when considering the quality of the output.

If the signal-to-noise ratio of a sensor output happens to allow for 20 bits of precision, for a sensor that happens to have a 1:2^20 brightness range, that's coincidental. Sure it correlates because higher-end sensors tend to perform better in both range and SNR. But I don't believe that the 1 on the very low end of a discrete range represents precisely one photon per <power-of-two times minimum exposure time>.

[0] correct me if I'm wrong about this btw. There might be specialized scientific equipment that can, but I doubt even high-end cameras bother to go to the accuracy of single photons. But, I mostly know about digital signal processing, not about state of the art of camera hardware. Yet even if they are able to detect individual photons, that's going to be a probabilistic and per-unit-of-time measurement, so the rest of my argument holds.

[1] these probably exist, but aren't used for games or photorealistic rendering purposes

I think you are right about the "infinite scale" thing but also that the bit count is correct up to a multiplication coefficient, given a specific resolution you are trying to achieve.

To put it in mathematical terms, there is no reason to use the log base 2, but the formulae is definitely a logarithm. So, again, 20 bits is a minimum bit count conveniently used to compare with others situation (e.g. "if you are shooting not in direct sunlight, you will only need 12 bit minimum, not 20 bit minimum so this method of shooting is OK") and make decision about your color pipeline

   You will need 20 bits if a human can distinguish 1,000,000 levels between those two extremes
It's not about what we can perceive but how we store and manipulate amount of lights in our pipelines

You still have quantization noise at 20 bits. It's just less than at 8.

Exactly - it's a real range! You always have quantisation noise. What I'm asking is why is 20 bits the correct minimum resolution? One extreme being a factor of 1,000,000 to the other seems irrelevant for determining the correct minimum resolution.

You can represent a range of 0 to 1,000,000 with a single bit. It's just that you can't do much with that representation :P

So, for display purposes, 8 bits (with a non-linear mapping) has been sufficient for the brightness range of TV and monitors until recently. Now we have displays going High Dynamic Range. If you tried to display a 10x wider range of brightness using the old 8-bit format, you'd start to see a lot of ugly banding (posterization) much more than you have seen before because each of those 255 flat steps would be associated with a much larger section of the brightness curve. You'll need at least a couple more bits to keep your brighter&darker image looking as smooth as the mid-range-only images did in the past.

For storage purposes, the 16-bit per channel floating point OpenEXR format has been used by the film industry as a intermediate format between image processing tools for quite some time now. I think that demonstrates that 16 bit floats are sufficient for storage.

For computational purposes, 16 bit floats can build up a lot of numerical error very quickly. It's OK for small bit of math (moderate real-time shaders), but when things get complicated, it's necessary to go full 32-bit float.

> You can represent a range of 0 to 1,000,000 with a single bit. It's just that you can't do much with that representation :P

Tell that to D/A converters everywhere.

> If you tried to display a 10x wider range of brightness using the old 8-bit format, you'd start to see a lot of ugly banding (posterization) much more than you have seen before because--

Because you didn't dither, is why.

The only thing you lose with less bits is SNR, and thanks to the wonders of noise-shaping you can pretty much decide where to put this noise. And I'm just going to make a very broad statement that ever since we managed to climb out of 320x200 resolutions, spatial resolution is where you should put it, not in banding. Maybe even one or two bits in the time resolution if you want to get fancy about it.

Basic dithering can be as easy as just adding a uniform 0..1 random number just before truncation. Triangle noise (difference of two uniforms) can give a bit better visual accuracy for graphics, but has one or two snags that take attention to get right.

> For computational purposes, 16 bit floats can build up a lot of numerical error very quickly. It's OK for small bit of math (moderate real-time shaders), but when things get complicated, it's necessary to go full 32-bit float.

What kind of graphics calculation would build up enough error to matter? You don't exactly need to do A + 10000 - 10000 in a rendering pipeline.

If you're summing huge amounts of numbers in a raytracer it's appropriate. Where else?

You're completely right to question that assumption. It's very much possible to quantify the amount of error, or even just ballpark it. But numerical math is hard! :)

And .. even a raytracer doesn't sum that many numbers, maybe a few thousand[0]? So even then it's still a very reasonable question to ask.

[0] assuming a Monte Carlo pathtracer type of thing with 5-10 bounces or so. BTW, do you think you'd ever need more than five? Unless you're rendering close-to-parallel reflective surfaces, but the appearance of those is often as confusing in real life as it is when rendered correctly.

It's less than at 8, making acceptable new previously darker parts of the image. But I agree with the "infinite scale" argument

I think O.P. understand that since he wrote "You can encode a range using any number of bits you want, but if you use fewer bits the resolution within may be lower.", applying a curve solves high dynamic range but lowers the resolution

> Imagine a gradient that smoothly transitions from black to white. 8 bit color means only 256 shades of luminance, which would quickly show banding artifacts.

Only if you don't dither. Which is like free bits of precision given today's monitors resolutions.

I don't disagree with the rest of your point (although I do believe that just because there's a 20-bit dynamic range one should represent it by 20 bits of precision that's an arbitrary choice) (not even the best choice, IMO).

Just want to emphasize that, a statement being made all over this thread, that "less bits = you gonna get banding", just means you're doing it wrong, thinking about it wrong, and might therefore even end up with banding if you had enough bits but wasted them at some point in the pipeline.

I'll try to explain it the way I understand it, which might be completely wrong.

Imagine you have a lightbulb (producing let's say max 1000 units of light/brightness [which are two subtly different things]) connected to a dimmer that has for example 8 discrete levels. What would you say the dynamic range of this setup would be? Obviously measuring the contrast ratio from when the dimmer is in the zero setting (=light off) to max setting would result infinite contrast ratio/dynamic range. That is not very useful though, so the next best thing is to measure brightness difference from the lowest on setting ("1") to max, which gives you actually a reasonable number, in this scenario 125:1000 or 1:8. I hope this example was illuminating.

So I guess the idea is that bit depth fundamentally determines what is the minimum brightness representable without going to zero (where be dragons), and as such determines the max usable dynamic range.

That still doesn't make any sense.

You could have a dimmer that goes from one level of lightness, up to another, with 8 steps in between. And you could have another dimmer that goes between exactly the same two levels of lightness. The same range. but with 1,000,000 steps in between. The range of the two dimmer switches is the same. They both go from the same minimum to the same maximum. Just the number of steps in between is different. The number of bits represents the number of steps.

Why is 1,000,000 steps the required number? 1,000,000 is just the number of integral factors in the range in base 10. But why is an integral factor more important than a fractional one? They could both be perceptible to a human. The range is the same in both cases, and the given explanation was given that the range was so high.

> And you could have another dimmer that goes between exactly the same two levels of lightness. The same range. but with 1,000,000 steps in between. The range of the two dimmer switches is the same. They both go from the same minimum to the same maximum.

The thing is that you can't set the minimum value arbitrarily, because the scale is anchored to zero. So "1" is always one step above zero, and the step size is a function of bit depth.

So going with our example here, a dimmer with 1000000 steps would not have the same range as dimmer with 8 steps, because the the minimum brightness of one is 1/1000000th of max, and 1/8th of the max for the other (assuming linear scale for simplicity's sake, the principle applies to nonlinear scales too)

I suppose more engineery explanation would be that we allocate "0" to represent everything below noise floor and whatever maximum value to represent saturation, which leaves us with [1,max-1] as the useful range for our signal. And the dynamic range is in respect to that, instead of [0,max]

Another way to explain it is that for every two dimmer settings you can create, there is another luminance value between those two settings that you could have as an additional setting if you wanted. When do you stop adding new settings? Why stop when you have a million settings? Why is that a magic number?

At a certain point you'll hit the physical quantization of light, so that may be where the number comes from.

Assume a linear scale, and assume the 1,000,000:1 range is between the highest luminance (1,000,000) to the lowest luminance (1) that isn't pure black.

(Pure black might be zero, but that's useless for computing a ratio, so it doesn't give you a resolution.)

Then you need 20 bits, because log_2 1e6 rounds up to 20 (equivalently, 2^20 <= 1,000,000 < 2^21).

If you used only 8 bits, the ratio between highest and lowest (non-zero) luminance would be only 256:1, or 1,000,000:3,906.

20 bits lets you represent more values between n and 1,000,000 times n than 8 bits does, but it still doesn't let you represent them all, because these are real numbers and there an infinite number of values between n and 1,000,000 times n.

20 bits lets you represent all the values which are integral in base 10.... but so what? Why is the fact that a value is integral in a particular base make it the minimum resolution difference that you need to represent? Why do you need to quantise at that level? There is a luminance of 1.5 times n, isn't there? But 20 bits doesn't let me represent that. 21 bits would. See what I'm saying?

I understand you, I had the same thoughts as you in the past.

The fact is that if you want to have good highlights (beautiful clouds, candles, sun glares etc..) in your image when shooting with a camera, you have to "expose by the right" i.e. if the scene is more dynamic => reduce the amount of light incoming into the camera => your shadows will be crushed down so less precise => you need better color resolution if you want to "save" your shadows OR you need to use another specific colorspace (S log shaped)

    It's a problem of presenting that contrast ratio on a screen
    that is not capable of a contrast ratio of 1,000,000:1
If you have such a high contrast ratio, it also means you have discretized amount of light in such amount or more. Talking about the number of bits minimum needed indeed makes the assumption you are storing and manipulating light in a linear way (hypothesis not true for most 8 bpc rgb values).

It's a bit mindblowing, but I'm pretty sure that every single person who has responded to you doesn't understand what you're asking and doesn't understand how to answer it.

The short answer is: You're right that it's wrong to say that you need 20 bits to cover the range. Because clearly you can cover the range with 10 bits or 100 bits by subdividing differently. More bits is better, but there's no rule of required multiplication of bits for fabricated metrics like integral brightness multiples. 20 bits is just required to cover the range with integers without skipping any, which is an entirely arbitrary decision.

The long answer is:

The reason you want more bits to represent the larger range is exactly as you say because one doesn't actually care about _range_ but rather about the utilization of increments within the range. Now I'm going to describe something that you clearly already intuit for completeness...

Say, for example, that you have only 3 steps (let's call them 1, 2, and 3) in your range instead of a million, but you want to represent the full range of darkest dark to brightest bright with those 3 steps (for simplicity you could also be thinking in black and white instead of color). Now, you take a picture of a normal indoor environment with some gentle indirect light, and maybe all of external values would be, on the million range, between 350k and 650k. If you linearly divide your range onto 3 steps, when you look at your resulting image you will have a flat solid block of 2s with no 1s or 3s at all. And yet your original setting had an absolutely massive 300k spread. But it didn't have significant brights or significant darks, because you were indoors with some indirect light and there was no fireball sun or deep cavern of despair in the image, so you didn't measure any differentiation in values.

So you decide to switch it up and go inside a dark cathedral with some wonderful stained glass windows (everyone loves cathedrals with stained glass when looking at HDR). This time you do the same thing, but instead of measuring all 2s, you measure all 1s with a chunk of all 3s for the sunlit windows, but no 2s at all.

So then you say, "What I should do is take some of those 3s and some of those 1s and pretend that they're actually 2s when differentiation is too high, and take some of those 2s and split them into 1s and 3s when differentiation is too low! That way I'll see more details, because there will be more efficient utilization of the segments of my available output spectrum!"

But _then_ you say, "Oh crap! I can't do that! I didn't actually capture the differences! I captured what I captured, and what I captured was the aforementioned terrible undifferentiated data!"

So the reason you want to be able to represent values on a _densely_ _segmented_ range is just so that you can differentiate things that are only slightly different in absolute overall term (think photon counts, I guess) because aesthetically those slight differences matter. It means you can do things like know when some of your 2s should become 1s and 3s, because the 2s aren't actually 2s anymore; they're a bunch of tiny fractional variations between 1.5 and 2.5, and that's data you can work with.

You can perform tone mapping with any number of bits. If record 10 increments instead of only 3, you can obviously convert those 10 down to 3 in a way that gives a better image than just starting with 3. And the more bits you have, the more linearly you can record the environment without losing the differentiability between things that are in absolute terms on a large scale very similar. And then you get to do fun things like add a light bloom effect around something that is _very_ bright compared to its surroundings without losing details within the very bright thing and without losing details within the very dark surroundings.

Having more representational bits just means you can tone map scenes that have really bright parts, really dark parts, and really medium parts without losing any of the details inside those parts.

I feel like this post does nothing in terms of educating readers. It all sounds very interesting, but I didn't learn anything. It alters between putting the blame on technical limitations and ignorance, so I would have liked to see a suggestion on how to address either of these.

It's a well-written and informative post, it deserves a little more praise than "does nothing in terms of educating readers".

What he is discussing is about aesthetics. There are two things at play here: the big studio business model relies on making blockbusters, so you pack together a few hundred devs and artists and try to pop out the biggest game you can. The other is that many devs are technically trained but not especially artistically educated. They have a limited understanding of lighting, color, framing a scene, etc.

Someone who has made some great criticisms of this as well as provided real solutions is Eskil Steenberg.

The article isn't very concise and is actually quite repetitive, it uses a bunch of jargon (without any description), and shifts blame around.

The topics are interesting and the opinions are there, but it certainly could be written better.

As a programmer who has done a small amount of work in graphics but hasn't worked in HDR rendering I have no idea what any of these are....

    film LUT

    color grading

    post process LUT color grade
...but I could write the linked-to shader. So who is he helping?

> It alters between putting the blame on technical limitations and ignorance, so I would have liked to see a suggestion on how to address either of these.

I second this.

CLUT = color look-up table

color grading = adjusting the color in an image, as you might do in Photoshop or whatever

post process = any image adjustment that happens after the image is originally captured

This author is assuming a basic background in lighting / human vision / film technology.

Yes, I understand the acronyms, but they are being used as jargon, not in a instructive way.

If I were to talk about a web framework and say, "we've got our JSON files wrong and our package configs are cluttered, and don't get me started on the XML files," it's clear that it is metonymy [1].

Knowing that JSON is JavaScript Object Notation, the package.json is used for Node Package Manager, and that XML is a file format doesn't capture the feeling that a developer might have reading that sentence, and it doesn't instruct.

[1] https://en.wikipedia.org/wiki/Metonymy

From the article:

>>> It was Reinhard in years past, then it was Hable, now it’s ACES RRT. And it’s stop #1 on the train of Why does every game this year look exactly the goddamn same?

If the problem the author is facing is a gap between art skills and technical skills, then the post could be less angry and more informative about color theory or rendering technology, maybe bridging the needed skills.

If you were writing for an audience of people who work with web frameworks all the time, writing JSON and XML and whatever would be entirely reasonable.

Though if you look to the sidebar, you can see the author’s recent tweets:

> Tonight in "alarming discoveries" is that I went into my WP stats to find heavy traffic from Reddit. From r/pcgaming no less - blech.

> Practical consequence is that I'm going to have to write future entries from more basic principles, because this is getting out of hand.

So maybe you’ll get your wish.

The title of the blog is literally "vent[ing] space".

Sure. I feel like this discussion has drifted away from the original comment and reply. Great vent, mediocre article.

> adjusting the color in an image, as you might do in Photoshop or whatever

Photoshop is a very basic tool for this. Color grading tools used in the industry are far more advanced than the options offered in PS, take a peek at Resolve (free).

I’m just defining a term, hence the “or whatever”.

But personally I find Photoshop to be more flexible and more capable than any of these film tools I’ve looked at, none of which seem to be capable of the kind of (very non-standard) workflow I use in Photoshop. Photoshop tools (layers, masks, adjustment layers [especially curves, and more recently full color lookup tables], blend modes) can be combined in very sophisticated ways which can be set up via a macro action, whereas most of these other tools are a bit more prescriptive. For example I can build my own “hue vs. hue” tool (from the Resolve marketing video) out of more basic Photoshop building blocks, and set up an action to pop one up with a keystroke. To be fair Photoshop also has piles of highly constrained single-purpose tools, most of which are outdated and relatively useless.

But none of these tools was ever designed with a user in mind who would think abstractly about building blocks and means of combination. More abstract and generic functional-boxes + arrows kind of interface could probably do better from a flexibility perspective, but can be a pain in the butt to use.

Someone with a color science, user interface design, and art background and a lot of time to experiment who was trying to make a tool for professionals could also certainly do a heck of a lot better. Most of the individual user interface components in these tools [including photoshop] are pretty uncreative and inflexible. Maybe this will be me in a few years when I have some more time, we’ll see.

Admittedly I deal with still images, not video.

Resolve specifically and PS are operating differently. Resolve's color correction is nodal-based, nodes stack by coefficients, not output (like layers generally do). The usual example to show this off is to drown the blacks in one node and pull them back up in the next, without any loss of detail, since Resolve combines the coefficients not the outputs of the transformations.

Resolve exactly lets you put your layers in graphs instead of have them linear. It also lets a later layer bring back detail that was removed in a previous layer, while Photoshop doesn't. Here's an example: https://youtu.be/YxUoW5_gMjQ

That video shows Resolve to be a pretty capable tool. (Can the nodes take custom user-specified logic in them, e.g. some GPU shader or the like?)

But everything shown in that video (other than motion tracking) can be done step for step equivalently in Photoshop, albeit some bits take some nonstandard combinations of tools. Some of it would probably be more convenient in PS to achieve via a different method.

I don’t exactly understand what the compositing order is of “parallel nodes”, but I’m quite convinced I could mimick the effect almost precisely given a technical description.

The nonlinear ordering you mention could be done using smart objects, but that can get annoying, so it’s usually easier in practice to just duplicate the layer (which breaks the link if you change something up the compositing chain later).

I think the audience is other games graphics programmers and technical artists who can be expected to understand all this terminology. As a game graphics programmer the article was written appropriately assuming I'm representative of his intended audience (which is implied by his use of 'we').

The article is intended for graphics engineers. All those terms are pretty fundamental and something I expect a graphics engineer working even ~6 months professionally to fully grasp.

Have you considered that you're not the target audience?

Link please to Eskil Steenbergs writings!

That entire presentation is amazing. I recommend it every chance I get.

Thank you! Eskil is such a wizard.

I thought it was a great post, but you're likely not the intended audience. At the very least, you probably need to understand HDR as a technique and the fundamentals of tone-mapping.

I thought there were a few very clear suggestions:

1. Pick your tone mapping early in the development process.

2. Look into the models are used by the camera / film industry, because they're clearly getting better results.

It's not a complete discussion, but as a student in computer graphics, I found it very informative. The title says Part 1, so presumably there will be more. The links were also all very good, too.

There's a few links to other articles that go into greater detail about that, and from there a whole rabbit-hole of more things :)

Artstyle is far far more important than graphical capability.

A game like Team Fortress 2, released in 2007 (!) looks so much better than many modern games because there is a coherence and style to the art. It's not "HD" for the sake of "HD."

We're in a period where the graphical canvas is getting larger every year, and the temptation is to fill it with as much color and pop as possible. But some restraint really works wonders.

> We're in a period where the graphical canvas is getting larger every year, and the temptation is to fill it with as much color and pop as possible.

It's been a while, but last I looked, it seemed that the temptation was to use ever muddier and desaturated visuals, with as much glare and shiny surfaces as possible.

It comes and goes in phases according to the available technology. The first Quake was all muddy brown and gray. Quake 2 and Unreal both introduced color lighting and looked like a disco on Saturday night. Mid-2000s games added a lot of new lighting and post-processing effects but they had limitations causing harsh shadows and highlights, hence another round of gray/brown photorealism came through.

But in this decade things are finally feeling more evened out. Lighting models are sophisticated enough to allow for designs similar to a film set or photo shoot, and post-processing is getting past basic glows and filters and into a spectrum of quality/performance tradeoffs.

Of course, the games that don't aim for photorealism always age better. That's been the case since people started digitizing photographs for games.

What's funny is that it looks like Valve lowered the polycount and quality of TF2, likely to accommodate the user-added stuff like hats--which seems in the spirit of stuff this article is upset about.


I think this is a great point. It's why even cartoon-art games like Super Mario Brothers (1985) can age really well visually while games that put so much effort into graphics still age poorly.

If the main goal in a game is to make the graphics of yesterday look and feel obsolete, then that game will probably look and feel obsolete tomorrow...

TF2 has nice stylized models, but IMO the in game graphics are dated and not particularly special compared to newer genre examples like Overwatch, Splatoon, or Destiny.

I feel TF2 has a place in peoples hearts so it keeps getting wheeled out in this debate way more times than it's deserved these days (especially considering Valve compromised both the art direction and coherence in recent years).

All 3 examples here are excellent but I want to say I know a lot of people hate on Destiny but the art direction, environmental art, creature design is all magical in terms of tech and results. Honestly feels like concept art brought to life and anyone dismissing it and not even giving it a chance just are missing out on some seriously impressive visuals resting perfectly between realistic and stylistic.

Everyone commenting here that they don't care about photorealism in their games is missing the point. The example games (and movie) given at the start of the article lack an appearance of photorealism not because they were going for something else, but because they either didn't pay enough attention to the HDR rendering, or else they didn't have the expertise to do it right. The article also gives examples of games that intentionally went with a non-photorealistic look to achieve a specific effect. So the point is not "these games look bad because they aren't photorealistic", it's "these games look bad because their HDR rendering is bad".

I don't know enough myself about HDR rendering to have an opinion on whether the author is right or wrong about any of this, but people arguing that the author is wrong because games shouldn't look photorealistic are attacking a straw man.

The author doesn't appear to have bothered to investigate or honestly ask WHY games are the way they are, though. They are just blindly attacking an entire industry on basically the claim that the entire industry is stupid.

But the industry isn't stupid, and some of the "over-contrasty garbage" is intentional to help people rapid identify the things that they need to identify. Being able to quickly spot the things that you need to play the game are more important than sensible tone maps and contrast curves.

This is not a medium you observe like film & paintings are, it's a medium you interact with, and rapidly at that. Approaching the visuals with the same perspective as something you only observe is a fundamentally flawed approach.

This article is packed with baseless assertions, unfounded claims, and offhanded but not investigated technical limitations. Hell, right after saying "partly due to technical limitations" the author proceeds to throw out a "nobody in the game industry has an understanding of film chemistry" snark. The author doesn't even bother to elaborate on the technical limitations that they remarked on, opting instead to just hurl insults for no reason.

I recognize the author of this post just from the logo on his site. He worked in the industry and has spent years sharing his knowledge on gamedev.net. This is not an attack on the industry. He is speaking to his peers.

> [...] some of the "over-contrasty garbage" is intentional to help people rapid identify the things that they need to identify. Being able to quickly spot the things that you need to play the game are more important than sensible tone maps and contrast curves.

Is this speculation or were you involved in these discussions? Or, maybe a GDC talk? For all I know, you could the lead graphics programmer for Naughty Dog, in which case I'd totally take your word for it.

And, actually, I really would like to hear more specific criticisms. As much as I respect Promit, he absolutely could be wrong and a detailed breakdown from someone who knows better would be informative.

I was hoping that the author would go into how to do tone mapping correctly.

> ask WHY games are the way they are, though.

To add to your point, the main reason for "bad HDR" in FPS games is for simulating eye adaptation. Screens don't have 1,000,000:1 constrast ratio and can't shine as bright as the sun (we wouldn't want them to). This means that games are forced to approximate this effect - inevitably resulting in shortcomings.

Furthermore, game development is a race against time. You have milliseconds to render the scene and apply tone mapping. It may very well be mathematically possible to create automated tone mapping that would make a cinematographer proud, but can you make an algorithm that completes in less than 1 millisecond? I doubt it.

To my point, there are mods on PC that replace the tone mapper (I think ReShade does this) and they absolutely destroy your framerate - however the results are absolutely gorgeous if you know what you're doing.

> To add to your point, the main reason for "bad HDR" in FPS games is for simulating eye adaptation...

How is that different from what photographs do?

> Furthermore, game development is a race against time. You have milliseconds to render the scene and apply tone mapping.

Maybe I know just enough to be ignorant of the topic, but LUTs are just lookup tables. People working in film apply LUTs to the footage they're watching in realtime on normal PC hardware. Fragment shaders have been common in games for years. Last year I was working on a VR project (targeting 11ms per frame) and I think the post-processing took between 0.25 and 1.5ms--any more an it would have been dropped). For a desktop game having 16 or 30ms to render many fewer pixels is an eternity, comparatively. The only difference between a LUT and ICC profile (which most OSes support out of the box) is that for ICC profiles are mean to be calibrated against either an input or output--they're used everywhere in realtime.

I've never actually used ReShade, so I had to look it up. It does way more than color correcting. Depth of field, ambient occlusion, antialiasing are completely separate from tone mapping and notoriously computationally expensive (not just in video games, but in film as well). I also think since third-parties are writing them, they're not always tuned to be performant.

> How is that different from what photographs do?

A typically developed HDR photo is a single sample following the time at which the eye has adapted. There are [hopefully] no over- or under-exposed areas.

The dynamic range in HDR games is often used to simulate the entire process of eye adaptation. It's a function over time that intentionally creates over- or under-exposure some of the time.

I still don't see why animating the exposure explains why there's "bad HDR" in games (high contrast and over saturation)? I haven't implemented it or read any papers on the subject, but the games I've played don't look that much different than auto-exposure on camcorders in the 90's (or your phone today). For actual cameras it's more difficult because you need to guess and adjust the aperture or shutter instead of having a HDR image in a buffer and deciding how to generate a LDR image to present. Calculating the exposure (especially if you're animating it and only willing to go so far in one direction or the other), should be fairly quick.

The game industry definitely follows inward looking trends. For a long time every game was grey or brown and lacked contrast. More recently they all seem to use the same look/LUTs.

I worked in animated feature films and had similar critiques about the lighting and look. I felt like they lit everything like a well exposed photograph and were afraid of letting things fall off into dark or light (part of that could have been poor simulation of the toe or head of the film curve).

Then, when Pixar did Wall-e (2008) they hired a well known cinematographer, Roger Deakins, as a consultant. DreamWorks also hired him for How to Train Your Dragon (2010). I think that was a big change and both of their studios approached how CG movies look.

Whatever reason modern games have for looking like shit, it's not good enough. The end effect is that they look like shit.

Maybe a little snark is good. These big studios have been pumping out bigger and louder games for a decade, and each one is more soulless than the last. There's a reason Minecraft took off the way it did. It's because it was a return to the visually appealing age when pixel colors were in some sense manually selected, resulting in a pleasant palette. I can say without a hint of irony that I prefer the look and feel of the original Doom to any modern blockbuster game.

Are you here to tell me that Minecraft and Doom are paragons of bad gameplay?

Maybe I wasn't clear. I said, basically, gameplay trumps visuals. You are in agreement that gameplay trumps visuals. If gameplay and visuals are at odds, gameplay should win. Minecraft's contrast is boosted and its colors are not accurate, and this all adds to its charm. This is in dispute of the article, which is taking the stance that tone mapping and colors should be tuned to look like a painting or film, rather than aspects that can influence gameplay.

My response here is mostly that many of the visual choices made in the games I'm criticizing were made completely independently of the game design team. This is far from ideal but it's a consequence of how the production pipeline is set up in these blockbuster titles. I do vehemently disagree that the aesthetic traits I'm looking for are at odds to gameplay and design decisions, though.

I could go on for a while about how Zelda BOTW manages to integrate the design and aesthetic choices, but I'm going to get mugged if I try to use that game as an example again. Maybe I'll use it as part of a different write-up on visual design.

Zelda BOTW has climb-anywhere gameplay, though, so it doesn't need landscape contrast to support gameplay purposes. Contrast that to something like Horizon Zero Dawn where you need the harsh ledge contrast on rock faces to distinguish between walls you can climb on and those you cannot.

Similarly combat in BOTW happens up close, you don't actually need to distinguish things at range. This is not generally true, especially for the other titles you're comparing BOTW against. BOTW also has problematic cases such as this: https://static.gamespot.com/uploads/scale_super/1552/1552458... That rock face is horrible. Totally washed out in a highly unrealistic way, to say nothing of the ridiculous saturation of the torch light.

Or, you know, the developers were trying to balance far more factors than what has to happen with a movie. Multiple platforms, not only high FPS but maintaining that FPS with disparate scene complexity, old hardware vs new, support a variety of camera distances and fov, highlighting game elements so that objects don't blend together, handling in-game cutscenes as well as actual play, having an asset pipeline that doesn't destroy your budget, etc. Not to mention players that are random monkeys that like to try to do bizarre things within your game.

I guarantee game developers and game artists know a hell of a lot about HDR and color and lighting, and all kinds of things related to it, probably more than most people involved with movies. They just come at it from a different viewpoint - that of real-time rendering.

I can confirm everything except gamer audience concerns. I don’t game. I have worked in film production camera department, post-production and now CGI. Color grading and compositing in particular are not specialties of mine, but I know the systems well, and I can confirm the technical critique in the text. This author does an impressive job of cutting out the bull and getting to the bare issues.

As a gamer, I can also confirm that I tend to have a tough time with dark areas in a lot of mainstream games. If I can get the game to realize that I'm trying to look at them (usually by getting the bright sky out of my viewport so it does the "HDR adjust" fake thing) then the darks come out of the contrast floor and I can suddenly make out the detail.

The problem with the game doing fake-HDR is that while it looks great in trailers (or... at the very least, looks like the engine is doing something fancy with lighting, "great"-ness notwithstanding) it doesn't match what my real eyes would do in a situation. Unlike film, a video game doesn't have control over where I'm going to point the camera. I do. And sometimes, I need to see over that ridge, but the detail in the shadows is important too, and the game camera can't see what I'm actually looking at (with my eyes) and adjust accordingly.

Breath of the Wild is a great counter example, because it doesn't really try to do the HDR adjust thing at all. It doesn't really need to; sure its daylight scenes are a bit washed out, but real daylight is a bit washed out too. My eyes adjust, just like they would outside, and I can actually see what I'm doing. That's important! With games, tempting though it is to control the camera effects just like you would when shooting film, you can't control the player's actions. It's usually better for gameplay to mellow out the effects, so the player can focus on their actions without being distracted by the 'pretty' effects.

> The problem with the game doing fake-HDR is that while it looks great in trailers (or... at the very least, looks like the engine is doing something fancy with lighting, "great"-ness notwithstanding) it doesn't match what my real eyes would do in a situation.

An especially egregious offender in this category is Battlefield 1. When looking out of the driver's hatch of any vehicle, the game will randomly and chaotically fade it's "HDR" simulation between the exterior being discernible and a white rectangle (the hatch) set on a solid black background (the vehicle interior). The fade depends on miniscule adjustments in the PoV of the player, which is fixed relative to the outside world, not the vehicle, so you have to constantly track the view port.

Battlefield 1 generally looks best in scenes featuring low (natural) contrast.

> Yeah, and we screwed that up too. We don’t have the technical capability to run real film industry LUTs in the correct color space

(The reason being that even fairly simple color correction pipelines in a real color grading tool keep a mid-range GPU pretty busy at 1920x1080 and higher resolutions).

> I tend to have a tough time with dark areas in a lot of mainstream games. If I can get the game to realize that I'm trying to look at them (usually by getting the bright sky out of my viewport so it does the "HDR adjust" fake thing) then the darks come out of the contrast floor and I can suddenly make out the detail.

Of course, this is also how eyeballs work...

This is a valid defense! The real issue though, primarily, is that video game cameras don't emulate this well enough. My eyes don't respond nearly as fast to light changes as a video game camera does, and they're not as sensitive to small variations in the overall light level. They're also quite a bit more sensitive, especially to darkness and shadow, even in bright daylight. I find that most video game lighting exaggerates the shadows way too much.

So, adjusting the curves, especially at a slow, steady rate? That might be a good idea, and should actually aid visibility if it's done right. Say, walking all the way into a dark cave should lighten things up a bit, since your real eyes (and most cameras) would do exactly that. On paper, this part of the HDR effect works, and if it's done well it enhances the gameplay rather than distracting from it.

Imitating a faster, more jarring camera transition when it fits the mood, especially during cutscenes? That's fine, and the article gives some great examples of this being done well. Scaling up the contrast so that any HDR changes cause you to either lose all the detail in all of the darks, or all the detail in all of the lights? That's the problem. The effect, if it's there at all, needs to be a lot more subtle. This is partly due to limitations with the final RGB space (only 255 color values to work with and weird curves to boot) but mostly due to the fake-HDR effect just being a poor imitation of nature.

I am a colorist and I also want to add that the technical details check out - minus the Red One being better than film. It was most decidedly worse in every way - but was almost as good which was the important thing. I recently was going back and look at One and MX footage and couldn't believe how terrible it has held up while the Alexa sensor is still great today as are the 16 and 35 scans I work with.

That aside, I found the discussion about the misuse of ACES particularly interesting. It's like the completely misunderstood the tool and how it's intended to be used (which isn't uncommon with the tech unfortunately).

Yes, the bit about what the RED One was trying to do was a bit cringy but I thought it skirted around claiming it was better than film. The comparisons between the Alexa and RED One were very good, however. Coincidentally, I was doing work for Oakley as the RED One was being tested. After two test shoots with it, the director declared it B-cam and let a PA operate it, only because we couldn’t get the color right. This was action sports and the frame rate would have been useful. Even now, I am not sure much has changed. I don’t go on sets or handle much live footage these days but my observations have it that much of the production industry who could use it still lacks understanding of ACES. I have not worked with Alexa on any projects where I interacted with both camera and footage but my understanding is ARRI remains the king of blacks.

I've always said (as a joke) that all game developers must have bad eyesight and dirty glasses.

Everything is blurry, you can't see far, you get dirt over your vision and there are glares and lens-flares all over.

What's worst is that this is done on purpose and on almost all the games. They must think that this is what real life look like.

I rarely play games but the most recent game I played was Alien: Isolation. It had been quite a while since I played a game before that so I was surprised to see a lot of these things on by default. Specifically, chromatic aberration. Why the hell would anyone add in a setting for that? I get that they were going for the Super8/VHS type of look but it's just really distracting. I forgot what the other settings were but I turned off a lot of those junk visual effects and then things looked beautiful.

I also can't stand chromatic aberration in games - especially because it's so freakishly exaggerated compared to actual CA you would experience with old video/photo gear.

A lens that completely separates color channels is defective. It was defective in 1950 and it's defective today, so unless your goal was to replicate the look of hopelessly broken hardware, it's in no way "vintage".

Which I think goes more generally into my issue with game graphics - the taking of interesting effects (depth of field, chromatic aberration, lens flares, etc) and cranking it up to 11 for its own sake.

Epic added chromatic aberation to the new Unreal Tournament at some point and the mob started sharpening their pitchforks. The reason given afterwards was that they think it’s important to break the “sterile” cg look by roughening the image slightly. I can see their point, games are at the point where they look “photoreal”, but not like a photo. I find the almost-but-not-quite realism jarring, personally, I even prefer playing older games usually because of it. so I can understand why devs try all kinds of tricks to alleviate it.

They do this in many action movies (especially Michael Bay) as well. They believe the audience prefers these over-the-top effects. Based off the revenue of entire film franchises with millions of people returning for sequels of more of the same, I think they nailed what consumers want?

In the past draw distances were much worse. But those restrictions spurred creativity. For example, the miserable draw distance of the PlayStation 1 resulted in the developers of Silent Hill to hide faraway rendering with a thick fog, which soon became an iconic theme for all Silent Hill games, even as the hardware far surpassed the need to keep it.

Tons of games in that era had dense fog for that reason.

Yes but in silent hill they gave a reason for it as opposed to just being something the player was supposed to ignore. They took the limitation and used it in world.

The first thing I turn off in most games is Bloom. I actually do have glasses and the bloom effect in most games just looks like my glasses are greasy, no thanks.

Also, AA is another thing I generally turn off. It gives quite a performance boost and the resolution these days is high enough that I'm not bothered by it.

From part 0, they make the claim and question:

> So why the heck do most games look so bad?

And then they go on to pick a game that people specifically calling out for looking amazing: HZD.

In fact, the problem seems to be:

> But all of them feel videogamey and none of them would pass for a film or a photograph.

To which I would reply: because they are video games. A photograph doesn't pass as film, and film doesn't pass as a photograph (though the nature of the two, film essentially being a lot of photographs). A video game must serve a different purpose than a photograph or a film. The players interact with the game, and therefore the intent and purpose is different. Glares on the screen aren't just there to add lens flare, but to make it harder to see because that's part of the challenge. It is harder to see, and you have to deal with that to overcome the challenge. That can be offset by keeping the sun to your back. Same with grime in the screen, or other such things. You want to highlight the things players need to react to, and in such a way that allow them to react quickly.

There is so much more, and yet so far this article is only referencing the pure look and seems to ignore the goals of the medium. I'll reserve judgement until I read the next parts of the articles, but until then, I can't help but wonder if this really matters.

Most games attempt to look realistic, and they fail. They definitely try to look like film, so comparing them to film is perfectly fair: it is an impossible standard that masochist, mannerist or hopeless game art directors have freely chosen for their projects and by which they deserve and demand to be judged.

Failure at realistic game graphics is so common, so taken for granted, that the scale of artistic accomplishment shifts downwards: if HZD looks only slightly bad, not constantly, and without disgusting the player too much, it's considered to have "amazing" graphics.

> But all of them feel videogamey and none of them would pass for a film or a photograph.

Until graphics and animation are photorealistic I prefer my games to look videogamey. All those 4 screenshots are beautiful and I wouldn't want them any other way.

As a fairly avid gamer, but there is a reason why my top 10 games are all fairly mediocre in the graphics department. High end Triple-AAA games with "Great" graphics usually lack depth in their mechanics, and while this isn't always true, it happens enough to keep me from enjoying them.

There is a reason why Skyrim, Mount&Blade, GTAV, etc are still dominating playtime charts years after release. Good enough graphics + deep, flexible mechanics is a much better combination in my opinion.

I'll give you Mount and Blade... but Skyrim and especially GTAV were at the highend of graphics (and budget) for open world AAA games when they were released in 2011 and 2013. They also both had very long development cycles, the release of GTAIV was 2008 and Oblivion was 2006. Assuming similar development cycles, the next games in each series will be due in 2018 or 2019.

The bar for "great" graphics is going to change as new hardware comes out, but its a stretch to say those games were not pushing rendering on their target minspec systems (Xbox 360 and PS3) at the time of release, there aren't any examples of open world games that looked better than either of those games and ran on that hardware.

I know it's fun (although tired) to jump on any thread about game graphics and say "Graphics don't matter, games with great graphics don't have good gameplay!", but sharing knowledge about how to tune and implement postprocess to get better image quality isn't hurting gameplay quality and can lead to games both looking and playing better.

> There is a reason why Skyrim, Mount&Blade, GTAV, etc are still dominating playtime charts years after release. Good enough graphics + deep, flexible mechanics is a much better combination in my opinion.

Those are all sandbox games that you effectively never stop playing. People tell me that there's a story and an endgame in Skyrim, but I've played a few hundred hours without doing any main-line quests past the first city. Mount&Blade I've sunk even more into - there isn't even the pretense of a story there, so you're pretty much on your own if you want to role-play, or just take in the enjoyment of riding around, hacking at people, and looting their corpses.

Those were just 3 of the more mainstream games with semi-limited graphics I could think of, but you're right they do share a sandbox style. Overwatch would be another example of a game where they put "good enough" graphics combined with much more refined mechanics (depending on who you ask, this last patch...eh) to achieve success

Anyway, the point I was going for the last thing we need is having them spend even more effort trying to make pretty looking but mediocre playing games.

Mount and Blade, and to a lesser extent, Skyrim maybe, but I don't think GTAV fits here. It pushed its target hardware (originally PS3/Xbox 360) pretty far, to the point I wondered if it was going too far and was thus lacking in other areas (e.g. pedestrian/vehicle density).

You may have already tried it, but from this comment I think you'd really enjoy The Witcher 3.

I've been playing The Witcher 3 recently, and it actually really annoyed me with its HDR.

Yes, that sunset is pretty, but now I can't see anything else. I do not want to reproduce my frustrating commute into the sunset in a game.

While on the subject of graphics in games...

Another pet peeve of mine: ambient occlusion (crude GI approximation) and especially SSAO (crude approximation of a crude approximation).

Corners don’t look like that: http://nothings.org/gamedev/ssao/

(Ambient occlusion proper and in moderation I think looks fine, but most games really overdo it.)

Latch on to that idea -- corners don't look like that -- and follow it to its logical conclusion: no one knows how to make any fully-simulated video look indistinguishable from reality.

That's a huge leap, but it has the benefit of being true.

(To be precise: no one has ever created fully-simulated video capable of fooling human observers anywhere close to 50% of the time. The video has to be reasonably long (>30sec) and complex. But in a double-blind test, almost any nature video will handily beat any synthetic video.)

> (To be precise: no one has ever created fully-simulated video capable of fooling human observers anywhere close to 50% of the time. The video has to be reasonably long (>30sec) and complex. But in a double-blind test, almost any nature video will handily beat any synthetic video.)

Out of curiosity is this from something? it sounds like you're citing something and for realtime I'd agree. I think people are fooled everyday by raytraced images (movie cg etc).

> no one knows how to make any fully-simulated video look indistinguishable from reality.

I'd say we know how to do it, we don't know how to do it fast.

I'd say we know how to do it, we don't know how to do it fast.

If all I do here today is hatch an egg of doubt in your head, I would be delighted. Someone needs to carry the torch.. My life has turned to other interests.

It's the central problem. It's as hard as AGI, and it might be as impactful as the invention of the airplane.

Think of it. Fully-simulated video, indistinguishable from reality.

In many ways, it was my first love. The desire to be a gamedev drove me to learn programming. I ended up a graphics dev -- Carmack's old path. My job was to make a certain game engine look better. What an innocent problem... 12 years later, I still feel the pull, the need to blow off everything in my life and bend a computer to my will. To our will. The human mind has never once achieved this goal. And it was the perfect problem... I never cared much about fame or money. But to be the first. Think of it... How can you not want to spend the rest of your life on this? The solution is out there, taunting us. Everyone is pursuing physics, when all we need to do is pursue the fact that video cameras can already generate images that look identical to real life.

The ancients have made up stories about the sky and stars since long before civilization. Put yourself in their shoes, if they even had shoes.

Look up. It's the night sky, far brighter than anything we can see today. Imagine staring up at the infinite complexity, wondering, how does it work? Why do the stars go the way they go? We tell stories; could one of the stories be right?

A few millennia later, one person had an idea: What if we watch the stars very, very carefully, and collect the stories very carefully? We could compare the movements of the stars to the stories, so that the alternative theories might be distinguished from one another.

This was the key to modern science, and the root of wisdom. When you stop thinking about what everyone else is doing, you're free to hit on solutions that everyone else overlooked.

And think of the feeling you'd get when you finally solved it. Can you imagine? You'd get the same rush as the Wright brothers, or Ford when he made the assembly line, or McCarthy when he stumbled across Lisp.

If you or anyone else intends to take on this challenge, know this:

The fact that no one believes you when you say "No one has ever done this, and no one has any idea how to do it," is your biggest advantage.

It means you're free to spend the next five years figuring out that solution that everyone else missed because they were too busy chasing the pipe dream that if you throw enough physics+time at a computer, it will produce synthetic video that fools people into thinking it's real.

The moment people realize that it's probably not that hard, you'll lose your advantage, because every top tech company will start exploring your problem space. Like if in 1902 you'd hinted to a top university the gist of Einstein's thesis. No one would take you seriously. Lucky for you.

So, what's the secret technique? Well, if I knew that, I'd have fulfilled my 12-year dream. But I know a few things that will move you (12-N) years toward the goal.

There is one rule, and one rule alone. You have to force yourself to stay true to it, or else nothing else you do will matter. Here it is:

If you get a dozen people together, and show them a mix of 10 real videos and 10 simulated videos, and those videos are reasonably complex, like clips from a nature documentary, then 12 out of 12 people will effortlessly call out your fake videos as fake and your real videos as real. It's not even close. That's how far away we are from the goal of fully-simulated video indistinguishable from reality.

Maybe I've hooked you at this point. Maybe not. But if anyone comes up with a way to fool those 12 people so completely that their responses are no better than random chance, you win.

Let's call this the "Carmack criterion." If you tried administering the above test to a dozen clones of Carmack, here's how they'd sound: "That's a fake. That one's real. Fake. Fake. Real. Real." No matter how much ornament or showmanship you throw into the video, you can't fool Carmack. He'll report whatever his eyes are telling him.

And as of 2017, he'll be right 100% of the time. His eyes would shout: "None of those fakes were even close to real! Are you kidding? One of the real videos was of a lion taking down a gazelle. I know every artist in the gamedev industry. None of them have ever produced anything approaching that level of quality, even working together."

That, and that alone, is the game. Literally nothing else matters. If you can fool people until their responses are statistically identical to RNG, you've done it. You're world-famous. Yer a wizard.

Corollary: you can use the Carmack criterion like a compass for every decision you face. Should you research physically-based rendering, or try to apply machine learning? The latter seems unpromising. Yet Hollywood has been administering the Carmack test to millions of people, most recently with Avatar, which completely rules out physically-based rendering. So we know to spend zero time on it.

As you can see, that razor is so sharp that it will cut away every illusion you might try to cling to that humans are anywhere close to achieving the Carmack criterion. Or that some smart hacker somewhere has a pretty good idea of how to achieve it, or that it's just a matter of letting computers advance another few decades, or any other false reason that those around you like to tell themselves.

But if mainstream ideas are dead-ends, then what should we research?

I hesitate to give concrete suggestions, because the history of science demonstrates that progress isn't made like that. Whatever the real solution is, it's far beyond anything you or I can imagine today. People were forced by mathematics to believe that planets' orbits were elliptical. An ellipse is the only shape that makes the numbers come out right. Yet how many of our ancestors came up with that idea? Even by accident, it's probably too bizarre for anyone to seriously consider it. Not without mathematics.

Yet that's a positive statement: It meant that if someone were audacious enough to trust in mathematics alone, they could determine the right answer. The solution was always there, waiting for you to find it.

To make any progress at all, your ideas will need to seem shockingly different. The whole world has spent two decades going over every inch of physically-based rendering -- presumably hoping that if they put on a different pair of glasses, maybe they'll spot anything other than a mountain of evidence that it doesn't work.

So you have to let yourself consider every angle, no matter how strange.

720 frames of 720p video. That's all you need. That's 30 seconds of HD footage. Get a computer to conjure up those 720 frames. Summon DaVinci's ghost, and you win.

Whenever someone finally solves this, you'll think "Oh, right. That technique makes sense." But it only makes sense because you see it works. Till then, that correct answer will seem to be a complete waste of time.

Think of Airbnb, and how awful their idea sounded. Yet when someone spent a couple years exploring the problem space, shazam! Out popped a billion-dollar company.

Since there is ~zero chance these ideas are anywhere close to the right answer, here are the two avenues I left off with:

1. A video camera generates images that pass the Carmack criterion. Ask yourself: why do those videos look real? And why is it so important to judge video, not photos? (It's crucial.)

This is key: Are you absolutely certain you should be ignoring the fact that any old camcorder's videos look real? Whip out your phone. Take a video. That video looks real. Why? Quantify the difference vs footage of the latest game engine.

(Try to avoid using the latest movies as a basis of comparison, because movies mix real-life footage into their VFX. Our criteria of "fully-simulated video" is strict by design: it keeps us honest about our progress. Especially to ourselves.)

2. After meditating for a year on why crappy cellphone videos look real, you may start thinking along the lines of "how can I write a program to mimic the essence of that realism?" It looks real because the colors are exactly right. Think of evolution, and how long we've been evolving. That whole time, every single one of our ancestors were staring at images that they believed were real. Our brains are wired to notice even a hint of strangeness. ("When we notice there's something strange about a video, what exactly is going on there? What do we mean by that?" is another "fun" question.)

Now, wouldn't it be handy if someone knew how to write a program that can mimic real-life data? If only such a technique were possible... We even have an infinite stream of pre-classified data to feed it: phones and webcams.

Hmmm. :)

You've probably looked at this, but maybe it would help to avoid thinking "out of the box", and try it more incrementally: create an extremely simple cellphone video of an "easy" scene, like a teapot sitting on linoleum or something. Then try to recreate it -perfectly- using computer graphics. Get it to the level where you can literally compare the pixels for each frame.

Maybe that could bring you closer to understanding what the important factors are that the current graphics pipeline can't do. Why is it hard to get the pixels in the simulated video be the same color as the cell phone ones? Are the materials off? The shadows? If you can't even make the teapot look real, then you've zeroed in on something fundamental that's still going to bite you when you're busy trying to rig antelope skeletons.

There's already one standard scene / benchmark like that, the Cornell Box: https://en.wikipedia.org/wiki/Cornell_box

Of course it's ridiculously simple, so you may want to increase the bar a little to impress GP :p

Thanks for sharing!

If it gives you any sense of hope, I was actually paraphrasing Carmack himself as part of the source of "we already know how". I believe from a Quakecon Keynote (I'm not sure what year unfortunately, the context was PBR and approximations).

The real thing I'm not certain of is do we want to live in a world where we have the ability to create in realtime video content entirely indistinguishable from reality.

The entertainment and simulation benefits would obviously be amazing. The ability to recreate phenomenon we can observe in science but can't see, etc.

But there is also the possibility to weaponize that. Seeing is believing and what do in a future where we can't trust anything we see.

Part of me is glad we don't have to ask ourselves these questions yet.

While not video, check out any IKEA-catalogue of the past couple of years, it's roughly 50/50 photo versus CGI and you can't tell which is which. They got their photo-crew CGI training and vice versa.

> no one knows how to make any fully-simulated video look indistinguishable from reality.

Does that need qualifying with "in a reasonable time frame and/or budget"? Or is it really just that bad?

For what it's worth, I spent about a decade trying to chase down that answer.

If you bet $3,000 that "we have no idea what we're doing" is an accurate assessment, you'd win.

How can that be? Because color science is very difficult. Your eyes are designed to fool you.

When you're born into a certain time period -- a random slice of human history -- the probability the dominant school of thought is mistaken is nearly 1. Wouldn't it be remarkable if we were the first generation who figured out all the truths?

The hardest part is admitting to yourself that it might be true. Could it be possible? Has the world collectively been using techniques that are nowhere close to the final answer?

I launched myself into that question with an open mind. As far as I can tell, the answer is yes.

From having worked in the industry, it's pretty accurate to say that most graphics programmers haven't read any books on color, or the human visual system. I nearly didn't. I was dragged into it because I kept getting strange answers when I tried to mix colors and quantify the diffs -- I was trying to do the same experiment that CIE 1931 did, but I got very different results. That led me to the Musnell color system, and to the history of color theory.

If you glance over the history, you'll notice that our understanding of color keeps changing. The models keep being updated; we can never quite figure out whether they're right. If CIE was perfectly accurate, we'd never have invented LAB space, because CIE would perfectly match nature. Right?

Musnell tried a different approach. Rather than coming up with a fine-sounding theory and curve-fitting it to the data, he built a model directly from the data. One of the most powerful techniques at our disposal is to use our own eyes as a null instrument. You have to have absolute confidence in your own judgement -- I cannot overstate how easy it is to fool yourself -- but if you are as methodical as a robot, you can come up with surprising answers. When those answers contradict the established science that everyone believes, you start to worry. Maybe you weren't careful enough, right? They must know what they're doing; this is what everyone believes, after all.

No... At the end of it, you discover that it really is that strange. Color science is one of the hardest to quantify. There are hard answers, but only when you strip away all the context your eyes relies on. When you look at something, you see literally a million clues that tell your brain it's a 3D shape and that X color is brighter than Y because of Z. Nature has spent a billion years evolving your brain to be able to process all of that instantly. It's impossible to be consciously aware of everything that's happening.

The only way to answer your question is to (a) come up with a methodical test, (b) conduct it meticulously, then (c) trust in yourself and the fact that you are competent and were extremely careful.

If you do all three of those things, you will be dragged kicking and screaming to the conclusion that not one person anywhere in the world has any idea how to generate 100% synthetic video. We don't even know where to begin. No one knows even roughly what the final techniques might look like.

Think about how integral a good artist is. Every rendering pipeline in the world is built for artist flexibility. When a talented team of artists feel empowered by the tools you write for them, they end up producing a different kind of movie altogether. It's not a matter of degree. The reason movies look incredible is because artists mastered the tools we make for them. That's their role, and this is ours. Both halves are crucial.

Yet what does that imply? Imagine we invent a program that produces perfectly real video. People think it looks like a nature documentary. Now think about everything an artist does in a modern pipeline: they decide which shaders to use. Which materials to apply, and to what. The base color of everything. The shape and the animation. They arbitrary select the physics. When grass changes color from green to brown, it's because the atoms they're made of are changing -- everything that makes light bounce off grass in a way that looks real, those are the parameters that artists change "till it looks good." It's arbitrary. It makes no sense to say with a straight face that we've created a "physically based renderer" when the artists have complete authority to break every assumption and piece of data that those physics simulations were modeled from.

The fact that they have so much flexibility is a strong hint that we are very far from mastering this. If artists' jobs were mostly identical to a set designer's job -- placing lights, arranging the scene -- then our renderer must look so real that it may as well be reality, right? If it looked perfectly real, there would be no reason to change it, except as a stylistic choice (which is fine, but it's unrelated to our goals).

Now, people will immediately try to convince you that there are engines out there that work that way. Artists are mostly set designers, they say. But all you have to do is look. Take a clip from whatever movie they produced, and put it side by side with a nature video. Then put it next to the most visually cutting-edge movie you can think of. An honest assessment will show a striking difference.

Lack of flexibility kills the art. The flexibility is the only technique we have. The fact that you can get really talented artists together and give them highly advanced tools, and they end up spitting out stuff that looks real -- it's not inherently obvious that we should've been able to invent those techniques! The fact that it's possible at all is amazing. When graphics programmers believe in the ideology of physically-based rendering, they become slaves to hubris. They start thinking it's reasonable to take away the only tools that work.

Ask yourself this: When an artist is free to flex all the parameters until it looks real, what's going on there? What does that mean, in a fundamental sense?

It's a deep question, and I still haven't come up with a complete answer. But I think it's reasonable to say that artists attempt to make the output on the screen match the output that a video camera would have recorded, if the scene were real. Yes, they make a few stylistic tweaks, but all of it still looks awesome. That's why people pay to experience it. It's partly why Star Wars was such a hit. It was believable.

And that, my friend, is the real question. Asking "Do we know how to make something look real, if only we spent enough money or CPU power on it?" turns out not to make any sense. Counterintuitive, yet true. People care about making movies or manipulating images in photoshop or making games look awesome. They don't care about wasting time trying to coax the computer into generating video that can fool an audience -- they already have a thousand techniques for fooling them! Why generate it when you can mix in actual video from the real world?

As strange is it sounds, I think the full answer is: no one realizes we have no idea how to generate video of complex scenes indistinguishable from reality in a double-blind test because there's no money in it. Not yet. If you happen to invent it, your company might make a million dollars. But you're more likely to lose a million by trying to achieve that objective.

What about scientists? Surely some of them must have spent their lives trying to answer such a deep and fundamental mystery?

Yes and no. There is a lot of impressive work out there, but scientists are mainly concerned about getting published. Their careers are at stake. If you don't publish, you can't get funding, and your impact comes to an end. And the problem is ambiguous: what does it mean to publish a paper related to the idea that people don't know how to generate synthetic images that look real? Everyone already knows that! You can't write a paper on that. The best you can do is try to come up with a paper about an incremental improvement.

And that's exactly what we see. It's all we see. Negative results in science are mostly discarded -- much of the time, we simply don't hear about them. I am speculating, but I think this would be even worse in color science: it's not very prestigious work. When you run an experiment to validate the CIE model and end up with wildly different answers, what do you do? As a scientist with a deadline that may literally kill your career, how likely are you to chase down this mystery? Or to have the freedom to suddenly pivot, and to make the paper about that?

It felt strange to realize no one knows how to make 100% synthetic video look real. It's like sinking up to your neck in quicksand: an inescapable conclusion, and consequences people would rather not dwell on.

Focus on the data. That's the key. Not opinions, not what the professionals believe, but data -- hard answers, obtained from careful experiment with a large sample size -- you arrive at some very unexpected truths.

It's hard to know what to even do with the information. What do you even say? I would've dismissed this at 23. "What are the chances that everyone in the world is being sent to college to learn the wrong techniques? And what about all the published research? The guy who inspired me to become a graphics programmer worked so hard on his graphics engine. He spent years thinking about it. You're saying he has no idea what he's doing, and that all the techniques are fundamentally flawed? That they're not even kinda-sorta close?"

All I can say is, look at the data. Pretend you're piloting a plane in complete darkness. You either trust your instruments -- your carefully-designed experiments -- or you don't.

My main problem with it isn’t that it’s unrealistic per se, just that it looks bad on static geometry. But SSAO is typically sold as a “realism” feature.

Compelling article, but I can’t help but feel the author hasn’t appreciated some of these games on an HDR capable screen. They call out Battlefield 1 and Horizon Zero Dawn, both of which look stunning in HDR mode on an LG OLED screen. BF1 only got HDR mode implemented in a patch fairly recently, and before then was unplayable on an OLED with its deep blacks, it was a contrast fest with no luminance range what so ever. After the patch however, that all changed.

I’ve never played HZD in anything but HDR mode (in fact, I bought the game specifically to test HDR) so admittedly I don’t know the difference there. All I know is in HDR mode, HZD is one of the prettiest games I’ve ever played, with fantastic colors and brilliant transitions in luminance.

HDR I feel is one of those things you just can’t appreciate in screenshots or comparison videos. You have to get a good screen (I can’t see myself coming back from OLED, it’s – no pun intended – a game changer) capable of HDR and just appreciate it for yourself, it’s really something.

I’m sure it’ll be even better as technology matures and game developers become more familiar with it, and as well the market penetration of HDR capable screens increase, but to say they don’t know what they’re doing now seems a bit of a stretch. Especially if all you have to go on are screenshots.

Maybe I just play too many video games, but I don't think any of those "terrible" screenshots look bad. If anything, the foggy image from Breath of the Wild looks worse than any of them. What am I missing? What is supposed to be bad about these?

Some of it is just one's personal taste, but I would say that a practical difference is that I find it difficult to pick out details in the images called out as "bad". My eyes feel overwhelmed by the huge shifts in contrast and it's hard to visually identify objects in the very dark areas and the very shiny, bright areas.

I also get a very strange sense of depth from those images -- it feels like everything is either very close or very far away, without having a sense of how far apart objects actually are.

In comparison, in the BOTW screenshot, I can clearly identify objects of interest (a shrine, a tower, a volcano, a river, a bridge) and have a sense of where they are compared to each other and myself.

That said, I feel like it's not a completely fair comparison -- the "bad" images all seem to have the camera in a dark area looking out and up into a bright area, while the BOTW camera is on top of an exposed mountaintop looking out into the sunset. The lighting is going to look a lot different regardless. However, the HZD image after _that_ is really quite garish and harder to justify.

That particular BotW screenshot isn't entirely representative, what is remarkable about BotW is that going for a non-photorealistic aesthetic makes the game more compelling as there are fewer details competing for your visual attention. I highly recommend playing or viewing it at 1080p on a very large screen.

The attempts at photorealism brings them barely into the uncanny valley, but overdone HDR pulls them firmly into the bottom of the valley. To think they look bad, you have to be the sort of person who didn't like Tarkin's CGI in Rogue One.

The first game to feature (pseudo) HDR was actually Shadow of the Colossus, and it's still a beautiful game despite its washed out colors, super high contrast, and "that dynamic exposure adjustment effect nobody likes". It may have been the template for most or all of the "bad" games in the article.

These effects are beautiful in Colossus because they were chosen for artistic reasons and they are artistically used. The lack of color saturation complements the game's lush, but somehow bleak and foreboding, natural environment. The extreme contrast highlights the differences between the dark and dank indoor environments of the various ruins and temples, and the raw untamed wilderness that lies outside them. And the dynamic exposure adjustment effect is, vaguely, supposed to parallel the main character's eyes adjusting as he moves from one kind of environment to another. (Either that or just "hey look, we can do this on a PS2!")

It's kinda like how the parallax line-scrolling effects in Shadow of the Beast were amazing in that particular game at that particular time, but take the same effect in a shitty game -- like, say, Bubsy -- and it just looks awful and tiresome.

Isn't this just a matter of taste? There are a lot of tone mapping algorithms everybody know about. So it's just a matter of taste choosing the one you like.

Edit: As far as I know render engines use the complete dynamic range until the last step which converts it to a 16 bit output range. So maybe games can let the user decide which tone mapping algorithm to use.

*24 bit output range

The point of most video games is to be distinct, not to look like reality. Furthermore, they use high-contrast to help distinguish game elements. Games are not movies.

Came looking for this. Unnatural contrast is good in games because it can help players distinguish different set pieces easier.

This is also why a lot of people prefer older FPS titles or games like TF2 used stylized profiles of the characters that stand out. In some games, you want interpretation of the game scene to be a challenge, but in most games, you want players to easily tell what they are looking at. Using high contrast / distinct models / distinct animations / color differentiation all help to achieve that.

> Screenshot of Shovel Knight

I don't think it's fair to put Shovel Knight in your list. Yacht Club Games tried really hard to recreate a NES aesthetics and constraints (limited color palette, sprites, etc...) with careful deviations when it is really necessary [1]. I don't think you'll find more aesthetically pleasant game in the NES library.

[1] https://www.gamasutra.com/blogs/DavidDAngelo/20140625/219383...

Yeah, NES games looked pretty bad. I can find only a couple screenshots that don't make my eyes bleed:



Hence my mention of historical reasons that are holding game aesthetics back. I think making good art requires criticizing your childhood tastes, not following them.

TF2 is very pretty and it’s clear that a lot of thought went into the color palettes and visual effects. The examples of bad effects in this article are the result of developers thinking “this is how games look”. I don’t think they got as far as planning how to guide the player’s attention.

I hate this idea that it should be photo realistic.

    > But all of them feel videogamey and none of them would pass for a film or a photograph.
Thanks, if they didn't looked at least "videogamey" I wouldn't have any interest. Stop this terrible trend of being a film!

The point is that the lack of photorealism in the given examples is not intentional, but happens because of a lack of proper attention to the HDR aspects of rendering. The article also gives examples of games that intentionally deviate from photorealism in order to achieve a specific effect.

HZD is intentional the way it is and it is perfectly good on my not-HDR tv. So he is missing the point.

Some pretty bold and controversial (and subjective) statements, but I think the author makes a good case! Very interesting article.

"ventspace" indeed, but I agree--it's an interesting perspective.

Honestly, I'm tired of the race for games to just 'look' better. Most of the time it seems as if it is a detriment to actual gameplay and frame rate.

I would definitely prefer games to look bad (or preferably stylized in a way that works for the game) over having a somewhat realistic looking game ( only when your standing still so the frame rate doesn't dip) and terrible mechanics and a bad story.

Far to much information and talk about making games look better and very little about how to make games actually better. Modern gaming to me has become the same as clicking the X on hundreds of popups from your windows 98 machine. Just with somewhat realistic looking ugly faces instead of X's.

Related to that, colors of videos on the internet (and YouTube) are always kinda ugly and (I think) not very faithful of what the color artists intended to show...

I'm not sure but I think one of the issues are from the 16 - 235 channel remapping

It's probably more likely due to the fact that codecs for videos on YouTube tend to have lower resolution in the red and blue channels, which leads to very bad blocky artifacting around bright red and bright blue areas.

I think everything else (Blu-Ray, DVD, cable TV) is using the same technique.

YouTube don't seem to handle colour profiles correctly

or there is some problem in the pipeline between uploading your video and it getting reproduced on screen by the player inside a web browser

videos exported from e.g. FCPX have colour profile attached and look right when played back in Quicktime desktop player, but the contrast always gets messed up when it comes back through YouTube

this is a separate issue from compression artefacts

It makes sense for artists to concentrate on the details of how games look, and this article provides some good insight into that world.

But using stark moral terms for aesthetic judgements seems rather insular at best. When I think about games like Minecraft, Factorio, RimWorld, and so on, I wonder if maybe imitating movie-quality graphics is really as important as the author thinks it is? It's only a small part of the experience.

Correct me if I am wrong, but I thought the author was protesting against the misuse of techniques for imitating movie quality graphics. He held up BOTW for not attempting such imitation.

The author makes a point of Nintendo opting out of HDR but does not realize thats due to hardware limits not allowing deferred shading. The HDR techniques he does not like are all made possible by the use of deferred shading. These techniques may be sometimes used to overky stylize the look but it is undisputed they create more realistic images.

Promit Roy has been being knowledgeable about graphics since before I was in college. I kind of doubt he "doesn't realize" it. The dude is really, really good.

And HDR doesn't really do anything to materially aid what he's calling out as well done in the first place. It's ancillary. It's not important. What's important is actual artistic intent and structure. That's what's lacking, along with a misunderstanding of why certain techniques work in film and don't when they are uncritically applied to games.

I dont know what kind of HDR you refer to. My point was that for realistc rendering or physically based, whatever you might call it, a linear workflow (light calc in linearized space) is crucial. It is very important for the artist too if you aim at realism. Of course it brings challenged such as exposure correction and managing dyamics. These challenges are not yet fully solved but to say we ought to abandon HDR and go back to SRGB is at least questionable... we might as well go back to mode 13.

The Switch is perfectly capable of both deferred shading and HDR, which are themselves independent unrelated techniques. You can make an argument about whether or not those are the best implementation trade-offs for that hardware, taking performance and desired visuals into account. But there's nothing inherently preventing you from doing it on the Switch and it's a fairly capable GPU in its own right.

P.S. Unreal Engine reintroduced forward (non-deferred) rendering as an option last year because it's more efficient in VR.

True, but deferred shading takes up extra performance, in most cases, thats why it is less used on lower end systems or vr where you need 120fps. i did not claim hdr or deferred was impossible on nintendo switch.

Deferred rendering is on the way out, in my opinion. With the advent of clustered forward rendering, we can get many, many lights without having to do multiple passes, cram everything in the gbuffer, use a single shading model for the whole scene, and be limited to screen-space anti-aliasing like FXAA and TSAA.

Please note there is s difference between deferred rendering and deferred shading. The latter is here to stay and all major game engines have made it their main technique. (vr excluded)

This is false, HDR and deferred shading are two completely unrelated concepts.

A few months ago I hacked tonemapping straight into the ubershader of some foss Quake client, you don’t even need a seperate buffer to do these things.

They are not as separate. A deferred shading pipeline would make little sense if it was operating purely in LDR. Tonemapping in foward shading can be done but makes a lot more sense to do as a post step. If you ask a photographer about tone mapping he/she would tell you about a very different concept (but is more likely to turn away in disgust)... and that could not be done by color-mapping 1 fragment at a time.

What kind of esoteric hardware can't do deferred shading? I really doubt that the switch lacks the ability, and I know it doesn't lack the processing power.

Does anybody know what's the technical ground for some games to have dark areas turn brighter (typically, of a squashed, dark, green/blue) instead of darker?

I remember one game ("Through the woods") which had this problem in a very aggressive form. It was very visible while playing, but it's hard to find a screenshot.

Here, for example, the house on the left (but also the cabin in the center:


What's the exact issue? Is it a tone mapping problem? I'm curious, because I have a rather good monitor, and some dark games (eg. "Alien: Isolation") are equally dark, but don't look as bad as this.

Looks like environment lighting to me.

I'm not sure this is the case in your example, but a lot of render engines use an environment light when rays don't detect collision.

So the camera is looking at the wall of the house and bounces into oblivion. Therefore the environment values are used. And since the stones of the house are gray a lot of environment light bounces back to the camera. But when the camera is looking at the trees a lot of rays bounce to leaves and the ground returning almost no light back to the camera.

I would even say that your example looks realistic (although way to dark).

Half Life: The Lost Coast, this is all your fault!

I play World Of Tanks and tried every one of their mappings (forget what they call them). Even the one I can tolerate makes the tanks look like toys in a sandbox. Never feels like a real tank.

HDR is the new "next-gen Brown!"

As HDR panels used for TVs 'trickle down' to PC displays the HDR part of the problem will solve itself. Perhaps then people will focus more on color grading (although I suspect over-saturation will always be a staple in some genres).

This can really be attributed to a general rule of thumb in game development: aesthetics are more important than graphics.

I would guess this isn't a priority for the industry because gamers themselves don't care. Case in point: gaming monitors. Gaming monitors advertise high-refresh rates and low input lag but rarely color quality. What percentage of gamers use wide gamut monitors or even perform color calibration?

FWIW, I'm a colorist and have very expensive color critical monitors and calibration probes. For GUI monitors I've started recommending gaming panels for suites that don't want to invest in professional graphics displays. Once calibrated these monitors are surprisingly accurate and hold the calibration well (at least as good as professional displays many times more expensive - though still a far cry from color critical displays). The expanded refresh rates are a nice bonus for eye relief as well.

Some films are absolutely just as contrast-y as these HDR images. In particular, Fujiilm RDP-III slide film. See: https://www.flickr.com/groups/provia100f/pool/

But yes, Arri does have a beautiful soft look along with Fuji digital cameras, and plenty of film stocks do as well. It all depends on the tastes. Gamers aren't going for soft reality. A lot of hard-edged gamers are going to go for the high-contrast, desaturated ExTreMe ChaLLeNgE look.

The resident evil screenshot looks incredibly realistic. I thought that was a real photograph placed there to contrast reality with games. Until I read the paragraph below it.

It's amazing how many commenters seem to be embarrassing themselves by trying to be the first to wildly misunderstand the problems being discussed and reduce them to trite memes. You're not even disagreeing with the thrust of the article in your vehement objections.

Yes, this indicates the article was poorly written. I advise you move on rather than reflexively attack nothing.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact