Also butteraugli's XYB has similar ideas, but is slightly more expensive to calculate due to the biased logarithm in the compression function (instead of cubic root), but possibly scales better for HDR (say above 200 nits).
JPEG XL's XYB includes more red and less green in S-receptor modeling (for the blue-yellow axis). If I look at literature of LMS receptor spectra, it makes me wonder why there is so much green in Oklab. When I optimized similar for XYB, the optimization would favor adding slightly more red for the S than green.
S component in JPEG XL XYB before non-linearity:
0.24 * R + 0.20 * G + 0.56 * B
S component in Oklab before non-linearity:
0.05 * R + 0.26 * G + 0.63 * B
Given the similarity of Oklab and XYB I suspect (but i'm not completely sure) that JPEG XL's format is powerful enough to model Oklab, too. Very very likely it can perfectly model the M1 matrix and the cubic root. I believe for M2 there may be some need for approximations. There JPEG XL can have local variations for M2 from chroma-from-luma fields, but likely luma needs to be slightly different from Oklab.
This difference can be because Oklab is based on XYZ which is based on 2 degree color samples. XYB is based on about 0.03 degree color samples. Perception seems to be different there -- to me it looks like S is not yet integrated into Luma experience at that resolution.
In butteraugli color modeling is more complex: it is divided into high spatial frequency and low spatial frequency. S is brought only to the low spatial frequency color transforms. (Frequency separation there is by Laplacian pyramid.)
I think this might be a case where the requirements for image editing and image compression are different.
For image editing, especially when working with HDR images, I think it is better to just have a simple power function, since this makes less assumptions on the exact viewing conditions. E.g a user might want to adjust exposure while editing an image, and if the prediction of hue changes when the exposure is altered, that would be confusing (which happen if more complex non-linearities are used). When compressing final images though, that wouldn’t be an issue in the same way.
Basically the M1 matrix for linear sRGB [linearR, linearG, linearB, 1] to approximate cone responses:
(I think in this normalization 1 means 250 nits, but not completely sure at this stage of optimizations -- we changed normalizations on this recently.)
M1 = [
[0.300, 0.622, 0.078, 0.0038],
[0.240, 0.682, 0.078, 0.0038],
[0.243, 0.205, 0.552, 0.0038],
[0, 0, 0, 1]
then non-linearity by cubic root, in decoding cube, see: https://gitlab.com/wg1/jpeg-xl/-/blob/master/lib/jxl/dec_xyb...
The LMS values after cubic root are coded by this matrix M2:
M2 = [[1, -1, 0], [1, 1, 0], [0, 0, 1]]
In practice Y->X and Y->B correlations are decorrelated, so M2 looks more like this:
M2 = [[1+a, -1+a, 0], [1, 1, 0], [b, b, 1]]
after decorrelations a is often around zero and b is around -0.5.
The first dimension in this formulation is X (red-green), second Y (luma), third B (blueness-yellowness).
For quantization, X, Y and B channels are multiplied by constants representing their psychovisual strength. X and B channels (the chromacity channels) are less important when quantization is low, and particularly X channel increases in strength when more quantization is done.
Cube is beautiful in the sense that it allows scaling the intensity without considerations, but it is quite awful in the near black psychovisual performance. That is why sRGB added a linear ramp, and I added biasing (homogeneous transform instead of 3x3).
Regarding: “Cube is beautiful in the sense that it allows scaling the intensity without considerations, but it is quite awful in the near black psychovisual performance.”
Yeah, that is the tradeoff, same for dealing with hdr values. The idea with Oklab is to avoid having to know what luminance the eye is adapted to, by treating all colors as if they are within the normal color vision range basically. Makes it simpler to use and more predictable to use, but makes predictions in the extreme ends worse than it would be taking knowledge of the viewing conditions into account (given that you can do so accurately)
E.g. linear ramp for near black values would not be good if you are in a dark room, only viewing very dark values full screen on a monitor (so there isn’t anything bright around to adapt to)
Certainly gives it a natural look.
With public domain C++ code! This is incredibly useful to me as a Unity developer. Can almost be plugged right in to C# & Unity Colors. Could be used for interesting real-time color effects. One of my favorite simple things to do in HSV is to animate the hue. This could be used to do something similar, but across that more attractive Oklab gradient.
One trick I've used in the past is using the alpha channel to mask multiple hue shifts in a shader to give a lot of variations. Eg 0 = shift A, 0.5 = don't shift, 1 = shift B.
That way you can make eg leather armor with metal attachments, and tan or fade the leather and swap metal types independently.
It is pretty much the same cost as CIELAB. Just uses 2 full 3x3 matrices where CIELAB can be thought of as using matrices with a bunch of zeros in them.
Björn Ottosson not only did some mighty fine work, producing a simple equation that produces smooth colour gradients, but he "showed his work" too. Instead of just journal references, he littered this page with incredibly useful hyperlinks to difficult to find things such as the raw data for the Munsell colour chart and the Luo-Rigg data!
This should also be mandatory reading for people not so serious about computer colour, because heads up: If you do any kind of arithmetic on RGB bytes, you've screwed up much more than you think you have! It's one of the most common examples of the Dunning–Kruger effect.
Need a nice gradient of colours to represent something graphically, such as low-medium-high values? Go to this page first.
Need to blur a background picture? Go to this page first.
Resizing images? Read this first.
Making colour schemes for a web page theme? This page first.
Adding a "colour picker" control? Definitely read this first.
Wow, thanks a lot!
If anyone has any questions feel free to post here and I can try to answer.
I have another post that goes into more detail onhow software often gets color wrong:
How big are your datasets? Would the parameters get better if they were bigger, or have they converged to some optimum?
A couple of typos I spotted: "asses", "he final".
The generated dataset consists of a few thousand colors. The hue dataset is using 15 different hues only. Some more data there could definitely be useful.
I think the biggest problem is that there isn't that much experimental data overall, especially for wide gamut colors. The hue data is from experiments with sRGB displays if I remember correctly, and CIECAM I think has mostly been derived based on surface paints, which makes it fairly limited.
Comprehensive experiments done using modern calibrated wide gamut displays would be fantastic.
Thanks, will have a look at the typos!
Do it soon before you have too many people creating independent implementations.
The comparisons with CIELAB and CIELUV are using LCh coordinates. That’s how hue and chroma predictions are made using those spaces.
Really nice work!!! How does it compare with HSLuv (https://www.hsluv.org/)? It seems that both scheme try to manage the perceptual color problem...
I'm sure you're aware that a quick and dirty method of fast image segmentation is just to compare image frames in varying color spaces. Would sRGB be the best "complement" to oklab in such a pipeline?
Also wondering aloud here if maybe you haven't stumbled upon an ideal color space for optical flow fields between pixels in moving images, in video prediction research for example? Great work!
Instead of a continuous gradient for the color, show patches of color in a randomized order. Display the colors over top of grayscale gradients of different sizes. This helps to deal with the trouble of human perception being affected by background and by solid angle. You might also want to consider showing the new color right over or around an image of the user's choice.
For example, on a small screen, you might have a 256x256 patch on a 512x512 grayscale gradient for the currently selected color, and numerous 16x16 patches on 32x32 gradients for the choices that the user can make.
With each click on a choice, the whole array of choices is redone.
Available choices are provided as variations of the current choice and any colors that might have been bookmarked. Take the current choice, and vary the hue. Take the current choice, and vary the saturation. (including the negative extreme, positive extreme, and grey) Take some properties (hue, saturation, etc.) from the current choice, and others from a recent bookmark. Probably also throw in a 3x3x3 sRGB to make dramatic changes simple. Be sure to include everything that is within 2 or 3 units of the current choice in sRGB. In a plane that slices through the current selection and the grey line, provide choices that run along lines from the current choice in 4 different directions: black through the current selection until out of gamut, white through the current selection until out of gamut, along the saturation axis both directions, and along the L axis parallel to the grey line.
So with each click, you head in the direction you prefer.
Or you can buy an introductory color science book or two. Let me recommend Mark Fairchild's Color Appearance Models, https://www.amazon.com/dp/1119967031/ but here are a few others https://www.amazon.com/dp/1119367220 https://www.amazon.com/dp/0470024259 https://www.amazon.com/dp/1118173848/ https://www.amazon.com/dp/0470049049
I also think it is useful to focus on understanding the various experiments that have led to the different color models. The most important one is the experiments that led to CIE XYZ. Lecture notes from universities seem like one of the best sources of info about the basics. Such as this: https://www.cl.cam.ac.uk/teaching/1516/AdvGraph/02_Light_and...
Other experiments that are interesting, but a bit hard to find information about are:
The Munsell renotation effort in the 1940s, the experiments that led to OSA-UCS, the MacAdam ellipses.
I also like this paper since gives a fairly good overview and lots of new keywords to search for: https://www.osapublishing.org/viewmedia.cfm?uri=oe-25-13-151... (and is freely available)
Landa & Fairchild 2005, "Charting Color from the Eye of the Beholder", http://markfairchild.org/PDFs/PAP21.pdf
* * *
Nickerson 1940 "History of the Munsell Color System and Its Scientific Application" https://doi.org/10.1364/JOSA.30.000575
1943 OSA Munsell renotations report: https://doi.org/10.1364/JOSA.33.000385
Nickerson 1976 "History of the Munsell Color System, Company, and Foundation" (3 parts) https://doi.org/10.1111/j.1520-6378.1976.tb00003.x https://doi.org/10.1111/j.1520-6378.1976.tb00017.x ahttps://doi.org/10.1111/j.1520-6378.1976.tb00028.x
Nickerson 1983 obituary for Alex Munsell, https://munsell.com/color-blog/alexander-ector-orr-munsell/
Kuehni, "The early development of the Munsell system" https://doi.org/10.1002/col.10002
Another question: When going back from Oklab to the device color space, some numbers may at times fall out of range (e.g., negative). Is there a recommended way to bring those back into range to perceptually close colors?
However, it's not clear to me why should this be so. For a given wavelength, I would understand that mixing in physical space would be better, i.e., this may apply to lightness. Why should it not work better when applied more generically in a perceptual space (Note: I have worked with displays with more than three primaries), even when pixels are not individually perceptible.
Another way to put this is that dithering works because of physical blending of photons due to an insufficiently sharp eye lens before perceptual mechanisms in the brain.
Further, the question still remains why is it that mixing of photons spatially as you explained works better imperceptible pixels, and yet we need these non-linear color spaces when having larger areas.
Goes without saying that the intensity hit for 50 need not be the midpoint of that hit for 0 and 100 given the gamma curve, and actual mapping of the value to intensity for the pixel.
But if you zoom in enough, any smooth curve looks linear.
For most computer vision applications, that is the opposite of what you want. When you analyze a video stream, it is quite common for frames to have different brightness due to things like aliasing between the flicker of LEDs in the room and the camera shutter. That's why CV needs a color space where colors remain close to each other, no matter the frame brightness.
Also LEDs don't typically flicker unless a dimmer is on the circuit and their PCB isn't made for that. You might be thinking of florescent lights that are pulsed with a ballast.
That really depends on the viewer’s adaptation state.
> So if users will view your rendered image surrounded by an rgb-white screen, if/when the resulting small red/yellow/green/white shift is problemaic, you might consider rendering using a non-standard 58K whitepoint. Just saw someone burned by this the other day, with a "supposed to be white with a localized tint" object (the Sun), which D65 then confused.
Isn’t that what the ICC’s relative rendering intent is supposed to take care of, if I understand what you mean?
IIRC, both D65 and D50 are sufficiently extreme as to prevent full chromatic adaptation. Viewers will be aware a scene is lit "cold" or "warm"ly.
> ICC’s relative rendering intent is supposed to take care of
Yes. Browser support used to be poor, but I've not been following it. Firefox seems to still require user config? A perhaps outdated WP article suggests chrome support of V4 is OS dependent? But yes, someday this will just work.
For the most part, when looking at a monitor, chromatic adaptation is pretty good. In other words, RGB #FFFFFF looks white, not light blue, and #808080 looks gray, not bluish-gray. It does start falling apart when you mix different lighting, for example holding a white piece of paper to the screen under normal room light.
I think the main reason D65 is chosen for the sRGB white point is that it is fairly accurate, in other words it's pretty close to the actual white point of the monitor I'm looking at now (Dell P2415Q).
I have yet to find a source on chromatic adaptation that I consider a really excellent tutorial. Certainly the Wikipedia page is cursory and abstract, which is a shame.
Illuminant A is a black body radiator that approximates an incandescent light bulb. Illuminants B and C are attempts to approximate daylight by putting a colored liquid filter in front of the black body radiator (or incandescent bulb). Illuminant B is a simulation of tropical noon daylight, while Illuminant C is a simulation of average daylight at a higher latitude (more contribution from the sky, less from direct sunlight).
Illuminant C was widely used in colorimetry, but it didn't match real measured daylight spectra especially closely, so the D illuminants are a replacement based on some physical measurements taken in Rochester (where Kodak was based) and London. Colorimetric applications that previously used illuminant C mostly switched to D65 instead.
(Note that real outdoor daylight spectra vary dramatically depending on place, time, and weather conditions.)
For details, see https://en.wikipedia.org/wiki/Standard_illuminant
> reason D65 is chosen for the sRGB white point is that it is fairly accurate, in other words it's pretty close to the actual white point of the monitor I'm looking at now (Dell P2415Q).
There is also influence the other way. Monitors have a D65 white point to match the spec.
As the images we were trying to compress got more detail in them we had to redo the compression algorithm so that we could keep as much detail as possible. The key was to do the compression in a new color space and then undo the color space transform before we saved the LUTs.
I ended up creating an empirical color space that was the best for each tile we were compressing (think PCA), but now that I look at this, it might even do better since the perception is so consistent.
If you try to shift hues, keeping a constant luminance causes a problem. You can't transform a bright blue to a bright yellow, because the luminance of blue is considered much lower than yellow. The best you'll get is a very dark brown.
If your goal is to otherwise preserve color relationships, then you should use a perceptually relevant color space.
But maybe you have some different goal...?
An HSV hue shift (or whatever similar thing you are thinking of) yields horrible results in this use case.
Photoshop is definitely designed RGB-first, and some of the tools get a bit clunky in CIELAB mode. It is still an improvement for most of what I want to do.
I have many ideas for better image color manipulation tools.
Computer graphics is mostly fairly simple math that needs to scale well and not have edge cases. Some papers have had their algorithms in matlab but it is the exception in research and basically never used in any sort of production sense, because you would just have to port it to C++ to use it.
This is a bit like the python/R dichotomy in data science.
As a result, YCrCb yields crappy results for any purpose other than image encoding. The further you get from neutral gray, the worse YCrCb gets at cleanly separating lightness from hue/chroma.
I haven’t really seen YCrCb used except anywhere for in compression, so didn’t think of including it (an neither do most papers related to perceptual color spaces).
YCbCr was definitely not designed for analog tv. That color space is for digital video and imaging
I’ve been using a variant of JzAzBz for the last year or so. Perceptual color spaces make image generation much easier. Can you help me understand how JzAzBz falls down?
E.g, if you scale a color by some value before converting to JzAzBz, the predicted hue will be different than without scaling.
Oklab will predict the same hue regardless of scale (but could also mean it does worse for extreme colors). So it is a trade off, with oklab being designed to be less complex.
[FWIW, thanks to prove the HN crowd has strictly no sense of humor what-so-ever...]
a) Almost all of us have seen it before.
b) A comment with just an opaque link doesn't really add anything to the conversation.
c) That comic doesn't even apply in this case, as Oklab is not trying to be a universal standard covering all use cases. Its intended purposes are quite specific and different from the other colour spaces.
However, the data isn't 100% there if you work in the spaces, i.e. I'd want to understand more how the CAM16 gradients were generated (CAM16 has two correlates for lightness and another three for 'color', saturation, colorfulness, and hue. CAM16UCS 'only' has colorfulness. How could the CAM16UCS gradient get 'saturated' more quickly?)