Hacker News new | past | comments | ask | show | jobs | submit login
Premultiplied alpha vs. not-premultiplied alpha blending and compositing (iquilezles.org)
86 points by Tomte on Oct 22, 2022 | hide | past | favorite | 40 comments



Did you also see the Captain Disillusion video today? "CD / The Horrors of the Alpha Channel" https://www.youtube.com/watch?v=XobSAXZaKJ8


I clicked on this hn link specifically because I had just skimmed through that video and assumed it had to be related. Obv it could be a coincidence, but...

(Not the worst YT to reference anyway, in my view)


Just in case, to avoid confusion, I'm not complaining about the current submission.

* 50% curiosity about the coincidence

* 50% readers that liked this article may also like ...


For similar reasons, one should blend and scale in a linear color space. It doesn’t fundamentally matter which linear color space, but sRGB and most other color spaces used in image files are nonlinear, and that nonlinearity scales wrong much like the (alpha * color term) is nonlinear and scales wrong.


Many don't seem to realize, that this also applies to Text. If you rasterize anti-aliased text. The grayscale image is really "coverage" percentages and not intensity values. If you want to use it "as is" (on a white background), you need to apply gamma correction. If you want to use it on non white background, it gets more complicated. This may be a reason some text seems to look blurry.


This is wrong, sRGB is not a "nonlinear color space". In fact, the term itself makes no sense, all color spaces are defined in terms of linear intensity. What you mean is that sRGB is usually stored and transmitted with a transfer function that is nonlinear. Once you linearise, i.e. apply the reverse transfer function, you are left with linear sRGB, as opposed to some other color space like Adobe RGB, or ACEScg.

But yes, you must do your blending in linear values, otherwise what you get is nonsense.


> In fact, the term itself makes no sense, all color spaces are defined in terms of linear intensity.

What do you mean, and why do you think this? The definition of the term “nonlinear color space” is one that has a nonlinear transfer function. What does it mean to say that all color spaces are defined in terms of linear intensity, exactly? (And in that case, what would a non-linear value be?)


Because a color space is a definition of the physical characteristics of the R, G, and B channels, usually given as points in XYZ space, along with the white point of the color space. Every color space is defined in terms of linear intensity from 0 to 1 for each of the color channels. Technically you can even have a color space with more than 3 channels, like what Weta are doing in their spectral renderer.

The point is that a "color space" is the definition of the physical characteristics of the individual channels. This is always linear. After that, many color spaces do define a transfer function for storing and transmitting the data, usually meant to be used for cases where not much precision is available, e.g. when you only have 8 bits per channel. So to say that you have to "switch to a linear color space" is incorrect, as you are not changing the color space, you are just transferring (applying the inverse transfer function) the values from their non-linear representation to linear.


I think you're nitpicking. The term "colourspace" is used to refer to the actual colourspace plus the nonlinear transfer function. The sRGB spec defines that transfer function.

You can nitpick and say that the "sRGB colourspace" is linear but it's still true that sRGB is nonlinear.


What do you mean by “physical characteristics”? sRGB is not a physical color space, which is part of why it’s non-linear. You seem to be confused about what people mean when they say “nonlinear color space”, and what makes a value linear.

A value is only “linear” with respect to something else. Not all values are linear. In the case of color, the words ‘linear’ and ‘non-linear’ are used with respect to physical measurements of light intensity such as watts per steradian or radiance.

sRGB values are not linear, because those values are not a linear transform away from physical measurements, it takes a non-linear function to map from physical measurement into sRGB.

The reason sRGB is non-linear is because when you add or multiply two sRGB colors to simulate physics like emission or reflection, you get the wrong color compared to what happens in the real world. To get the right color, you must transfer your sRGB inputs to linear space, combine them there, and then transfer the result back to sRGB space.

> many color spaces do define a transfer function for storing and transmitting the data, usually meant to be used for cases where not much precision is available

This is mostly wrong. The transfer functions define the color space, and they’re almost always defined for the purpose of emulating either display devices with non-linear response or human perception (which has non-linear response). sRGB’s non-linearity is emulating the non-linear gamma of your monitor https://en.wikipedia.org/wiki/SRGB#Transfer_function_(%22gam...

Data space savings is almost entirely a byproduct of more closely matching human perception, and data savings is more or less only relevant to 8-bit color data or less (because 8 bits is not enough to represent linear color without getting visible color banding, but is almost enough when using a non-linear color space like sRGB, while 9 or 10 bits actually is enough to keep things in linear without banding). You still have to switch to linear color spaces to blend colors, even when using 16 or 32 bits per channel, so the compression side-benefits have very little to do with whether you need to use your transfer functions to switch between color spaces.

> So to say that you have to “switch to a linear color space” is incorrect, as you are not changing the color space, you are just transferring (applying the inverse transfer function) the values from their non-linear representation to linear.

This is entirely wrong. Using a transfer function or inverse is always a change from one color space to another, by definition. That’s exactly what a transfer function does, and it’s the only thing a transfer function does: changes (transfers) values from one color space to another. There is no such thing as a ‘linear representation’ of sRGB, there is only sRGB space and linear space. You aren’t making sense with the “non-linear representation” part, you just said all color space values are linear (even though that’s not true).


sRGB and its transfer function are inseparable, by definition. IEC 61966-2-1:1999 defines sRGB as a color space with its particular non-linear transfer function.

When you linearize sRGB you create a new not-sRGB-anymore color space that happens to have a compatible white point and RGB primaries.


> a new not-sRGB-anymore color space that happens to have a compatible white point and RGB primaries.

Do you know if this new color space has an official name?

I work with it basically everyday via the three.js renderer internals and mostly see it refered to as simply "linear", but if more specificity is needed usually I see "linear sRGB". I think that's common even if it's not correct.

However as HDR color spaces take over from sRGB I expect it'll become more important to have a correct name for this color space. I have searched for one previously but haven't been able to come up with anything better than "linear sRGB".


“Linear” is in practice the official name. The only additional specificity you can have is to give the scalar conversion factor from your linear colors into physical units like lumens or candela or watts per solid angle or surface area. What linear means is that you’re working in a scaled set of physical units. The scale factor is the only thing missing.

You might not ever need the physical scale factor, it doesn’t generally matter. You probably didn’t set your light or color intensities in three.js using physical units in the first place, and even if you did you might be simply scaling the values until things look right. We usually scale the inputs and outputs manually anyway, so the absolute scale of the intermediate linear color space in CG is frequently irrelevant.

HDR is mostly referring to whether your values are conceptually [0..1] or [0..infinity]. Sometimes people are also referring to how many bits get used per color channel. (But note that the R is HDR is referring to ‘range’ and not to ‘resolution’.) 8 bit color values that represent [0..1] are unambiguously LDR, while 32 bits per channel color values that represent [0..inf] are unambiguously HDR. If you use 16 bit colors and your range is [0..3] to allow for headroom and glare effects, it’s not really considered HDR, but it’s not exactly LDR either. It mostly only makes sense to work with HDR values in a linear color space, but it’s not strictly or by definition necessary.

“Linear sRGB” gets used sometimes, and what it means is that a color was converted from sRGB into linear. So the space is actually just linear. It’s useful to know if a color came from sRGB sometimes, because there’s less worry about going back into sRGB if you know it came from sRGB originally, and it helps you understand something about the scale and brightness of your linear values. If you have linear values and don’t know where they came from, you have no point of reference, and no idea if a value of 0.1 means invisible black, blinding white, or just middle gray. But unless I scale the linear colors, I do know a linear sRGB value of 0.1 is a dark gray that’s visible, because I know that the range of color is approximately [0..1] because it came from sRGB.


> You probably didn’t set your light or color intensities in three.js using physical units in the first place,

Nod... unless you're using a color space, like Jzazbz, with an absolute (not relative) transfer function. Application context then motivates the nit pick. So a Jzazbz HDR of [0..10k], might get a daylight 1k nits picked for diffuse (non-highlight) white. Thus I appreciate it when color libraries provide a distinct Absolute XYZ D65 data type.


This is a good point! Jzazbz is a good example of the (somewhat rare) application of perceptual response functions in high dynamic range. I’d be curious to hear what reasons you have to use Jzazbz. The color space seems best at scenarios where you need to compute HDR color differences. Most people don’t need that, but some advanced users like you do. Worth noting in this context that Jzazbz is non-linear, and even though it’s HDR, it still requires conversion to a linear space in order to blend colors physically.

I want to elaborate just slightly on what I meant. In CG, like in real photography, the photographer almost always sets some combination of their exposure, aperture size, and/or white balance values manually, regardless of how well they’re controlling inputs and color spaces. As a simple example, I can always dim the lights by a factor of 2 power, and also either expose for twice as long, or open my shutter by one f-stop. When I dim lights by half and expose by double, then nothing about the image changes. Even if you know the physical luminosity of your lights, the overall process (usually) still cancels scale factors and at the end reduces to an arbitrary scaling that essentially sets the output white and output black to the brightest and darkest colors I want to show, respectively.

With real film, there are additional arbitrary scaling values like the film development process, and lighting conditions when viewing. With digital, the display’s scaling is frequently unknown, and thus arbitrary; you might use 1k nits for your daylight diffuse, but that’s not what your monitor actually shows you.

There are legit use-cases for wanting absolute physical values in the intermediate stages, and using a “linear” color space for blending is, in a way, emulating exactly that desire but still with an arbitrary scale because the input brightness and/or output brightness are being tuned by hand.


> I’d be curious to hear what reasons you have to use Jzazbz

I needed a color space for teaching, so I'm using it to kludge wide-gamut hue linearity onto CAM16UCS. I wish I knew of something better to do.

Backstory is, I've long struggled to find community of interested and discussion, around the idea of transformatively improving science education content by collaboratively applying vastly greater-than-usual domain expertise. I thought to troll with a worked example, and chose "color". Even though "I'm building hard thing solo, because I really don't want to build such things solo?!?"... sigh. Color is pervasively taught in early primary, but with such poor content and outcomes, that foundational confusion persists even among first-tier physical-sciences graduate students. So: might color be taught more clearly by emphasizing spectra, in early primary? With web interactives? It seemed a question to explore.

So I needed a perceptual color space optimized for learning color. A rather different objective than usually drives color space design. Computability doesn't matter, as small lookup tables suffice. But noticeable features from eyeballing the space should be "yes, that's color perception", not "oh, no, that's an artifact of the model - try to ignore that aspect". The usual art/science education "one part careful, nine+ parts misleading bogosity" just doesn't end well. CAM16UCS is generally good at avoiding "that... doesn't look plausible". Better than other UCSs I skimmed. But... hue gets curvy near its edges, especially badly in blue. Jzazbz regrettably fails the "space looks plausible" test, but is wide-gamut, and has nice hue linearity. So I'm using it to kludge greater linearity onto CAM16UCS, between sRGB gamut and visible boundary. Which aids understandably coupling space to spectra.

Getting from daylight to banana to spectra and apparent spectra and color space, provides a lot of opportunity to get things wrong. And from interactive draw-a-spectra. Swapping in Jzazbz for end-to-end HDR physical units provides the warm fuzzy of sanity check.

Thus three.js intensities in nits, and an opportunity for Sunday punning.

It's just a silly niche use case, rather than advanced use, but modulo burn out, I'll be back at it tomorrow. Also... there's an order-of-magnitude luminance "why don't you see a sky full of stars" interactive somewhere on my infinite todo list, so getting perceptual luminace right had additional motivation. Though that will have to handle meso and scotopic perception... sigh. Appreciated your comments.


Hey that’s really cool and not silly or niche at all, I actually think it’s extra advanced to not just understand color so deeply but try to teach it at that level. Pretty hard core in my book, color is a surprisingly hard and deep subject.

I had a few good teachers along the way who taught me a ton about color and computer graphics. Peter Shirley is probably the top of my personal list, and he’s also very interested in improving science education, so maybe worth checking out his CG books. In his rendering class we bought a MacBeth color checker and had to take a photograph of it set inside a scene that we physically built out of whatever materials and light sources we wanted, and then we had to write our own renderers to match the photo. You really learn a lot about color trying to do that! :)

Pete and I have since designed a simple opponent-process color space aimed at ease of use that is perceptually uniform ‘enough’, but we haven’t published it yet. That could be a fun way to teach though: have students design their own color space! There also might be some interesting work going on in the area of digital color picker interfaces that could be useful for teaching. I’m forgetting who does this stuff, so can’t cough up any links at the moment, but I’ve definitely seen a couple of neat presentations recently that combine the science of perception and color spaces along with good design sensibilities to make interfaces that are demonstrably better for artists than the crappy RGB & HSV & other pickers we usually see today.


> color is a surprisingly hard and deep subject

Oh yeah. I was (hmm, am) hoping to dig out some limited cluster of concepts that gells as "oh, that's fun, and I'd very not have thought you could teach that, or that way". Eg xkcd color name regions in a voxelized color space as "learn color names" K-1 worksheet. Trying for a "light"(spectra) vs "color"(human perception) distinction, limiting scope on perception to early primary color topics.

Science education rarely does deep, integrated, or usable, so it's rather an open question what might be possible. And without a vision of that, there's little incentive for funding or exploring other than incremental improvement.

Even aside from creation pragmatics. "A big onboarding welcome to all our new science textbook author staff! Worry not that you are fresh liberal arts graduates, with no science background at all, for we've A Scientist on call!"(paraphrased)

> a simple opponent-process color space

Ah, neat. That could be fun.

Given my focus on spectra, I was also tempted by IGPGTG's "build on Gaussian spectra", but pruned.

> have students design their own color space!

Oh, there's an intriguing idea. I'd though as far as "if I can't find a color space without misleading large artifacts, I'll have to offer a diverse swappable several, if students are to have any hope of distinguishing data from noise". But a direct manipulation "create your own color space"... hmmm. Perhaps evaluated with visual metrics? Like gradients for linearity. Or mixing for Euclidean-ness and hue order - reorder a hue circle, and get weirdness in the in-betweens? Neat approach - not "this is just how it is" but "hands on, you'll find this a sweet spot". Hmm, if say, "Pink is the most important color! So its variants should be the focus of my space!"... what might that look like? A "pink-set" of color names? Eg, not "green" but "anti-magenta"? Not "white" but "palest pink"? :) I so miss pre-covid brainstorming like this at MIT. :/

> a couple of neat presentations recently that combine the science of perception and color spaces along with good design sensibilities to make interfaces that are demonstrably better for artists

Very neat. If a link surfaces, I'd be interested. There's some work on simulating pigment spectra and interaction for "physically realistic" painting, and I wondered at their UI. And if a MVP web app with such, might serve as hands-on antidote to the "subtractive rules are peers to additive" misconception.

> take a photograph of it set inside a scene that we physically built

I wondered how controlled lighting (filter gell on box, discrete LEDs, and tablets getting the narrow spectra of quantum-dot displays) might be leveraged. Eg, "sketch what you think this page/thing/scene will look like if lit like X, then try it and describe it, and take a picture"?

And sketched, sort of a point renderer - a block ui for playing with spectra. Eg, a Sun block, realistically pictured with red-tinted rim on white, with a spectra and white color", passed though atmosphere block set at angle X, with this transmittance/reflectance/florescence strip, yielding this spectra and color, lighting this point on a multispectral image of fruit, with/yielding ditto. Seen by this camera to get rgb, shown by this display to get this spectra. Banana vs banana pixel. Filters, bounced sources, etc. And then perhaps challenges of "assemble blocks to get a spectra like this". Or to light this multispectral MacBeth or scene to match this snapshot. Or bantering, "your freehand or musically-keyboarded spectra looks like this strawberry icecream with blue M&Ms". A hands-on way to address the many misconceptions around lighting and color.

And then there's video shaders, filtering or fragmenting by color space and names, for interacting with color in the environment. And... the minor challenge of finding a coherent MV"P" in all of this. :)


When you take sRGB and apply an inverse gamma curve to it you end up in linear light space. Is that what you're referring to?

The choice of the inverse gamma curve you're using is a choice with no single answer depending on how you intend to map that linear light onto...whatever it is you're doing. The obvious one is to use the nominal gamma of 2.2. That may or may not make sense depending on your display technology and what you're doing.


scRGB with the 16-bit encoding.


No, it's still sRGB. Usually it's referred to as "linear sRGB" or "linear Rec. 709" to make the distinction clear. The latter uses the same XY coordinates for its R G and B channels, but a slightly different transfer function.


You have misunderstood what some people mean when they say “linear sRGB”. That is a term that means linear color space with the gamut of what used to be an sRGB color. It is no longer an sRGB color, but it’s safer to transfer back into sRGB later, there are fewer worries about whether your colors will be clipped to the wrong bounds. The term “linear sRGB” helps you understand the provenance of your inputs, it does not mean your color values are in the sRGB color space.


For another nice discussion of this topic, see this older publication by Blinn (yes that Blinn): https://courses.cs.washington.edu/courses/cse576/03sp/readin...

In this older terminology, pre-multiplied alpha is called associated alpha. He goes through the math, but a key takeaway is: "... downsampling and, in fact, all filtering operations should operate on arrays of associated pixel colors ..."

I really enjoy Blinn's articles (and iq's :).


One reason “associated alpha” is a better name is that it allows doing useful things with images where the alpha isn't correlated to the colour. In the simplest example, you have zero alpha but you have non-zero (non-black) colour. That gives you a pure additive blending effect! But you can arbitrarily choose your alpha values, so you can choose how much occlusion you want and how much added light. Classic non-premultiplied alpha has none of this power.

I wish I could find the article I read that argued for this. It had a lovely demonstration of how you could use this to combine fire and smoke effects into a single texture (smoke occludes but fire effects tend not to).


Does anyone know of any good books that teach these concepts ground-up? I would think that this area of graphics would have dedicated books on the topic, but I have not found any. Color spaces, compositing, linear, srgb, all that good stuff.


Check out Andrew Glassner’s Principles of Digital Image Synthesis, it starts with an overview of color.

http://www.realtimerendering.com/Principles_of_Digital_Image...

Also these days there may be better and more focused resources online if you know where to look.

http://poynton.ca/notes/colour_and_gamma/ColorFAQ.html

http://www.cvrl.org/

https://www.handprint.com/HP/WCL/color2.html (https://www.handprint.com/CE/book.html)

https://www.realtimerendering.com/

http://paulbourke.net/miscellaneous/colourspace/ (Old but still useful)

The original compositing paper (which didn’t address issues of perception or non-linearity): https://keithp.com/~keithp/porterduff/p253-porter.pdf


Extraordinary list of resources, thank you!


The book you want is called within the graphics industry "Foley Van Dam", after the original authors, but the actual title is "Computer Graphics: Principles and Practice". https://www.amazon.com/Computer-Graphics-Principles-Practice...

You may also be interested in "ACM: Transactions on Graphics", the Association of Computing Machinery's publication of computer graphics research papers. I suggest going to a University technical/research library, where you should be able to access the collection of issues from the 80's, where the original scan line, ray tracing, CSG, and pretty much every single advanced graphics technique (minus the deep learning) used today is documented by the original innovators.

At that same University research library they might have the collected set of course text books (mimeographs and photocopies) used for the 3-day long courses taught at SIGGRAPH every year.

These items are invaluable, and I reference them multiple times a year.


I only have 2nd edition of Foley & vanDam’s GCPP, which predates sRGB, I assume the 3rd edition has been modernized? I wouldn’t have thought of this book as the first place to learn color & compositing specifically, but it’s probably decent. I haven’t referred to CGPP much lately, but maybe it has something to do with the time Andy van Dam loaned me one of his swimsuits and it was slightly too small for me. :P


This is incredibly useful, thank you so much!


Because I like being difficult, what about gamma-correction?


You need to go gamma -> linear before doing any linear algebra on your color values.

There is some debate over whether the alpha should be stored linearly or not in low precision (8 bit) formats. I think the standard practice is linear, but there’s good arguments it should be gamma encoded.

And, from there it’s important to point out that while pre-mul alpha is great for the math, having a color recorded in an 8-bit per channel image pre-multiplied doesn’t leave many possible values in the low range. So, you can get a lot of banding. Dithering becomes important. You should probably store the image non-premultiplied and do the premul as part of the math at runtime. Unfortunately, almost all image editors and encoders make it impossible to control the rgb values of low or zero alpha pixels. They all assume those colors conceptually “don’t exist” and don’t matter. But, they matter here because the filtered transition between solid and transparent pixels is the whole issue we are struggling with here.


Do you have any recommended resources for self-learning a ground-up education in these concepts? Book, etc? I've struggled to find any good ones.


Not a book, but I had a similar question yesterday and ended up finding this article https://tomforsyth1000.github.io/blog.wiki.html#%5B%5BThe%20.... Goes into more detail than I thought I needed!


If you create a sRGB texture on a modern GPU, the R/G/B channels are stored in sRGB (functionally a lot like gamma correction in this case) while the alpha channel is always linear 0-255. So while the GPU's texture samplers will convert the R/G/B channels to linear before filtering or sending the result to a shader, nothing gets done to the alpha channel.


Gamma should not have any effect on the alpha channel, which is linear by definition. The alpha channel represents the average coverage of (infinite) subpixels within the pixel. The color encoding of an object should not have any impact on the coverage within that pixel.

I recommend reading the technical memo "Alpha and the History of Digital Compositing" by Alvy Ray Smith [1] to get a better intuition on the matter.

[1] http://alvyray.com/Memos/CG/Microsoft/7_alpha.pdf


Gamma does however have an effect on blurring. And I suspect that because of gamma correction, applying a blur filter to a premultiplied image is (very subtly) incorrect.


Yes, I think the possibilities, in practice, for bad interactions between gamma-related and alpha-related operations are real.

"PS3 does sRGB conversion before alpha-blending, so the blending is done in gamma space, which is not quite right."

https://tomforsyth1000.github.io/blog.wiki.html


Applying a blur filter to a premultiplied image is (very subtly) _correct_, at least if the goal is to emulate what happens if you used a physical lens to blur the same image. Not only does postmultiplied alpha mess up the correct pixel values (as the original post shows), but even without alpha, you get a “halo” that is weird and unphysical.


I think corysama had the right of it, you need to do all the operations in linear colors, i.e.

original image -> convert to linear color space -> multiply alpha -> blur -> composite -> possibly convert back to sRGB if needed

The wrong way I meant was

original image -> multiply alpha -> gamma-correct-blur -> composite

where gamma-correct-blur = convert to linear color space -> blur filter -> convert to sRGB




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: