Hacker Newsnew | past | comments | ask | show | jobs | submit | ProgramMax's commentslogin

Yeah. I mentioned this elsewhere but repeating here:

I designed the article to be accessible and understandable for the average person. So I took some liberties like showing only HDR primaries and not deep diving into HDR transfer functions. People understand the primaries intuitively.

But you are right that a wide color image could also use those same primaries without being HDR.

My goal was to be as truthful as possible while still being digestible at a glance.

In the article, I linked to Chris Lilley's post which explains it more thoroughly for the technical people.


Author here. Hello everyone! Feel free to ask me anything. I'll go ahead and dispel some doubts I already see here:

- It isn't really a "new format". It's an update to the existing format. - It is very backwards compatible. -- Old programs will load new PNGs to the best of their capability. A user will still know "that is a picture of a red apple".

There also seems to be some confusion about how PNGs work internally. Short and sweet: - There are chunks of data. -- Chunks have a name, which says what data it contains. A program can skip a chunk it doesn't recognize. - There is only one image stream.


Do you have any examples on hand of PNGs that use the new features of the spec? It would be cool to see a little demo page with animated or HDR images, especially to download to test if our programs support them yet.


Sure!

Chris Lilley--one of the original PNG co-authors--has a post with an example HDR image: https://svgees.us/blog/cICP.html It is about half way down, with the birthday cake. Generally, us tech nerds have phones that are capable of displaying it well. So perhaps view the page on your phone.

What you should look for is the cake, the pink tips in her hair, and the background being more vivid. For me, the pink in the cake was the big give-away.

There is also the Web Platform Tests (WPT) which we use to validate browser support: https://wpt.fyi/results/png/cicp-chunk.html?label=master&lab...

Although, that image is just a boring teal. See it live in your browser here: https://wpt.live/png/cicp-chunk.html

For an example of APNG, you can use Wikipedia's images: https://en.wikipedia.org/wiki/APNG

But you have a bigger point: I should have live demonstrations of those things to help people understand.


Thank you for the examples. I tried the one with a pink cake. Turns out that on my machine only web browsers are capable of displaying the image properly. All viewers (IrfanView, XnView, Nomacs, Windows Photos) and editors (Paint .NET, GIMP) that I've tried only showed the "washed out" picture.


Yeah. We were able to get buy-in from some big players. We cannot contact every group, though. My hope is since big players have bought in, others will hear the message and update their programs.

Sooooo file some bugs :D

Also, be kind to them. This literally launched yesterday.


The creator of photopea.com is very responsive to user suggestions. I’d recommend contacting him if you haven’t already.


It's interesting that Paint.NET supports the vivid image if you screenshot the cake (Win+Shift+S) and paste it. But, opening the PNG opens up the washed out picture.


Huh, for some reason GIMP doesn't even show the usual color space conversion dialog.


> But you have a bigger point: I should have live demonstrations of those things to help people understand.

Pink can pose problems for individuals with red-green color blindness (or more exactly: color vision deficiency). So make sure that examples work for these people too. Otherwise the examples might not work for about 8% of your male viewers.


I never realized how limited sRGB is. I guess this is why people liked CRT TVs, and why you could never watch analog TV properly on a PC screen.


It's really not that limited, the problem is only if you reinterpret a larger gamut as sRGB without doing the proper conversion where things look washed out.


That's what I thought too, but the difference is big. You'd think you maybe lose some color lights, or very bright flowers, but no, colors outside sRGB are common.

There was nothing you could do about the TV, the screen couldn't show all the colors that you needed.


I can see a clear difference between the images in Firefox on MacOS with my M1 macbook. Very nice.


Thanks, I appreciate all of these links. :)


You’re awesome. Thanks for making things better.


> It isn't really a "new format". It's an update to the existing format. - It is very backwards compatible. -- Old programs will load new PNGs to the best of their capability. A user will still know "that is a picture of a red apple".

This is great but also has the issue that users might not notice that their setup is giving them a less than optimal result. Of course that is probably still better than not having backwards compatibility.

Edit: Seems the backwards compatibility isn't as great as it could be. Old programs show a washed out image instead which sucks. This should have been avoidable in the same way JPG gain maps work so that you only need updated programs to take advantage of the increased gamut on more-than-sRGB screens and not to correctly show colors that fit into sRGB.


PNG Fourth Edition, which we are working on now, is likely to add gain maps.

However, gain maps are extra data. So there is a trade off.

The reason gain maps didn't make it into Third Edition is it isn't yet a formal standard. We have a bunch of the work ready to go once their standard launches.


If you are really going to do something new, I recommend that you proceed through a work that is very good at this. For example, HALIC(High Availability Lossless Image Compression). It is both extremely fast and has a very good compression ratio, and memory usage is very very low. There is also very strong Multithread support already. I think something like this would be great for the new PNG. Of course, we don't know what the author of HALIC thinks about this.


Does it have any advantage over Lossless encoding in JPEG XL?


Yes, lots.

The big one is adoption. I love JPEG XL and hope it becomes more widely adopted. It is very scary to add a format to a browser because you can never remove it. Photoshop and MSPaint no longer support ICO files, but browsers do. So it makes sense for browsers to add support last, after it is clearly universal. I think JPEG XL is well on their way using this approach. But they aren't there yet and PNG is.

There is also longevity and staying power. I can grab an ancient version of Photoshop off eBay and it'll support PNG. This also benefits archivists.

As a quick side note on that: I want people to think about their storage and bandwidth. Have they ever hit storage/bandwidth limits? If so, were PNGs the cause? Was their site slow to load because of PNGs? I think we battle on file size as an old habit from the '90s image compression wars. Back then, we wanted pixels on the screen quickly. The slow image loads were noticeable on dial-up. So file size was actually important then. But today?? We're being penny-wise and pound-foolish.


>we'll be researching compression updates for PNG Fifth Edition.

What sort of improvements might we expect? Is there a chance of it rivalling lossless WebP and JPEG XL?


Our first goal is to see what we can get for "free" right now. Most of the time, people save a PNG which is pretty far from optimally compressed. Then they compare that to another format which is closer to optimal and draw a poor comparison.

You can see this with PNG optimizers like OptiPNG and pngcrush.

So step 1 is to improve libpng. This requires no spec change.

Step 2 is to enable parallel encoding and decoding. We expect this to increase the file size, but aren't sure how much. It might be surprisingly small (a few hundred bytes?). It will be optional.

Step 3 is the major changes like zstd. This would prevent a new PNG from being viewable in old software, so there is a considerably higher bar for adoption. If we find step 1 got us within 1% of zstd, it might not be worth such a major change.

I don't yet know what results we'll find or if all the work will be done in time. So please don't take this as promises or something to expect. I'm just being open and honest about our intentions and goals.


Solutions such as OptiPNG and Pngcrush require extra processing power on top of the already slow PNG. But in most cases they are still behind.


So, I'm a big fan of metaformats with generalized tooling support. Think of e.g. Office Open XML or ePub — you don't need "an OOXML parser" / "an ePub parser" to parse these; they're both just zipped XML, so you just need a zipfile library and libxml.

For the lifetime of PNG so far, a PNG file has almost, but just barely not, been a valid Interchange File Format (IFF) file.

IFF is a great (simple to understand, simple to implement support for, easy to generate, easy to decode, memory-efficient, IO-efficient, relatively compact, highly compressible) metaformat, that more people should be aware of.

However, up to this point, the usage of IFF has consisted of:

• some old proprietary game-data and image formats from the 1980s that no modern person has heard of

• some popular-yet-proprietary AV formats [AIFF, RIFF] that nobody would write a decoder for by hand anyway (because they would need a DSP library to handle the resulting sample-stream data anyway, and that library may as well offer container-format support too)

• The object files of an open but uncommon language runtime (Erlang .beam files), where that runtime exposes only high-level domain-specific parsing tooling (`beam_lib`) rather than IFF-general decoding tooling

• An "open-source but corporate-steered" image format that people are wary of allowing to gain ecosystem traction (WebP — which is more-specifically a document in a RIFF container)

• And PNG... but non-conformantly, such that any generic IFF decoder that could decode the other things above, would choke on a PNG file.

IMHO, this is a major reason that there is no such thing as "generalized IFF tooling" today, despite the IFF metaformat having all the attributes required to make it the "JSON of the binary world". (Don't tell me about CBOR; ain't nobody hand-rolling a CBOR encoder out of template strings.)

If you can't guess by now, my wishlist item for PNGv3, is for PNG files to somehow become valid/conformant IFF files — such that the popularity of PNG could then serve as the bootstrap for a real IFF tooling ecosystem, and encourage awareness/use of IFF in new greenfield format-definition use-cases.

---

Now, I've written PNG parsers, and generic IFF parsers too. I've even tried this exact unification trick before (I wanted an Erlang library that could parse both .beam files and PNG files. $10 if you can guess the use-case for that!)

Because of this, I know that "making PNG valid per IFF" isn't really possible by modifying the PNG format, while ensuring that the resulting format is decodable by existing PNG decoders. If you want all the old [esp. hardware] PNG parsers to be compatible with PNGv3s, then y'all can't exactly do anything in PNGv3 like "move the 4-byte CRC inside the chunk as measured by the 4-byte chunk length" or "make the CRCs into their own chunks that reference the preceding record".

But I'm not proposing that. I'm actually proposing the opposite.

Much of what PNGv2 did in contravention of the IFF spec, is honestly a pretty good idea in general. It's all stuff that could be "upstreamed" — from the PNG level, to the IFF level.

I propose: formalizing "the variant of IFF used in PNG" as its own separate metaformat specification — breaking this metaformat out from the PNG spec itself into its own standards document.

This would then be the "Interchange File Format specification, version 2.0" (not that there was ever a formal IFFv1 spec; we all just kind of looked at what EA/Commodore had done, and copied it in our own code since it was so braindead-easy to implement.)

This IFF 2.0 spec would formalize, at least, a version or "profile" of IFF for which PNGv2 images are conformant files. It would have chunk CRCs; chunk attribute bits encoded for purposes of decoders + editors via meaningful chunk-name letter-casing; and an allowance for some number of garbage bytes before the first valid chunk begins (for PNG's leading file signature that is not itself a valid IFF chunk.)

This could be as far as the IFF 2.0 spec goes — leaving IFFv1 files non-decodable in IFFv2 parsers. But that'd be a shame.

I would suggest going further — formalizing a second IFFv2 "profile" against which IFFv1 documents (like AIFF or RIFF files) are conformant; and then specifying that "generic" IFFv2-conformant decoders (i.e. a hypothetical "libiff", not a format-specific libpng) MUST implement support for decoding both the IFFv1-conforming and the PNGv2-conforming profiles of IFF.

It could then be up to the IFF-decoding-tooling user (CLI command user, library caller) to determine which IFFv2 "profile" to apply to a given document... or the IFFv2 spec could also specify some heuristic algorithm for input-document "profile" detection. (I think it'd be pretty easy; find a single chunk, and if what follows its chunk-length is a CRC that validates that chunk, then you have the PNGv2-like profile. Whereas if it's not that, but is instead four bytes of chunk-name-valid character ranges, then you've got the IFFv1-like profile. [And if it's neither, then you've got a file with a corrupted first chunk.])

---

And, if you want to go really far, you could then specify a third entirely-novel "profile", for use in greenfield IFF applications:

• A few bytes of space aren't so precious; we can hash things much faster these days, with hardware-accelerated hashing instructions; and those instructions are also for hashes that do much better than CRC to ensure integriaty. So either replace the inline CRCs with CRC chunks, or with nested FORM-like container records (WCRC [len] [CRC4] [interior chunk]). Or just skip per-chunk CRCs and formalize a fHsh chunk for document-level integrity, embedding the output of an arbitrary hash algorithm specified by its registered https://github.com/multiformats/multihash "hash function code".

• Re-widen the chunk-name-valid character set to those valid in IFFv1 documents, to ensure those can be losslessly re-encoded into this profile. To allow chunks with non-letter characters to have a valid attribute decoding, specify a document-level per-chunk-name "attributes of all chunks of this type" chunk, that can either be included into a given concrete format's header-chunk specification, or allowed at various points in the chunk stream per a concrete format's encoding rules (where it would then be expected to apply to any successor + successor-descendant chunks within its containing chunk's "scope.") Note that the goal here is to keep the attribute bits in some way or another — they're very useful IMHO, and I would expect an IFF decoder lib to always be emitting these boolean chunk-attribute fields as part of each decoded chunk.

• Formalize the magic signature at the beginning into a valid chunk, that somehow encodes 1. that this is an IFF 2.0 "greenfield profile" document (bytes 0-3); 2. what the concrete format in use is (bytes 4-7). (You could just copy/generalize what RIFF does here [where a RIFF chunk has the semantics of a LIST chunk but with a leading 4-byte chunk-name type], such that the whole document is enclosed by a root chunk — though this is painful in that you need to buffer the entire document if you're going to calculate the root-chunk length.)

I'm just spitballing; the concrete details of such a greenfield profile don't matter here, just the design goal — having a profile into which both IFFv1 and PNGv2 documents could be losslessly transcoded. Ideally with as minimal change to the "wider and weirder/more brittle ecosystem" side [in this case that's IFFv1] as possible. (Compare/contrast: how HTML5 documents are a profile of HTML that supersedes both HTML4 and XHTML1.1 — supporting both unclosed tags and XML-namespaced element names — allowing HTML4 documents to parse "as" HTML5 without rewrites, and XHTML1.1 documents to be transcoded to HTML5 by just stripping some root-level xmlns declarations and changing the doctype.)


Strangely, I was familiar with AIFF and RIFF files but never made the connection that they're both IFF. I hadn't known about IFF before your post. Thank you :)

W3C requires that we do not break old, conformant specs. Meaning if the next PNG spec would invalidate prior specs, they won't approve it. By extension, an old, conformant program will not suddenly become non-conformant.

I could see a group of people formalizing IFFv2, and adapting PNG to it. But that would effectively be PNGIFF, not PNG. It would be a new spec. Because we cannot break the old one.

That might be fine. But it comes with a new set of problems, like adoption.

Soooo I like the idea but it would probably be a separate thing. FWIW, it would actually be nice to make a formal IFF spec. If there was no governing body that owns it, we can find an org and gather interest.

I doubt W3C would be the right org for it. ISO subgroup??


They pretty much say the same thing halfway through. Don't change PNG but adapt IFF to work with PNG's flavour of IFF.


Right. Sorry, that was supposed to be a "yes, and..." to provide some additional context.


We really shouldn't be making new standards with big endian byte order.

It's also questionable how much you actually benefit from common container formats like this since you need to know the application specific format contained anyway in order to do anything useful with it. It also causes problems where "smart" programs treat files in ways that make no sense, e.g. by offering to extract a .docx file just because it looks like a .zip


> you need to know the application specific format contained anyway in order to do anything useful with it

One neat thing about IFF is that all of its "container" chunk types (LIST, FORM, CAT) are part of the standard; the expectation is that domain-specific chunk types should [mostly] be leaf nodes. As such, IFF is at least "legible" in the same way that XML or JSON or Lisp is legible (and more than e.g. ELF is legible): you're meant to decompose an object graph into individual IFF chunks for each object in the graph. Which translates to IFF files being "browseable", rather than dead-ending in opaque tables that require some other standard to tell how how they're even row-delimited.

Another neat thing is that, like with namespaced XML element names, chunk names — at least the "public" ones — are meant to have globally-unique meanings, being registered in a global registry (https://wiki.amigaos.net/wiki/IFF_FORM_and_Chunk_Registry). This means that IFF tooling can "browse" an arbitrary unknown IFF document, find a chunk it does understand the meaning of, and usefully decode it (and maybe its descendants) for you.

Many more-complex IFF formats (e.g. the AV containers like RIFF) embed data of other media types as chunks of these registered types. Think "thumbnail in a video file" or "texture in a scene file." Your tooling doesn't need to know the semantics of the outer format, to be able to discover these registered inner chunks inside it, and browse/preview/extract them. (Or replace them one-for-one with another asset of the same type; or even, if they're inside a simple LIST chunk, add or remove instances of the asset from the list!)

Also, somewhat interestingly, given the way IFF is structured, there is no inherent difference between embedding a sub-resource "opaquely" vs embedding it "legibly" — i.e. if you embed a [headerless] IFF document as the value of a chunk in another IFF document, then that's exactly the same thing as nesting the root-level chunk(s) of that sub-document within the parent chunk. It's like how an SVG sub-document inside an XHTML document isn't a separate serialized blob that gets sucked out and parsed, but rather just additional tags in the XHTML document-string, around which a boundary of "this is a separate XML sub-document" gets drawn by some "DOM document builder" code downstream of the actual XML parser.

---

But besides the technical "it can be done" points, let me also speak more in terms of the motivation. Why would you want to?

Well, have you ever wanted to open up a complex file and pull its atomic-level assets out? Your first thought when hearing that was probably "that sounds like a nightmare" — and yes, today, it is.

But back in the 1980s, with the original growth of IFF-based formats, we temporarily lived in this wonderland where there were all these different browseable / explorable file formats, that could be cracked open with exactly the same tools.

Do you wonder how and why the game modding scene first came into existence? It was basically the result of games storing their asset packs in these simple-to-parse/generate file formats — where people could easily drop-in replace one of those assets with a new one with simple command-line tools, or even with a GUI, without worrying about matching asset sizes / binary offset patching / etc — let alone with any knowledge of how the container file format works.

Do you appreciate how macOS app bundles just have a browseable, hierarchical Resources directory inside them? Before app bundles, macOS applications held their resources in a "resource fork" — essentially a set of FourCC-tagged file extended-attributes (though actually, a single on-disk packfile that acted as a random-access key-value store of those xattrs). And both of these approaches (bundle Resources dirs, and resource forks) provided the same explorability / moddability as IFF files do. People would throw a macOS program into ResEdit and pull out its icons, its fonts, its strings, whatever — where those weren't program-domain-specific things, but rather were effectively items with standardized media types (their FourCC codes being effectively the predecessor of modern MIME types.)

For that matter, consider this quote from the IFF wiki page:

> There are standard chunks that could be present in any IFF file, such as AUTH (containing text with information about author of the file), ANNO (containing text with annotation, usually name of the program that created the file), NAME (containing text with name of the work in the file), VERS (containing file version), (c) (containing text with copyright information).

Now, remember that IFF decoders are almost always expected / coded to ignore chunks they don't understand. (Especially for IFF files encoded as a toplevel stream of heterogeneous chunk types.)

That means that not only can various format authors decide to use these standard chunks... but third-party editors can also just drop chunks like this into the things they edit! You know how Windows has that "name, author, version" etc info on the Properties sheet for some file types? That info could show up and be editable for any IFF-based file format — whether the particular format has an "allowance" for it or not.

(There's nothing special about IFF here, by the way. You could just as well drop "foreign-namespaced attributes" like this into an e.g. XML-based document format. The difference is a cultural one: the developers of XML-based document formats tend to have their XML decoders validate their documents for strict conformance to an XML schema; and XML schemas tend to be [but don't have to be!] designed as whitelists of the possible tags that can be used within any given parent nesting path. IFF, meanwhile, has never had anything like a schema-based document validation. Every document was best-effort parsed, like HTML4; and so every IFF-based format decoder is a best-effort decoder, like a web browser parsing HTML4. That very lack of schema-based validation, actually opens up a lot of use-cases for IFF.)


(Separate reply for space)

> We really shouldn't be making new standards with big endian byte order.

IFF isn't a wire protocol standard for efficient zero-copy; and nor is it intended for file formats amenable to being streaming-parsed.

And that's okay! Not every format needs to be suited to efficient, scalable, concurrent, [other lovely words] message passing!

IFF has two major use-cases:

1. documents that are "loaded" in some program, where "loading" is expected to occur against a random-access block device; where each chunk will be visited in turn, with either its contents being parsed into an in-memory representation; its contents' slicing bounds being stored to later stream or random-access within (or the part of the file within those bounds being mmap(2)ed — same thing); or that chunk discarded, thus allowing the load operation to skip issuing any read ops for it or its descendants entirely.

This is the PNG use-case.

(Though, interestingly enough, since PNG has only one large chunk — the image data — PNG can be made into an "effectively-streamed format" simply by keeping that big chunk at the end of the IFF document. Presuming the stream length of the PNG file is known [as in a regular HTTP fetch], the "skeleton load" process for PNG can terminate after just having parsed its way through all the other tiny chunks — perhaps with a few minimal buffer waits to skip over unknown chunks — but with no need to buffer the entire image data chunk. [It adds the image-data-chunk length to the file pointer, realizes there's no more room for chunks in the stream, and so doesn't bother to buffer+seek past that final chunk.] The IFF parser then returns to the caller, passing it the slicing bounds of [among other things] the (still not-yet-fully-received) image-data chunk. And the caller can then turn around, and hand the same FILE pointer and those slicing bounds to its streaming renderer, letting it go to town consuming the stream as needed.)

IFF in its skeleton-loading model, would also be ideal for something like e.g. a font file (which has lots of little tables, which are either eagerly parsed, or ignored, by any given renderer.)

2. simple "read-rarely" packfile documents, that act sort of like little databases, but without any sort of TOC header part; where, when you want to grab something from the packfile, you re-navigate down through it from the root, taking the IOPS hit from all the seeks to each nesting-parent chunk's preceeding sibling chunks before hitting the descendant you want to navigate into.

This is the use-case of most IFFv1 file formats — most of them were made for use by programs that would grab this or that for the program's use either once at startup, or when the thing became relevant. (Think of the types of things a Windows executable embeds as "resources" — icons, translated strings, XAML declarative-MVC-view documents, etc.)

For a parallel, IFF here is to "using an entire archive-format library like tar or zip to store these assets for random access", as "spitting CSV/XML out using template strings" is to using a library to encode a table to a Parquet/ORC/etc. table.

The parallel is that in both cases, you're trading some performance and robustness, for massively reduced complexity and ease of implementation. Like with emitting CSV, you can slop together an IFF encoder right there inside your data-emitting logic — in any language that can write out binary files, and without even having access to the Internet, let alone adding a dependency on an encoder package in some package ecosystem. You can do it in C; you can do it in assembly; you can do it in a bash script; you can do it in BASIC; you can do it in a Windows batch file; you can do it in your single-file Python or Ruby or Perl script that lives in your repo. You can probably do it in a Makefile!

(Also, given how IFF parsing works [i.e. given that any given chunk's contents is in superposition of being either an opaque binary slice or a potential stream of child chunks, with a streaming event-based parser able to decide at each juncture whether to take that step of decoding the child chunks or to leave them as an undecoded binary for now], if you start to care about performance, you can just stick some memoization in front of your "fetch a key-path-lens KP from document D" function, and now you're building a just-in-time TOC. And obviously you can put TOC chunks in your IFF-based file formats if you want — though IMHO doing so kind of goes against the spirit of IFF.)

---

In neither of those use-cases does it really matter that lengths require reading four bytes one-at-a-time with left-shifts, rather than being able to just plop the four bytes into a register. These aren't cases where the parse overhead of the the structural glue between the data, will ever be non-trivial relative to the time it takes to consume the data itself.

And even if you did want to use IFF for something crazy, like as a substitute for Protobuf: did you know that most modern CPU ISAs have a byte-shuffle instruction that can transform big-endian into little-endian [among an unbounded number of other potential transformations] in a single cycle? Endian-ness did matter in protocol design for a while... but these days, unless you're e.g. a Google engineer designing a new SAN protocol, and optimizing it for message-handling overhead on your custom SDN L7 network-switch silicon that doesn't have a shuffle op... endian-ness is mostly irrelevant again!


It would be nice if PNG supported no compression. That is handy in many situations.


PNG previously supported ICC v2. That was updated to ICC v4. However, neither of these are capable of HDR.

Maybe iccMAX supports HDR. I'm not sure. In either case, that isn't what PNG supported.

So something new was required for HDR.


> However, neither of these are capable of HDR.

How so? As far as I can tell, the ICCv2 spec is very agnostic as to the gamut and dynamic range of the output medium. It doesn't say anything to the extent of "thou shalt not produce any colors outside the sRGB gamut, nor make the white point too bright".

Unless HDR support is supposed to be something other than just the primaries, white point, and transfer function. All the breathless blogspam about HDR doesn't make it very clear what it means in terms of colorspaces.


> How so? As far as I can tell, the ICCv2 spec is very agnostic as to the gamut and dynamic range of the output medium. It doesn't say anything to the extent of "thou shalt not produce any colors outside the sRGB gamut, nor make the white point too bright".

That’s precisely what makes it unsuitable for HDR. With PQ, (1, 1, 1) means 10 000 cd/m² – if you simply create an ICC profile with the PQ transfer function, an image that looks right on a hypothetical 10 000 cd/m² monitor will look way too dim when naïvely scaled down (as ICC-type colour management would have you do) to the 300 cd/m² of a typical monitor. HLG, meanwhile, has a transfer function that depends on the peak luminance, which is not possible to do with ICC (the profile would have to assume a specific peak luminance), and the reason that it does that is to preserve the subjective perception of the image.

So, sure, you can prepare an HDR image so that it looks right on a monitor with a 1000 cd/m² peak luminance, describe the colorspace in relative terms using an ICC profile, and you will have “done HDR using ICC”, but that’s arguably a very low bar for “supporting HDR”.


IIRC (been a while), the reason was ICCv2/v4 still requires a gamma function. And PQ is not a gamma function. Maybe they can cover HLG, but if we want to represent any given HDR content, we needed something more than ICCv2/v4.


That doesn't sound quite right to me. ICCv2's 'curveType' gives the option of a full lookup table instead of a simple gamma function. Maybe it has to do with ICCv2 saying that the reference viewing condition has an illumination level of 500 lx for the perceptual intent? (But how does that apply to non-reflective media?)

I don't doubt that there's lots of problems in the chain from RGB samples to display output, but I'm finding this whole thing horribly confusing. Wikipedia tries to distinguish 'HDR' transfer functions like PQ [0] from 'SDR' transfer functions in terms of their absolute luminance, but the ICC specs are just filled with relative values all the way down.

(Not to mention how much these things get fiddled with in practice. Once, I had the idea of writing a JPEG decoder, so I looked into how exactly to convert between sRGB and Rec. 601 YCbCr coordinates. I thought, "I know, I'll just use the standard-defined XYZ conversions to bridge between them!" But psych, the ICC sRGB profile has its own black point scaling that the standards don't tell you about. I'm still not sure what the correct answer is for "these sRGB coordinates represent the exact same color as these Rec. 601 YCbCr coordinates".)

[0] https://en.wikipedia.org/wiki/Perceptual_quantizer


Agreed that it gets confusing. That's a piece of why I'm unable to give you a solid answer. This isn't my area of expertise.

Here is what I can tell you confidently: The original plan was to provide an ICC profile that approximates PQ as best as we could. But it wasn't enough. So the proposal was to force the profile name to be a special string. When a PNG decoder saw that name, it would ignore the ICC profile and do actual PQ.

Here is that original proposal: https://w3c.github.io/png-hdr-pq/

Possibly more context (I just found this) from Apple. I'm not sure of date: https://www.color.org/hdr/02-Luke_Wallis.pdf Slide 29: "HDR parametric transfer functions not in ICC spec Parametric 3D tone mapping functions not in ICC spec - Neither can be approximated by 1-D or 3-D LUTs"

I'm not sure why they cannot be approximated by LUT. Maybe because of the inversion problem?


Thanks for that proposal link. The email thread starting at [0] seems to explain some of the challenges. My understanding:

- In ICC-land, all luminances are relative to the display's (or reflective medium's) black and white points. So for an HDR-capable display, all content, HDR or SDR, would be naturally displayed at the full 10k nits or whatever the actual number is. This is obviously not how things work in practice: OSes and/or displays really want a signal as to whether the full HDR luminance is actually desired. (This reminds me of an earlier HN thread where people complained about HDR video forcing up the brightness on Apple devices.)

- PQ (but not HLG) specifies everything in terms of absolute luminance, but this gets confusing when people want to adjust their display brightness and have everything work relatively in practice.

- Due to lack of support for "overrange" behavior [1], 1D LUTs + matrices are insufficient for representing PQ at all, so you need a 3D LUT just to approximate it. This needs ICCv4, since ICCv2 only supports 3D LUTs for non-display profiles.

- But 3D LUTs are big and fat, and can only give a few bits of accuracy across some parts of the full HDR range. (It seems like there's no form of delta compression?) Most people really hate this. iccMAX can allegedly use 3D parametric formulas, but literally no one implements it since it has a million bells and whistles.

- More importantly, GPUs especially hate big fat LUTs, and everyone uses GPU rendering. In the worst case, some implementations will do everything they can to ignore LUTs in ICC profiles, and instead try to guesstimate some simple-gamma or linear-gamma approximation, which won't end well.

So it does seem to be a combination of "the HDR stack is a mess and needs its own special signaling" and practical concerns about avoiding overly huge profiles.

[0] https://lists.w3.org/Archives/Public/public-colorweb/2017May...

[1] https://lists.w3.org/Archives/Public/public-colorweb/2017May...


You....are wonderful. Thank you.


Even though I know about this, I still pronounce it as letters. :)


It is very backwards compatible. :) We worked hard to make sure it would be.


0xFF is 8-bit. PNG supports up to 16-bit. It always has. Plus, PNG now supports full HDR so the fireball won't look washed out.

I think your experience is with some tool that made bad PNGs. That is a problem with the tool, not the format.


EXR stores the color-space information differently, and you missed the point.

Have a look at a tutorial that dives into the basic details, and consider learning something:

https://www.youtube.com/watch?v=pLt1230dtYE

https://www.youtube.com/watch?v=mb0b83MML78

https://www.youtube.com/watch?v=egtnkhuUe_E

PNG has its use-cases, and some people do expect that baked color-space garbage look given it dominates a lot of low-end media. Have a great day =3


I'm trying to follow your point. But...there are problems with your claims. Yes, EXR stores color-space differently than PNG. Because EXR doesn't store color space at all.

In the first video, the person loads the image and manually chooses a gamma transfer function with 2.2. If that was then saved, it would produce the washed-out fireball you mentioned.

In the second video, the person loads the image and manually chooses rec.709, which is also gamma tf and also produces washed-out fireball. In fact, the EXR image he loads literally has a bright fireball and you see it get washed out.

If you want to make claims about EXR being better than PNG, you need to say why storing the values as floating point is better than integer. But the blown-out fireball example is just incorrect. As evidence, I'll point to HDR. ANYTHING you see in an HDR movie is now 100% losslessly reproducible in a PNG.


Everything new is optional. This is not a breaking change. Old PNGs and software continue to work just fine. And these new changes are backwards-compatible as much as they can be. So old software can display a new PNG and be mostly correct. By that I mean, the user will still say "it is a picture of a red apple". But if the software isn't HDR, they might not get the bright highlights and inky blacks of the HDR PNG.


What is the remaining pertinent value of HDR since we are moving towards xrgb16161616 pixel format?


16-bit, yes. Arbitrary channel count, no. However, HDR is more than just bitcount.


PNGs have supported transparency since day 1 :)


In general, I support the "follow the money" idea. But I don't think it applies here.

I'm retired and making zero money here. (I'm actually losing money on it. Wish I had a company sponsoring me for the flights and hotels for meetups.)

All participants are required to not patent any piece of it. We work hard to make sure we only reference open standards. (This one is quite tricky. We have to convince other standard orgs to make their stuff free.)

I could see the argument for getting around a gate. But fwiw I don't think that's the case :)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: