Hacker News new | past | comments | ask | show | jobs | submit login
PNG Parser Differential (da.vidbuchanan.co.uk)
506 points by Retr0id on Dec 16, 2021 | hide | past | favorite | 53 comments


> on the first retina iPads, decoding PNGs was a huge portion of the total launch times for some apps, while one of the two cores sat completely idle.

Could they distribute decoding PNG files to a thread pool, instead of making multithreaded PNG files? Or would this fail for single large PNG files?

Well, one thing that does work for decoding regular old PNGs is separating the entropy decoding (in PNG, DEFLATE) from the prediction (what PNG calls filtering). You can create a pipeline where one thread is working on the entropy coding and handing off completed rows to a thread that performs the prediction decode. This doesn't get a 2x speedup because usually DEFLATE is slower than the prediction step, so it ends up bottlenecked on zlib, but it helps a good bit. (I implemented this in Rust at one point with the intention to ship in Firefox, but it never did.)

Of course, this only scales to 2 CPUs. Beyond that you will need to do some sort of splitting of the input to achieve wins (since no matter how much you optimize the filtering, you're still bottlenecked on DEFLATE, which is inherently serial).

How I read nyanpasu64; I think that the point was that a web page will often have multiple png files -- often more than you have cores, so the gains of decoding a single specially-formatted png might disappear in practice.

iDOT seems like overengineering to me: Why not break up the PNG and tile them in an SVG? This should be just as fast, about the same size, and you'd never have run the risk of inventing your own format and implementing it badly[1]. Users would probably be happier with all their SVGs being faster than having specially-crafted PNGs accelerated (if given the choice)

[1]: https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2016-1811

I’m a bit confused, are you suggesting they convert the PNGs into an SVG? How is that possible?

I'm not entirely sure how to begin to answer this question. Forgive me if I start too early.

An SVG is a list of compositing instructions for some (graphical) artefact. These instructions are given so that their dependencies (the things that must be composed before other things) are stated directly.

If you have familiar with Adobe's Photoshop, you might imagine a SVG as a set of nested layers: An SVG "renderer" will simply compose these layers together in order to have some pixels to display.

Now, to give a clear example of what I'm referring to, I am going to show you a simplified SVG that composes four png files together in tiles:

    <svg width="100" height="100">
      <image x="0" y="0" width="50" height="50" href="data:image/png,xxx" />
      <image x="50" y="0" width="50" height="50" href="data:image/png,xxx" />
      <image x="0" y="50" width="50" height="50" href="data:image/png,xxx" />
      <image x="50" y="50" width="50" height="50" href="data:image/png,xxx" />
That data:image/png,xxx stanza is the SVG-encoding of a PNG (a slight simplification: the format actually belongs to a number of different standards, that the SVG specification leverages).

That is to say, I'm not exactly suggesting converting a PNG to an SVG: I am also suggesting breaking apart the (large!) PNG into several component PNG files (tiles) so that they can be decoded independently. Note carefully my example, how none of the tiles overlap. A decoder can (trivially) determine these instructions are independent, and so process them independently.

Being able to decode parts of the resulting image independently is what the iDOT metadata makes possible: It is essentially a different encoding of the x/y/width/height information in the above, the difference is that SVG already existed.


I assume iOS may have a problem of the application doing:

    for each path {
and the OS can't parallelize that, because it learns about a new file only after decoding the previous one.

Decompressing zlib/DEFLATE streams, in the general case, is an inherently serial task. It cannot be meaningfully parallelised.

The "trick" used by Apple, and a few other encoders, is to flush the zlib state every so often (ZLIB_FULL_FLUSH), which means that subsequent data is both byte alligned, and does not make any backreferences to before the sync.

If you know where these sync points are (Apple encodes this information in their non-standard "iDOT" chunk) then you can start decompressing from that point, in an isolated thread.

There are ongoing discussions as to how this metadata could be standardised: https://github.com/w3c/PNG-spec/issues/54

Technically speaking you can also speculatively decode Huffman coded bitstreams because the use of canonical Huffman tree means that the end-of-block symbol is almost likely the longest code and has no 0 bits, so you can start decoding at a long string of 1 bits. Of course this doesn't solve an issue of shared LZSS window across multiple blocks.

This is hilarious to me, but makes complete sense. Do you know if this has ever been done anywhere?

There is indeed a hilarious backstory. Back in time one of my friends was looking for a task for the Distributed Computing course where you write a program for the Cell microprocessor, and I half-jokingly suggested this, not realizing that it won't work at all because of the shared window. The friend eventually came to realize that issue and couldn't finish the project in time, but still got a good grade solely because of the novelty of the task (!). (According to that friend virtually everyone else did the parallel JPEG stuffs.)

This doesn't surprise me. The Cell architecture wrought more insane hacks than good designs.

I think that's what restart markers are for in JPEG from the original spec. IIRC in JPEG you can have markers that are marked as "crucial for proper decode" and programs should reject the image if they don't understand them. Maybe the latter was from PNG I forget.

It would fail for single large splash-screen PNG files, and that's where they were looking to shave milliseconds.

You'd also have to be kicking off PNG decodes in a way that is compatible with offloading them to a threadpool, and lots of software likely just loads images synchronously and moves on with its day.

A tangentially related trick involves gamma correction



I am on MacOS. The miniature in the desktop icon of the file shows "Hello World", but opening the file shows "Hello Apple".

If you view it in Finder in gallery mode, the picture in the the gallery says “Hello Apple”, while the preview in the sidebar says “Hello World”:


Interesting. On the iPad on Firefox in the link preview when long-pressing the link it shows "Hello World", but "Hello Apple" when actually opened. On the same iPad in Safari, it says "Hello Apple" both in the long-press preview and the full page :)

Is there a conjecture that says for a field with given a degree of complexity, a given string or function of a can be encoded equivalently in a number of different ways that is proportional to its size and maybe complexity class?

Between this and project zero's analysis (https://news.ycombinator.com/item?id=29568625) of NSO using compression encoding to create a virtual machine for calculating exploit offsets, while unrelated except at a very high level of abstraction, it reminds me conceptually of cryptographic hash collisions, where over a large enough search space or field / domain of complexity there are many equivalent encodings or homonyms/isomorphisms.

The issue with the NSO exploit was they found that the compression encoding for a font was Turing complete, and then wrote a virtual architecture in it, and then ran programs on it that did the calculations necessary for their exploit.

This png encoding issue is different, but if you abstract it upwards to find a general principle it may be the effect of, it's like there is fast rule where if if you know the size or definition of the field of possibilities, and then have a definition of a given string in it, the function that describes or defines that string will also yield all strings whose evaluation is the same. It's like Kolmolgorov complexity, but where instead of finding the smallest progam to compute something, it's: given the number of instructions to define programs over a field of inputs of a given size, there are N programs beneath length L that are equivalent.

Sort of a showerthought, but it's interesting to think that our ideas of encodings and general isomorphisms may be instances of the same concept linked by a sort of "imaginary" function.

Wow, thank you. Totally forgot about FHE, this is the underlying principle of why FHE must work theoretically.

Mind = blown. Hello world on Chrome, Hello Apple on Safari.

If you save it to desktop, the icon says hello world and quicklook says hello apple.

Both are generated by /System/Library/QuickLook/Image.qlgenerator, one as thumbnail and one as preview. You can check yourself:

   # thumbnail
   qlmanage -c public.image -g /System/Library/QuickLook/Image.qlgenerator -t /path/to/a.png
   # preview
   qlmanage -c public.image -g /System/Library/QuickLook/Image.qlgenerator -p /path/to/a.png

I suppose it's not too surprising that thumbnailing is single-threaded, since its usually done in the background (where wall-time doesn't really matter) and/or in large batches (where file-level parallelism doesn't really help you).

I noticed that too. Even weirder, this is the case when saving from Firefox (which displays HELLO WORLD) and when saving from Safari.

I mean, of course it is, it's just saving the PNG; it's not going to decode it and then re-encode it or something!

Since the desktop icon is scaled down, it has to be parsed at some point and saved back out at icon size. The fact that it says HELLO WORLD on the icon even if you're saving it from Safari means that whatever code is shrinking it into a little icon is parsing it differently from what's being used to display it at full size.

That’s not surprising, some image formats come with separate thumbnail images to make them faster to decode.

AFAIK, PNG format can't contain portable little icons. It wouldn't make sense, either, since it would need lots of different sizes of those for different platforms and devices. Mac OS is what creates the icons for individual files on a Mac desktop. Type something into a text file and save it... the document's icon will actually show a tiny miniature of the text you typed, not just some generic text-document icon. The PNG miniature is generated by the OS, and says HELLO WORLD, even though the same OS parses it as HELLO APPLE when you open it with quicklook or preview.

Now that is a WTF

Anyone on Safari please post a screenshot for the rest of us.

It is clear that images are a mistake.

Computers, actually.

Computers are the children of humans which arguably are a mistake too... So really, blame the universe!

"In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move"

The apple pie was delicious, though. Totally worth it.

Universe so so huge to judge, lets say blame atom.

Atoms are not atomic. Blame number.

OMG this could be used to deanonymize people on nanochan.

One person reacts one way on the image and another reacts the other way.

iOS and macOS may be a small player in worldwide computer use, but they're not small enough that you can identify anyone with just this flag. Unless you're in North Korea, saying you're using Apple software isn't really that much of an identifying feature.

It can definitely be used for fingerprinting, though, especially if Apple ever fixes their PNG decoder.

Only if they do not use Tor Browser, which is based on Firefox. Firefox on Macs has its own PNG decoder and shows "Hello World".

I am on macOS. Safari shows "Hello Apple", while Brave shows "Hello World"...

Well it's late December but this might be the coolest hack this year for me!

I remember an old .png which renders differently in IE6, Chrome and Firefox. Anyone still got the file?

Is it this one? [0] IIRC it's the lack of handking the gAMA tag properly.

[0] https://www.howtogeek.com/149223/why-do-chrome-and-internet-...

I've tried decoding the image with a png decoder at hand, and got 'hello world'. Guess it belongs to the "other software" category. :)


This is objectively very cool.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact