Any lossless format can be made lossy by simply performing a lossy pre-pass prior to the lossless compression.
The real challenge is finding a good way to perform the pre-pass; it's easy to pick a simple naive method, but optimality is likely exponential-time, or if one is lucky, polynomial with an intractable constant (using trellis).
The fact that PNG uses a dictionary-based compressor instead of a PPMD-like system likely makes the problem much harder.
What does PNGout do different? I tried optipng with -o9, which is an exhaustive search, and it didn't make the file as small as the PNGout one.
I though -o9 tried every possible method of encoding the png, but I guess it's missing something.
Edit: I found my answer: http://optipng.sourceforge.net/pngtech/optipng.html - apparently the png filter can be applied to each row independently. But I think optipng uses the same filter for the entire image, not each row.
- Compression improvements:
Use zlib's deflateTune().
Use 7zip's powerful deflation engine.
(This is not possible with libpng, so a custom encoder is needed.)
AdvanceCOMP uses the same deflate code as 7zip. PNGout uses a deflation engine that claims to be better than 7zip's.
There doesn't seem to be a Mac or Linux port for PNGOut; the link redirects to a forum about GTA. And isn't PunyPNG a web front-end for OptiPNG? That would explain the lack of any real differences between the two.
Also, would be good to know what flags were used when compressing the images.
// edit: See @ccollins post for a link to Mac/Linux/BSD ports of PNGOut.