Hacker News new | past | comments | ask | show | jobs | submit login
JPEGrescan: unique way to losslessly shrink any JPEG file (pastebin.com)
54 points by DarkShikari on Sept 3, 2009 | hide | past | favorite | 11 comments

Is there any other information on what he's doing in the code and why? I wouldn't mind implementing something like this in a project I'm working on.

He's basically taking advantage of a trick which is almost never used for compression purposes, so it can be used after any JPEG encoder to improve compression, since nobody else seems to do this.

Specifically, it abuses the progressive scan system. It's well-known that progressive scan in JPEG (which causes the image to load starting at the lowest detail and being refined up to full quality) doesn't just add to usability for the viewer of the image, but it also slightly improves compression. However, progressive scan allows you to specify almost any splitting of the coefficients--each of which gets its own Huffman table for compression.

To make this script, Loren did an exhaustive search of all possible splitting options on a large collection of images. He then collected statistics from this and used it to devise a pretty fast and simple search of the most common best-split situations to maximize compression. You can read the comments for more specific details.

Extra note: Loren has been the primary maintainer of the x264 video encoder over the past 5 years and is also an ffmpeg developer.

So it's the JPEG version of PNGcrush et. al.?

Basically. There are three ways to losslessly compress a JPEG:

1) Huffman table optimization

2) Ordinary progressive scan

3) Trying all sorts of split orders for progressive scan

Most image apps worth their salt hopefully do 1) and 2). This script does 1) if it wasn't done already, and 3), which is its main purpose.

Any stats on likely percentage improvements over just doing 1)?

The GIMP takes advantage of this and lets you specify the coefficient (to an extent).

It only lets you pick the number of levels of progression if I recall correctly; it doesn't experiment with various different possible locations to split the DCT coefficients, only the number of splits total.

The way I read the code it tries a number of scans scripts by passing them to jpegtran one by one and remembering the size of the output.

The best result is then used.

an equivalent to how pngcrush works then

Just ran this on some files, random files about 801 or so jpg files...

Pre utility: du -d 1 -k 39710 .

Post utility: du -d 1 -k 37074 .

So about a 6-7% reduction. These are files uploaded via wordpress, not sure if it does any type of reduction by itself in the upload wizard, i doubt it.

Time to run the utility:

      279.84 real       176.08 user        72.19 sys

Hey! If the title is accurate, we can just keep applying it over and over, until every jpeg shrinks to 0 bytes!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact