
Introducing the ‘mozjpeg’ Project - joshmoz
https://blog.mozilla.org/research/2014/03/05/introducing-the-mozjpeg-project/
======
pavlov
Bravo. I love JPEG. Amazing that it's been 23 years since its release and it
remains as useful as ever.

I remember what it was like to watch a 320*200 JPEG image slowly build up on a
386SX PC with a VGA card. Today, a HD frame compressed with JPEG can be
decoded in milliseconds. This highlights the secret to JPEG's success: it was
designed with enough foresight and a sufficiently well-bounded scope that it
keeps hitting a sweet spot between computing power and bandwidth.

Did you know that most browsers support JPEG video streaming using a plain old
<img> tag? It works also on iOS and Android, but not IE unfortunately.

It's triggered by the "multipart/x-mixed-replace" content type header [0]. The
HTTP server leaves the connection open after sending the first image, and then
simply writes new images as they come in like it were a multipart file
download. A compliant browser will update the image element's contents in
place.

[0] [http://en.wikipedia.org/wiki/MIME#Mixed-
Replace](http://en.wikipedia.org/wiki/MIME#Mixed-Replace)

~~~
dmm
Unfortunately chrome recently removed support for "multipart/x-mixed-replace".
It's too bad too. It was a simple way to implement a webcam.

[https://code.google.com/p/chromium/issues/detail?id=249132](https://code.google.com/p/chromium/issues/detail?id=249132)
[http://blog.chromium.org/2013/07/chrome-29-beta-web-audio-
an...](http://blog.chromium.org/2013/07/chrome-29-beta-web-audio-and-webrtc-
in.html)

~~~
fzzzy
According to those links, they still support it for images, but removed
support for other types of resources.

It's too bad IE never supported multipart/x-mixed-replace or we might have
seen more live updating websites earlier. Now that we have WebSockets and well
understood long polling approaches it doesn't matter any more, since x-mixed-
replace would keep the download spinner spinning forever and the newer
approaches don't have that problem.

~~~
kybernetikos
I used multipart replace in firefox for streaming data for years with no
download spinner. I couldn't use it in chrome because chrome always seemed to
have some weird bug where 'frames' (of data in my case) were delayed.

I was disappointed when they were unceremoniously ripped out, but yes,
websockets are better.

~~~
fzzzy
Ok, I believe you, it's been over 10 years since I tried it :-) Maybe I was
thinking of the forever iframe, where a regular application/javascript
document had content added to it incrementally over time.

~~~
IgorPartola
It is actually pretty bizarre. If you point Chrome at a .jpeg that does
multipart/replace it will stall, giving you roughly 0.2 fps. Point it at a.
.html that contains an <img> tag and it works fine. Found this out while
hacking on Hawkeye:
[https://igorpartola.com/projects/hawkeye/](https://igorpartola.com/projects/hawkeye/)

------
billyhoffman
This is very promising. Images by far dominate a web page, both in number of
requests and total number of bytes sent [1]. Optimizing image size by even
5-10% can have a real effect on bandwidth consumption and page load times.

JPEG optimization using open source tools is an area that really needs focus.

There are a number of lossless JPEG optimization tools, but most are focused
on stripping non-graphical data out of the file, or converting the image to a
progressive JPEG (since progressive JPEG's have rearrange pixel data you can
sometimes get better compression since there may be more redundancy in the
rearranged data). Short of exceptional cases where you can remove massive
amount of metadata (Adobe products regular stick embedded thumbnails and the
entire "undo" history for an image) lossless optimization usually only reduces
file size by 5-15%.

Lossy JPEG optimization has much more potential. Unfortunately, beyond
proprietary encoders, the most common lossy JPEG optimization exclusively is
to reduce the JPEG quality. This always felt like killing flies with a tank,
so advances in this area would be awesome.

I've written extensively about Lossy optimization for JPEGs and PNG, and spoke
about it at the Velocity conference. A post and my slides are available[2].

[1] - [http://httparchive.org/trends.php](http://httparchive.org/trends.php)

[2] - [http://zoompf.com/blog/2013/05/achieving-better-image-
optimi...](http://zoompf.com/blog/2013/05/achieving-better-image-optimization-
with-lossy-techniques)

------
IvyMike
JPEG has shown amazingly good staying power. I would have assumed "JPEG is
woefully old and easy to beat" but Charles Bloom did a good series of blog
posts looking at it, and my (non-expert and probably hopelessly naive)
takeaway is that JPEG still holds its own for a 20+ year old format.

[http://cbloomrants.blogspot.com/2012/04/04-09-12-old-
image-c...](http://cbloomrants.blogspot.com/2012/04/04-09-12-old-image-
comparison-post.html)

~~~
mistercow
In my opinion, the biggest drawback of JPEG is that its window is non-
overlapping. Ring-artifacts are generally not a big deal for natural images at
medium quality or higher, but blocking artifacts can be noticeable even at
relatively high quality settings.

There are a lot of post-processing techniques to try and mitigate this, but in
my experience they tend to do about as much damage as they fix. The proper
solution is to overlap the blocks using one of the myriad techniques DCT-based
audio codecs use.

It is bizarre to me that for all of the attempts to beat JPEG, nobody seems to
have tried simply overlapping the blocks by 2 pixels. You'd have an
implementation only marginally more complex than JPEG (in fact, you can even
implement it on top of an existing JPEG encoder/decoder) with a slowdown of
only 25%.

~~~
brigade
JPEG-XR has (optional) lapping. But lapping has the problem that you can't do
spatial intra prediction, which is significantly more valuable. And no one
figured out how to make frequency-domain intra prediction as good until Daala.
Plus deblocking filters have gotten pretty good now that they're tuned based
on the quantizer used.

But a bigger problem is that no one is really interested in designing a new
still image codec that's better than JPEG, since JPEG can't be unseated. So
video codecs are where the practical development goes. And avoiding in-loop
deblocking filters there means OBMC, which is _extremely_ computationally
intensive.

------
csense
For improving general-purpose gzip / zlib compression, there is the Zopfli
project [1] [2]. It also has (alpha quality) code for PNG file format; since
this functionality wasn't originally included, there are also third-party
projects [3].

You might be able to shave a percent or so off the download size of compressed
assets.

[1]
[https://news.ycombinator.com/item?id=5316595](https://news.ycombinator.com/item?id=5316595)

[2]
[https://news.ycombinator.com/item?id=5301688](https://news.ycombinator.com/item?id=5301688)

[3] [https://github.com/subzey/zopfli-png](https://github.com/subzey/zopfli-
png)

------
derefr
Now if only they'd do a mozpng.

(For context: libpng is a "purposefully-minimal reference implementation" that
avoids features such as, e.g., Animated PNG decoding. And yet libpng is the
library used by Firefox, Chrome, etc., because it's the one implementation
with a big standards body behind it. Yet, if Mozilla just forked libpng, their
version would instantly have way more developer-eyes on it than the source...)

~~~
pornel
There's already a "mozpng", it's called Zopfli. There's also AdvPNG which
compresses PNGs with 7-zip's deflate implementation.

And if you want much much smaller PNGs, then try
[http://pngquant.org](http://pngquant.org) or
[http://pngmini.com/lossypng.html](http://pngmini.com/lossypng.html)

~~~
sovok
Or try ImageOptim, which bundles these and other tools with a nice GUI:
[http://imageoptim.com/](http://imageoptim.com/)

~~~
nness
ImageOptim also bundles PNGOUT, which I find to have exceptional compression
when compared to the others (but it is a bit slower)

------
CookWithMe
We've been using [http://www.jpegmini.com/](http://www.jpegmini.com/) to
compress JPGs for our apps. Worked OK, although we didn't get the enormous
reductions they advertise. However 5% - 10% does still make a difference.

We've been using the desktop version. Would love to use something similar on a
server, but jpegmini is overpriced for our scenario (I'll not have a dedicated
AWS instance running for compressing images every second day or so). Will
definitely check out this project :)

~~~
rast-a
Have u tried [https://kraken.io](https://kraken.io) ?

------
tenfingers
I noticed that optimizing JPEG images using jpegoptim
([http://www.kokkonen.net/tjko/projects.html](http://www.kokkonen.net/tjko/projects.html))
reduces the size by a similar factor, but at the expense of decoding speed.

In fact, on a JPEG-heavy site that I was testing with FF 26, there was such a
degradation in terms of responsiveness that transitions would stutter whenever
a new image was decoded in the background (while preloading).

It made the effort to save 2-4% in size wasted with a worse user experience.

~~~
gcp
Did you file a bug for this? This doesn't sound normal at all.

~~~
tenfingers
Honestly, no. libjpeg would show similar slowdown (interestingly, PNG decoding
is slower than JPEG for the same size), and it make sense anyway.

The problem is that even if the bug would be fixed in recent FF versions,
libjpeg is basically used in all other browsers as well.

------
ilaksh
If my goal were to compress say 10,000 images and I could include a dictionary
or some sort of common database that the compressed data for each image would
reference, could I not use a large dictionary shared by the entire catalog and
therefore get much smaller file sizes?

Maybe images could be encoded with reference to a common database we share
that has the most repetitive data. So perhaps 10mb, 50mb or 100mb of common
bits that the compression algorithm could reference. You would build this
dictionary by analyzing many many images. Same type of approach could work for
video.

~~~
phillmv
Well, if we're shipping a common database and you're maximizing _transmission
efficiency_ then yes of course.

In information theory, my understanding (which is quite limited) is that when
we talk of bits transmitted we can think of it as "uniquely identifying from
within the _set_ of total possible messages". So, if you shipped the entire
10,000 image catalog "transmitting" an image from within that catalog would
take you a mere, uh, let me count my fingers, 13 bits.

We could go one step further and find some way of hashing all the data
together to remove any redundancies and so forth - but the problem alas is
about defining arbitrary images :).

What you described tho is kind of what happens with all compression algorithms
except on the "micro"/individual image level. You may already be aware but
check out Huffman coding:
[http://en.wikipedia.org/wiki/Huffman_coding](http://en.wikipedia.org/wiki/Huffman_coding)
for a simple intro.

~~~
ilaksh
I am aware that compression algorithms kind of work on this general idea of
referencing common bits. Of course I am aware of that.

How do you read my question and interpret simply it as "lets send all of the
images in full and then give their index and call it compression?"?? What I
suggest is that we take a standard encoding technique like Huffman, or some
modification, but rather than creating a table based on data in an individual
image, build this code table by analyzing many, many images.

I have read the Wikipedia article on Huffman coding before. However, the
details are not really important in regards to my point.

What I am suggesting is that rather than looking at just the bits in
individual images and using them to construct a Huffman table or some other
kind of reference, look at the bits on many, many images and create a larger
reference table. And then of course you may need a local table for things in
the image that don't quite correspond to the larger table.

Earlier compression techniques were much more constrained in terms of
processing power, RAM, network connectivity etc. and so distributing and using
a large table for compression was not practical. I am suggesting that someone
who has knowledge of compression engineer a system where 10MB, 50MB, or 100MB
of RAM is used and a large common bits file is transmitted, rather than
starting with the idea that almost all of the data or all of the data has to
be contained in one file. I am not suggesting that an existing compression
algorithm could be translated directly into this general concept. I am
suggesting an engineering effort starting with different constraints and
trade-offs.

~~~
pit
I thought about doing something like that for audio, but I think it gets
really inefficient with higher sample rates because you end up with so little
overlap.

Like this: imagine I've got an array of shorts representing audio data. If
I've got two files with similar segments, I've saved one short:

[1, 2, 3, 2, 1, 2, 5] [7, 2, 3, 6, 8, 9, 3]

So I can say that [2, 3] is represented by a new value (z), and can shave a
short off of both streams. Then what happens if a new stream comes along with
no similarities:

[8, 2, 7, 1, 7, 3, 7]

...you still have to send each value.

Maybe I've just demonstrated that I don't know anything about compression, but
I would be interested in working with you on this.

~~~
ilaksh
I think I would want to find an existing sparse autoencoder implementation,
ideally a project already setup for encoding audio, and start from there.
[http://www.stanford.edu/class/cs294a/sparseAutoencoder.pdf](http://www.stanford.edu/class/cs294a/sparseAutoencoder.pdf)

------
rwmj
Why don't they just contribute the jpgcrush-like C code back to libjpeg-turbo?

Edit: A good reason given in the reply by joshmoz below.

~~~
joshmoz
This was discussed with the author of libjpeg-turbo. His priorities are
different, it was agreed that a fork is best.

~~~
nly
Listen, you're doing it all wrong. You're supposed to fork it, tell noone, do
a years worth of work in secrecy, release it. palm it off to another FOSS
community, then abandon it, then refork it, do another 2 years of work in
secrecy and then release it again under a new name. Got it?

You'll never get to play alongside big boys like Apple and Facebook with this
'talk to upstream' attitude of yours. That's just not how the game is played.

~~~
TheZenPsycho
It's telling you left off google.

------
drawkbox
Data compression and image compression is a great way to improve the overall
internet, bandwidth and speed. Maybe as important as new protocols like SPDY
and js/css minification and cdn hosting of common libraries.

As long as ISPs/telcos don't go back to the days of AOL network wide
compression to reduce bandwidth beyond low quality I am for this at service
level like facebook/dropbox uploads. I hope this inspires more in this area.
Games also get better with better textures in less space.

Still to this day, I am amazed at the small file sizes macromedia (adobe now)
was able to obtain with flash/swf/asf even high quality PNGs would compress.
So yes we all have lots of bandwidth now but crunching to the point of
representing the same thing is a good thing. With cable company caps and other
bandwidth false supply shortage that focus might resurge a bit.

~~~
userbinator
SWF has a very cleverly designed binary vector graphics format, which
naturally lends itself to small filesizes. Much better than SVG or (E)PS, I
think.

~~~
est
Can you share more on the "clever" part?

~~~
userbinator
To quote from the format spec, "SWF uses techniques such as bit-packing and
structures with optional fields to minimize file size." Many fields are
variable-width numbers of bits, using only as many bits as necessary to encode
the data. Coordinates are delta-encoded.

------
cjensen
JPEG-2000 exists, but decoding is still too slow to be useful.

[http://en.wikipedia.org/wiki/JPEG_2000](http://en.wikipedia.org/wiki/JPEG_2000)

~~~
jfb
Seen a movie in the theater recently? JP2k lives on in the DCI [1].

[1] [http://www.dcimovies.com](http://www.dcimovies.com)

~~~
cjensen
Yep. Theater projectors have roughly $1k in specialized decoding chips to make
that work.

Software encode/decode of JP2k is the hard part. That's why there's little
adoption of J2k outside of hardware solutions.

~~~
jfb
It's a really weird place to use JP2K -- they're functionally not space
limited, or compute limited, so what's the advantage of wavelets? You could
just gzip V210 or something.

~~~
sjwright
The spec allows for 4K (4096x2160 at 24 FPS) or 2K (2048x1080 at 24 or 48 FPS)
source material and projectors. The spec recognizes that 2K sources may be
played on 4K projectors (where it leaves the task of upscaling to the
implementer) and 4K sources on 2K projectors.

The advantage of the wavelet format is that they can implement progressive
resolution decoding, so a decoder only needs to read half of the data from a
4K source to decode a full quality 2K image.

~~~
jfb
Interesting. I had known about the mosquito noise problem (as cjensen notes
(and I was joking about gziping 10-bit 422)), but I didn't realize the
progressive decoding was a part of the standard. I spent some time in the guts
of the DCP in a previous life, and software decoding of JP2K was always a
nightmare. So I'm a bit jaundiced.

But, as always, when you think an engineering decision is insane, you're
_probably_ missing the context in which the decision was made.

------
United857
What about WebP? Isn't that intended to be a eventual replacement to JPEG?

~~~
cbhl
Only if other browser vendors adopt it, although IIRC WebP has hard-coded
maximum file size limits that make it impractical for e.g. retina displays,
let alone anything we might see twenty years from now.

~~~
magicalist
Ah, that's interesting. I hadn't heard that before. It is a maximum size of
16383x16383[1], so it's more than practical for retina displays, but I can see
the point about files in 20 years (for reference, jpegs can be 4x that in each
dimension, 65535×65535).

I haven't heard that brought up as an objection to the format before, though.
If it were really a fundamental stumbling block, there are likely ways to
adapt the format around it.

[1]
[https://developers.google.com/speed/webp/faq#what_is_the_max...](https://developers.google.com/speed/webp/faq#what_is_the_maximum_size_a_webp_image_can_be)

------
1ris
I'm actually disapointed. I hoped they developed a still image format from
Daala. Daala has sigificant improments such as overlapping blocks, differently
sized blocks and a predictor that works not only for luma or chroma, but for
both.

~~~
metajack
One of our listed intern projects is exactly that. It's not a high priority
project for the Daala team because we already have good royalty-free image
codecs, but not video codecs. We have finite time, so we choose to attack what
we see as the more important problem.

Developing a still image format from Daala is possible but not as trivial as
it looks (the same can be said for the WebP work from WebM).

------
Taek
I like that Mozilla is improving the existing accepted standard, but using
modern (mostly patented) codec techniques we could get lossy images to under
1/2 of the current size at the same quality and decode speed. Or at a much
higher quality for the same size.

The speed modern web concerns me. The standards are not moving forward. We
still use HTML, CSS, Javascript, Jpeg, Gif, and PNG. Gif especially is a
format where we could see similar sized/quality moving images at 1/8th the
file size if we supported algorithms similar to those found in modern video.

In all of these cases, they aren't "tried and true" so much as "we've had so
many problems with each that we've got a huge suite of half-hacked solutions
to pretty much everything you could want to do". We haven't moved forward
because we can't. WebP is a good example of a superior format that never stood
a chance because front-end web technology is not flexible.

~~~
pcwalton
I think the surprising thing is that JPEG is as good as it is. WebP certainly
isn't "1/2 of the current size at the same quality and decode speed". It's a
modest improvement if anything.

GIF, sure. But, IMO, nobody should really be using GIFs anymore. We've had
<video> and APNG for a while if you want animation.

------
TheZenPsycho
I have heard similar things about GIF (that there are optimisations that most
encoding software does not properly take advantage of). But I haven't seen any
efforts, or cutting edge software that actually follows through on that
promise. The closest I've seen is gifscicle, which is a bit disappointing.

What would be great if there was some way for an animated gif's frame delays
to opt-in to being interpreted correctly by browser- That is, a 0-delay really
would display with no delay, and so optimisation strategies involving the
splitting of image data across multiple frames could be done- and when read in
by a browser, all frames would be overlaid instantly, module loading time.

What other things can be done to further optimise animated gif encoding?

~~~
Grue3
GIF should be killed. Animated PNG is where it's at.

~~~
iopq
Animated PNGs should be killed. Just embed a video with an alpha channel. It
compresses way better.

[http://simpl.info/videoalpha/](http://simpl.info/videoalpha/)

~~~
Grue3
Wrong use-case. Animated GIFs and animated PNGs were never intended for video.
The analogy to still image formats is video:JPEG = web-animation:PNG. You
don't save screenshots as JPGs and you don't save photos as PNGs. Same with
video and web-animation, two different use-cases require two different
formats.

~~~
iopq
except that's a false dichotomy

one format can support both lossy and lossless images and use better
compression between frames to achieve 10x better results for animations than
animated pngs without visible loss of quality

~~~
Grue3
Which format can support both lossy and lossless images?

I spent 2 seconds in GIMP to make the following GIF:
[http://i.cubeupload.com/kbrcfQ.gif](http://i.cubeupload.com/kbrcfQ.gif)
Please show me the video format that can replace it.

~~~
iopq
webp supports animations, lossless and lossy images, etc. I already linked to
a video with an alpha channel that can match your gif

------
jmspring
It's not clear from the article, in their "comparison of 1500 JPEG images from
Wikipedia" did they just run through the entropy coding portion again or did
they requantize? (I suspect they did jus the entropy coding portion, but hard
to tell).

Getting better encoding by changing the quantization method can't be purely a
function of file size, traditionally PSNR measurements as well as visual
quality come into play.

Good to see some work in the area, I will need to check out what is new and
novel.

That said, a company I worked for many moons ago came up with a method where
by reorganization of coefficients post-quantization, you could easily get
about 20% improvement in encoding efficiency, but the result was not JPEG
compatible.

There is a lot that can be played with.

------
transfire
If only JPEG supported transparency.

~~~
chrisdew
You can do a JS hack with JPEG for RGB and PNG for alpha, but it's not ideal.

------
Matrixik
When I optimize JPG or PNG I usually use ScriptJPG and ScriptPNG from
[http://css-ig.net/tools/](http://css-ig.net/tools/)

They are shell scripts running many different optimizers

------
morganw
"support for progressive JPEGs is not universal"
[https://en.wikipedia.org/wiki/JPEG#JPEG_compression](https://en.wikipedia.org/wiki/JPEG#JPEG_compression)

e.g. the hardware decoder in the Raspberry Pi
[http://forum.stmlabs.com/showthread.php?tid=12102](http://forum.stmlabs.com/showthread.php?tid=12102)

------
kraken-io
Hey everyone, after some testing we have just deployed mozjpeg to our web
interface at: [https://kraken.io/web-interface](https://kraken.io/web-
interface)

You can test it out by selecting the "lossless" option and uploading a jpeg.
Enjoy!

------
kllrnohj
So... version 1.0 is basically a shell script that calls libjpeg-turbo
followed by jpgcrush?

~~~
michaelmior
No, the jpegcrush functionality is implemented in C as an extension to
libjpeg-turbo. But yes, I suppose a shell script would achieve roughly the
same result.

~~~
kllrnohj
So why wasn't this upstreamed to libjpeg-turbo? Why fork at all? libjpeg-turbo
is still an actively maintained project after all...

~~~
syntheticnature
Maintainer has different priorities, see
[https://news.ycombinator.com/item?id=7349117](https://news.ycombinator.com/item?id=7349117)

------
sp332
Any chance of incorporating other psy improvements, instead of just targeting
SSIM?

------
Momentum
At first glance this seems wasteful. I do not think anyone would have problem
in using Jpeg. However, in many cases, before the the invention of a thing who
has had no problem using old tools!

------
SimHacker
Has somebody translated the jpeg library to JavaScript? Besides encoding and
decoding jpeg, it has some useful modules that would be nice to have in the
web browser.

------
callesgg
A bit to soon to start announcing the project. But I like the initiative hope
the project manages to improve stuff.

~~~
joshmoz
We would like to develop in the open, and hopefully with community
participation.

~~~
brigade
BTW, please fix your lossy test methodology for when you do your tests on a
future version of this.

[https://blog.mozilla.org/research/2013/10/17/studying-
lossy-...](https://blog.mozilla.org/research/2013/10/17/studying-lossy-image-
compression-efficiency/) was rather flawed and you never acknowledged this.

(see
[https://news.ycombinator.com/item?id=6581827](https://news.ycombinator.com/item?id=6581827)
\- only one of the four sets of test results _might_ have been valid)

------
davidgerard
What license are they doing this under? Hopefully they're aiming to upstream
this to libjpeg.

------
jimbones
This is so dumb, there are a million JPEG crushers in existence but instead of
advocating the use of one of these Mozilla writes their own? Why not support
webp rather than dismiss it due to compatibility and waste time doing what has
been done before.

