
Obscure Features of JPEG (2011) - userbinator
https://hodapple.com/blag/posts/2011-11-24-obscure-features-of-jpeg.html
======
londons_explore
It's worth remembering that if you're doing all these methods to optimize a
page for fast mobile display, that most hardware is typically CPU limited more
than network limited during a good part of displaying most webpages.

Displaying a JPEG in multiple passes uses significantly more CPU. Rather than
decoding the JPEG once and putting it on the screen, you end up decoding it 5,
10, or even 20 times, and have to rescale it, render any overlapping text or
effects, composite it and put it on the display every time.

Some hardware has accelerated JPEG decoding, but usually will still have to
render overlapping text, borders, or clip masks and with the CPU. The frequent
back and forth between GPU and CPU ends up being a big overhead too.

Your optimized jpeg file might theoretically render decently with fewer bytes
downloaded, but your overall pageload might be delayed by a second or more due
to the extra rendering required.

~~~
pornel
> Displaying a JPEG in multiple passes uses significantly more CPU

No, it's not. It has become a meme mainly because libjpeg-turbo v1 didn't have
optimizations for progressive coding. It's been fixed in v2. Progressive
rendering keeps state and incrementally updates it. Browsers throttle refresh
speed to avoid expensive edge cases. There has been a lot of investment in
browsers to make compositing dirt cheap.

There are ways to make incomplete JPEG decoding even faster, e.g. decode just
DC passes (that's 1/64th of memory and no IDCT), but the feedback I've got
from maintainer of libjpeg-turbo and from browser vendors is that it's at best
"nice to have" territory and the progressive JPEG overhead is a non-issue in
practice.

~~~
londons_explore
My experience tuning the Chromium rendering pipeline doesn't match this. The
actual jpeg decode is fast, but all the rest of the drawing and compositing
required is sloooow. For one thing, the skia graphics library has to reprocess
the drawlist for all things in the layer, and the drawlist (including the
elements outside the clipping rect of the image) needs to be serialised and
sent to the GPU process each time for every update. Perhaps I'm missing
something tho.

~~~
untog
It sounds like OP (no offence to them) is missing the context? Decoding a JPEG
itself can be super fast but if laying out every element that overlaps that
JPEG is slow the a progressive JPEG is still going to cause speed issues.

------
Mattwmaster58
I looked for a demo of progressive JPGs online and found this:
[https://pooyak.com/p/progjpeg/](https://pooyak.com/p/progjpeg/). Pretty cool

~~~
ganzuul
Haven't seen images load like that since the dial-up modem days.

I cludged together a node.js application in 2013 to load JPEG SOS segments
separately to the browser. The idea was to tie it to depth in a VR
application, like level-of-detail maps in game engines, but 'online'. Turned
out no browser like that much so I dropped the project.

------
magicalhippo
An interesting article detailing the various parts of JPEG and the choices
behind them can be found here:

[https://www.spiedigitallibrary.org/journals/journal-of-
elect...](https://www.spiedigitallibrary.org/journals/journal-of-electronic-
imaging/volume-27/issue-04/040901/JPEG-1-standard-25-years--past-present-and-
future/10.1117/1.JEI.27.4.040901.full?SSO=1)

------
DougBTX
Some of these JPEG features are great for optimising JPEG resizing, which can
be significantly faster than fully decoding the image then resizing.

[https://github.com/mattes/epeg](https://github.com/mattes/epeg)

------
contingencies
The other day I uploaded an image to Wikimedia Commons at maximum JPEG file
format resolution.

[https://commons.wikimedia.org/wiki/File:Panorama_of_Sydney_f...](https://commons.wikimedia.org/wiki/File:Panorama_of_Sydney_from_Lavender_Bay_\(1875\).jpg)

This is likely to become a more frequent issue in future.

~~~
JadeNB
Probably-stupid question from a non-specialist: if you're digitising an
existing real-world picture and want really high fidelity, then why would you
convert it to a lossy format like JPEG?

~~~
contingencies
Fair question. The image was assembled way back in 2014 or earlier from a
large number of tiled JPEG source images, so the quality had already been
compromised. Also, JPEGs are essentially viewable on any device this side of
1990 and way smaller than lossless formats. These days I would consider webp
in preference to JPEG.

------
jfries
Could progressive mode be used to serve thumbnails by just truncating the
image at a suitable point, or does the spec (and so, decoders) expect the
whole image to eventually arrive?

~~~
ygra
Since progressive JPEGs are displayed while downloading and the connection
could just be closed at any moment anyway ... I don't think that'd be a
problem. Whether that's more efficient than an extra thumbnail is probably the
more interesting question.

~~~
TapamN
If you have time and space to pre-generate thumbnails, it's probably not a
significant win, but I think it could work well for displaying local
thumbnails of JPEGs, like from a camera.

If you're browsing a directory of hundreds of large (e.g. 10+ MB) JPEG
photographs, generating the thumbnails by fully decompressing all of them
would take while. "Progressive thumbnails" that only decompress the first ~100
KB would be much faster.

~~~
mkl
You can do that even with non-progressive JPEGs, as you can use just the low
frequency terms from the discrete cosine transform (the same data that comes
first with progressive ordering).

Epeg and libjpeg-turbo can do this.

~~~
TapamN
You would still have to read the entire JPEG in though, wouldn't you?

I'm not an expert on JPEG, but I think that if you want the macro blocks at
the bottom of the image, you still need to un-Huffman the all the blocks
before it to find where the macro blocks start (since AFAIK there isn't a
table indicating where each block starts). That means you have to read the
entire JPEG from storage, only to through away the vast majority of it.

Even if there was a way to magically predict where the low frequency values of
the image are stored, you'd still have to do tens of thousands of random reads
to just get to them. Reading the whole file would be faster.

So if you have 500 photos and you want to go though them and need some
thumbnails, for non-progressive image thumbnail generation, you have to read
10 MB x 500 images = 5 GB of data, but with a progressive thumbnail you only
need the first 100 KB x 500 images = 50 MB of data.

~~~
mceachen
As an aside, if you're just wanting thumbnails, most digital cameras encode
small (120x160, ish) thumbnails in the EXIF header that can be quickly
extracted by exiftool.

------
tedd4u
It all depends on what the client is doing when it’s decoding, so YMMV, but
progressive JPEG is not a panacea. In addition to the CPU concerns mentioned
above there can be memory implications as well. Many progressive JPEG decoders
require a full-sized destination buffer to decode. Baseline decoders usually
only need one row of blocks. If most of the time some downsampling is required
anyways, it can be done inline with baseline encoded images. A progressive
implementation may need order n^2 memory footprint. If you have many pages
with many images all of which are going to require some client side scaling
this could add up to a lot of unnecessary alloc/dealloc too.

But I encourage you to try both ways and pick a winner based on your own
results.

------
zzo38computer
I should think that if the quantization matrix (which is stored in the JPEG
file) is known, then it should be possible to re-encode the JPEG file
losslessly after it has been converted from JPEG into a lossless format so
that the DCT is lost. However, I do not know if any program actually does
this, or how to write such a program. Another thing I considered is if a
decoder could be made to somehow try to improve the quality of the output by
producing a picture which could have been the input to the encoder with the
specified quantization matrix.

(Of course, all of this is not what the article is about (it is about
progressive and multi-scan JPEG), but still it is my questions/comments
anyways.)

------
mackman
> jpegtran can do some other things losslessly as well - flipping, cropping,
> rotating, transposing, converting to greyscale

Ok so those are only lossless in specific situations when the image dimensions
are a multiple of the dct block size. And of course grayscale and cropping are
lossy by definition. A better way to say it is that it can perform some
transformations without recomputing and compressing the dct although if you
pass the -optimize option you are still redoing the Huffman tables.

~~~
gugagore
I understand your point, but it's not necessarily a better way to say it. The
full term is "generation loss", and it applies regardless of destructive edit
operations.

------
mmastrac
Also note that JPEG supported arithmetic coding for somewhat better
compression, but most encoders and decoders didn't support it (do they
currently?) because of patents that expired over the last few years.

~~~
userbinator
The majority of software still doesn't support it. It's a chicken-and-egg
problem --- almost no one uses arithmetic mode because no software supports
it, and almost no software supports it because almost no one uses arithmetic
mode.

------
mysterydip
Are there any image formats that have a signed hash or similar to know if it's
been altered since creation?

~~~
morsch
JPEG and many other image formats support adding arbitrary metadata, you could
easily sign the image data and add it as an Exif tag. I'm not sure if anybody
is doing it, though. It seems like a good idea, though, maybe we should start.

Here [1] is a paper evaluating the concept. The link to the code examples
seems dead, but I bet you could cobble something together in a couple lines of
shell script. The only difficult part is signing just the image data as
opposed to the whole file (which you're going to modify by appending the
signature itself).

Obviously, signing the image like that only guarantees that the owner of the
key also generated the image. You're going to have to trust them and their
sources that they haven't modified it.

If the camera itself did the signing in a secure manner, that would be a much
stronger guarantee. You could rely on a JPEG or RAW file being generated from
a Canon camera by validating Canon's signature. I think some cameras can do
the signing part, but not so much the secure manner part[2].

In either case it's obviously trivial to strip the signature. So it doesn't
help photographers who want to prevent reproduction of their works without
attribution.

Finally, here[3]'s a Stack Overflow post on the topic,

[1]
[https://www.sciencedirect.com/science/article/pii/S221083271...](https://www.sciencedirect.com/science/article/pii/S2210832717300753)
(2018)

[2] [https://petapixel.com/2010/12/01/russian-software-firm-
break...](https://petapixel.com/2010/12/01/russian-software-firm-breaks-
canons-authenticity-verification-big-time/)

[3] [https://photo.stackexchange.com/questions/15307/can-
digital-...](https://photo.stackexchange.com/questions/15307/can-digital-
cameras-sign-images-to-prove-authenticity)

~~~
mysterydip
Wow, thanks for the detailed reply. My initial thought was you could make sure
evidence submitted in court hasn't been edited (ie photoshop someone at a
crime scene, or add someone to a pic for an alibi), but there could be a few
uses. The problem with exif data is as you say it's trivial to strip, but it
might be a stepping point towards another "secure" format (sjpeg?)

~~~
dfox
If you want truly secure format for these kinds of applications the best thing
you can do is to sign the data externally (or wrap the whole file in some
signed container, which seems to be the preferred solution for EU's EIdAS and
related stuff).

On the other hand I have seen totally insecure, but effective hack for formats
with embedded metadata: include some kind of value in there that is usually
prominently displayed by OS and applications but store it in somewhat broken
way such that applications trying to preserve the metadata will break it even
more and it would become unreadable.

