
Building a blazing fast ETC2 compressor - ivank
https://medium.com/@duhroach/building-a-blazing-fast-etc2-compressor-307f3e9aad99
======
Scaevolus
> For a 1024x1024 image, that’s 256x256 blocks, where each one can
> independently be one of 10 separate formats. In total, that’s
> 10<sup>65536</sup> potential combinations for the encoder to search through
> in order to find the best-possible-quality image.

This math is misleading. ETC2 is a GPU texture compression format, so blocks
are accessed independently-- each 4x4 chunk is entirely represented by 8 or 16
bytes of data. That means there's only 655,360 different possibilities to
search through, assuming each chunk is encoded independently using the best
method. That works out to ~10ms per chunk for brute force, which is slow but
not _completely_ crazy.

Here's a slide deck with a generic overview on what ETC2 provides:
[https://www.khronos.org/assets/uploads/developers/library/20...](https://www.khronos.org/assets/uploads/developers/library/2012-siggraph-
opengl-es-bof/Ericsson-ETC2-SIGGRAPH_Aug12.pdf)

And here's one with detailed information on how ETC2 encodes blocks:
[http://www.graphicshardware.org/previous/www_2007/presentati...](http://www.graphicshardware.org/previous/www_2007/presentations/strom-
etc2-gh07.pdf)

------
astrange
This is an image compressor (I guess), but it doesn't use any of the words I
would expect to see in an article about one: prediction, rate-distortion,
residuals. Guess they haven't talked to the ffmpeg/vp8 people down the hall.

I'd complain about PSNR - optimizing for that gets you blurry/banded images -
but the codec looks too simple for that to really matter.

~~~
tveita
"Prediction, rate-distortion and residuals" sounds suspiciously like you're
talking about a video format, not an image format.

And this isn't just any image format, it's a _texture format_. That comes with
some additional limitations that will be foreign to the ffmpeg/vp8 people down
the hall. Texture formats are made to be decodable in hardware, and are
usually fixed-rate and support random lookup. ETC2 divides the picture into
fixed-size blocks, and each block is represented with the same amount of bits.
You can't predict the block based on surrounding blocks, and you can't save
bits on easy to compress blocks and spend them on the difficult ones.

~~~
astrange
> "Prediction, rate-distortion and residuals" sounds suspiciously like you're
> talking about a video format, not an image format.

All video formats make better image formats than the image ones do, because
people expect video codecs to get replaced faster. They just call them
I-frames instead. H.265 is a better JPEG and ffv1 is a much better PNG.

> Texture formats are made to be decodable in hardware, and are usually fixed-
> rate and support random lookup.

Yeah, that's the major difference, and of course it's 2D random lookup so
linear pixels are a very bad choice too. I don't know much about this area,
but OS X WindowServer used to store all its bitmaps RLE-compressed and handle
them straight in the compositor…

> You can't predict the block based on surrounding blocks, and you can't save
> bits on easy to compress blocks and spend them on the difficult ones.

There's different kinds of prediction in codecs - inside a block (intra
prediction, he calls it "block type") and from neighboring blocks. I would've
figured the second one doesn't happen too, but then what does he mean by
"proto-block-chains"? Seems like he's trying to talk about dynamic programming
for intra prediction, but actually means intra prediction heuristics.

Rate-distortion doesn't matter if you can't control rate, yeah. But "residual"
exists - it's just the pixel values not guessed by prediction.

~~~
tveita
There is no intra prediction. The proto-block-chains are just an optimization
to choose which encodings will be tried for each block. An alternative is
exhaustive search for each block, which is feasible, but slow.

