
A Tour of the Tiny and Obfuscated Image Decoder - a_e_k
http://eastfarthing.com/blog/2020-09-14-decoder/
======
a_e_k
There's been some discussion on HN [1] of this IOCCC-winning program by
Fabrice Bellard before, but no real explanation of how it works. This post is
my attempt at a thorough exegesis (and something of a tutorial on modern lossy
image compression techniques).

At some point, I plan to write a followup post describing the encoder that I
wrote. Enjoy!

[1]
[https://news.ycombinator.com/item?id=17011872](https://news.ycombinator.com/item?id=17011872)

~~~
ConcernedCoder
Thank you for doing this, it was a wonderful read.

------
userbinator
In summary, Bellard took pieces of JPEG, H.264, and H.265 and combined them
together into a tiny and highly efficient image codec. Were this anyone else,
I would think it's extremely impressive; but it being Bellard somehow makes it
seem like par for the course, given everything else he has done...

~~~
a_e_k
Yes, the impressive thing to me was discovering just how many such pieces he
managed to cram in to a such a small amount of source code, and the
judiciousness with which he chose them.

While working on an encoder for it (for a future post!), I did find myself
wishing that it'd also had a deblocking filter. But that might have been
asking for a miracle.

------
peterburkimsher
Wow! I'm impressed at how such an image can be decoded from only 4 KB.

Would any mathematicians be able to take it an order of magnitude further? The
legendary DeCSS code is only 433 bytes, and can be represented as a prime
number [0]. Imagine if images could be represented as prime numbers.

[0]
[https://en.wikipedia.org/wiki/Illegal_prime](https://en.wikipedia.org/wiki/Illegal_prime)

~~~
a_e_k
As the sibling comments here say, that should certainly be possible.

However, I don't think that would result in any real compression. At 433
bytes, the DeCSS code takes 3464 bits. Reading the Wikipedia page you linked
tells me that the smallest of the prime representations for it was 1401
decimal digits. That's worth 1401 log2(10) ≈ 4654 bits of information. So I
believe that encoding it as a prime represents an increase in size in this
case.

~~~
peterburkimsher
Yes, it's an increase in storage size. However, it might be a decrease in
network bandwidth.

It doesn't take many bits for me to send you the illegal prime in decimal
exponent representation: k*256^211+99

With a sufficiently powerful CPU on the receiving end, you can decode that
into DeCSS. I would enjoy being able to send you a picture just by telling you
a prime number like that.

------
pierrec
Nice! I guess you'll be the first to publish an encoder for this format. I
wonder if Fabrice Bellard deliberately never published his encoder in order to
encourage curiosity leading to this kind of research. Even then I never
expected I would see an analysis with this level of detail, it's almost like a
course on compression techniques (but more entertaining).

~~~
a_e_k
I expect I will be the first, unless someone takes what I've posted here and
uses it to create and publish their own before I write up mine (or the
original one used for the contest gets published). Still, I've posted a new
image file in this format at the end to prove that I've done it.

I've corresponded a bit with Fabrice Bellard. He'd indicated that he simply
hadn't found the time to clean up his encoder for publication. To be fair, I
wouldn't yet consider mine to be publishable. I'll be very curious to compare
them if he ever does -- the neat thing about modern lossy codecs is that there
tends to be a lot of latitude for the encoders to make choices.

I'm glad you enjoyed all the detail. I had worried that perhaps there was too
much but I like to be thorough.

~~~
eatingCake
> I've posted a new image file in this format

I couldn't resist the urge to follow along and rewrite the code you've
presented in Rust, and what do I find at midnight after finally getting it to
work and running your test image through? You got me.

~~~
a_e_k
Excellent! I was hoping that someone would notice that and get a small chuckle
out of it.

------
shakna
Slightly offtopic, I'm having some problems accessing this site due to use
of... Older technology.

Curl reports: SSL connection using TLSv1.0 / DHE-RSA-AES256-SHA

Which Firefox will refuse to load without forcibly enabling TLS 1.0, which is
actually marked for complete removal at some near point in the future. I
believe Chrome's approach is similar, with the same expectation of removal.

Now might be a good time to check the site's architecture. As annoying as that
probably is. (Especially as it does just appear to be a static blog).

~~~
a_e_k
Thanks for the heads up. That's odd since it works for me in Firefox (on both
desktop and the new mobile) and I don't think I've had to do force enable
anything.

But it is true that I need to get a certificate and upgrade it to support
https. That's been on my to-do list but lower priority since, as you say, it's
a static blog.

~~~
shakna
Ah, that'll be it. Your web host actually already supports https, but in a
broken way.

> TLSv1.0 (IN), TLS handshake, Finished (20):

> SSL connection using TLSv1.0 / DHE-RSA-AES256-SHA

> ALPN, server did not agree to a protocol

> Server certificate:

> subject: CN=*.phpwebhosting.com

> start date: Apr 7 00:07:56 2020 GMT

> expire date: Apr 8 00:07:56 2021 GMT

> subjectAltName does not match eastfarthing.com

They serve up a wildcard certificate that uses a deprecated TLS.

