

Zip Bomb - takinola
http://en.wikipedia.org/wiki/Zip_bomb

======
jefffoster
I found a similar file to this (a zip file that contains itself) and e-mailed
it to a friend at work. He never received it, but I thought nothing of it (I
assumed the email filters just destroyed it).

A days later the mail server stops working and the sysadmin turns up at my
desk. Turns out the anti-virus scanner had been unzipping and scanning
repeatedly. It eventually filled up the entire disk and bad things happened.

~~~
chrismorgan
That reminds me of an incident when I was in year 8: seeing how deeply nested
I could get directories on Windows. H:\a\a\a\a\\..., eventually it stopped
working. (I played the game with my friend... he went for creating a new
directory at each level, after a little I became sensible and went for copying
and pasting, thus multiplying the depth by two each level which of course
achieves the goal pretty quickly - so I won by a considerable margin.)

The school IT manager (who, incidentally, apart from this once I was always on
good terms with) was rather annoyed at me the next day, for the nightly backup
had fallen over the previous night and he had found the problem. You see, what
to me was H:\ was \\\galaxy\users$\chrism, which on that server was
D:\users\chrism. So that 256-or-so character path became longer than 256
characters on the server and the backup software hadn't been written carefully
enough to cope with what was a perfectly valid NTFS path, but not a valid path
for the normal Win32 API function calls.

How was I to know it would do that?

~~~
TazeTSchnitzel
One of many examples of why Windows's arbitrary path length restriction is
ridiculous.

~~~
duskwuff
Most UNIX systems have a PATH_MAX. It's not just Windows.

~~~
krakensden
PATH_MAX is defined but meaningless on both Linux and OS X.

~~~
comex
Unfortunately, this is not true. The OS X kernel can't deal with paths longer
than MAXPATHLEN, which is defined as 1024.

------
simoncoggins
I've seen something similar with a PNG file for user supplied profile image
[1]. The image was a 10000x10000 all black PNG image which compresses to a
pretty small file size.

Unless you validate the image dimensions as well as the file size it may cause
problems, for instance when GD is used to try to resize it exhausted the
memory limit.

[1] <https://bugs.launchpad.net/mahara/+bug/784978>

~~~
rorrr
Out of curiosity I just made two images:

15,000 x 15,000: <http://i.imgur.com/WzCyE.png>

50,000 x 50,000: <http://i.imgur.com/kgmHu.png>

Both FF and Chrome refuse to open the second one. IE does something weird.
Both Opera and Safari figure out the size correctly, but don't display the
image.

~~~
m4rkuskk
In case safari crashes and you won't be able to open it, sudo rm -rf
~/Library/Caches/com.apple.Safari did it for me.

------
DanBC
Here are some other compression curiosities:

(<http://www.maximumcompression.com/compression_fun.php>)

It includes a 115 byte rar file that expands to 5 Mb. (That 115 bytes can be
squashed down further; one compressor gets it to 39 bytes.); a file that
compresses with one software but ends up bigger with another software; etc.

some say that file compression is linked to AI - good general purpose
compression relies on being able to predict the text and create table; if you
can predict something you understand it.

The Hutter prize tests this against the 100 Mb of enwiki8. Best attempt so far
is a bit less than 16 Mb.

(<http://prize.hutter1.net/>) I remember zip bombs from early 90s BBSing. I
also remember ANSI bombs.

~~~
im3w1l
Very relevant:
[https://en.wikipedia.org/wiki/Kolmogorov_complexity#Incomput...](https://en.wikipedia.org/wiki/Kolmogorov_complexity#Incomputability_of_Kolmogorov_complexity)

~~~
DanBC
> _By definition, the Kolmogorov complexity K of a string x is defined as the
> length of the shortest program (self-extracting archive) computing x. K(x)
> itself cannot be computed, only approximated from above, namely by finding
> better and better compressions, but even if we reach K(x) will never know
> whether we have done so. For a text string like enwik8, Shannon's estimate
> suggests that enwik8 should be compressible down to 12MB._

The current record is 15,949,688 bytes.

------
luriel
See also Russ Cox's "Zip Files All The Way Down" article:

<http://research.swtch.com/zip>

------
vog
Even more impressive: A zip file that extracts to itself. (That's also shown
in the article, as a kind of "Lempel-Ziv quine".)

------
swighton
I similar attack to mess up XML parsers:

<http://en.wikipedia.org/wiki/Billion_laughs>

~~~
vigrant
lol ^ 10,000,000,000

~~~
DoublePlusWill
lol * 10,000,000,000

lol ^ 10,000,000,000 is an insane amount larger. :)

------
ctdonath
Considering the nature of modern "art" (ex.: "here's a hard drive containing
$5M in stolen software!"), and owning a "This T-Shirt is a Munition"
(featuring the then-controversial RSA-in-4-lines-PERL code), I'm perversely
inclined to find such a "zip bomb" small enough to print the hex or QR code on
one business card. The 42kb file is a bit big; any known smaller versions?

~~~
DanBC
(<http://www.maximumcompression.com/compression_fun.php>)

This has a 115 byte RAR file that expands to 5 Mb. You can probably experiment
to get a file just small enough for QR code, with huge output.

(Note that using obscure compressor gives a 24 byte file that expends to 5
Mb.)

------
acd
You can also use the same technique to cause a exhaustion on the number of
files/inodes. Ie you format a ext4 partition with crazy number of inodes, then
create the zip file containing crazy amount of 0-byte files. Then receiving
side has a normal formatted ext3,ext4 partition. Exhaustion on number of
inodes. This is not nice so don't do it.

~~~
thisaccount
Wonder what would happen if you hid a symlink pointing to root as one of the
files. Someone without a doubt would rm-rf.

~~~
bct
rm -rf doesn't follow symlinks:

    
    
        /tmp $ mkdir -p a/b/c
        /tmp $ mkdir -p a/d/e
        /tmp $ cd a/b/c
        /tmp/a/b/c $ ln -s /tmp/a/d .
        /tmp/a/b/c $ cd ../../
        /tmp/a $ ls */*
        b/c:
        d
    
        d/e:
        /tmp/a $ rm -rf b
        /tmp/a $ ls
        d
    

It would be pretty stupid for it to.

------
Zenst
Detection of compression bimbs has improved alot as apposed to over 10 years
ago when they realy did cause problems on mail servers. Home AV software
detects them, crazily enough my install of GoLang on a windows box has a file
that gets flagged as a compression bomb every full system scan.

But examples like this happen in many forms, heck windows on some file
types/sizes doing thumbnails has done wonderous things like exponentialy
growing the swap file to a ever impending churned slowdown.

Even computers have mental farts.

~~~
exch
> crazily enough my install of GoLang on a windows box has a file that gets
> flagged as a compression bomb every full system scan.

This may have something to do with Russ Cox's blog post on recursive zip-
archives "Zip Files All The Way Down" in Go: <http://research.swtch.com/zip>

Baseless speculation mode: There is a possibility that the recursive zip file
was part of the Go test cases for the gzip package at some point. If it
lingers in the mercurial commit history, it may still trigger hits from your
AV software.

~~~
Zenst
I just had to dig out my logs and see what it was - file in question is
located in (default install):

C:\Go\src\pkg\regexp\testdata\re2-exhaustive.txt.bz2 385KB in size though
opening shows a .txt file that is 58MB in size. Basicily Avast being picky and
a non-positive. Probably so crompessed that it hit whatever limit on
decompressing per file in avast and avast then things its a compression bomb.
Opens and extracts fine, though hardly fun reading.

------
conductor
Web browsers support compressed data, I wonder will they try to decompress
something like this?

~~~
chuppo
Yes they will, there was a post a while back which used this to bomb the
browser.

~~~
repsilat
Perhaps more concerning is being able to use this to launch a denial of
service attack on a server that accepts zipped data. Gzipped requests are
unusual with HTTP (no idea how widespread support for it is), but iirc SPDY is
compressed by default.

~~~
kijin
Maybe not at the transport or protocol level, but it wouldn't be too hard to
DoS an application server that handles compressed data, such as images.

Make a billion-pixel PNG image that compresses very well, upload several
copies simultaneously to a LAMP server running on an average Linode, and watch
it run out of memory while trying to create thumbnails with GD.

~~~
debacle
PHP usually has a pretty reasonable memory limit set, so it would puke on
itself pretty quickly.

But I don't think you'd bring the site down.

~~~
kijin
Fair enough, but I've been on Linode's forums long enough to have seen dozens
of people running 50 PHP processes with 128MB memory limit each, on a 1GB
server shared with MySQL and a bunch of other crap. (It seems that 128MB is
the new "reasonable memory limit" these days, since that's how much RAM it
takes for PHP to handle photos from 8-to-12-megapixel cameras and
smartphones.)

------
tokenadult
The most authoritative-looking reference

[http://www.aerasec.de/security/advisories/decompression-
bomb...](http://www.aerasec.de/security/advisories/decompression-bomb-
vulnerability.html)

in this typically thinly referenced Wikipedia article looks to be several
years old. What is the current state of the art? Some of the comments already
posted as I post this comment talk about the situation "years ago" and at
least one comment suggests that this is largely a solved problem, currently.
How many wild vulnerabilities like this are there, really?

(I ask questions like this about most "facts" reported in Wikipedia articles,
because I am a Wikipedian myself, and I have become painfully aware of how
often the "the free encyclopedia that anyone can edit" becomes "the
encyclopedia in which every fact is just made up.") From a neutral point of
view, is this really much of a problem in day-by-day computer use and online
network use?

~~~
diminoten
This seems like a ripe opportunity for an enterprising blogger to examine
changes that have been made to how we compress things over the past few years
and see if any of these changes impacts the potency of the decompression bomb
as a kind of, as you said, 'update to the current state of the art'.

------
jaylevitt
Comp sci folks: Is predicting whether a compressed file will produce a finite
(or, better, reasonably-sized) output roughly equivalent to the halting
problem?

~~~
lmkg
It depends on the decompression algorithm. It's possible for that to be the
case, but this can only happen if the compressed binary format is essentially
a Turing-complete language, for which your decompresser is the interpreter.

I'm not aware of any data formats for which that is the case, but from a
theoretical standpoint, _eval(s)_ is a perfectly cromulent decompression
algorithm. This fact is essentially the starting point for Kolmogorov
complexity.

"Reasonably-sized" is actually an interesting problem in itself. If your
decompresser is sufficiently advanced, you could embed a busy-beaver function,
which terminates but grows faster than any computable function. I have no idea
whether such functions could be expressed with less-than-Turing-complete data
formats.

~~~
dsl
RAR has a built in virtual machine for forwards compatibility with new
compression algorithms.

------
timdorr
While this is out of the range of most consumers, I wonder if any bored
sysadmins with a new storage system to test have tried unzipping that file...

~~~
lucian1900
It'd be easier to do something like cat /dev/urandom > big

~~~
VMG
/dev/zero is probably faster

~~~
fduran
It is way faster (at least with dd):

    
    
      $ time dd if=/dev/zero of=10MB.dat  bs=1M  count=10
    
      real    0m0.213s
    
      $ time dd if=/dev/urandom of=10MB.dat  bs=1M  count=10
    
      real    0m8.873s

~~~
Dylan16807
Or if you want to measure the speed of the source itself:

    
    
      $ dd if=/dev/zero of=/dev/null bs=1M count=100
      104857600 bytes (105 MB) copied, 0.0237114 s, 4.4 GB/s
    
      $ dd if=/dev/urandom of=/dev/null bs=1M count=100
      104857600 bytes (105 MB) copied, 21.501 s, 4.9 MB/s
    

Also dammit Ubuntu with your Gibis.

------
mariuolo
Welcome to 1988?

------
trool
Old as fuck.

~~~
nhebb
>Old as fuck.

So is algebra, and yet, every year millions of people learn it for the first
time.

~~~
misnome
But they don't immediately rush out to tell the world the "News"

~~~
CapnGoat
Yes they do. Every child that learns anything will tell the world about it -
or at the very least everyone in their family.

------
Zelphyr
Inception

------
ryanpers
Awesome link, but already seen it on reddit yesterday. HUH

