
42.zip (2004) - geekam
http://www.unforgettable.dk/
======
0x0
That's nothing. How about an infinite zip file :)

[http://research.swtch.com/zip](http://research.swtch.com/zip)

~~~
ocfnash
Interesting.

I wonder if it might be possible to prove the existence of fixed points of
compression programs using only general assumptions (which would be satisfied
by the Lempel-Ziv implementations the author discusses so well).

Such a fixed-point theorem would probably not tell us _how_ to find solutions
but it would be illuminating if the assumptions could be made weak enough.

------
DanBC
Here are some other compression oddities:
[http://www.maximumcompression.com/compression_fun.php](http://www.maximumcompression.com/compression_fun.php)

A 42 byte file that expands to 5 MB; a file that compresses in one compression
software but not another; a file that compresses to itself; and some extreme
compression software.

~~~
nileshtrivedi
> a file that compresses to itself

Could this be used as DoS against anti-virus such as the one that GMail uses?

~~~
laurent123456
Probably not, as antivirus programs these days all have protections against
zip bombs [1].

[1]
[https://en.wikipedia.org/wiki/Zip_bomb](https://en.wikipedia.org/wiki/Zip_bomb)

------
DyslexicAtheist
10 years ago most virus scanners were vulnerable to it.

actually you can build your own collection of files with cat /dev/zero
>foo.txt ... then zip all those into a final collection of zip's -> n-levels
deep ... resulting individual zip files will be tiny, but once extracted the
file eats massive space (idea is to fill disk on scanning server up and create
I/O errors or keep the server busy with a DOS). Also servers usually limit now
the depth they are willing to scan in a zip to avoid this type of attack
(workaround: only use up to 4 levels ...)

------
derefr
Here's an information-theory question: would there be a way to estimate the
size of the decompressed output, just given the raw zip data and no size
metadata, without attempting to extract it (i.e. without spending time or
space proportional to the compression factor)?

~~~
nawitus
Well, yes. The linked page shows you how to calculate it.

~~~
euank
Your answer is wrong. He says "Just given the raw zipped data and no size
metadata".

The page calculates it using exactly that size metadata he excluded.

The question, as I see it, is if that size metadata (the 4.3 gigs at the
bottom) can be determined from the zip file without unzipping a rather lot of
data.

It would be interesting if someone who knew the exact details of the zip
format could comment.

------
vezzy-fnord
Context:
[https://en.wikipedia.org/wiki/Zip_bomb](https://en.wikipedia.org/wiki/Zip_bomb)

~~~
masklinn
Here's an old page with various compression bomb demos:
[http://www.aerasec.de/security/advisories/decompression-
bomb...](http://www.aerasec.de/security/advisories/decompression-bomb-
vulnerability.html)

It's not just archive formats, anything which uses standard compression
algorithm could be a compression bomb (e.g. PNG)

~~~
angersock
Any examples of a PNG bomb in the wild?

~~~
DanBC
The link in parent comment has instructions for creating PNG bombs and a link
to an ftp site with some downloads.

~~~
angersock
So, do we need to change our file uploaders and whatnot to autokill after a
certain amount of time? Or just set `ulimit -m` or whatever judiciously?

This looks like an annoying attack vector for people that, say, resize image
uploads to make thumbnails.

~~~
masklinn
> So, do we need to change our file uploaders and whatnot to autokill after a
> certain amount of time? Or just set `ulimit -m` or whatever judiciously?

Most image formats specify the size before providing any data, you can just
check that w*h is below whatever limit you deem fit before loading the image
data. The file upload size should already have a "sane" limit so people don't
upload truckloads of data, the problem with compression bomb is that you can
upload a very small amount of data and it eats the target's RAM on decoding.

> This looks like an annoying attack vector for people that, say, resize image
> uploads to make thumbnails.

It is, although for JPEG you don't need to decode the image to resize it
(depending how whatever you're using is implemented it may still do so
regardless).

------
huhtenberg
That's a good test for pass-through antivirus appliances :)

~~~
raverbashing
Makes me wonder what are the implication of the halting problem for
compression formats

~~~
whatshisface
The halting problem only applies to full Turing machines, and their
equivalents. So, a compression algorithm based on x86 assembly is uncheckable,
but one based on a restricted format like .zip would be fine. Or, you could
just give it a time limit.

------
zupa-hu
Gmail won't allow you to send this file to anyone, throws an error pointing
you to
[https://support.google.com/mail/answer/6590](https://support.google.com/mail/answer/6590)

Ah, I almost forgot we are not spied on. Tried via local email client.

~~~
qnaal
> Mac users, sending a zip file which also contains another zip file increases
> your chances of file corruption. We recommend that you de-compress all files
> first and create only one zip file.

what the fuck?

~~~
0x0
Yeah, that's pretty crazy. If zipping a zip causes "corruption", they should
probably file a bug at bugreport.apple.com

------
IvanK_net
All files at the same "level" are identical. The 4GB file contains zeros only.
I think these "data" could be compressed into even smaller size using some
different methods.

~~~
DanBC
It would be interesting to see how small it could be made.

I posted a link earlier, but here it is again with a quote:

[http://www.maximumcompression.com/compression_fun.php](http://www.maximumcompression.com/compression_fun.php)

> 'What is the best compressor to get really extreme compression?' is a
> question often asked on the internet. The achieved compression ratio of
> course depends on the quality of the compressor used, but the type of data
> that is being compressed is much more important. To show this I created a
> tiny 115 byte rar file. When you decompress it, it will turn into a textfile
> of almost 5 MB (a compression ratio of 99.998%). Here it is: test.rar. Note:
> Turtle 0.07 compresses the same file to 49 bytes, Hook09c (with switches 3 1
> 1 1) to 36 bytes and UHBC (with switches -b128m -m3) to 24 bytes!!.

I enjoy watching this "code golf" style competing to optimise one feature of
something.

------
nsajko
There's this old discussion about zip bombs:
[https://news.ycombinator.com/item?id=4616081](https://news.ycombinator.com/item?id=4616081)
A certain Caspian Maclean published a gzip quine on the Web in 2004, but I
cannot seem to find it.

------
phiri
It's funny, first got to know about this a few days ago when I came across
this [1, 2] interesting thread on Quora.

[1] [http://qr.ae/sQnm5](http://qr.ae/sQnm5)

[2] What is the most compressed file ever?

------
flux_w42
NOD32 detects it as an Arch-bomb. Made me think of the somewhat similar
'Billion laughs' XML bomb but for archives.

------
iancarroll
Now we have to ask, how did he create that file without using petabytes of
disk space?

~~~
scragg
I would guess you would just need to create the single 4G file and zip it.
Once it's zipped, copy 16 times, put in a zip file, repeat.

~~~
AlyssaRowan
Precisely that.

It's rather a pain in the ass for antivirus products, which tend to have a
nesting limit to prevent these kind of shenanigans.

There are much weirder things you can do. With knowledge of what's packing it
and if the packer doesn't bail out on expanded files (like, say, LZMA2 used in
XZ does, but the older LZMA doesn't), you can make a small file which tickles
every worst-case performance/size wrinkle in the cruncher and comes out very
large!

I thought executable compressors had reached a plateau until Mentor/TBC &
Blueberry/Loonies came up with crinkler -
[http://crinkler.net/](http://crinkler.net/) \- which is a compressing linker
specialised for demoscene 4ktros, and cheats like a motherfucker.

~~~
caspianm
Interesting, is there a description of how it works or cheats?

------
level09
so what's the content of the final files ? just a bunch of redundant zero's
and one's ?

~~~
escaped_hn
Isn't that the contents of all files?

~~~
taspeotis
You think all file contents are redundant? Deep man, deep.

------
gegenschall
I'm sorry, but I have to say this: I like how this entry had 42 comments
before this one:
[http://i.imgur.com/KmCodSM.png](http://i.imgur.com/KmCodSM.png)

------
gdi2290
42.zip has a weissman score of 42

------
mundizzle
wow, that has a Weissman Score close to the theoretical limit

------
untilHellbanned
got nothing on Pied Piper

