I get the feeling that Marcus Hutter is going to be able to keep some of his money.
EDIT: This apparently isn't the case! Turns out the prize is based on improvements of the current record, not the starting baseline. From the website's FAQ:
> Theoretically you can win more than 500'000€. For instance, if you halve the current record, then halve your own record, and then again halve it, you would receive 3×250'000€
We're still not particularly likely to see a full payout, but much more likely than the way I initially interpreted it.
Emphasis mine. You can make a highly specific compressor, but it'll need a highly specific decompressor, which will count against your score.
In essence the competition is "create the smallest executable which produces enwiki9 as its output".
If you embed 1GB of text in that compressor, then your 'compressed size' will be 1GB (plus whatever else is in the compressor).
The size of the decompressor is included in the total size of the compressed data.
It just seems like they've hit upon a good idea - AI/compression crossover - then shot themselves in the foot by excluding anything genuinely intelligent. I suppose the problem is how to tune the rules to allow AI but not cheating.
> all human knowledge
> genuinely intelligent
> allow AI but not cheating
Those all all vague, hand-wavey concepts; open to disagreement, and in some cases might turn out not to exist or make sense. Entire research careers have been spent trying to even define these terms, let alone implement them.
Focusing on compression of a particular chunk of Wikipedia eliminates all of that, and gives us a precisely measurable quantity.
Is it a perfect defininition of intelligence? No; it was never meant to be.
Is it a runnable, measurable, comparable experiment? Yes.
That's what they're doing (in the constraints of defining all human knowledge ~ English Wikipedia). More refined models of representing all of our knowledge will beat weaker ones in this test.
Whether this is AI or intelligence or not depends on how you define those terms - but to win this competition your compressor has to be able to anticipate the rest of the human knowledge fairly accurately based on having seen part of it.
But I'm wondering if starting with a 1tb knowledge file might lead to more interesting ai outcomes as it would allow for larger models to be in play.
But I guess the possibility of such a generator existing (within the size constraints) for this specific text is pretty much zero.
E.g. a Minecraft world seed carries very little data compared to the world which the world generator consistently generates out of it.
The catch is that only a miniscule fraction of all the possible worlds can be actually generated since there is a finite amount of coresponding seeds.
The goal is maximizing compression ratio - with the aim of affecting AI. Since an AI system can be seen as a lossy compression where some worldly input is compressed into something that we hope mimics "understanding". Improvements in compression may translate to AI advances  and potentially vice versa.
To draw a parallel away from computing:
One could say we are compressors that experience the world and compress it to loose what we don't care about. Improving our compression is basically akin to "knowing more" or "remembering more of the right things". See Schmidhuber .
That's how lossy compression works but this competition is about lossless compression. Makes me wonder if the competition should really be about lossy compression, as it's pretty obvious our minds aren't lossless compressors.
To me it seems doubtful whether compression of a 1 GB text corpus could benefit from AI even in theory: if you can get it down to about 15 MB without AI then any AI would have a very tight budget. Perhaps compression of a 1 TB text corpus could benefit from AI in theory, but that doesn't mean that people working on AI would have much chance of winning the prize in the foreseeable future because there might be lots of non-AI techniques that provide a bigger improvement in the short or medium term. (A similar criticism was made against the Loebner Prize in the 1990s. Sorry I can't immediately find a link to it.)
Where do I claim my money?
Edit: alternate solution:
 this solution only works some of the time
However, think about the first part of this comment from an AI point of view. The program itself is the compression. An intelligent agent could discover this kind of program.
Notes: This isn't perfect, since it depends on having network access, reliability, and so on. However, maximally intelligent agents, in my view, know how to balance situational and environmental factors in order to solve a problem.
 The HN guidelines write:
> Please don't comment on whether someone read an article. "Did you even read the article? It mentions that" can be shortened to "The article mentions that." -- https://news.ycombinator.com/newsguidelines.html
The article only contains a paraphrasing of the rules missing these essential details. Per HN guidelines , it would have been better to post the original link and avoid the confusion that you now also fell for.
 Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.
 Please submit the original source. If a post reports on something found on another site, submit the latter.
The HN rules aren't terrible, but with no-one actually following them nor having followed them for as long as this site exists I don't see the point in bandying them about to make pointless comments.
Of course, since standard libraries are allowed, the logical way to win the challenge is for someone who has commit access to a standard library to include the text as part of an update to that library.