Hacker News new | comments | show | ask | jobs | submit login
Illegal Prime Number? (2001) (fatphil.org)
85 points by ColinWright 6 months ago | hide | past | web | favorite | 45 comments

Some context for people younger than me: This was a Big Thing On The Internet back in 2000. It was lovely, and it was where I discovered how a bunch of motivated nerds on the internet could make a dent.

So anyway, the thing was DVDs were pretty new, and DVDs had "region protection". Cutting corners, this means that data on DVDs could be encrypted with different keys depending on which continent you were on, to allow complicated per-country movie distribution models. You could change the "region" on your DVD player or computer, but only 3 times or so. Of course people could also make unencrypted DVDs, but movie makers didn't do that. Internally the encryption scheme was called CSS, chosen by people who I assume didn't code websites much.

Now, the DVD consortium people (backed by the film industry and hardware vendors) released DVD playing software for Windows and Mac, but not for Linux. This made Linux enthusiasts sad, who couldn't play DVDs on their computers. You have to remember, this was before the Pirate Bay, broadband, and Netflix. People really wanted to be able to buy/rent these DVDs and play them, this was not a hypothetical wish. It was the only way to get high quality movies in the house.

Then it became clear that the DVD people really weren't going to ship a DVD driver thing for Linux, not even closed source, not even when we all asked really nicely. I'm not 100% sure whether there were no DVD drivers at all on Linux, or simply that they only could play unencrypted DVDs, but you can see where this is going: the Linux people wanted to be able to play real DVDs, the ones they'd buy or rent in town, the vendors wouldn't do the effort, so the Linux folks did it themselves. This was common, especially then, when most hardware vendors couldn't care less about Linux.

So they cracked the CSS encryption.

I mean, of course they cracked it. This was not the warez people, this was the FOSS people. They had a computer with a perfectly good DVD drive and they wanted to watch movies on it. You can't stop these folks.

The MPAA, however, did not like this. They realized that it would just be a matter of time before people would port this code to OSes that have more usage than a rounding error on their bottom lines. So they did what the MPAA does best: hunt witches. At least one guy, a Norwegian 16-year old, got prosecuted by the state, and it all got messy very fast.

Meanwhile, the angry nerd mob that is the internet didn't sit still. They soon realized that the big Linux DVD player software wasn't the issue, the only issue was that little library called DeCSS which cracked the encryption. Soon, people started hosting DeCSS on their websites, in objection to the MPAA's witch hunt. It wasn't that big, after all.

MPAA then started sending aggressive takedown notices and even lawsuits to ISPs and hosting providers who had customers who hosted DeCSS. Some customers got in trouble, some providers had a spine. It was bad, but this sparked 2 wonderful developments:

First, someone wrote a Perl script called DeCSS that removes cascading style sheets from HTML files. Nobody had a use for it, but lots of people hosted it on their sites. The MPAA sent takedown notices to those as well, and this was much easier for providers to say no to. After all, it was as harmless as any program good get and let's be honest, it was aptly named.

Second, the one of first serious internet sizecoding competitions got kicked off, because smaller code is easier to distribute in nifty ways. People remixed each other's work until the core DeCSS algorithm was only a single line of code. Gzipped, there was nearly nothing left. The article this thread is about assumes that you're aware of that (all the geeks were at the time) and starts from there with the insight that there's probably a prime number that includes this code and is, therefore, illegal.

People also put this minified DeCSS code in all kinds of wonderful places. One of the best hacks I recall is that you could make the DVD consortium's DNS servers host DeCSS. Because DNS servers cache data from other DNS servers, you could make a TXT record on your domain with DeCSS in it, then look it up via the DVDCCA nameserver, and it would keep a copy. But there were many ways: http://decss.zoy.org/

This was all straight from memory. I probably got some details wrong and I probably missed many great anecdotes. But this was a beautiful piece of collaborative civil disobedience and to my knowledge it was the nail in the coffin of region-protected DVDs.

Another aspect you didn't mention is how the number came to be illegal in the first place. The law for it was signed in 1998 and is one of very few laws that outlaws distributing information regardless of how you got the information.

It outlaws bypassing any DRM, regardless of how trivial it is mathematically, the legitimate purpose of doing so, or even whether the content inside the DRM is copyrighted.


> > You could change the "region" on your DVD player or computer, but only 3 times or so.

Before this law the shops selling DVD players (for TVs, not computers) would also give you a sheet of paper with a controller code to turn your machine into a region free player. You'd tap some numbers into the remote control.

After these laws (which were introduced across the world) shops would no loner do this because it was a criminal act.

The fact that I could not legally buy a DVD that was only sold in Japan (never sold in the UK) and play it was really weird and scary. It was now more legal for me to pirate that DVD than to buy a real copy.

So, when people ask me why I pirate so much, DeCSS and DMCA are why.

they did that till the very end.

they just realized it was the ultimate killer feature so it was cleverly removed from the lower end models. the high end ones could always be turned region free. in the very end the ones with multidisk and what not could be factory reset to clear the 3 region change restriction.

Most "normal" TV-compatible DVD players, in the US, at least, were region-locked and had no way to change region (or at least, no such information was distributed with them).

Region-free players existed, mostly from sketchy/low-end sources.

>This was all straight from memory. I probably got some details wrong and I probably missed many great anecdotes.

Wow, that’s pretty impressive! Your comment made a pretty good article by itself; I’d encourage you to take those final polishing steps and publish it as an article or blog post.

> This was a Big Thing On The Internet back in 2000 . . . before the Pirate Bay, broadband, and Netflix.

Just a nitpick, but this was not before Netflix; I remember this vividly because I was called to a friend's house to troubleshoot their problems with Netflix DVDs and it was because their DVD player had been purchased off ebay for cheap and was region-locked.

The super-cheap DVD rentals offered by Netflix were really game-changing for people who were interested in DVD quality but couldn't justify the cost of purchase/rental in my area.

Just to nitpick your nitpick, that Netflix had little in common with the current Netflix. In fact, it supports my story because Netflix was a DVD rental service at the time.

Eg IMO it's fair to say that Motorola phones were there before Nokia. It's a true statement unless you mean Nokia, the well known Finnish manufacturer of bike tyres, rubber boots and television sets.

I'd never have guessed that. Netflix must have been around for quite a while before it became a big thing.

This was circulating on Twitter this week: https://twitter.com/JonErlichman/status/947909247836319744

Here's how old these companies will be turning in 2018:

Snapchat: 7 years

Uber: 9 years

Twitter: 12 years

Facebook: 14 years

Tesla: 15 years

Google: 20 years

Netflix: 21 years

Amazon: 24 years

Apple: 42 years

Intel: 50 years

HP: 79 years

Disney: 95 years

IBM: 107 years

Amazon and Google I knew had been around since the late 90s as I used both of them fairly early on. I still have a couple of coffee cups I got as "Christmas presents" from Amazon in early years as thanks for being a customer.

Facebook I started with while it still required a school address.

It does look though as if I started using Netflix in late 2001. So not that much later.

I remember having a Netflix DVD subscription when I was still using Yahoo search and getting my munchies from Kozmo.com. Funny thing, only in the last few years have new services begun to match the convenience of the original Kozmo. It was a sad, sad day when they burned through their last VC dollar. :)

Did the FOSS people crack it? My recollection is that a software player (made by Xing, no not the German LinkedIn copycat site) accidentally distributed their keys unencrypted: https://www.wired.com/1999/11/why-the-dvd-hack-was-a-cinch/

Then again, encryption of something that the self-contained software will need to be able to decrypt is probably just a small hindrance.

Thank you for this writeup, this was before my time on the internet.

> People remixed each other's work until the core DeCSS algorithm was only a single line of code. Gzipped, there was nearly nothing left

What was the actual number? Can you post it here?

I recall a version that was an ascii string that you could save as a file with a .zip extension and it would unzip into the source.

I think that's in the steganography wing here: https://www.cs.cmu.edu/~dst/DeCSS/Gallery/Stego/index.html

> This gif file contributed by Robert de Bath, contains a surprise inside. Mr. Bath writes: "There are two tiny facts about GIF files and ZIP files you might like to know about: GIF files have their length defined at the start of the file; any bytes after are ignored. ZIP files have a table at the end; anything at the start of the file is ignored. The result is that a file can be both a GIF and a ZIP, just change the extension."

I also like the gallery, which contains a variety of code-golf like discussions of the edge cases of US 1st ammendment protections.


And I remember someone was selling t-shirts with that string!

I had the RSA "this shirt is a munition" t-shirt that was technically illegal to take outside of the United States due to export controls.

Though, it wasn't 1/100th as ugly as this block of graffiti spam.


Check out the gallery of css


k·256^211 + 99

...where k is the decimal representation of the original compressed DeCSS file.

(just for context)

Note the implication: the prime basically is the original program represented as a binary number, then shifted 211 bytes (i.e., 00000000 appended 211 times). Then 99 is added to that number. Note that 99 is representable by a single byte.

Consequenly, if you take that prime number and chop off the last 211 bytes, you end up with the original program.

That's it I'm calling the cops!

> Internally the encryption scheme was called CSS, chosen by people who I assume didn't code websites much.

"CSS" in this case stands for "Content Scrambling System." It may well predate the common use of CSS (stylesheets) on the web. They both originated around 1996.

I still have that t-shirt with the decss code on the back.

Sadly the region lock thing still exists in blue ray age.

pi, e, √2 are thought to be normal [0], which would make them Disjunctive sequences [1]. This means that every finite string appears as a substring in them, including for example all Shakespeare works, and the ADN of every person [2].

In particular they contain this illegal prime number, and the gzipped and non-gzipped versions of this program in every programming language possible.

Does that mean that they may become illegal someday if they are proven to be normal?

[0] https://en.wikipedia.org/wiki/Normal_number

[1] https://en.wikipedia.org/wiki/Disjunctive_sequence

[2] http://sprott.physics.wisc.edu/pickover/pimatrix.html

(edited to add links)


What does it mean to "contain" information?

These discussions often use an implicit definition "if you enumerate an infinite series of digits according to some rule, you will eventually generate any arbitrary string". But that'll never fly legally. Under this definition, the boring old system of counting up from 0 also "contains" every number, and in fact is a much more efficient system for doing so. Pickover's pi-based mysticism loses its magic a bit!

On the other hand, the practical, everyday sense is "meaningful information can be extracted, with an input of information very much smaller". Hard drive images "contain" files, requiring only a few dozen bytes to specify which one. Encrypted files likewise generally only require a few dozen bits to recover. This is much more legally relevant, as it allows the possessor of the "container" to act on its contents.

Pi does not qualify, as the index to any meaningful information in its digits will be far larger than the information itself. Writing "pi" on the back of my hand will not help me cheat on my Shakespeare exam. It's merely a highly inefficient coding scheme.


> πfs is a revolutionary new file system that, instead of wasting space storing your data on your hard drive, stores your data in π!


Thanks for the reference. Even the github issues are fun to read. In particular this one is relevant here: PiFS installs large volumes of objectionable content and copyright violations (https://github.com/philipl/pifs/issues/2)

You can easily iterate over all binary strings to generate your illegal number too. That way you don't need to rely on the normality of transcendental numbers. If you concatenate all base numbers in base 2, you even get a number that's normal to base 2. There's no proof that it's normal to most other bases, though.

Not to mention there's a full archive of all tentacle porn ever produced in there.

An illegal prime reminds me of the article "What Colour are your bits?" from Matthew Skala http://ansuz.sooke.bc.ca/entry/23

I didn't understand what made this illegal from the article, but http://primes.utm.edu/glossary/page.php?sort=Illegal covers it in more detail. Very interesting.

tl;dr- Programming code serialized to binary seems like a Base-2 number to them, so they were looking for "illegal" C scripts that would be prime when read as a Base-2 number.

They give the example of a C script that breaks an old DRM crypto scheme.

It seems kinda silly since primes have a fairly regular distribution, so all they need to do is:

1. Pick any program they want to be "prime".

2. Serialize it and check if it's prime. If so, done; if not prime, continue.

3. Perform some minor code modification that doesn't change the script's functionality, e.g. tweaking a variable name or the content of a comment.

4. Return to Step (2), looping until a prime representation has been found.

There's already a 1401-digit prime that contains the gzip of the source code of DeCSS. He wanted to do more.

* 3 append a NULL byte

* 4 serialise and check if prime; if so then done

5 check if any byte value makes it prime when appended. If so, append it, else append NULL. Repeat until done

Step 2 (checking if huge numbers are prime) is a hard problem.

Additionally, the author wanted to find a prime that encoded the program that was also notable for some other reason (to give people a legitimate pretext to host it). He decided to try to make it one of the largest primes ever discovered, and it was, though it no longer is.

FYI, prime number test can be done in determining polynomial time (instead of NP). So it is not a hard problem.

Interesting that the prime pages link is dead, and they didn't include the prime number on their own site.

The link may be dead, but the site still exists. Here's the 1401-digit prime:


A very interesting number. The first time I learned about it was through this Youtube video https://youtu.be/LnEyjwdoj7g.

See also related Wikipedia article: https://en.wikipedia.org/wiki/Illegal_prime

So, has the law changed? What's its impact today?

DMCA (and similar in other countries) is still in force. It's still enforced, but selectively.

The UK has an approach which is almost sensible, but ends up being stupid. IP holders have rights to protection from illegal copying, and that's why law about circumvention of technological measure was introduced. Consumers also have rights to make copies in some situations, and those rights need to be protected. So consumers can ask the rights-holders for an exception, and can then go to the secretary of state if the IP holder declines to provide an exception.



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact