There are several past threads of this article but this comment about how the DeCSS illegal prime was created seemed the most interesting: https://news.ycombinator.com/item?id=1045394
E.g. publicizing the number and expiration date of Bill Gates' credit-card is illegal but what if we wrote a text commentary about Shakespeare's Hamlet that when SHA-1 hashed, it gives you that same information? Yes, it would be very computationally intensive to generate a plausible literary text but it could be done. Or one embeds Bill Gates credit-card number as "transactions amounts" on Bitcoin blockchain.
The theme is the same: find a "legitimate" information channel to transmit illegitimate data. This then allows the philosophical arguments about information:
- How can commentary about Shakespeare be illegal? It's education!
- How can you censor transactions on the bitcoin blockchain? There are already 100,000+ copies of it on nodes around the globe! And so on...
The law cares about intent and outcome. If you intended to publish Bill Gates' credit card number and did so, and if doing so was a crime, you'd be guilty of that crime. This is true regardless of how you published the information. There is no "out" for putting it behind a pretext.
It's why sharing child pornography is illegal, even though all the creators are really doing is sharing a set of instructions for someone's else's computer to generate the image/video.
For copyright, it really doesn't. Hence Disney threatening to sue daycares for having Disney character murals on their walls.
While we have since had the four factors of fair use introduced, it still doesn't care about intent; it cares about how the work is used/transformed, but that's different than intent.
If you take copyrighted material and put it on a daycare mural, then you _intended_ to put copyrighted material on a daycare mural. It is not about intent to damage, it's just pure intent.
They intended to make referential work to characters in our modern common culture/mythology, as people have done since time immemorial. Some rent-seekers merely think they get to own ideas now.
> Some rent-seekers merely think they get to own ideas now.
are you quite serious? Disney doesn't think they own coming of age tales or intrepid children feeling out of place. Disney thinks they own the things they paid to create during the operation of their business.
the daycares Aren't making up their own stories, theyre directly benefiting by using Disney's products without licence or permission. this is the exact opposite of seeking.
The point is Disney shouldn't really be able to "own" the image of Mickey Mouse for over a hundred years just because they paid for the creation of some initial image.
What you're saying is that action = intent. Meaning intent is immaterial; the action is proof of intent. I.e., the opposite of caring about intent.
To prove intent you'd have to prove the daycare knew the works were copyrighted (i.e., the distinction between public domain Snow White, and Disney's Snow White), knew the copyright law sufficiently to know this was infringing (i.e., painting on the interior of a daycare where you're clearly not economically benefiting), and to choose to do it anyway.
Hence why copyright law doesn't take intent into account; 'ignorance of the law is not a defense'.
This isn't the intent being referred to. They aren't referring to "intent to violate copyright law".
If your intent is to comment on Shakespeare, and it happens that what you produced, when run through some process, produces something under copyright, but the thing you intended to express doesn't, I don't think that would violate copyright law? Or at least, no one would convict you.
Also, my understanding is that if you can prove that your creation of something was independent of someone else’s creation of the same thing, it isn’t a copyright violation?
> It's why sharing child pornography is illegal, even though all the creators are really doing is sharing a set of instructions for someone's else's computer to generate the image/video.
The strange matter is that in the U.S.A., mens rea is an element of the crime of sharing and possessing child pornography, meaning that one can only be convicted if one knew, or could reasonably be expected to know the actor was under a certain age, but this is not the case for actual sexual intercourse with a minor, where one is criminally liable if one could not be expected to know.
In most jurisdictions, mens rea is an element of both.
The number itself is not enough to have the encoded information, you also need the details of the encoding/decoding scheme. So if you had this 'commentary on Shakespear' by itself it's no more interesting than if you had an encrypted copy of Bill's credit card number but not the password to unencrypt it. Having that on a T-Shirt doesn't seem quite so outrageous though.
However if you have, or circulate this number along with an implementation of the decoding algorithm, as a reader, then you're actually circulating the credit card number itself in a much more direct way. Yes, I know they also publish the unpacking algorithms, but If I walked round with a T-shirt with the encoded file on the front, and the decoding program on the back, that's clearly a much more direct and obvious transgression.
The joke about these illegal primes is that by themselves they seem innocuous, they're just seemingly meaningless numbers, but put the two halves of the puzzle together and the problem becomes a lot clearer. Handing out an unreadable, encrypted, encoded number with the latest Marvel movie in it but no way to play it seems harmless, hand it out with a bundled video player as well and that's a problem. Now it's a viewable movie and that's a different kettle of fish.
A friend of mine got very excited when we were CS students because he found a way to theoretically 'encrypt and compress' data to extremely small sizes, until I pointed out the binary compression key would end up being vastly bigger than the 'compressed' file. Basically he was just 'sucking' the complexity out of the data file and putting it in the key. These numbers are a bit like that, a large amount of the information in the 'encoded' content is actually sequestered in the unpacking algorithm.
With the right 'decoding scheme' the number 5 is an encoded copy of Infinity War. Wow, 5 is now an illegal prime!
Because "illegal information" is an entirely artificial concept that doesn't hold if you think too much about it. You can't exactly "own" information, neither can you "destroy" it.
There is no right and wrong in this world; there are only parties with enough power to enforce their will upon others. Some of those parties believe that what they do serves some public interest.
Laws only work when the majority of people is on board with them, and there's a misbehaving minority against which they're enforced. It's most certainly not the case with the copyright law, at least in its current form. Or, for that matter, any other oppressive laws that aim to manipulate people through fear into doing (or not doing) something elites want.
If you're the only person that knows a particular sequence of letters or numbers you own it.
If you remove all knowledge of that information you destroy it. And by "all knowledge" I mean every known way to generate it, not just a simple representation.
There is nothing special among any single number in 160 byte string, but it contains all SMS ever sent. Some of those messages are of special interest to some people.
I'm replying to myself to address some of the skeptical comments about my scenario because I didn't make it clear what I was trying to communicate.
I was trying to emphasize the people sharing the legitimate data and not the person who generated it from illegal data.
For example, the SHA256("Hamlet was a prince in Denmark.")=="4461565240549538f8888f5ac829800a79763479f28418dec7a13087a3e0d2a8".
Let's say the first 16 hex digits ("4461565240549538") is an "illegal number" ... e.g. somebody's credit-card or whatever.
Yes, the intent of starting with "4461565240549538" and using brute force to eventually find some a plausible sounding sentence that creates a hash with that 16-digit number can be prosecuted. Likewise, you can't just XOR a Blu-Ray mp4 file with digits of pi and claim "it's just a random number" to be immune from copyright laws.
I'm talking about the people downstream that do something like this by sharing on reddit, Twitter, etc:
- I ran across some Shakespeare trivia: "Hamlet was a prince in Denmark." I don't want anybody to hash it. I just thought that was interesting to know.
- Streisand Effect[1] the above tweet or forum post
- now everyone debates : How can the sentence "Hamlet was a prince in Denmark." -- be illegal?!? It's a factoid about a play that you can't remove from everyone's brain! We're now past the point of the criminal who brute forced the illegal data to publish it. I'm talking about the non-criminals sharing it. That's the information hack: Shift the illegal data into the realm of the legal data.
Maybe that hack hasn't been fully tested in courts but it seemed to work if we look at the DeCSS code repackaged as an interesting prime number. Neither Chris Caldwell and Phil Carmody have been put in jail.
It can also be especially effective if the would-be criminal who generated interesting hashes or prime numbers did it anonymously such as Satoshi Nakamoto's generation of the bitcoin genesis block. There's now no obvious person to prosecute.
I think this is less a philosophical argument than a legal one. If a court can prove that you explicitly owned the illegal data in its raw format, and used that to generate something (even if it has secondary value), they should be free to prosecute in my opinion. That includes searching through and highlighting data that already exists in the wild such as large prime numbers. Presumably showing statistics of how often random data would decode to someone's credit card number would prove this beyond reasonable doubt.
The fact all data exists 'in nature' - while interesting, is not something I think should override the intent of data laws.
Yes. A commentary on Shakespeare that just happens to hash to Bill Gates' personal details is so monumentally unlikely to occur by accident that its existence constitutes proof of intent beyond all reasonable doubt. Disingenuously asserting that "it's education!" is a bit like the mafia boss asserting that no, the front business really is a laundromat.
It's really amazing how many experts and professionals in information technology seem to understand what information is less well than stodgy old lawyers and politicians.
>The SHA-1 pre-image resistance would prevent that.
No, I'm talking about computing something much less ambitious than solving pre-image resistance.
Consider the example SHA-1 hash[1]:
SHA1("")
gives hexadecimal: da39a3ee5e6b4b0d3255bfef95601890afd80709
gives Base64 binary to ASCII text encoding:
2jmj7l5rSw0yVb/vlWAYkK/YBwk=
Embedded in that hex is "95601890" ... which could be somebody's driver's license number. Now that the number is publicly visible via some legitimate purpose other than illegally publishing drivers license numbers, we can then argue "How can a SHA-1 of a empty string illegal?!?"
It's a similar concept to creating "vanity" public hashes for Tor or bitcoin addresses or finding a hash that has a certain number of leading zeros in "proof of work". Embedding arbitrary illegal information that way is more computationally feasible with brute force than perfect pre-image attacks.
The underlying idea of the hack is the same : Find some way to make illegal data be interpreted in a totally different way that _is_ legal. It's very hard to do for large amounts of information such as 9 GB mp4 rip of a Blu-ray disc -- but easier for small amounts of information... such as the DeCSS C source code converted to an "interesting" prime number.
"95601890" is just an arbitrary number, the same as any other.
Additional metadata, such as "it's a driver's license number", "it relates to person X" gives it context and makes it information. 4037 is a bank pin code for millions of people, but as long as you don't combine it with a person it's just a 4 digit number. When you combine it it becomes information protected in many countries.
This is a collision attack, much weaker than pre-image. It means that you can make two different texts with the same hash, but you can't create a text that matches a specific hash.
Even MD5 is safe from a pre-image attack, it means, for example, that it is safe to use for storing passwords (but not recommended for other reasons).
This might be thought of as a human version of a covert channel: https://en.wikipedia.org/wiki/Covert_channel
E.g. publicizing the number and expiration date of Bill Gates' credit-card is illegal but what if we wrote a text commentary about Shakespeare's Hamlet that when SHA-1 hashed, it gives you that same information? Yes, it would be very computationally intensive to generate a plausible literary text but it could be done. Or one embeds Bill Gates credit-card number as "transactions amounts" on Bitcoin blockchain.
The theme is the same: find a "legitimate" information channel to transmit illegitimate data. This then allows the philosophical arguments about information:
- How can commentary about Shakespeare be illegal? It's education!
- How can you censor transactions on the bitcoin blockchain? There are already 100,000+ copies of it on nodes around the globe! And so on...