Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Lets say this service is generating truly random bytes. If a subset of the bytes that you download happen to describe, say, the last episode of Mr. Robot encoded in MP4, does that actually constitute piracy?


No. Copyright only covers copying. If two people independently create the same work, they both have rights to that work.

Though, note that what you're suggesting is basically impossible for a work of any meaningful size, and the act of searching for a specific work in a random stream probably would prevent it from being considered an independent creation.

http://www.techpatents.com/Blog/independent-creation-is-a-de...


See also "Pierre Menard, Author of the Quixote" by Borges (https://en.wikipedia.org/wiki/Pierre_Menard,_Author_of_the_Q...) about the differences between Menard's and Cervantes' Don Quixote. Both the same text, 100% identical, but one written in contemporary times to the setting, one in modern times. The modern 'copy' has a different meaning, due to the authors different circumstances, despite the content being syntactically exactly the same. Really interesting concept, like all of Borges, really!


This is getting even more offtopic, but I'm kind of curious if anyone knows if it would be illegal to distribute a copy of something copyrighted, if it's "encrypted with itself" -- for example, an encrypted eBook with plaintext metadata that tells you how to derive a decryption key from a long sample of words in the book. In theory the copy is random bytes until you possess the book, which should mean you have the right to possess the copy for personal/backup/interoperability purposes. (The usecase in mind, which perhaps applies best to books, is being able to turn physical media into high-fidelity, open digital formats that can be distributed lawfully, without the cooperation of the copyright holder.)


> In theory the copy is random bytes

There is not any such thing as random bytes. Here's a byte: 0xca. Is it random?

Well, if I got it from a fair die, perhaps. But if I got it by printing the hex of a Java program, of course not. Bytes cannot be random, they can only be created in a random way (or not).

In your hypothetical you are not creating the bytes in a random way, you are creating them from a book. So the bytes have nothing to do with so-called "random bytes", but are in fact a derivative work of the book, at the very least.


I guess this is bit color, then (a similar "random" example is used in the Colour essay somebody else linked.) Thanks for mentioning "derivative work", it's a good key word to look into this further. After skimming https://en.wikipedia.org/wiki/Derivative_work I guess one question is, is applying cryptography "transformative"?

To take it in a slightly different direction, since parody has been upheld, what about a deterministic, reversible, mechanical parodic transformation? AFAIK machine-created works can't be given a copyright, but I'm not asking for that, merely that a machine can create a sufficiently transformative derivative work.

In fact, maybe it doesn't even need to be parody. In https://en.wikipedia.org/wiki/Perfect_10,_Inc._v._Amazon.com.... search engines were permitted to serve thumbnails of copyrighted images because the copies "served a different function than [the original] use – improving access to information on the Internet versus artistic expression." In the same way, an e-book could be "improving access" rather than offering "artistic expression."

I guess establishing "transformation" is a bit tricky -- presumably ASCII-encoding an English book's text is not transformative. But it seems like somehow you could come up with something interesting and legal, even if it's not quite like my original formulation. Google Books' excerpts are probably like this.

Oh, and after re-reading the Colour article, I see I am not even all that innovative: http://monolith.sourceforge.net/


Now your just getting into semantics. Do you have a problem with the term 'random number' too? Because that's all a byte is, a number between 0 and 256.


As a matter of fact, I do; as do many people who take cryptography seriously:

> Any one who considers arithmetical methods of producing random digits is, of course, in a state of sin. For, as has been pointed out several times, there is no such thing as a random number — there are only methods to produce random numbers, and a strict arithmetic procedure of course is not such a method. - John Von Neumann

But to your broader point, no, this is not a question of semantics. There is an actual syntactical difference being asserted here.

That difference is this: randomness has nothing to do with numbers. 0xca--the same number--can be either "random" or "non-random" depending on how you got ahold of it.

When we talk about numbers--prime numbers, even numbers, rational numbers, etc., we are talking about properties of the number; whether they are divisible by 2 and so on. But "random numbers" have no property by which we could recognize them. It is entirely a question of where the numbers were made, not what they are or what they look like.


Ianal but I think it would be illegal. Bits have flavour. Even if you randomly generate them with the intent of matching them with an existing copyrighted bits is enough to give them enough flavour to get you in trouble


Colour of bits [0] was an eye-opening read.

0: http://ansuz.sooke.bc.ca/entry/23


Does this also apply to sha256(a_copyrighted_work)?

Note that retrieving any "useful information" about the original is believed to be similarly impossible from sha256(a_copyrighted_work) and encrypt_{sha_256(a_copyrighted_work)}(a_copyrighted_work).


>In theory the copy is random bytes until you possess the book, which should mean you have the right to possess the copy for personal/backup/interoperability purposes.

In what theory? We can very well theoritically conceive that you just ask for those "sample words" from someone else who has the book either legitimately or illegitmately, and he gives them to you.

Even if the set of sample words is different for each metadata file, the book would be the same, so you just need a person willing to share the words with someone that doesn't own the book.


Yes, but in such cases you can also just borrow the entire book, photocopy it, and keep the copy. It's still legal to loan books or have photocopiers, but using them together without license is illegal. The point is to make it so that the distributors can reasonably argue they are not actually distributing copyrighted material and have made some effort to ensure the recipients have a license to the work, and that there are no reasonable physical prohibitions the law or copyright holders can make to stop this (i.e., banning the sharing of books isn't reasonable -- for the moment.) It's the responsibility of the decrypting party to only decrypt lawfully, just as it's currently with disc copying/ripping.

Now, an actual lawyer might obliterate this through some logic I don't know, and/or appeal to "bit color" as others are pointing out.

P.S. I do have to admit my idea reminds me of warrant canaries, a "clever hack" around the law that I ultimately believe can't really be lawful.

P.P.S. CleanFlicks et al may be relevant case law, as from what I understand they made varying efforts to "ensure" their customers held a license, but were basically still distributing copies of works they didn't own.


>Yes, but in such cases you can also just borrow the entire book, photocopy it, and keep the copy.

Yeah, but this way you get the electronic version that you want + the key (from some other user). In the end it comes to those distributing they keys, and it's not different from site offering trial-versions and others giving out serials and cracks.

>The point is to make it so that the distributors can reasonably argue they are not actually distributing copyrighted material

Just because it's encrypted it doesn't mean it's not copyrighted material. If I encrypt "Star Wars" and give it for download (letting others give the key), they'll still be all over my ass.


What if it's not random but π?

https://github.com/philipl/pifs


> What if it's not random but π?

π is known to never repeat (that is, it's irrational); it is not, however, known to be normal.

That means the digits in its decimal expansion (to pick a base) are not known to be normally distributed. This means that it is not necessarily the case that every possible sequence of decimal digits is present in π.

For example, for all we know it's the case that beyond a certain (massively huge) number of decimal places, π never contains another '7'. That would render some sequences impossible, while still preserving the proven-to-be-true property that π never repeats itself in its entirety (that is, that it's irrational).


> That means the digits in its decimal expansion (to pick a base) are not known to be normally distributed.

You mean uniformly distributed. Being a normal number states more than that: any finite sequence of digits are uniformly distributed.

While you are right that Pi isn't proven to be normal, constructing a normal number isn't exactly hard. I think the following number is proven to be normal (binary representation, I added spaces for clarity):

0.0 1 00 01 10 11 001 010 011 100 101 110 111 ...

It's clear to see that this number contains all possible finite binary sequences.


How is searching for something in a random stream equivalent to copying it?


It's hard to call it "independently created" if you used a copy of the work in question as an input to your algorithm.


At some point the copy of the work you use to either compare or generate a comparison hash might be considered copying.

The maths tend to work against you as well.


At least for me it is all zeros (0x00). Except when I downloaded the entire file for the first time, there I first got zeros (0x00), then zeros (0x30) separated by two windows newlines (twice 0x0D 0x0A) and reached infinity after 544,930 bytes.


Mandatory "What Colour are your bits?" post.

http://ansuz.sooke.bc.ca/entry/23


Of course it does, that's equivalent to what torrents do. The "index" and "length" would constitute the copyright violation.

In the same way torrents are lists of hashes, which are used as keys to find byte blocks in the "cloud" of torrent clients.


The probability of that occurring is on the order of 1 in 2^1000000000. You would exceed the computational capacity of the universe long before that point.


You're talking accidental piracy. That would make for a great court case!


Unfortunately (or fortunately?) this can pretty much be claimed to be impossible. Even just generating one specific kilobyte of data using uniform randomness yields a probability of 1/2^8192 (Python tells me that 2^8192 > 10^2000, which is far bigger than the estimated number of atoms in the universe, which is somewhere around 10^80).




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: