Hacker News new | comments | show | ask | jobs | submit login
Illegal Numbers (wikipedia.org)
184 points by Pwnguinz 1607 days ago | hide | past | web | 95 comments | favorite

Are we sure that this Wikipedia article, kindly submitted for our discussion, lays out the issues thoroughly? The Wikipedia article's talk page includes the tags

"This article has been rated as Stub-Class on the project's quality scale.

"This article has been rated as Low-importance on the project's importance scale."

What are Wikipedians doing about that? The article history


makes it look like this is a rather sporadically edited article, which needs a lot of work. Are Hacker News participants willing to roll up their sleeves and jump in on editing the article? Or is that part of the problem on Wikipedia, that an article can be known to be lacking, but still not get fixed?

Where else would one go to find good sources on this issue? I've got no special knowledge of this specialized issue, so I can't personally help. My experience as a Wikipedian, fixing articles I know how to fix, is that most Wikipedia articles on most subjects need a lot more reliable sources.


The often relevant "What Colour are your bits?" - http://ansuz.sooke.bc.ca/entry/23

Summary: "Illegal number" is a term that you use to refer to protected digital information when you also want to convey that you have no grasp of the legal issues involved.

> you have no grasp of the legal issues involved

Let's have a thread about it then. What legal issues am I misunderstanding? It seems to me that the anti-circumvention provisions of the DMCA are rather unprecedented.

Brandenburg v. Ohio demonstrated that free speech protects abstract advocacy of force or law violation. The anti-circumvention provisions criminalize the distribution of not just software that can be directly used, but anything that serves as a tool to violate copyright. That could be a small number, a key, or even a description of how a DRM scheme operates.

I cannot imagine a constitutional basis for those provisions of the DMCA. What say you?

> The anti-circumvention provisions criminalize the distribution of not just software that can be directly used, but anything that serves as a tool to violate copyright.

I'm not a lawyer, but I'm going to speculate that if you upload image.bmp, a picture you took with your camera, that happens by coincidence to contain a key that can be used for circumvention, then you would not be prosecuted or convicted. In other words, it's not the number that's illegal.

I'm not speaking about accidental violation of the provision, I'm even talking about intentional violation. Anti-circumvention provisions don't cover the copyrighted work, they don't even cover a derivative work. It's just suppression of speech that tangentially supports or encourages illegal activities, which the Supreme Court has ruled the government has no compelling interest in quelling.

A lot of this speech can take many forms. It can be blog posts delving into how some encryption scheme works, it can be keys that are derived mathematically from Sony's mistakes, and it can be software which may only incidentally be used as a tool to circumvent DRM. This does not survive strict scrutiny.

But does it become illegal if someone (maybe not even you) points out that starting at bit 123094 you find the key to unlock your PS3?

The courts seem to put copyright on a roughly equal level with the first amendment, so any censorship caused by them is considered to be by design.


A non-lawyerly devil's advocate:

I would argue that a secret key by itself does not qualify as a circumvention tool and thus should not be banned by the DMCA. Unfortunately, by my reading the law does not agree with me and specifically bans distribution of a circumvention "device, component, or part thereof"; it seems like a key is a part of a circumvention device.

Looking at circumvention code itself, I would argue that such code should have no artistic or political expression when boiled down to its functional essence, in which case there is no meaningful free speech issue. If the code did have artistic or political aspects, arguably those aspects could be separated from the part of the code which performs circumvention. For example, the political opinion of being opposed to the DMCA can be easily expressed without using any code.

Having read the article, I'm not sure if this summary is satirical or not. On the one hand, it sounds like it could be a relevant summary of his point, on the other hand it sounds like you're saying the author has no grasp of the legal issues involved.

Can you clarify?

It sounds to me like he's saying that the author of the "Illegal number" article has no grasp of the legal (or philosophical) issues involved, not that the author of the "What color are your bits?" article has no grasp of it.

The author of that article specifically made a point of not calling either side "right" or "wrong"; the article just exists to help both sides understand the worldview of the other. Whichever mindset you hold, understanding the other mindset will help you make more effective arguments that don't sound inherently insane.

I'm aware, but the grandparent wrote a "summary" that could be read as either:

His point, worded a little oddly.

"The author got the law side wrong, and is yet another stupid nerd who doesn't get it."

From the perspective of the law, bits do have color, even though that makes no sense from a technical perspective. So, any given number may or may not qualify as "illegal" depending on how you generated it. That said, I don't think that makes the terms "illegal number" or "illegal prime" nonsensical, depending on context.

In any case, I think the article doesn't deserve the pithy "summary" upthread; the summary provides actively unhelpful information if you haven't read the article.

My summary was snarky, but "illegal number" is a silly term intended to emphasize that "it's just a number (so how could it be illegal?)", which is as sophisticated as the legal arguments "it's just a plastic disk with certain reflective properties" or "it's just some markings on a piece of paper."

Your claim that bits having color "makes no sense from a technical perspective" sounds like more geek misunderstanding to me. There is no technical problem here -- only a problem of geeks applying technical results where they shouldn't.

Your claim that bits having color "makes no sense from a technical perspective" sounds like more geek misunderstanding to me. There is no technical problem here -- only a problem of geeks applying technical results where they shouldn't.

How dare you suggest that geeks and their rationality are not the rightful masters of the universe?

You've described one side of the debate, yes. The technical side has no less merit; don't dismiss it offhandedly. Hence why I find the essay on bits having color so useful: it frames the debate nicely without actually taking a side.

The arguments you referenced would indeed make for relatively unsophisticated legal arguments, but they don't claim to be legal arguments; they're statements about technology and about how geeks want technology to work. They carry no less weight than statements about how lawyers want technology to work.

Or, to put it more snarkily: your claim that bits have some unrepresentable property of "color" sounds like non-geek misunderstanding to me. There is no legal problem here -- only a problem of non-geeks applying legalistic results where they shouldn't. :)

Let's not rehash the arguments that the essay already eloquently expresses; neither of us will get anywhere that way.

We apparently had different interpretations of the color article. You seem to think the article was about there being two valid viewpoints (color exists and it doesn't), whereas I think it was just about trying to get the computer scientist to understand color.

> Or, to put it more snarkily: your claim that bits have some unrepresentable property of "color" sounds like non-geek misunderstanding to me.

The legality of information should and does involve tracking how they were obtained. This is not a property of mathematical bits, but is a property of the physical encoding of those bits in this world. Since the argument here is over how the law should work, a geek making arguments like this is just wrong. This is not a "they're both right" situation.

> Let's not rehash the arguments that the essay already eloquently expresses; neither of us will get anywhere that way.

Go reread the essay and see if you can really find support for your viewpoint. I just skimmed it (quickly, admittedly) and it seems to say what I remembered it to say.

> Since the argument here is over how the law should work

I think we've talked past each other here. I agree with you about how the law works. As much as I might want it to work differently, it currently does not. However, read the second to last paragraph of the essay: "I think it's time for computer people to take Colour more seriously - if only so that we can better explain to the lawyers why they must give up their dream of enforcing Colour inside Friend Computer, where Colour does not and cannot exist." That one sentence nicely sums up both viewpoints: the law wants bits to have color, and technology does not support properties of bits beyond the bits themselves. You've argued the former, which I agree with. I've argued the latter, and you seem to think that in doing so I've disagreed with the former.

> whereas I think it was just about trying to get the computer scientist to understand color.

I'd argue that the same article would also help lawyers understand why technology cannot represent the color of bits.

In any case, I think the sentence I quoted above makes both viewpoints very clear. And while I recognize that the law sees bits as having color, I intend to continue working to make sure technology does not attempt to care.

I think the few bits about lawyers not understanding the limitations of metadata weren't the thrust of the article. I do agree that the two viewpoints there aren't opposed, but that point about metadata is not the same as the wrong headed one being made by people using the phrase "illegal number."

The only thing that makes the "illegal number" view wrong headed is a belief that the law as currently on the books has the final say on what is ethically/factually/morally/intellectually right and wrong.

I believe you are too hard on those who disagree with you, as you don't seem to acknowledge that those who disagree with the law's desire to stamp a color on everything have valid reasons for their disagreement.

Can you point me to a coherent presentation of the alternative possibility? The only alternatives I see are that either A) there are legally protected (uncolored) numbers, or B) there is no legally protected information at all.

I don't think option A is coherent. In fact most use of the phrase "illegal number" is by people who are attempting a reductio ad absurdum argument against the legal system (but don't understand color). But note that as a practical matter, we do have protected numbers. Good luck convincing a court that you produced an identical text to the last Harry Potter novel through any means other than copying.

Option B, the anarchistic approach, is no doubt consistent but would be a radical change from our present form of copyright.

It's my understanding from watching the DeCSS fiasco unfold that the "illegal number" phrase was coined to mock the DMCA's criminalization of circumvention tools, and the first illegal numbers described were various representations of DeCSS. I don't think that position is unreasonable or wrong headed at all.

You are right that, when applied to actual copyrighted materials, the only obvious options are A: "We'll voluntarily submit to the legal fiction that bits have color (or at least shades of gray) in the form of copyright law, because we can see some benefit to doing so," and B: "We believe that a society whose rules can be fully justified by concrete logic, evolutionary morality, and the laws of physics is preferable to a society founded upon legal fictions, and abolition of the concept of intellectual property follows from this."

However, I believe a good number of people hold position C: "We believe copyright law has significantly overstepped its useful bounds, and will pretend to hold position B in order to achieve a reasonable compromise."

Even your use of the phrase "... the legal fiction that bits have color ..." sounds silly to me. Compare that phrase to "... the legal fiction that bicycles have an invisible ownership property ..." to see why. The entirety of the law is about similar fictions.

The idea that someone owns the part of my brain that has been indelibly imprinted with their copyrighted expression sounds silly to me.

If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and the receiver cannot dispossess himself of it. Its peculiar character, too, is that no one possesses the less, because every other possesses the whole of it. He who receives an idea from me, receives instruction himself without lessening mine; as he who lights his taper at mine, receives light without darkening me. -Thomas Jefferson

Ideas are fundamentally different from bicycles. One cannot share a bicycle indefinitely without each user having proportionally less access. One can share bits and ideas. Show me where in silicon the color of a bit is stored. The entire concept of colorful bits is a fabricatiion of the human imagination, not a fundamental property of the universe. The atoms comprising a bicycle, on the other hand, can indeed be under the control of the atoms comprising a human being, to the physical exclusion of others.

Ownership of a bicycle isn't about physically controlling the bicycle. It's about the path of the bicycle through space-time, which is exactly how color operates -- it tracks the flow of information through space-time. There is no physical aspect of a bicycle that controls ownership. It's color, just like with bits.

I'm sure we both understand well enough the trade-offs that society makes in granting copyrights. Reasonable people have differing opinions, but these trade-offs existed long before computers, which have only shifted the balance of the bargain. The fact that bits encode information as numbers is irrelevant to the discussion.

The important distinction is scarcity. All of our intuitive concepts of ownership depend on it, and bits don't have it.

Sometimes when I'm feeling innocent and pure, I like to pretend that the world is a place where everyone realizes that "... everything is just zeroes and ones" and thus anything can represent anything else when viewed just the right way. That means everything is illegal, and so nothing is illegal.

"This coke can plays the melody to Funkytown!" ... http://rachelbythebay.com/w/2012/07/26/encoding/

Then I snap out of it. It's right up there with "what if each of your fingernails contained a universe" talk: inevitable, but predictable.

This is an interesting problem that we, as a society, need to deal with.

HN's readership would be predominantly from a tech background with probably a reasonable basis in mathematics and the sciences. As such we can recognize some of the absurdities of the legal system, like how you can't patent a mathematical formula but you can patent software, which is ultimately indistinguishable from a formula in many (if not all?) cases.

The question is naturally asked: if I make some number produce something illegal is that number then illegal? The engineers among us like absolute rules and certainty as a general rule. I've had a discussion with someone else about how a group of people could route each other's packets and then hypothetically law enforcement couldn't prove who they came from. The same argument comes up with "an IP address is not a person" arguments.

While I sympathize with these arguments, this isn't how the legal system works. The law doesn't work on absolutely provable certainty. It works on reasonable doubt, intent and (hopefully) facts.

So when it comes to numbers, one has to remember that numbers encode information (See [1]). So for a small integer to encode, say, an obscene image or a program that circumvents copy protection, it would require another program that actually does that. So if the number 7 produced a DeCSS circumvention program with a given program A, then program A is the problem, not the number 7.

Now to turn any number into a video or a program or an image requires another program or infrastructure. Even if it's juaw the raw bytes of an i386/ELF program, you still need a kernel and a filesystem to run it. The test here, as I imagine the law would apply (IANAL), is whether the required infrastructure is general purpose or not.

Turning the number 7 into something obscene can't be done with something general purpose. Turning a 1400 digit number into a program can.

I don't like the stance of the US on IP, particularly the Obama administration, which has to be the most pro-IP anti-tech administration in US history (IMHO). I also don't like how selective enforcement is here. Share a few songs on Limewire and you'd up for hundreds of thousands of dollars? Really? At the same time, every city has a place you can go to buy pirate DVDs or counterfeit goods (eg Canal St in NYC). Why is this, which is actual trafficking for profit, ignored while the administration tries to elevate file-sharing to terrorism (the original ACTA draft)?

Anyway, numbers are simply a way to represent information and that information is the problem, not the number.

[1]: http://en.wikipedia.org/wiki/Information_theory

This topic is explored more in GEB[0], with the point made that in general the information exists neither only on storage device, nor in the reading device alone; in fact the information is encoded on both of them together.

[0] - http://en.wikipedia.org/wiki/G%C3%B6del,_Escher,_Bach

While people may not grasp this technical point, I think the common-sense reaction conforms to it.

For example, if I started publishing a sequence of all the binary numbers between 0 and (something huge), I don't think anyone would argue that I'm violating copyrights.

But if I added to that data stream a statement like "take the Nth number from this series and feed it through Encoder X and you'll get Movie Y", I'd argue that's both the additional information and the intentionality needed to make it a copyright violation.

I just want to mention every movie ever translated to a digital format, every song rendered as an mp3, and every image or sound ever made digital is in effect a very long number in binary form that can be interpreted by decoders to produce copyrighted material. Are all those numbers illegal to distribute, or even copy if you already have an instance of such a number? That is a lot of numbers. Big numbers too.

Owning a number seems weird, but the truth is all information can be seen as numbers.

Owning information seems weird to me. It is intangible. I am all for property ownership, but not the ownership of ideas.

When more and more everything is digitized, it seems like pro-property-ownership will mean pro-idea-ownership and vice versa. Finally Nous of Anaxagoras becomes true.

> While I sympathize with these arguments, this isn't how the legal system works. The law doesn't work on absolutely provable certainty. It works on reasonable doubt, intent and (hopefully) facts.

But is this how the legal system should work? The legal system has obvious flaws. At the very least it needs to move away from duelling expert witnesses to a more scientific method based approach.

> I've had a discussion with someone else about how a group of people could route each other's packets and then hypothetically law enforcement couldn't prove who they came from.

Combine it with encryption and you'll get something like Tor or i2p. Not just hypothetical, but works very well in practice too. Things like Silk Road have been running there for years with LEOs not being able to do anything about it.

>So for a small integer to encode, say, an obscene image or a program that circumvents copy protection, it would require another program that actually does that.

so winzip is the problem, not the zipped file? ridiculous statement, bro

fine, you still need to play the file, so media player is the problem? obviously, it's the source file, not the generic tools that work on everything

He is saying that the problem is those components of the total system used to produce the infringing output that is not generic.

So in the case of a zipped file, the zipped file and not winzip would be the problem, because winzip is generic, and have general purpose uses, while the zipped file is a very large specific number that is translatable by a generic process into a specific protected work.

On the other extreme, the number "7" on the other hand, does not encode enough information to on its own contain the information about the infringing content. Even if you were to write a "decoder" that turns the number "7" into a movie, in this case the "decoder" is the problem, as the decoder embodies enough specific information about the content that in effect the number you feed in is little more than a key to unlock access to that content.

That's actually the parent's point. Except specific to large integers, because there aren't enough small ones to encode anything meaningful that can be decoded by a general purpose program.

> While I sympathize with these arguments, this isn't how the legal system works. The law doesn't work on absolutely provable certainty. It works on reasonable doubt, intent and (hopefully) facts.

Hah, you still think that the legal system makes an ounce of sense. Read the below, weep, weep again and then ask yourself how you didn't realise that you were living in the dark ages up until now.


But that's just the tip of the iceberg. If you care to know, read up on the reoffending rates of prisoners and the types of systems which have particularly low ones. Look around you at all of the seemingly civilized people who call for blood when their sensibilities are offended strongly enough.

Where we now stand, in this aspect, is not at the heart of civilization but on the very fringe of it - and hardly anyone has noticed the fact. The legal system, just like everything else to do with humanity, is solidly anchored in emotion, not logic.

Let's rephrase this problem as not something that people will defend, e.g. anti-DRM, but something people will not defend; let's go straight for the big guns and use child pornography. Clearly, child pornography is illegal, and yet the images are just a series of 0s and 1s; these could potentially be re-interpreted as a single number, giving rise to the problem here. Or maybe each triplet of bytes of the JPEG's data could be represented as an RGB colour, and you could have a "child pornography flag" analogous to the free speech flag here, albeit a lot longer. Does this make it ok to distribute this number or this flag, because it's just a number/sequence of colours?

I've never really understood the child pornography thing.

Criminalizing mere possession of such images seems like a roundabout, largely ineffective, and perhaps even counterproductive, way of tackling the real issue; the sexual abuse of children.

It blurs the very real and very important distinction between those who are sexually attracted to children and those who will act on that attraction to sexually abuse a child.

It criminalizes the inquisitive, has the potential to make people inadvertent criminals, and ultimately — to return to the original subject matter — it makes a crime of possessing a particular sequence of 0s and 1s, which strikes me as particularly absurd.

My personal suspicion is that sexual attraction to children (primarily referring to teenagers here) is not the abhorrent, unnatural illness that people speak of in public, but a perfectly natural, harmless preference that is far more commonplace than most people would like to admit.

Ultimately, I'm for the free sharing of 0s and 1s in any order, and I'm not the least bit swayed by the child pornography challenge.

Criminalizing the possession of child porn photos is a tool to assist in the capture of people who make child porn photos. A person facing prosecution for possession is highly motivated to reduce their own sentence by helping in the arrest and prosecution of the source of their images. If the source didn't make the photos, they certainly possessed them too, so it's possible to walk another link higher in the chain.

"You haven't caused any harm or done anything wrong, but we think you might have information that will lead us to someone who has, so we're going to invade your privacy, drag you through the courts, lock you up, and ultimately ruin your life."

I can't be the only one who sees something wrong there? The sexual abuse of children is horrible, but I don't it justifies the ritualistic destruction of countless adults' lives.

Fine, replace child pornography with a text file containing your name, birthday, social security number, credit card details, email password and mother's maiden name, the argument is the same.

There would be no crime in possessing such a file, nor should there be. It would and should be criminal to use it to impersonate me, or to access my emails without permission.

> There would be no crime in possessing such a file, nor should there be.

exactly. There should be no crime in mere possession of information. It may be a crime to use that information in ways you are not allowed - for example, the above file of personal information, if copied off my machine by somebody who i didn't authorize, and then that person used it to do something, i have grounds (and the state has grounds) to pursue criminal activity.

Now swap that with any information that is deemed "illegal" in society these days, and the same arguments should apply.

As lmm says, I don't have any issue with someone possessing that info. Security through obscurity is folly, and it is only as a result of incompetent security procedures that someone could cause harm with these pieces of information.

Could you please post all that info in a reply to this message then please?

There is a significant difference between someone else possessing this information, and actively distributing the information myself.

I would prefer not to have people possess the information (because I am beholden to institutions that, unfortunately, rely on security through obscurity and have abysmal security procedures) but I don't support criminal sanctions against people if they do happen to come into possession of it.

The criminal sanctions should only cover abuse of the information. If you possess my credit card information and do nothing with it, then I don't have a problem with that. If you use it to make unauthorized purchases, or use the information to blackmail or harass me, then you should face criminal sanctions for the harm caused by those actions.

Simply possessing the information doesn't cause me any harm.

The idea, however misguided, is that the consumption of child pornography creates a market demand. If nobody wants the images, nobody will be compelled to create them in the first place thereby eliminating the actual abuse.

There are two issues with this:

1) The causal link described remains a matter of dispute, with evidence supporting both sides; a plausible argument can be made for the opposite case that wider access to child pornography, and making more 'efficient' use of existing images, could quell demand for the creation of new images, and thus reduce new abuse cases;

2) Legislation prohibiting child pornography often extends its reach beyond cases of serious abuse; in many countries it encompasses 'softcore' images, even semi-nude images if they are sexually suggestive. These kind of images (the kind that you would find on /r/jailbait over on reddit) are often created by the subjects themselves, and one could argue that their creation caused no harm to anyone.

1) I think that's fair, but legal arguments are not always logical. I'm reminded of the Illegal Numbers topic that was on here in the last day or so.

2) Many times I think people forget what laws were originally intended to resolve and they get updated and twisted throughout the years. Not unlike the drug laws. Who even knows why they are illegal anymore? Especially when so many other drugs with the same social/economic problems are perfectly legal.

There is a Latin term for arguments based on the reduction of software to numbers: reductio ad absurdum.

You seem to be implying that the argument is absurd, but reductio ad absurdum is valid. "If X were true then we would have impossibility Y, so X must be false."

I'm intrigued by how the law would handle one time pad cryptography in such a case. You could produce two numbers of similar length, each on its own bearing no relationship to the 'illegal' number, but which can be combined to give that number. Then post them separately. Could either of them legally be taken down?

I believe they would both be infringing. The issue here is intent, and the intent of those two numbers are both to be used to recreate the original number. It's not any different than if one website had the first half and another had the second half.

The alternative here would be if you could somehow find preexisting content whose intent was not to recreate this number. In that case it's trivial that the only infringing content would be something that told you how to recreate it based on those (eg if you just have a website for each digit that is trivially noninfringing, but a website that links to each of those websites in the correct order is infringing).

It's easy to act like this is absurd because of the natural existing of numbers, but the law is generally pragrmatic and not concerned with extreme hypotheticals. You could similarly argue (and plenty have on internet forums) that every mp3 in existence could be found encoded somewhere in pi, which is technically true but absolutely irrelevant in copyright law; if you encode someones content in a novel encoding it's still infringing the other ephemeral protected content.

The only way to produce such a pair of numbers is by using the restricted material as a source -- therefore the numbers would be a derivative work. This is already covered under copyright law.

Now there is a twist to this concept -- plausible deny-ability. Let's say Bob posts a non-infringing work encrypted with a one time pad (randomly generated), and also posts the one time pad too. Both files would appear to be random text until put together. Further, someone else, Bill, produces a "random" number by XORing the Bob's one-time pad with an infringing work, and uses the result as a one-time pad to encrypt another non-infringing work, and posts both of those files. The two one-time pads can recreate the infringing work by XORing them together. But you can't prove who's "random" file was actually randomly created, and which one was produced as a derivative work. Who do you send the take-down notice to?

The initial example presents the same which-part-is-infringing conundrum as the more convoluted example you presented.

  - restricted material is X
  - generate Y randomly (using a typical cryptographically secure RNG or PRNG, zero-bias, with an entropy source unlikely to be observed by anyone else)
  - Z = X xor Y
There is no way to prove that it wasn't Z generated randomly, with Y = X xor Z.

It's clear that the only way to generate both Y and Z is to have the restricted material, but you don't know which piece is tainted and which isn't. You have to know which piece was generated first to know which one is infringing (the non-infringing one had to be used as input for the infringing one).

That just encourages the powers that be to take a "shoot 'em both and let God sort it out" approach. Also, the metadata that says Z = X xor Y will be seen as infringing.

I doubt that metadata will be infringing. It doesn't have any real information content.

I think my case is essentially identical to yours. One of my two numbers (the 'key' of the one time pad) is randomly generated, so it can't be a derivative work. But there's no way of telling which that is.

The main difference is that the case I presented, although it may seem to be a stretch, there is a plausible reason for the two random-looking files to be present on one site (or the other). Whereas if each site only had one random-looking file, in neither case was there a plausible reason for that file to be there.

They accused me of threatening to kill other people when all I did was just sending numbers, which happens to be interpreted by mobile phone to show on screen as "I'll kill your family if you don't do what I said".

Apparently some numbers or are considered illegal.

Numbers corresponding to obscenity depends entirely on the software that interprets and renders them, doesn't it? If I write software that translates the number 5 into obscenity, would 5 become illegal?

Not entirely, partially. If you could come up with an actual example, you may have a point.

The main problem with "illegal numbers" seems to be that for any illegal number a, which represents someone's intellectual property or trade secret A, I can easily create two original pieces of intellectual property B and C so that b + c = a.

As a general rule, using secret sharing protocols, for any choice of n and k, you can split any illegal piece of information a into n pieces such that k of these pieces together can reconstitute a but <k pieces will not give any information about a. It is hard to tell to what extent an individual piece is illegal or not.

Related: http://www.madore.org/~david/misc/freespeech.html (does something similar using XOR, and discusses interesting uses of that kind of system).

This is covered in "color", already posted in this thread.

This is a good point, as it the derivative notion: are my thoughts about illegal numbers, themselves, illegal. Evidentiary issues aside. etc.

Give us an actual example of b and c that are created without knowledge of a, and you may have case. But information theory says it is unlikely, if a is deeply original and not derivative of something like b and c to begin with.

I may not be on the same wavelength as rolux here, but compression would be one example. Create B (a series of compressed files) and C (a decompressor), both unique creations in their own right, that when combined create the original work of another person, A.

Now I'm no expert on compression, but there are surely a vast, if not infinite, number of possible pairs B and C that could combine to form the original copyrighted work.

Thus, with copyright law in its current state, we are not simply granting copyright holders the exclusive rights to a particular sequencing of 1s and 0s, but also rights over any method of creating that sequence.

If there are an infinite number of methods of creating that sequence — that is to say, any B, if combined with a suitable C could form A — then aren't we in effect granting rights over everything to the copyright holder? Where do we draw the line?

The 1s and 0s that make up this post could, with suitable decompression, form an exact copy of a hit blockbuster, but neither this post nor the decompressor would resemble the blockbuster on their own.

Edit: Interesting idea time. If I uploaded a series of files alleging they are encrypted copies of blockbuster films, but resolved not to release the encryption keys to any of the files for 12 months, would the copyright holders have the right to have the encrypted files taken down in the meantime?

They can't actually prove that the files are infringing without the encryption key. Is the mere suggestion that a file may potentially, with some manipulation (i.e. decryption with the appropriate key), resemble a copyrighted work, sufficient to have it taken down?

Again, that can only work if either B or C is a derivative of A.

Apparently what I was describing here has already been put into practice in the shape of Monolith [1]

In the case I describe, as in the case of 'munging' using Monolith, neither B nor C bears any resemblance to A except when the two are combined.

In my view, at least, you cannot say either is a derivative of A. To do so would bind you to declare everything a derivative of A, because any B when combined with an appropriate C can form A.

If I take your post here as B, you will likely deny it is a derivative of any copyrighted work, but if I make the text of the post the encryption key to another file (C) which, when decrypted, becomes copyrighted work A, is your post itself derivative of a copyrighted work?

What makes your post non-derivative and the encrypted file I create derivative? They are both nothing in themselves, and yet a copyrighted work when combined with the other.

[1] http://monolith.sourceforge.net/

Edit: Reading through the article on the color of bits it seems this exact argument prompted the article in the first place. I guess I should finish reading this!

Let me put it a different way. If A = decompress(B), then necessarily B = compress(A), so B is obviously a derivative of A. Introducing xor does not change anything; one of the parts must be a derivative of the original.

Okay, let me propose an alternative procedure.

I set a series of random number generators going, and with each set of results, I apply randomly generated XOR to create a new sequence of numbers.

I perform this process over and over. Eventually, it produces (give or take a few bits) a copy of an MP3 file of a copyrighted work.

Now, once we've eliminated any procedure of creating B and C that includes A, would you still say one of B or C are derivative of A?

Should my random number generator be banned? Perhaps more importantly, do I acquire copyright to all the files it creates?

I could quite easily create every possible variation of an MP3 file of a given length. Does that mean any musician who, using a different procedure, produces one of these files is infringing on my copyright?

>> Eventually, it produces (give or take a few bits) a copy of an MP3 file of a copyrighted work.

...which you recognize by having a copy of that work and specifically matching for it. If I were a copyright lawyer, I'd argue that your algorithm for plucking this value out of the stream of randomness was the infringement.

>> I could quite easily create every possible variation of an MP3 file of a given length.

If you're prepared to pay $35 each to register the copyright on all of those, knock yourself out. I'll enjoy not paying taxes anymore.

> ...which you recognize by having a copy of that work and specifically matching for it. If I were a copyright lawyer, I'd argue that your algorithm for plucking this value out of the stream of randomness was the infringement.

I don't have to recognize it myself. Say I put all the resulting files up for download on an FTP server, and the RIAA stumble across the collection. Within, say, a collection of every possible 30 second long MP3 file encoded at 128kbps, I'd probably be infringing on a few thousand copyrighted works.

For each infringement there'd be many, many more 'infringing' files (i.e. every slight variation on a work that a copyright lawyer would deem indistinguishable from the original work)

> If you're prepared to pay $35 each to register the copyright on all of those, knock yourself out. I'll enjoy not paying taxes anymore.

Apparently you can register copyright for music tracks in bulk. In any case, where I live you don't have to register copyrights.

>> Say I put all the resulting files up for download on an FTP server

128kilobits per second * 30 seconds = (128 * 1000 * 30) = 3,840,000 bits per file.

There are 2 to the 3,840,000 possible combinations of that many bits. Ignoring the fact that many of those won't be valid mp3s, each of those is about a 0.46 megabyte file.

I'm guessing you don't have enough hard drive space to put all those mp3s up. :)

Assuming you did, the RIAA would have a tough time crawling all that content for infringement.

It would make an interesting test for the theory that "linking isn't infringing," since the link would be the only thing distinguishing a song from random noise.

Obviously I'd set the random generator up such that it operates within the rules of the mp3 specification and only creates valid mp3 files; I don't think that detracts from the experiment.

Storage space is the only major limitation here. With current computing power I could easily have random mp3 files spat out at an alarming rate, such that it wouldn't take too long (I'm guessing a matter of months) until I managed to produce an infringing file this way.

I could probably speed the process up by teaching the 'random' mp3 generator certain patterns to pursue; fade-ins and fade-outs, repetition, etc. Again, I don't think these detract from the substance of the experiment.

It's kind of like teaching someone to play a sport; you show them the rules of the game, and a bunch of 'patterns' that players tend to adhere to. Eventually, they'll make a sequence of movements, lasting 30 seconds or so, near enough identical to that performed by a famous sports star.

OK, I got a little help for my sorry math skills. http://math.stackexchange.com/questions/225155/how-can-i-qua...

According to Ross Millikan over there, the number of possible 3,840,000-bit files is a number with more than a million zeros. The number of atoms on the universe is only around 10 to the 80. So if you could use the entire universe as your hard drive, storing a bit on every atom, you'd need many, many universes to store those files.

You're going to have to use some serious algorithmic bias to get mp3s, much more bias to get non-static, much more to get anything resembling music and containing any English words, etc etc.

Bluntly, you won't get copyrighted works ever unless you're specifically targeting them, for any reasonable value of ever. It's theoretically possible only in the sense that it's possible for someone's DNA to spontaneously appear at a crime scene.

This is why the "songs are just numbers" argument is misguided. Yes, they can be represented as numbers. But you'd never discover them that way.

All you've done here is take the 'hidden in pi' argument and customized it. If you are only making a sample of random mp3s you have a statistically zero chance of infringing on anything. If you make a thorough set of X-length mp3s then the actual infringement is in the url, because a datastore that has every number is equivalent to one that has no numbers--it's really just an encryption algorithm between url data and mp3 data.

If (b) and (c) can be considered derivative works, then should a low-bit hash of (a) not also be considered a derivative? If a given hashing algorithm says the hash of Metallica's latest hit is 42, have I just broken the law by posting this?

It's not just a yes/no on derivation, the amount of information present in the derivative data matters. If you hashed every 32 bits individually then you would be infringing, but one hash for the whole thing is fine.

I've read an argument that the law has considered a five(?)-note sequence enough to identify one song as a derivative of another - and given reasonable estimates for the "space" of possible 3-minute tunes, and the number of (copyrighted) songs currently in existence, there's a significant chance that a "random" new song will be legally considered a derivative of some existing song.

Of course, here we come back to the "color" of bits; one is unlikely to be prosecuted for "copying" a song one has never heard. Though cases like My Sweet Lord are enough to give one pause.

This sounds absurd only because we have a hard time really understanding the consequences of dealing with a vast search space. Yes, 128^140 is the space of all possible tweets (in ascii), but just because we're able to count them (sorta) and assign a number to each message doesn't actually give us any power over them. After all, the omega number [1] is just another number too.

[1] http://plus.maths.org/content/omega-and-why-maths-has-no-toe...

Not to mention the binary representation of every single copyrighted MP3, text, software, picture, etc. in every conceivable encoding and file format.

I don't speak binary very well, so correct me if I'm wrong on this subject. If a number that represents illegal information is formed, that number is illegal. What if there is a larger number that has the illegal number in it? Example: 10 is illegal. 1000 has the number ten in it. The "10" can be filtered out.. How would this be handled?

What truly matter is intent. If you are trying to convey the illegal number, you are trying to break the law. If you accidentally transmit the bits that form the number while transmitting something else, you are not culpable.

Obligatory IANAL.

You are recognizing just one aspect of the inanity that is having "illegal" numbers.

Are people trying to say the context in which something happens shouldn't have legal bearing?

People are trying to say that criminalization of the possession of or trafficking in so-called circumvention tools is absurd.

The first thing that comes to mind when looking at the flag are Windows 8 theme colors.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact