If linking to copyrighted data 'should' be illegal (SOPA), then what about descriptions of that data that are sufficient to identify the original, but not reconstruct it (magnet links)?
And if those were made illegal, then what about descriptions of those descriptions? You can recurse infinitely on this.
Beyond mere amusement, after just one or two recursions, you get to the point where it would be difficult to write a law that would criminalize magnet links without also criminalizing people who link to a Sparknotes-like summary or commentary for a piece of media.
Law has certain resemblances to regular code, but folks here seem to think that if something isn't properly specified that the law will break in the same way that a program will fail to compile or run properly. But that's not how it works. Poorly drafted laws can fail, certainly, but it's not that hard to draft something that focuses on the end result.
Consider ordinary offences, such as robbery. You wouldn't get anywhere by arguing that you're alleged to have put your right hand in your pocket and pulled out a small hatchet, and that since there's no law specifically forbidding right-hand wielding of hatches, you should go free. The technicalities of how you committed the robbery are irrelevant as long as it can be established that you took someone's property in a violent fashion. I'm a little perplexed as to why folks think torrenting/piracy/filesharing etc. is so different that it can't be addressed legally. Sure, the law needs to be clear and logical, but only up to a point. It doesn't need to be absolutely exhaustive, and 'beyond a reasonable doubt' has never meant 'beyond any imaginable possibility'. People do make arguments like that in criminal defense cases from time to time, but they typically fail because the doubts they attempt to raise are absurdly far-fetched.
But! There are still interesting problems to be acknowledged as far as how you define the transgression. A torrent like this from one perspective is worth billions, from another is worth very little. And practically speaking, if someone were to really use it the way you might expect a pirate to use it, it is practically worth only a tiny fraction of its potential worth. These are things that are difficult to define or restrict, but are immediately obvious to anyone who would use it. That's why it's interesting to me.
A description of copyrighted data would be the torrent file, which content producers would probably like to argue are infringing.
A magnet link is a hash of the torrent file, so it's already two steps removed.
Of course, the Pirate Bay magnet dump is itself a torrent, so it's a hash of a hash of a hash of copyrighted data.
And that torrent itself has a magnet link: 938802790a385c49307f34cca4c30f80b03df59c is a hash of a hash of a hash of a hash of copyrighted data. (In the MP/RIAA's ideal world, I've just committed criminal copyright infringement with damages reaching into the $billions.)
Theoretically, the Pirate Bay dump could include the torrent for the Pirate Bay dump, and be an infinitely recursive description of itself... but that's probably an intractable cryptographic process.
Perhaps its prime factorization also belongs on that list.
It would also be wise to outlaw the URL I've just linked, and perhaps also the combination of letters "JROu8" as they also contain the information in question given proper context.
the most copyright-infringing flag of all time?
Algorithms that append a hash to a file, preserving the same hash, i.e.
hash(s) == hash(s ++ hash(s))
A program that prints (though does not contain) a hash of itself:
(edited for formatting)
Do you recall the title or author of that paper? I'd be interested in reading about the contradiction argument you mention.
I think Colour is what the designers of Monolith are trying to challenge,
although I'm afraid I think their understanding of the issues is superficial
on both the legal and computer-science sides. (...)
I personally feel comfortable taking the position that anyone claiming that bits have color has an objectively wrong view of reality. I don't think that position needs much advocacy, though; reality always tends to win in the long term. It could certainly use a little help sometimes, though.
Granted, the fingerprinting algorithm used by YouTube is pretty good: http://www.csh.rit.edu/~parallax/
As an, albeit somewhat contrived, example:
Let's say I have a music player on my computer that, when fed the works of Shakespeare, it plays the Gaga's latest hit and when fed with the works of Sir Arthur Conan Doyle, it plays Jingle Bells.
My actual intent is to read Shakespeare and to listen to Jingle Bells, but that is going to be a pretty difficult case to prove in a real court of law. The assumption will be made that I had the player and copy of Shakespeare so that I could listen to Gaga.
Theoretically, unless Shakespeare has been interpreted by said program, I have not committed any crime. But precedence disagrees. Just having the right sequence of bits on your computer is enough to prove intent, whether your actual intention was to use it to violate copyright or not. That is where I get lost.
EDIT: I think I found it at http://www.findthatfile.com/search-31270472-hPDF/download-do... . The name of the file is "CopyNumbCJ.pdf".
You're thinking legality, but I'm thinking efficiency. So we can now distribute 1.5 million torrents (of a total of several thousand TBs, no doubt) in a file 90MB big, in a torrent which itself has a magnet address that takes up...20 bytes?
The size savings as you go up the tree are incredible. I see no reason why you couldn't create an almost-entirely distributed torrent site in this way.
Think about: Torrent discovery could be done by regular distribution of index torrents, and the clients use that to find out what can be downloaded and where.
In fact, in the world of magnet addresses, "uploading" a torrent would be as simple as requesting that its URI be put in the day's index. So running a torrent site would be as simple as curating a list of magnet URIs each day into an index, then publishing that torrent's URI somewhere. Like Twitter. You could run a torrent site entirely from Twitter.
I think that requiring torrent files and trackers was a policy decision to deflect liability away from the client implementer to multiple third parties. That's why bit torrent is still around and Grokster isn't. There's no technical need for them.
One switched pair could give you the position in the magnet-link key, the other switched pair could give you the value. That way, you could never pin down exactly who gave you what information.
Or maybe I shouldn't suggest it?
Such as the name of the copyrighted work?
I went trough the description pages like http://thepiratebay.se/torrent/$i by increasing the $i and saving the magnet if pirate bay didn't return 404 error. I went trough the pages as unlogged user, though, so that might be the reason why I got only 1.5m torrents.
I didn't know pirate bay has hidden porn torrents; there is TONS of porn in the scrape already.
The script is in perl, I will post it to pastebin in a moment.
allright, the script itself is here
as you can see, it's not very complicated.
It might be an to release a diff against this once a week, and write a quick script to grab it, keeping the list up-to-date.
But it would still be more proof of concept than really anything useful - the comments and descriptions ARE important.
edit: More I am thinking about it, the less useful it sounds.
First, the information about seeders vary constantly, especially with the new torrents.
Also, it STILL depends on single point of failure - the Pirate Bay itself. If TPB will be down for any reason, I will have no place to scrape this from and it will all fall apart anyway.
Plus, I think Pirate Bay itself should make dumps like this. It would probably be much better for their database anyway :)
Can't use an undefined value as an ARRAY reference at
piratebay_magnet_scrape.pl line 13 (#1)
(F) A value used as either a hard reference or a symbolic reference must
be a defined value. This helps to delurk some insidious errors.
Uncaught exception from user code:
Can't use an undefined value as an ARRAY reference at piratebay_magnet_scrape.pl line 13.
at piratebay_magnet_scrape.pl line 13
main::__ANON__(20697, 0, undef, 0, 0) called at /usr/share/perl5/Parallel/ForkManager.pm line 354
Parallel::ForkManager::on_finish('Parallel::ForkManager=HASH(0x9cd7ac8)', 20697, 0, undef, 0, 0) called at /usr/share/perl5/Parallel/ForkManager.pm line 333
Parallel::ForkManager::wait_one_child('Parallel::ForkManager=HASH(0x9cd7ac8)', undef) called at /usr/share/perl5/Parallel/ForkManager.pm line 285
Parallel::ForkManager::start('Parallel::ForkManager=HASH(0x9cd7ac8)') called at piratebay_magnet_scrape.pl line 27
The IDs are sequential, but there are substantial gaps. Removed spam torrents, most likely.
I think a magnet link is based on the hash of the contents too, so it might be an interesting problem to include the torrent's own magnet link in itself.
If you're really interested, install the Python module bencode, and use it to de-serialize a .torrent file.
The archive is static, the TPB database is dynamic :)
Hope the community doesn't think I've hijacked the thread for my own purposes. I just thought it was an interesting little discussion and wanted to point it out.
nz -cc -m1.8g
Bonus: it's in chronological order.
but not 100%
putting it throughs sort ordered by tpb's id, which I imagine are assigned in chronological order. The low-numbered torrents seem to be from 2004.
Just one thing: I see you are splitting by |, and some torrents (very few, but some) have | in their name (I didn't bother with escaping that).
but wikipedia has an example of a magnet link like this:
so could we get the magnet link to ALL of the magnet-hashes for ALL torrents on the Pirate Bay tdown to, what is that, 35 characters oplus the magnet cruft "magnet:?xt=urn:sha1:"?
Edit: To clarify, the line/link he posted splits across three rows automatically.
I used to comment on how I had to reboot my computer the other month. Windows users would be like "yeah, that sucks, I hope you had everything backed up ... wait, did you say reload or reboot?"
Alternatively, HN needs to implement a single-comment-only break somehow.
if you know who to mail who can do that go ahead
If we accept that the linked file is "all of the pirate bay" than isn't my comment just as equally "all of the pirate bay"? Haven't I just included "all of the pirate bay" in my comment?
Piracy is far closer to plagiarism, but even then only to a point. In plagiarism, one attempts to pass off the work of another as one's own. In piracy, one simply copies another's work for one's own use. They are fundamentally different.
This is why piracy is as prevalent as it is: it simply is not as bad as plagiarism, let alone theft. Most people have an intuitive understanding of this, and those who pirate do so without the cognitive dissonance that comes with acting against their moral code. It might be "wrong" in an abstract sense, sort of like lying on your resume is "wrong", but it's not wrong in the absolute sense of harming another person's body or property.
Declaring copies of data to be scarce is a convenient convention (for capitalists), because it allows data itself to be sold exactly like the scarce vessels (records, cassettes, books) which used to be sold as a stand-in for the data itself, and like other scarce things (bread, oil).
Without scarcity, capitalism cannot function. You cannot sell air (yet, but see "Spaceballs") because there is air all around so you can't run out of it. You can't sell "fours" because you can create as many fours as you need with a pen and paper or keyboard. However, the capitalists have successfully legislated that certain sequences of data are scarce. If this fictional scarcity of data collapses many people would be unable to make money selling non-scarce "virtual" things like ideas and data, and would have to fall back to selling real things. And they'd be unable to "allocate" these goods to people because everybody could just have whatever they wanted. Very good for people, very bad for capitalists.
Much like the attempt to legislate Pi (a half-true half-myth btw), this 'fiat scarcity' (just coined that) is doomed to ultimate failure. If not in the law, in practice. Like the War on Drugs.