
Magnet-hashes for all torrents on The Pirate Bay: 164 MB - Zirro
http://thepiratebay.se/torrent/7016365/The_whole_Pirate_Bay_magnet_archive
======
chimeracoder
This gets very meta very quickly.

If linking to copyrighted data 'should' be illegal (SOPA), then what about
descriptions of that data that are sufficient to identify the original, but
not reconstruct it (magnet links)?

And if _those_ were made illegal, then what about descriptions of those
descriptions? You can recurse infinitely on this.

Beyond mere amusement, after just one or two recursions, you get to the point
where it would be difficult to write a law that would criminalize magnet links
without also criminalizing people who link to a Sparknotes-like summary or
commentary for a piece of media.

~~~
Cushman
That's what this is already, eh?

A description of copyrighted data would be the torrent file, which content
producers would probably like to argue are infringing.

A magnet link is a hash of the torrent file, so it's already two steps
removed.

Of course, the Pirate Bay magnet dump is _itself_ a torrent, so it's a hash of
a hash of a hash of copyrighted data.

And that torrent itself has a magnet link:
938802790a385c49307f34cca4c30f80b03df59c is a hash of a hash of a hash of a
hash of copyrighted data. (In the MP/RIAA's ideal world, I've just committed
criminal copyright infringement with damages reaching into the $billions.)

Theoretically, the Pirate Bay dump could include the torrent for the Pirate
Bay dump, and be an infinitely recursive description of itself... but that's
probably an intractable cryptographic process.

~~~
katovatzschyn
And, continuing back from hexadecimal, we have the splendid:

    
    
        842254760070427756125843951997302119090555319708
    

in base 10 for the magnet link of the torrent containing the list of magnet
links for the pirate bay's torrents. Shall we add to the list of illegal
numbers?

Perhaps its prime factorization also belongs on that list.

    
    
        2^2×3^2×23×89×24733×462111028392156064869667382167578133253
    

Best to include the Roman numeral version of the number, as it's equally
illegal.

<http://i.imgur.com/JROu8.png>

It would also be wise to outlaw the URL I've just linked, and perhaps also the
combination of letters "JROu8" as they also contain the information in
question given proper context.

<http://en.wikipedia.org/wiki/Illegal_number>

~~~
robot_cowboy
So, going back to hex and turning it into a flag just like the Free Speech
flag in the wiki link, would that make this image:

<http://i.imgur.com/OHg3l.png>

the most copyright-infringing flag of all time?

~~~
Natsu
What will they do if we find a crazy enough way to make, say, the number 5
infringing? And who owns 5, anyhow?

~~~
bwarp
Even worse, they will make pointers illegal.

------
allisfine
Hello, I am an author of the scrape. I did it more to try it, but who knows,
maybe it will be useful to someone.

I went trough the description pages like <http://thepiratebay.se/torrent/$i>
by increasing the $i and saving the magnet if pirate bay didn't return 404
error. I went trough the pages as unlogged user, though, so that might be the
reason why I got only 1.5m torrents.

I didn't know pirate bay has hidden porn torrents; there is TONS of porn in
the scrape already.

The script is in perl, I will post it to pastebin in a moment.

edit: allright, the script itself is here <http://pastebin.com/8RXXthXB>

as you can see, it's not very complicated.

~~~
redthrowaway
I think it's a great idea, and a nifty hack.

It might be an to release a diff against this once a week, and write a quick
script to grab it, keeping the list up-to-date.

~~~
allisfine
I am thinking of releasing new versions once a week and putting the hash of
the torrent of the newest version on some public site. (Say, some twitter
account.)

But it would still be more proof of concept than really anything useful - the
comments and descriptions ARE important.

edit: More I am thinking about it, the less useful it sounds.

First, the information about seeders vary constantly, especially with the new
torrents.

Also, it STILL depends on single point of failure - the Pirate Bay itself. If
TPB will be down for any reason, I will have no place to scrape this from and
it will all fall apart anyway.

Plus, I think Pirate Bay itself should make dumps like this. It would probably
be much better for their database anyway :)

~~~
abailin
I like the idea of a weekly twitter update with the master magnet hash. I feel
like the purpose would not be the usefulness of the string of chars, but more
to prove a point.

------
joejohnson
The Pirate Bay front page claims 4.187.907 torrents. But, this 164MB is only
~1.5 million torrents. Is the discrepancy from exclusion of the porn torrents?
I'm guessing this guys scrape missed them; you have to be logged in to TPB to
see them.

~~~
schiffern
It does contain adult content.

The IDs are sequential, but there are substantial gaps. Removed spam torrents,
most likely.

------
lucb1e
When TPB had to be blocked in the Netherlands and they switched to
recommending magnet links instead of torrents (pretty close after each other),
I thought someone would have done this sooner. But it's here now, and proxies
do their job just fine ^^. (I couldn't load the page directly as it's blocked
here.)

~~~
denysonique
magnet of the magnets:
magnet:?xt=urn:btih:938802790a385c49307f34cca4c30f80b03df59c&dn=The+whole+Pirate+Bay+magnet+archive&tr=udp%3A%2F%2Ftracker.openbittorrent.com%3A80&tr=udp%3A%2F%2Ftracker.publicbt.com%3A80&tr=udp%3A%2F%2Ftracker.ccc.de%3A80

~~~
gmaslov
Should be titled "The whole Pirate Bay magnet archive except this torrent" :-)

I think a magnet link is based on the hash of the contents too, so it might be
an interesting problem to include the torrent's own magnet link in itself.

~~~
haakon
Son, in this house we respect Russel's paradox!

------
devindotcom
Fun discussion here, guys. I've posted my thoughts on it at TechCrunch here:
[http://techcrunch.com/2012/02/08/is-a-hash-of-hash-of-a-
torr...](http://techcrunch.com/2012/02/08/is-a-hash-of-hash-of-a-torrent-of-a-
torrent-of-copyrighted-data-copyrighted/)

Hope the community doesn't think I've hijacked the thread for my own purposes.
I just thought it was an interesting little discussion and wanted to point it
out.

------
cabirum
I wonder if some self-updating mechanism could be implemented in magnet links.
Something like additional signature part in the magnet url so the owner could
inform other peers that content is changed and need to be updated.

~~~
ak2012
how can this happen i wonder

------
haakon
Got it down to 70 MB with lrzip. Any better?

~~~
cinch
68MB with "xz -9 --extreme" :P

~~~
schiffern
70841044 bytes (67.56 MB) by running it through "sort" first.

Bonus: it's in chronological order.

~~~
allisfine
it's in sort-of chronological order originally (partially ordered is the
correct term, I guess?)

but not 100%

~~~
schiffern
I don't know how it was ordered originally. I imagine just whatever order the
scraper returned data.

putting it throughs sort ordered by tpb's id, which I imagine are assigned in
chronological order. The low-numbered torrents seem to be from 2004.

------
Bogdanp
I wrote <https://github.com/Bogdanp/Pirate> as a fun little exercise and
thought some of you might find it useful.

~~~
allisfine
Great!

Just one thing: I see you are splitting by |, and some torrents (very few, but
some) have | in their name (I didn't bother with escaping that).

~~~
Bogdanp
I've accounted for that by grabbing all the other fields before grabbing the
title. `python pirate.py -l | grep '|'` seems to yield correct results :).

<https://github.com/Bogdanp/Pirate/blob/master/pirate.py#L27>

------
yason
I was inspired to _steal and pirate the above magnet link_ into this quick
vigilantist internet liberation site: <http://yason.kapsi.fi/piratebay.html>.
I would be positively surprised to catch the interest of even a single MAFIAA
party, though.

------
tiku
i'm in holland from Ziggo, so cant see it.. :( but good news!

~~~
vidarh
Does www.br3in.nl work to access it?

~~~
c1sc0
I didn't know about that one yet, but that's hilarious!

------
xpose2000
Awesome idea. Magnet links will be around as the standard for some time it
seems :)

------
its_so_on
sorry if i'm being daft, but can't you get this down to a magnet link itself?
THe page linked does just that:
magnet:?xt=urn:btih:938802790a385c49307f34cca4c30f80b03df59c&dn=The+whole+Pirate+Bay+magnet+archive&tr=udp%3A%2F%2Ftracker.openbittorrent.com%3A80&tr=udp%3A%2F%2Ftracker.publicbt.com%3A80&tr=udp%3A%2F%2Ftracker.ccc.de%3A80

but wikipedia has an example of a magnet link like this:

magnet:?xt=urn:sha1:YNCKHTQCWBTRNJIV4WNAE52SJUQCZO5C

so could we get the magnet link to ALL of the magnet-hashes for ALL torrents
on the Pirate Bay tdown to, what is that, 35 characters oplus the magnet cruft
"magnet:?xt=urn:sha1:"?

~~~
blhack
Hi, would you mind editing your post to remove the really really long string
of text? It's breaking the page.

~~~
Zirro
Which browser are you using? I am not seeing this no matter how much I resize
the window. (I use Nightly, Firefox alpha)

Edit: To clarify, the line/link he posted splits across three rows
automatically.

~~~
SquareWheel
Works fine for me on Chrome Stable x64.

~~~
Lazare
It breaks the page on Chrome 16.0.912.77m on Windows 7 x64.

~~~
SquareWheel
The stable build is on 17.0.963.46 m. Been a while since you've restart your
browser?

~~~
abhaga
Breaks with Chrome 18.0.1017.2.dev on Lucid as well.

~~~
xp84
Still broken on Chrome 28.

------
mahannay
Maybe I'm being a bit black-and-white on this, but while the meta and the
philosophy is interesting to talk about, no one is mentioning the morality.
Stealing is wrong. You're taking someone else's work and not compensating them
for it. I think it's sad that we're all so worried about the law when, in
reality, you shouldn't pirate music for the same reason you don't steal a
candy bar from the grocery store or snag 5 dollars out of your coworker's
wallet or hack into Dropbox to get extra storage for free. It's wrong.

~~~
redthrowaway
I'm getting a bit sick of the 'piracy is theft' nonsense. It isn't. Nobody is
deprived of their possession. Me copying your song doesn't result in you no
longer having a song.

Piracy is far closer to plagiarism, but even then only to a point. In
plagiarism, one attempts to pass off the work of another as one's own. In
piracy, one simply copies another's work for one's own use. They are
fundamentally different.

This is why piracy is as prevalent as it is: it simply is not as bad as
plagiarism, let alone theft. Most people have an intuitive understanding of
this, and those who pirate do so without the cognitive dissonance that comes
with acting against their moral code. It might be "wrong" in an abstract
sense, sort of like lying on your resume is "wrong", but it's not wrong in the
absolute sense of harming another person's body or property.

~~~
xp84
The real problem is that piracy is an attack on our capitalist society's
framework for handling a new type of good--a non-scarce good. Copies of data
are not scarce, yet society has decided to declare them to be scarce, by law,
on pain of jail time.

Declaring copies of data to be scarce is a convenient convention (for
capitalists), because it allows data itself to be sold exactly like the scarce
vessels (records, cassettes, books) which used to be sold as a stand-in for
the data itself, and like other scarce things (bread, oil).

Without scarcity, capitalism cannot function. You cannot sell air (yet, but
see "Spaceballs") because there is air all around so you can't run out of it.
You can't sell "fours" because you can create as many fours as you need with a
pen and paper or keyboard. However, the capitalists have successfully
legislated that certain sequences of data are scarce. If this fictional
scarcity of data collapses many people would be unable to make money selling
non-scarce "virtual" things like ideas and data, and would have to fall back
to selling real things. And they'd be unable to "allocate" these goods to
people because everybody could just have whatever they wanted. Very good for
people, very bad for capitalists.

Much like the attempt to legislate Pi (a half-true half-myth btw), this 'fiat
scarcity' (just coined that) is doomed to ultimate failure. If not in the law,
in practice. Like the War on Drugs.

