Internet Archive Starts Seeding 1,398,875 Torrents (torrentfreak.com)
203 points by vibrunazo on Aug 8, 2012 | hide | past | web | favorite | 40 comments

This is among the most brilliant digital preservation strategies I've encountered. You want to make sure your material lasts as long as possible? Get it out on bittorrent. End of story.

We wonder what books scholars will write about 500 years from now. It won't be what's popular, it'll be whats pirated.

Er... bittorrent is certainly not the end of the story. Anyone who's ever used a torrent tracker knows that access to data on a torrent network is only as good as the network's collective will to share data.

If everyone just hops on, downloads the data and disconnects without continuing to serve as a source for the data then the Internet Archive wouldn't be terribly better off than just hosting direct downloads (they'd get the benefit of leechers sharing bits between each other during the download process).


Well, we'll see.

I do not understand your last comment? None of the material on archive.org is pirated. It is all freely available, legally...

He does not mean it as a comment towards archive.org, he means information in general. With increasing threats to free speech and censorship, he suggests that the information that's not popular - or, censored/manipulated/bubbled by mass media/governments/etc. - can only survive through decentralized means of storage and distribution, like BitTorrent.

In essence, he argues that BitTorrent can be a fantastic way of digital preservation, which is the goal of archive.org, and I suspect he uses the word 'pirated' interchangeably with 'torrented', since that's what BitTorrent is commonly associated with.

Bittorrent and other decentralized systems are not good choices for unpopular data. As another commenter pointed out unpopular torrents will only be seeded by archive.org.

I'd argue that bittorrent is still better for unpopular data then direct downloads.

I think BT makes downloading and sharing accessible enough that more people would be encouraged to download more than they normally would and it is for sure easier for someone to see a request for a reseed for an unpopular torrent and load it up onto their tracker for a few hours than it is to try and rehost it.

Surely if one other person is downloading the same torrent at the same time it's better (potentially) than a direct download, as part of the torrent can be shared in the swarm..

As long as archive.org is seeding everything all the bits are there..

Something like Freenet would be better for this I think, since the "what" you share is limited by how much space you give the datastore, unlike bittorrent where you consciously decide what and what not to keep seeding.

It still has a problem with non-popular content being hard to find, but not nearly as much.

That only works for popular stuff - or at least as long as someone is seeding it. Once no-one seeds something, it's gone.

Glad to see more legitimate uses of bitorrent which currently are basically getting new linux distros and the odd free indie movie.

Blizzard also uses bittorrent to distribute their games and all patches for their games. I'm not sure about other game companies, but I'd be surprised if Blizzard was the only company doing this.

League of Legends also uses it in their download client.

I bet you use bittorrent more than you realize. There are at least several, that I know of, content distributors that package their own bittorrent client in their "downloader" (think what happens when you install chrome).

Chrome uses Bit torrent on its installer?

I'm not sure. I mean the types of installers where it's "download our downloader", like what chrome does.

Libre Office uses torrents too.

This. I love the concept of peer to peer services like bittorrent. I just wish they were used more often for legitimate downloads.

The Internet Archive is the most important project on the internet today, preserving culture. I've already gained so much from it myself. They work hard to make sure that the artifacts of our culture are not lost. They are totally winning, and these torrents are an awesome step in that process.

Agreed. In particular I love their public domain movie archive. It is a crime that with extended copyrights new films won't be added to it any time soon.

are people aware that the internet archive includes netlabels with a huge amount of free music? http://archive.org/details/netlabels (there is some decent chilean electronica at http://archive.org/details/pueblo_nuevo for example; clinical have experimental jazz http://archive.org/details/clinicalarchives (some of it is a bit freaky for me, but their collections are ok))

trying to work out if it's available via torrent now...

Yes! A lot of the audio is available via torrent now [1].

"Decent chilean electronica" nice, but the Internet Archive has an impressive collection of live music; you can thank the Grateful Dead for letting people tape their concerts [2].

[1] http://archive.org/details/bittorrentaudio

[2] http://blog.archive.org/2012/03/27/sharing-works-100000-conc...

But I won't thank GD for stream only soundboards:( (1) I'm glad I saved all my crispy SBDs from the glory days of bt.etree.org.

My recommendation for awesome singer/songwriter with a lot of content (SBDs) on archive.org, Danny Schmidt(2). If you live near Austin you should definitely try and see him live. When Danny was asked if it was okay to put his material on archive.org he said:

"Sure, I'm fine with your posting the recording on there. Do I need to do anything to formally give my permission -- or is this email enough?

Thanks for thinking of me -- and thanks for helping spread my music around to new ears. I really appreciate that. And thanks for the heads up about Archive.org.

All the best -- Danny"

(1) http://archive.org/about/faqs.php#245

(2) http://archive.org/details/DannySchmidt

i'll tell my grandpa about the dead stuff, thanks. meanwhile, how can you tell if a particular track is on a torrent? or how can you to search for files on torrents?

ah, ok, so if it's a torrent it's listed as such in the download links.

and it's the etree collection (the concerts) that seems to be what is available. you can search with "collection:etree" in the search. so "bluegrass AND collection:etree", for example.

If you were not such a good grandson I'd downvote you for making me feel old with the grandpa comment:) While you are at it make sure you thank him for being such a good music role model. I'm 32 years old, no kids or grandkids and still love listening to Jerry...

if there's something you want torrentized that's not, feel free to post here too and I can poke it.

edit: assuming it's in one of the collections we're currently testing on. (all the opensource_* ones) (etree and netlabels _should_ be all torrentized [we hope])

hi, ok, so here's an example (taken at random from the pueblo nuevo archive i linked to above) - http://archive.org/details/pn014

i don't particularly want that album (i probably already have it thanks!) but it doesn't mention a torrent that i can see (the etree files have torrent in the links). and that's the same for all netlabel files (not many) that i have checked.

in comparison, http://archive.org/details/SPBB2006-04-29 (bluegrass etree!) does have a bittorrent link (bottom right).

i don't understand your opensource_ comment, but pueblo nuevo is a netlabel. so it seems to me that etree have torrents, but netlabels do not.

(this is just in case it helps / it's likely i am confused / no criticism intended / thanks for doing all this)

Thanks for linking to the electronica. I've been looking for some new stuff.

Right now the most popular archive.org torrent is a collection of My Little Pony porn from /r/clopclop.

I'm genuinely impressed that we, as a society, have already managed to create a 2.31 GB collection of My Little Pony porn. According to Wikipedia, the first episode aired 668 days ago, so that's an average of 3.7 MB per day. Imagine, if you will, a 343 baud modem continuously sending cartoon porn of talking ponies who are friends. This is one of our species' more embarrassing amazing achievements.

It's two of the top three in the most snatched list, though!

archive.org is one of the most brilliant things I've ever seen on the internet. I'm amazed since I first stumbled over it about 12 years ago or so. Free knowledge to everyone, that's the way to go to make the world a better place.

Just to be clear, the top shares right now are Asimov's Foundation trilogy (always a good read), a bunch of films, and My Little Pony pornography.

"Free knowledge to everyone" -- we're not quite there yet, and I would stick with the various OpenCourseWare-type free-university-education initiatives. The Internet Archive as I understand it is aiming more to be a library than a synthetic learning resource.

My Little Pony pornography? What the ...?

I was referring to knowledge not as in education but as in the corpus of all media. For learning resources universities will almost always be a better bet. That's what they're for after all (and research, obviously).

"My Little Pony pornography? What the ...?"

Rule 34 of the Internet. If it exists, there will be porn of it. In this case, the new series My Little Pony: Friendship is Magic has received a lot of attention due to Bronies, who are men and women outside of the target audience, yet fans of the show. A minority of them create and consume this kind of content.

Awesome! Though I wish archive.org was redesigned to make it easier to browse files. Too much text, too little video/image previews.

There should be a formal way of coupling a torrent file with a bitcoin address, some way of donating money to the creator but trusting that the address is the correct one.. maybe when content is first released it's registered somewhere alongside a bitcoin address and this shows up on the creator's website too.

Does anybody have a magnet URI? The site is overloaded right now.

Are there any good ways to contribute bandwidth and storage to this without manually choosing and downloading a bunch of files? I would love to dedicate a few 100gb of storage to mirroring some of these files automatically (I have plenty of bandwidth to go with that too).

