Download a Fresh Copy of The Pirate Bay, With Permission

glogla · on Feb 20, 2013

Great work! I have some questions if you don't mind.

I kind of wonder why you chose XML, and the page doesn't really elaborate on that much. Wouldn't CSV be much smaller (XML is pretty noisy format, though the 7z can probably take care of that) while at the same time more searchble? You can search CSV with just grep, with XML it gets little more difficult, and CSV is also more simple to load to a database in case someone wanted to make TPB web mirror or something.

EDIT: sorry if I seem annoying :) but I had something like this on todo-list for a while, and you did my work for me, so I kind of wonder why you made different decisions.

runn1ng · on Feb 20, 2013

Well, it could be in CSV, yes, but

- I was saving the comments, too, so right now, a torrent element has comment elements as sub-elements; in CSV would probably need to be two tables/files instead of one, which would get a little more complicated

- I didn't want to think so much about escaping newlines (that are in the comments and infos) and the delimiters, right now I was only escaping < to < and > to >.

- It was easier to check whether the script is working correctly or not (probably the top reason :) ).

- I thought parsing XML would be easier than parsing some other format, since there are tools already available for that.

But well, if somebody will really want CSV version he can easily transfer that from XML...

ad edit: no problem.

jerf · on Feb 20, 2013

"I didn't want to think so much about escaping newlines (that are in the comments and infos) and the delimiters, right now I was only escaping < to < and > to >."

Then at the very least you need & -> & too.

Alternatively, wrap the whole comment in CDATA, though don't forget to replace ]]> with ]]>]]<<!CDATA[ or something like that so you don't get spurious CDATA closures. (There may not be any in there now, but there will be once people hear about you doing this...)

runn1ng · on Feb 20, 2013

Oh. Thanks for pointing that out, I think I didn't escape & -> &.

I hope it won't make trouble to someone.

glogla · on Feb 20, 2013

Ah, that makes sense. I was looking mostly at the poor file, so I haven't considered comments and such.

runn1ng · on Feb 20, 2013

my original submission

http://news.ycombinator.com/item?id=5244196

t0 · on Feb 20, 2013

Thank you for doing this.

runn1ng · on Feb 20, 2013

it's more for fun than anything

simias · on Feb 20, 2013

It's not very useful given that there are no comments on your "original submission".

nitrogen · on Feb 20, 2013

Look at his profile; he's the creator of the original submission.

simias · on Feb 20, 2013

Ooooh, I missed that. He should have linked his blog directly then, not the empty comment section. But point taken.

felipebueno · on Feb 20, 2013

It would be very cool (and useful) if they put that in some kind of version conrol (git, svn, etc).

islon · on Feb 20, 2013

I second that. This way if pirate bay got down everyone will have the last version of the site, not the 3/4/5 months ago version.

mwfj · on Feb 20, 2013

I think it will only go down if the people keeping it running stop bothering.

They have stable network access covered via the Swedish pirate party. This party (which have two of Sweden's twenty seats in the European Parliament) have by doing solid work won a lot of legitimacy in media over the past two years or so, I think.

felipebueno · on Feb 20, 2013

Yeah, think of the hundreds or even thousands of up-to-date mirrors we could have if they did something like that.