Hacker News new | past | comments | ask | show | jobs | submit login
Download a Fresh Copy of The Pirate Bay, With Permission (torrentfreak.com)
31 points by Libertatea on Feb 20, 2013 | hide | past | favorite | 15 comments



Great work! I have some questions if you don't mind.

I kind of wonder why you chose XML, and the page doesn't really elaborate on that much. Wouldn't CSV be much smaller (XML is pretty noisy format, though the 7z can probably take care of that) while at the same time more searchble? You can search CSV with just grep, with XML it gets little more difficult, and CSV is also more simple to load to a database in case someone wanted to make TPB web mirror or something.

EDIT: sorry if I seem annoying :) but I had something like this on todo-list for a while, and you did my work for me, so I kind of wonder why you made different decisions.


Well, it could be in CSV, yes, but

- I was saving the comments, too, so right now, a torrent element has comment elements as sub-elements; in CSV would probably need to be two tables/files instead of one, which would get a little more complicated

- I didn't want to think so much about escaping newlines (that are in the comments and infos) and the delimiters, right now I was only escaping < to &lt; and > to &gt;.

- It was easier to check whether the script is working correctly or not (probably the top reason :) ).

- I thought parsing XML would be easier than parsing some other format, since there are tools already available for that.

But well, if somebody will really want CSV version he can easily transfer that from XML...

ad edit: no problem.


"I didn't want to think so much about escaping newlines (that are in the comments and infos) and the delimiters, right now I was only escaping < to &lt; and > to &gt;."

Then at the very least you need & -> &amp; too.

Alternatively, wrap the whole comment in CDATA, though don't forget to replace ]]> with ]]>]]&lt;<!CDATA[ or something like that so you don't get spurious CDATA closures. (There may not be any in there now, but there will be once people hear about you doing this...)


Oh. Thanks for pointing that out, I think I didn't escape & -> &amp;.

I hope it won't make trouble to someone.


Ah, that makes sense. I was looking mostly at the poor file, so I haven't considered comments and such.



Thank you for doing this.


it's more for fun than anything


It's not very useful given that there are no comments on your "original submission".


Look at his profile; he's the creator of the original submission.


Ooooh, I missed that. He should have linked his blog directly then, not the empty comment section. But point taken.


It would be very cool (and useful) if they put that in some kind of version conrol (git, svn, etc).


I second that. This way if pirate bay got down everyone will have the last version of the site, not the 3/4/5 months ago version.


I think it will only go down if the people keeping it running stop bothering.

They have stable network access covered via the Swedish pirate party. This party (which have two of Sweden's twenty seats in the European Parliament) have by doing solid work won a lot of legitimacy in media over the past two years or so, I think.


Yeah, think of the hundreds or even thousands of up-to-date mirrors we could have if they did something like that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: