Archival of 'cloud' data is an issue I've been thinking a lot about lately. People are putting so much of their lives on the internet these days without generally giving too much thought to permanence and availability. Twitter's only been around a few years and people are already running into retention issues. The tweet from the article will still be valuable 30 years from now - will twitter even be around then? Will all the tweets from the current system have been migrated to whatever tools we're using in the future?

I've been playing around with a project to locally archive a bunch of data sources that interest me (email, instant messaging logs, Twitter, SMS, some blog and social news comments) in a straightforward and open data format. Unfortunately this type of tool might be something that most people don't realize they need until it's too late.

It's not local, and not in an open data format, and not free, but you might like http://www.backupify.com/, they have a nice list of services that they can back up for you (haven't used it, so wouldn't know how good they are in practice).

Backupify works, but the formatting is odd. Your Twitter stream is packaged as a PDF book.

