There are some serious problems with their Picasa export (possibly others, I only tried Picasa). First, a good number of my photos simply didn't download - I only got 1328 of my 2283 photos. Second, I uploaded all my photos at their native sizes to Picasa, but the photos I got from Takeout were resized to around 1200x1600. However, my originals are kept by Google and I can download them individually through the Picasa interface, so they must be resizing them specifically for the download.
The first strikes me as a bug that will eventually be fixed, but the second seems to be an unfortunate design choice - Google can save money on bandwidth if they downsize your photos, and if you know that what you get when you export is downsized you're less likely to do it. I feel sorry, though, for people whose only copy of some photos is in Picasa.
Is the amount of bandwidth saved by Google in doing this really going to be all that cost efficient? I would chalk this up to a bug or poor design choice as well rather than something with the bottom-line in mind.
There are things I would _really_ like from google - where I fit in to their adsense categories, for example. (I'm not sure whether this data is anonymised or not)
I just downloaded this data, all the data they would give me. I got my email address and my name back. I also got a handful of contacts. I consider myself a reasonably heavy google[-owned projects] user.
This, for the most part, is useless information.
What I would like is exactly what they would give to the government (obviously, after confirming fifty times that it's actually me they're giving it to).
Regardless of all that, I think this should boost Google's image. Very smart to focus on social privacy, when that's Facebook's one downfall.
There are things I would _really_ like from google - where I fit in to their adsense categories, for example.
Absolutely. I'd love to be able to get all of my Analytics data, for example. Retrieving some subset of the data report by report is nowhere near the same thing.
This is possible via the Google Apps Email Audit API[1], which is accessibly only by paid Google Apps accounts. The API submitted export request takes usually several hours to complete and results in a series of large mbox files (1 GB chunks I think) containing the entire mailbox, including trash (if requested). However some meta information such as labels/folders is not present, but the results can easily be concatenated into a single mbox file containing the entire contents of a Google Apps email account.
Seriously, it's an absolutely perfect match for exporting everything, and it's easy to set up, and you get it in the format of your choice. Infinitely easier than having them support only formats they have time to write exporters for. It's also easily resumable, unlike the 8GB zip file I'd otherwise have to download.
IMAP can be used to download the raw message bodies. hierarchical structure (aka threading) comes from the parsing of Message-ID, References, In-Reply-To MIME headers contained in the bodies.
I mean directories. GMails IMAP interface maps each tag to a directory but that is not exactly the same. Esp. when your extracting algorithm doesn't know about this, you end up with a lot of duplicates.
Unfortunately it is still pretty hard to get any of your Google Talk chat logs out. Most of the solutions are from years ago and no longer work. I ended up being able to download an sqlite database of my chat logs using Google Gears.
I already do. I prefer Pidgin's UI anyways. (Also, the chats are logged by gmail anyways, so if I want to use Google's search for the logs over grep, I still have that option)
I gave this a spin as well and was fairly happy with it, at least as far as it goes. Takeout has a fairly limited scope, but of course the DLF (greatest project name EVAR) has done more good than just this.
Why is this separate from the Data Liberation Front?
Another 20% project destined to be a quick PR hit and never actually become usable enough to fulfill its promise of giving the user control over his own data?
It doesn't seem to work for me. I tried to create an archive for all my data and it just shows 'Files: 0, Size: 0B' on everything.
Edit: Tried a second time and it works now.
But as it was said, the most important thing I miss is a way to extract the mails (in the way they are stored in GMail with all tags and other meta information).
Tags are the most important thing for me. If this is not possible, any extracting is really not worth it for me.
To extract tags right now, you could search in every other directory (and assume that this are all available tags) for the same mail and get by that error-prone algorithm all tags of a message.
Maybe also missing is some meta information about the spam level and/or importance level. And meta information why Google thinks that some message is important to me (on the web interface, it sometimes shows a reason like 'because of recent conversation with this person' or so).
And maybe more. I would just like to get everything.
> et by that error-prone algorithm all tags of a message.
Shouldn't be error-prone. All your emails should have a Message-Id: header. While not impossible, I've never heard of issues due to Message-Id collisions...
You could write something to sync your emails to a database, then when it encounters an email in a 'folder' it can just add it to the email as a tag. I don't know of anything that currently does this, but it's not like the technology (and information) isn't there.
> And meta information why Google thinks that some
> message is important to me (on the web interface,
> it sometimes shows a reason like 'because of recent
> conversation with this person' or so).
Of what use is this outside of Google, though? IIRC Google is the only one doing something like this. It's not like you could import that meta information into Outlook/Thunderbird/Mail.app.
Error-prone because it adds stupid complexity and many additional steps for just getting some simple information which Google probably has stored already along with the mail. With additional complexity, you always get further things in your algorithm which could go wrong. Such unnecessary complexity should always be avoided.
This is broken for me. I choose "Contacts and Circles", then "Create archive". It then fails. I try again, it fails a bit later. Again: it succeeds with no obvious way to download the file. I try again and get "Download quota exceeded". The file which I haven't downloaded isn't even a megabyte.
Also, although there isn't any manifest or timeline file, each .html file for Buzz has a last modified date that corresponds to when it was created. In addition, in the file itself, it has a timestamp.
This seems more than reasonable. Sure an XML/JSON timeline might be nice, but it wouldn't be human-readable either.
Reader shared items only appear in buzz if you've "connected" reader to buzz. I don't have it connected because people I know hate seeing stuff in buzz they've seen elsewhere.
Plus, I've got hundreds of reader items I shared before buzz existed. I don't think google even keeps them though they imply that they do.
I just used "save web page, complete" in firefox on my buzz feed and got something better than what the takeout buzz download gave me.
The Cloud^TM - because all filesystems should be stochastic. Is my file there? Maybe!! We don't know, honestly. Your query timed out? I guess the system doesn't know either. Try again later!
What use is an exported that that cannot be easily imported to other system. Google should provide connectors to other popular tools e.g. export data from picasa to flickr or smugmug etc. similarly export to hotmail, yahoo mail etc.
I haven't looked at the exported format, but I assume it's readable. But surely other than that, import tools for other services should do their share of the work.
Besides not understanding hate/low value towards parents' Anti-Data-Lock-in post (On-point with "Avoiding vendor lock-in for computer software": http://en.wikipedia.org/wiki/Vendor_lock-in), I am unsatisfied and frustrated my high-five post gets downvotes. With a certain humility. It's like getting pushed over for waving your hands (or voting).
I passionately support freeing data because of my work and info I want to share freely being archived|trapped away as proprietary property/formats.
I hope a little phrase like Anti-Data-Lock-in can transcend copyright and privacy. Maybe a meme for not losing your data/yourself, by being locked out.
The first strikes me as a bug that will eventually be fixed, but the second seems to be an unfortunate design choice - Google can save money on bandwidth if they downsize your photos, and if you know that what you get when you export is downsized you're less likely to do it. I feel sorry, though, for people whose only copy of some photos is in Picasa.