

Google still hasn't removed "deleted" private Docs data from 2007 - e_proxus
http://www.line-of-reasoning.com/issues/privacy-issue-google-docs-seems-to-not-delete-but-only-hide-documents-when-the-trash-is-emptied/

======
magicalist
I came across this old google explanation[1], but I'm not sure it (or this
blogpost) are very relevant to today. Google claimed that they kept the image
around because it might have been referenced in another site or something,
even if the document was deleted, and they appear to still be keeping those
old images around, I guess. The claim also seems to be that a cryptographic
hash url is as unguessable as a password secured one, though it's not stated
directly.

In any case, I actually tried it myself (gasp) with a new doc. Dragged in an
image, inspected it to find the URL, deleted the doc, and then permanently
deleted the doc again from the trash (I assume it hangs out there for 30 days
like with gmail's trash). The image stuck around for maybe 15 minutes, but is
now gone, so I don't think this applies to docs today, but I can't find any
help document that says either way.

[1] <http://googledocs.blogspot.com/2009/03/just-to-clarify.html>

------
coderdude
In the title, "from 2007" means the article was published on July 15th, 2007.
This probably would have been pointed out sooner but the site has remained
down for some time now. I'd imagine it has been down since at least the moment
it hit the front page (having some faith in the initial up-voters here). That
said, this would have made an excellent trap for people who vote based on
title alone.

Edit:

veemjeem pointed out that he can see the site just fine, which prompted me to
try it from another network. I can access the site from my connection through
Verizon but the server times out through my AT&T landline connection.

~~~
ontheotherhand
>> In the title, "from 2007" means the article was published on July 15th,
2007

And so is the image that is part of the private document which supposedly was
deleted, but can still be accessed even today, more than five years later.
Therefore the content of that article, and the evidence contained therein,
actually matches the title perfectly. So what "trap" are you talking about?

~~~
coderdude
The trap would serve to ensnare people up-voting an article based on the title
since they cannot actually access the content to read it. It's of no
consequence whether the title just happens to match what is found in the
content (as far an actual trap would be concerned).

~~~
veemjeem
I can read it just fine over here. I don't understand why you think it's a
trap?

maybe it's just your internet connection. perform a traceroute against the
host to see who's at fault.

~~~
coderdude
My connection does seem to be the problem here. I wonder if it's something
that only affects me or if anyone else here is having the same issue
connecting to this server. Traceroute times out.

~~~
veemjeem
Where does it timeout on? You can probably find out the router that is the
issue. If it times out before it gets beyond your ISP, chances are it's your
connection. My traceroute looks good here, so it's probably not their
webserver.

------
s_henry_paulson
A devil's advocate could also say that they didn't have everything set up
properly back then, and now that they have proper security in place, that it
isn't possible to apply the new security to old documents because the options
didn't exist at the time the documents were created.

~~~
ontheotherhand
It's not possible to delete those documents because they didn't have
mechanisms in place to delete them back then? It's not possible to physically
delete documents that have been flagged as deleted and emptied from trash?

How about "no"?

~~~
s_henry_paulson
There's no evidence that the document wasn't deleted.

It's not the document that's being accessed, but an embedded image within the
document, accessed by some unique identifier.

It's entirely possible they just have this jpg saved with it's identifier, but
don't have good information about the related documents that pointed to it,
hence not knowing that it should be deleted.

I mean think about it, google docs STARTED in 2007 and was based entirely on
products built by companies they acquired.

I am 100% positive they didn't have everything set up properly in 2007.

~~~
ontheotherhand
Then simply look through all non-deleted documents and see which files are
still referenced and which ones are orphans. Delete the orphans. Done.

~~~
magicalist
If the "working as intended" behavior is that these images won't be deleted
since they can be linked to from elsewhere on the web, as appears to be the
case (see my link to the old blog post above), they actually can't delete
these old images.

~~~
ontheotherhand
"they actually can't delete these old images."

They can, it just would break those old links. A dumb decision ages ago
doesn't force you to stick to it as you seem to imply.

------
DanBC
Matt Cutts has responded in the comments of the linked article.
([http://www.line-of-reasoning.com/issues/privacy-issue-
google...](http://www.line-of-reasoning.com/issues/privacy-issue-google-docs-
seems-to-not-delete-but-only-hide-documents-when-the-trash-is-
emptied/#comment-25))

I think Google make a reasonable point; a bit daft, but still.

Has anyone tried fusking the URLs?

------
beaker52
I'm not surprised. I'd doubt few readers here would be. That doesn't make it
acceptable though.

I fear more for the people who've had private photographs 'automatically'
uploaded to Google+ via their mobile devices. Even if they weren't posted, I
bet they still exist somewhere in Googleland, just waiting for that Google
intern to run them all through Google's 'safe search' filter in reverse.

------
crististm
When Gmail did not let me create a mail folder with the same name with one I
just deleted I knew they don't tell me the whole truth.

Last time I checked, it looks like they fixed this issue.

~~~
pirroh
It's not about "not telling you the truth"--it's all about the inherent
complexity of distributed systems. Might sound counterintuitive, but deletions
are not easy to implement, and are very often deferred (obviously this doesn't
apply to the image mentioned in the article).

------
PanMan
Could it be that the image hasn't been deleted as it's referenced from this
article?

------
erez
Google, like every other company, "forgets" that when its users delete
something, they want it purged from all the servers, not just marked as
"deleted" in the database. While the practice is very common, it shouldn't be
used when it comes to your customers, even if those are basically the product,
like in Google, and others, case.

------
e_proxus
I'm a bit interested in the legal situation here. I think at least in Sweden,
if you asked Google to delete a private document including images (maybe
personal etc.) they'd be legally obligated / forced to do so. Anyone know if
there are any privacy laws regarding this?

------
duskwuff
I have to wonder: If these files are never deleted, could this be used as an
inefficient means of storing big blobs of data online? If someone uploads
illegal content to Google Docs attachments, does Google have any means of
removing it _at all_?

