When you delete items from your Web History, they are no longer associated with your Google Account. However, Google may store searches in a separate logs system to prevent spam and abuse and to improve our services."
The article claims that there's no guarantee that Google does anything other than change the display. Google actually does quite a bit of work to disassociate items from your Google account if/when you delete them.
Storing searches in a log to prevent spam sounds pretty disingenuous.
So, how about an unambiguous (as in, without weaselwords such as 'may') update to the privacy policy detailing exactly what google stores in a user profile and what it does not, and that what a user sees in the interface mirrors exactly what google sees in its systems minus some small delta for propagation across google's servers?
Because what you write above is technically quite possibly true as far as the viewpoint of a user is concerned but leaves open a ton of possibilities for clever/creative interpretation on what 'improve our services' means.
Why disingenuous? An easy example of preventing spam would be to detect and stop people trying to spam Google Suggest, which is based on queries that users do.
Jacques, you wrote an article titled "Google Web/Search History Disable Does Absolutely Nothing." I just wanted to point interested people to Google's public documentation which points out what happens when people delete items from their search history.
Right. And you conveniently avoid any discussion about the difference between 'your account' and 'your userprofile' (the data that google stores about a user).
So you are essentially confirming the thrust of the article, that as far as your privacy is concerned nothing changes.
One day the google equivalent of Mark Klein will leak a bunch of userprofiles, we'll all be shocked at what's inside and then you'll go 'Oh, I knew that all along'.
What will make you believe they aren't part of some type of large NSA cover up feeding your clicks to the government? A personal tour of every rack in every datacenter to inspect logfiles?
Trying to discount official policies or responses as "weasel words" or any type of other hand waving to every response they give sounds more like you are set in your agenda vs. wanting to accept a reasonable response.
> What will make you believe they aren't part of some type of large NSA cover up feeding your clicks to the government?
Where did I say that? Or did you just make that up in a second attempt to make me look like some tin-foil hat type?
>A personal tour of every rack in every datacenter to inspect logfiles?
Again, you're trying to pull this into the ridiculous.
An ironclad privacy policy would do me just fine.
If you're fine with ambiguous statements and finely crafted legalese then that's your choice, for me that's not good enough for a company that has presence on approximately 80% of the web.
My agenda doesn't come in to play, I don't really have one, I just noticed that what google says unofficially and what google does according to its own privacy policy are not necessarily one and the same. As I noted, I still use google, I just would not trust them with anything that I consider to be personal. So my email is not on google servers (even though likely a lot of my email is because they have the other side of the conversation). My google 'docs' (or drive, or whatever) files can be published in the newspaper for all I care.
Google, Facebook and just about every other large company that collects endless realms of data on the users that visit them should be absolutely clear in what is stored on their users, for how long they store it and what you can do to opt out. By law they are required to do so but it seems to me (and you're free to disagree) that google is just paying lipservice to these requirements while leaving itself enormous leeway to do as they please with data concerning their users.
I wasn't born in the 'privacy doesn't matter' age, and to me these sort of things deserve scrutiny because power can easily be abused, and google has in this sense enormous power.
Feel free to choose to be supportive of google and their noble goals, I'm sceptical and don't see much to reassure in the words Matt Cutts used above, if you do feel reassured then good for you.
Official responses are not done on blogs or in support forums, they are done in the one place where it matters, the privacy policy, which is supposed to be the document that governs the relationship between companies and their customers in all matters that concern user data.
What would you want to see in this privacy policy?
Any policy is going to be in legalese because that's how legal documents work, so if that's a deal breaker then I don't think western civilization as a whole will work very well for you. It has to be broad enough that the company can continue to do new things, but narrow enough to mean something. Google's lawyers work really hard doing both, making the language both precise and understandable. Do you have any specific feedback for what it should say instead?
That's a serious question which deserves a serious answer. Let me think about this and I'll get back to you (I see you have your email in your profile). I like your definition and I think I see at least a few points where the current privacy policy does not meet my standard for 'narrow enough to mean something' so I'll concentrate on that.
Matt isn't saying anything to contradict Jacques, other than the pedantic 'it does do something'. But Jacques point that it doesn't do anything to protect your privacy from Google still stands.
I can go and imagine a hundred scenarios that involve some company the size of google in the jurisdiction that it is in leading to some form of abuse but frankly what I can imagine is not relevant, what is relevant is whether or not this is desirable and imnsho it is not.
The fact that google creates all kinds of legal loopholes and that the Google Ambassador to HN finds it sufficient to point out that indeed something changes behind the scenes (namely, that google dissociates your searches from your account and nothing more) says enough.
I hope that nobody here is naive enough not to see the size of the loopholes that statement creates.
What you have said is unconvincing, and gives you quite a lot of wiggle room.
by "association with your Google Account", do you include "association with your personal identity"? The description of the server log suggests that your web history will still be easily connected to your personal identity. The claim that Google is doing quite a bit of work to prevent that wouldn't match the privacy policy.
by "deleted from Web History", do you include "turned off Web History"? This seems possible but is not explicitly stated.
The article says "there is no guarantee whatsoever" and I believe this is no guarantee whatsoever that there is more than a superficial difference.
I'm curious how the author knows that it "does nothing". It seems that the argument ends with "because they can" we can "rest assured" that it "is exactly what they’ll be doing". In other words, if you already have a certainty that Google is pure evil then you can extrapolate from that that they will do the most evil possible thing in every circumstance, including this one. That's not terribly profound. The entire rest of the post is a litany of ways by which Google can see your cookies, which has little to do with what they DO with that information, which is what the user account setting purports to affect.
(to be clear, I have no evidence either whether they continue to track and store web history or not, but it doesn't seem like the author does either, and it's disappointing to see such a baseless trashy post from someone who I have in rather high esteem in general).
Let's turn that around for a bit. The default assumption that I have is that advertising companies that deal in user profiles (such as Google) will collect everything they can about you because this benefits their ability to sell advertising. Google's terms of service states what they capture, in other words no matter what their user interface is telling you their privacy policy (which I consider to be the leading document in cases like these) tells a totally different story and and generalizes to all of google's services, including search (they even use that as the example of what they capture).
The fact that things like cookies are in those logs that they do make (again, according to the privacy policy) makes it trivial to re-construct the data that they ostensibly do not keep. If it is trivial, makes good business sense, enhances the value of the profile and makes more money then you can bet dollars to donuts that unless there are strong statements to the contrary from the company involved that they do not engage in such behaviour that they do.
Privacy policies are generally written in favour of the company writing them and it would be terribly naive to assume that if it could be written more strict but wasn't that this is an accident or oversight. Note how long google fought the EU commission to have any limits set on their permission to retain user data, and how they tried to spin it as a user benefit when they eventually caved in.
So if google re-writes their privacy police to state explicitly that they do not datamine their logs and that the data is used only in a statistical sense and never in a personally identifiable sense then I would agree with you (and I would even believe them), but until they do it is fairly safe to assume that they in fact do use that information.
Of course 'only to enhance your user experience' and never to improve the bottom line for google.
That's really just begging the question again, though.
It's also not correct, as their privacy policy doesn't state that they always collect a cookie per log entry, for instance, but that they may. This is an important distinction, because in practice, at least things like doubleclick and analytics requests do not transmit your google account cookie. In a quick test, google fonts and google hosted libraries don't appear to send any cookies at all, though I don't know if that's true under all circumstances.
There's much you could reconstruct from IP addresses and connection patterns if you were sufficiently motivated, but that's a long way from extrapolating from their privacy policy. Regardless, assuming "they can, therefore they will" isn't nearly sufficient here.
At absolutely zero cost to themselves and good PR as a benefit google could re-write their privacy policy if that were the case. Note that they still have not amended their privacy policy to the effect that they indeed anonymize the log files after 9 months, even though they announced that they would do that years ago.
So no, short of going to work for google or an insider coming up with hard proof there is not much to be done there. But with a privacy policy that details what they do log and a strong financial motive I have little doubt that this is an accurate representation of what's happening.
> At absolutely zero cost to themselves and good PR as a benefit google could re-write their privacy policy if that were the case
I just gave you clear examples where that wasn't the case, examples you can verify yourself by inspecting the requests.
> Note that they still have not amended their privacy policy to the effect that they indeed anonymize the log files after 9 months, even though they announced that they would do that years ago.
"We anonymize this log data by removing part of the IP address (after 9 months) and cookie information (after 18 months). If you have Web History enabled, this data may also be stored in your Google Account until you delete the record of your search."
> But with a privacy policy that details what they do log...
I don't understand why you're ignoring the very important distinction between of "do log" and logs "may include".
"may include" in a privacy police is newspeak for "we will".
That 'may' is there so that if you read it you get a warm fuzzy feeling because no way would google ever do such a thing, and it allows them to point at it when they're caught doing it saying 'we told you we were doing this all along, see, we gave ourselves just enough leeway there to squeeze through'. Call me jaded, cynical, old for all I care but I have yet to see a big company that did not act in the way I just described when it came to covering their asses while pulling the wool over the eyes of their end users.
> We anonymize this log data by removing part of the IP address (after 9 months)
That's not exactly anonymization is it? You're making it worse. Anonymization is removing all user identifiable information. This is merely stripping some unspecified number of bits of the IP, which more than likely has changed by then so has lost most of its value, and retains the cookie which has more resolution than an IP to begin with.
There's an interesting article from CNET (from 2008) on this topic:
"Debunking Google's log anonymization propaganda"
"Company says it will be reduce the amount of time that it will keep sensitive, identifying log data on its search engine customers. This is little more than snake oil."
You lay out the reasoning for your blog post in a succinct and plausible fashion.
One is free to assume, of course, that you're a tinfoil wearing paranoiac and disagree with your reasoning, but down voting such a comment, into which you actually expend the effort to explain your reasoning just seems so wrong, infantile and dumb.
Another trend I unfortunately observed recently where comments around the line of: "Don't link to this (the this being a blog entry by Andrew Sullivan, which is hardly controversial) since [paraphrased] it does not reflect the dogma of the HN politburo.
It's crap like that, which really devalues HN and I think that's a shame.
This sort of thing is par for the course. Especially in articles critical of Google or Apple, less so with articles critical of Microsoft. That's likely both a reflection of the number of people working for those companies that also frequent HN and the dedicated group of fanboys each has.
> If it is trivial, makes good business sense, enhances the value of the profile and makes more money then you can bet dollars to donuts that unless there are strong statements to the contrary from the company involved that they do not engage in such behavior that they do.
Even strong statements don't help, as seen in the Google Apps for Education case where filings in federal court admitting scanning of GAE emails by Google were at odds with public statements denying it. It's an interesting read as it shows the confusing mess of privacy statements versus actual practice, and how hard it is even to figure out which privacy statement applies, as Google started claiming that the consumer privacy statement applied to free student email as well, but it wasn't clear earlier.
Google finally stopped the practice, and it seems they were scanning even Google Apps for Business emails to build profiles to show ads on other sites even if the show ads setting was off.
>if you already have a certainty that Google is pure evil
This is a straw man. Google may not think that doing something that is legal and common is evil, and why are we having theological discussions anyway?
You will have information on what Google is doing with your information when Google volunteers that information. Unless they deny usage in a legally binding sense, there's no reason why you should expect to see any further information about how that data is used internally. What you do know is:
1) that there will be no public relations consequence for breaking users' trust, because what they do with what their records is not transparent, and
2) that it's legal and common to use any information gathered about you in just about any way.
You can either assume they'll do what you have indicated that you prefer with the data, even if that likely involves leaving money on the table - or you can assume that they will generally do what they're legally obligated to do, and within that, attempt to maximize profits. IMO, the former position involves imagining that a company has a personality, and doesn't want to take advantage of you. It's a false equivalence to assert that the latter (a company legally maximizing its resources for profit) involves a similar leap of imagination.
You're not going to get information either way about what they DO until Google publicly obligates themselves with a policy document, or somebody leaks.
It does nothing just like Google's own DNT option in Chrome does nothing to its own web properties. It's almost like Google is begging for this sort of stuff to turn into regulations against them, because clearly they can't be trusted to do the right thing.
I disabled the history to make the attack surface smaller.
Even if Google retains that data forever, if someone gains access to my account she won't be able to check my browsing history.
Agreed. I think this is the biggest point missed by the article. Your privacy as far as Google is concerned may not improve, but your overall privacy will. It will be much harder for other parties to get to the data Google collected. Considering that other parties are much more harmful and can lead to real-world harm (death, imprisonment, etc.), I'd say turning off history absolutely increases privacy.
"Considering that other parties are much more harmful and can lead to real-world harm (death, imprisonment, etc.), I'd say turning off history absolutely increases privacy."
Those guys who can imprison you can easily get your data directly from Google. If Google doesn't hand it over to them for the asking, they'll get a warrant (assuming there's probable cause to suspect you of a crime).
I think blackmail/extortion would be a more realistic scenario. As you correctly stated disabling the Web History is probably a futile defense against law enforcement agencies of western countries but think about all the dictatorships or otherwise strongly authoritarian regimes that are the norm for most of the world. As Google pulled from the Chinese market AFAIK they are probably not as receptive to warrants from there. Then there are also most of the African countries and many Asian countries with a high degree of corruption, low to no political freedom and/or freedom of speech. Not to forget the Arab countries where staying logged in while a third party has access to your PC and pulls up your long term search history and figures out that you are gay and you're stoned to death because of that.
Well, this wouldn't just refer to law enforcement. But even for law enforcement in the US for example, I would much rather them have to get a warrant. 99 out of 100 times they will not have probable cause. Privacy increased.
1. Most of what I do is on Firefox. NoScript is turned on, and disallows Google Analytics. I'm not signed into Google, although as this article points out, that changes little. I use CookieCuller aggressively. Ad/pop-up blocker are in play. Etc.
The net effect is to make web browsing less noxious than it otherwise might be -- few pop-ups, few adds, very few cases of some noisy video spontaneously playing when I click a link in Firefox.
2. My monash.com email has long gone through Google Apps. That's in Chrome. I also open links I get through email in Chrome, but do little else there. In that browser I'm usually signed into Google.
Chrome/Google consume a lot of resources, e.g by insisting on opening Google Talk whether I want it or not. But it's a manageable annoyance, as I keep my open-tab count in Chrome fairly low.
3. I use IE very selectively. If a page won't open in another browser, I try it there. A few of my most-annoying and rarely used apps and sites are relegated to IE -- Facebook, WebEx/GoToMeeting/etc., and perhaps a few others I'm not thinking of now. Unlike the other two browsers, IE is outright closed on my PC much more often than it's open.
I follow many of the practices you list, and then some. I wish there was an easy way to people - especially lay users - to "subscribe" to a privacy-enabled version of Firefox. For instance they could be offered pre-selected choices:
1. Block analytics [x]
2. Destroy cookies after session [x]
3. Block scripts [x]
4. Prevent search results tracking [x]
I say this because few non-tech users know or care about the entire gamut of tracking-blocking services (beyond,say, AdBlock). But if they were presented with an option like this:
Do you want to install Firefox "clean"
-or-
Do you want to install Firefox "Shields Up"
You can try to combat a part of this by installing Ghostery. It will block a lot of these third-party requests. As a website owner you could link to the share page, instead of loading the widgets, or load the widgets only after the user requests them.
As for those server logs, I understand they record my movements, but I don't think it is my right to stop them from doing that. The one who owns the server/web property should be allowed to analyze requests to that server. This can get icky though in the case of major CDN's.
You could choose not to keep server logs as a search engine (forgoing DOS protection), but then what happens when a user clicks on an advertisement? Privacy seems only as strong as the weakest chain.
Ghostery, the last time I checked it out was closed-source, subject to control or influence by advertisers, and reporting to the vendor about users' browsing. Clearly lots of people like it, but I would consider it gross breach of my security policy.
My recommendation for anyone who's serious about controlling his/her online footprint is Request Policy. It's open source and simply blocks requests according to user directions - you can put it on a whitelist or blacklist basis, and decide for yourself what servers to contact from each page. Of course this is too inconvenient for most people, but it gets asyptotically less troublesome as the list is perfected.
1. Closed-source javascript is not a thing, and Ghostery's code is very readable.
2. Technically, anything is subject to influence by third parties, but I'm quite certain you possess zero evidence that Ghostery is actually influenced by advertisers. Implying that Ghostery might do something nefarious at the behest of advertisers (based on nothing but personal paranoia) seems maliciously disinformational.
3. Ghostrank is opt-in by default. You have to intentionally check a box that plainly says you agree to send "anonymous statistical data" to them.
We are most definitely not influenced by third parties, if anything, companies now contact us directly to provide their registration information for monitoring by Ghostery. Additionally, we keep the database changes public here: https://www.ghostery.com/en/database/changelog
That's one of the advantage of my extension, HTTP Switchboard [1], over many others out there: it shows you everywhere a web page tries to connect -- and then let you act on what you find. First step is being properly informed. It also shows you behind the scene connections (those from other extensions or the browser). Anything that goes through webRequest API is reported.
Ghostery is an excellent piece of software but it caused me a few issues. For instance I logged in to my bank and couldn't view any statements. It was Ghostery blocking something so I whitelisted the site. Not a big issue but it took me some time to realise it wasn't a problem with my bank's website.
I went on to uninstall Ghostery because I was worried the unpredictable behaviour it introduces might cause some frustrating issues, particularly when going through a process like filling out a long online form only for it to fail at the end.
> Not a big issue but it took me some time to realise it wasn't a problem with my bank's website.
Actually, it likely was. Your bank probably did not realize it but depended on some external component blocked by ghostery being loaded. I've seen the weirdest cases of this, such as one instance where a missing function in some javascript code was fixed for backwards compatibility in an ad-tag served by someone else...
So disabling the ad tag caused the site to fail with a javascript error.
In general, a banking website should work A-Ok with ads disabled and if it does not it is likely not Ghostery that is at fault but the bank (if only for not testing that their site works with ghostery or adblock installed).
I've also seen behavior like this with NoScript. Sometimes it's a total crap shoot as to which functionality works/doesn't work when you enable/disable certain scripts.
I use a more radical step: browse in incognito mode all the time. No cookies survive a browser restart. If you have a fast connection, the cache is actually makes your browsing slower. As for the "convenience" of being logged in all the time on the sites I visit, I would rather not. Coupled with a dynamic IP, they can't correlate any of my traffic and search data.
How dynamic is your IP really, though? My "dynamic" IP from Comcast will go unchanged for months or even years (I think the last time it changed was when I replaced my modem about a year and a half ago).
> [Google] would observe three specific types of data retention periods: deletion of the last byte of IP addresses in Google server logs (9 months); the validity of cookies placed in users’ browsers (2 years); anonymisation of the cookie number in the company’s server logs (18 months).
1. Of course a certain level of logging and archiving of information is necessary to maintain the security of a server, it is not always about “THE USER”.
2. Again, a certain level of logging and archiving information is mandatory to offer some services based on artificial intelligence and to make people's life easier. Just imagine asking your doctor to not having a file of your information because it is a violation of your privacy! It does have a lot of benefit in terms of saving time, to learn about user’s search patterns. Google is able to offer better search results like this.
3. It does not make sense to be particularly concerned about what Google when you are actually sending that information out to the whole world. This is like shouting out something and then complain about people listening to it. If you don’t want people smell your kitchen, first you should think of closing the window.
I follow the practices listed in a few comments already [1] on PCs. But this has proven to be painful with increasing number of websites depending on Google via ajax.google.com, etc. As many as a third of the websites won't work on my browser till I take specific actions to allow something.
What are the recommendations for Android along these lines? Is rooting needed/recommended? I currently use Maxthon browser, have never signed into Google Account on my phone ever (this gives a lot of trouble, but sounded worth ever since I found my older Android phone won't let me remove Google Account ever without a factory reset).
I use Amazon's Appstore, which could be bringing its own privacy issues. I found that their Appstore app by default sends App usage data to them, though this can be disabled.
These as startling statistics, roughly corresponding to the number in the linked blog post. Add to that that they can also probably analyse ~50% of meaningful e-mails (depending on the region) [1].
I should be in your hometown somewhere in the next few months, drop me a line at jacques@mattheij.com with your cell number in it and I'll buy you dinner.
Periodically, Chrome will download a list of hashed URLs believed to be dangerous. As you surf the web, Chrome checks client-side if the URL matches anything in the hashed list. Only if the hash matches will Chrome initiate a more in-depth check.
At a guess, never. In fact, it will likely be the opposite, you'll be accessing the web through a device (mobile phone, tablet, thin client (aka ultrabook)) that you have bought at a steep discount from some provider in exchange for a large chunk of your privacy.
People (as in, the general public and a surprisingly large fraction of those that should know better) simply do not care.
> that you have bought at a steep discount from some provider in exchange for a large chunk of your privacy.
I have some doubt that the consumer observes this discount.
If consumers insisted on privacy-by-default, the point of sale cost would not be very different. It's just the industry standard not to provide this.
If we observe a steep discount now, it's because privacy-by-default has become a specialist interest that manufacturers can therefore charge extra for.
Never. People don't care of this level of privacy (except a very small minority). And frankly, thanks to the sophisticated data mining/statistical algorithms just using VPN only makes tracking/profiling harder, not impossible.
They dont care now,but someday they will,trust me.They dont care now because they get all these freebies in exchange of all their data.These freebies aint going to be free forever.
This isn't an issue of privacy, it's an issue of deception. Privacy (or the control of privacy) is being insinuated, but it's only smoke and mirrors. It's a lie.
When traffic shaping of typical protocols like HTTP[s] and a lack of net neutrality causes only unidentifiable VPN sessions to operate at a consistent high speed.
As per the OP what would this solve, exactly? You just would disguise your IP address, but cookies and any other identity revealing details of your browser will stay the same.
Active web history on the other hand will still be associated with an account for as long as it's active (not solely to Google's benefit by the way).
I thought this stuff was already sorted years ago and is now common knowledge, it's like those people only now realizing that Gmail does contextual ad targeting, it's somehow disingenuous.
Of course they mine those longs, there is a lot of knowledge to be gleaned from them, mostly to improve their product. Also some places mandate some kind of data retention.
Deriving "history" from those logs doesn't matter much as long as they eventually get anonymized.
You should update your post to reflect the 9 month distinctions between search history and server logs.
I'll update the post when someone from Google (rather than some anonymous account created for the sole purpose of debating this point) steps forward and guarantees that no information mined and/or copied from those server logs survives in user profiles after 9 months. You have definitely not made your point, they only speak about the IP anyway but say nothing whatsoever about the cookies, which are just as good (or even better) at identifying users.
Of course it matters that they derive history from those logs. That's what this whole article is about.
I also note that even though that blog post is now 6 years old wording to that effect still has not made it into the google privacy policy, which I assume to be the binding document in cases like this.
That's not likely to happen. Nor will anyone else from any other company volunteer to walk you through their policies and infrastructure.
If a "history" isn't associated with a user then there is no issue there, on the other hand they can still run a mapreduce job on it and extrapolate spelling corrections for example.
You ought to at least include Google's blog post for the sake of completeness.
> Nor will anyone else from any other company volunteer to walk you through their policies and infrastructure.
Funny how 'if you've got nothing to hide' seems to apply only to people. Companies do in fact walk us through their policies, that's why they are putting them on their websites for us to read.
> You ought to at least include Google's blog post for the sake of completeness.
Google should amend their privacy policy, for the sake of completeness. Assuming of course that anything actually changed. Announcing something on a blog versus actually doing it and updating the privacy policy to reflect the change are two different things. It's the same as announcing your intention to marry someone and actually following through with it.
If you're on the page at https://history.google.com/history/ and click on the gear and then "Help" the page about deleting search history is at https://support.google.com/accounts/answer/465 and it says
"What happens to your history when it's deleted
When you delete items from your Web History, they are no longer associated with your Google Account. However, Google may store searches in a separate logs system to prevent spam and abuse and to improve our services."
The article claims that there's no guarantee that Google does anything other than change the display. Google actually does quite a bit of work to disassociate items from your Google account if/when you delete them.