Google Analytics is on a substantial proportion of the Internet. 65% of the top 10k sites, 63.9% of the top 100k, and 50.5% of the top million. My own partial results from a research project I'm doing using Common Crawl estimates approximately 39.7% of the 535 million pages processed so far have GA on them.
That means that you're basically either on a site that has Google Analytics or you've likely just left one that did.
The aim of my research project is to end with understanding what proportion of links either start or end in a page with Google Analytics. If it starts with Google Analytics, your present "location" is known. If the link ends with Google Analytics, but doesn't start with it, then when you reach that end page, the referrer sent to GA in the clear will state where you came from.
All of this is then tied to your identity.
If people are interested when I get the results of my research, ping me. I'll also write it up and submit it to HN as it would seem to be of interest.
: http://www.youtube.com/watch?v=pkoIUmP5ma8 (GA specific results at 1:20)
Of course, as a web developer, it's useful to be able to see where people came from. But we don't have any right to that information. As an end-user, why the hell is my browser giving you this information for no reason when it doesn't have to?
I've been using RefControl for Firefox for years now. It fakes the referrer, setting it to the root of the domain being requested. This hasn't ever caused me any problems, so there can't be that many sites that rely on it.
I don't give a shit about your analytics or how much money you think you'll lose from referrers disappearing. Privacy is more important.
The problem is, all of these "features" allowed by referrers are user-hostile actions.
If referrers went away tomorrow, users wouldn't notice the difference or care. Publishers would get angry and think "we can't milk our content/visitors for as much money anymore!" But that doesn't really change the relationship with the customers who value your product or business so I personally can't believe it will make a sizable difference in the end.
I've spent 10s of thousands of dollars hosting free and open source software for millions of people over the years and I make sure to prevent bandwidth theft from other sites that cut into my ability to provide that service. Ad revenue (all responsible ads... no popups, no sound, etc) doesn't cover the cost of hosting and bandwidth even when sites are prevented from being unethical.
Take away referrers and it will be replaced by more complicated technology that serves the same purpose. CDN providers have secure links, for instance, that use an API to allow sites to generate a one-time use or limited time window link for a download from the CDN of a given file. It's more complex, but it's what I'd switch to tomorrow for downloads if referrers went away.
How is this a bad thing? How would removing referrals harm you in any way?
For images, the whole point of a CDN is to keep them in one place with a long expiry (a week or more) possibly downloaded from a nice geographically close edge node so that visitors load the images very quickly once and then cache them for the next pages and later visits of the site. The only current way CDNs implement of keeping folks from leeching/hotlinking images is to check referrals. The unique download link bit would negate the whole benefit of the CDN (you'd lose caching and the back and forth to generate the unique URL would slow it down), so that's out. Basically, lots of folks would ditch CDNs and host internally, possibly using server log checking to see if said IP recently hit a page. Otherwise they have to deal with lots of bandwidth leeches. The end result would be slowing down visitors' experience.
So, for both images and downloads, users wind up losing if referrals go away. It's far better to just leave it as is. Enable referrals by default. Let the privacy conscious disable them (sending blank ones). And build systems to take into account both userbases. Again, as a software developer, publisher, and host, I don't really care about referrals in terms of violating privacy, so I don't care if you disable them and send blank ones. I purposely set up my redirects and CDNs to allow for that. I care about them in terms of continuing to deliver services effectively to my users without competitors stealing my resources.
This is actually an interesting problem, because it's already solved but most people aren't using the solution: If you have a large file do distribute to a large number of people without authentication, use BitTorrent. As far as I can see there are two primary impediments to this:
B) Images are exactly the wrong size. They're big enough that you can't just ignore hotlinking but not big enough that you want to pay the overhead of connecting to 50 different peers instead of one to get a good transfer rate. But that just requires some adjustments to the protocol; if you're looking for realtime retrieval for display in a webpage you would probably want to use UDP and then use erasure coding to deal with slow/broken peers and packet loss. If you have a 60KB image, you can send a ~50 byte packet to each of a dozen peers and have ten of them each send 6KB (approximately four packets) to the target with 6KB worth of erasure bits from each of the others (which also allows the image to be constructed once 60KB of data is received in total from any collection of peers), and now the image is costing you ~600 bytes instead of 60KB. And if the image hasn't been received in 150ms, add more peers.
As for images, the bittorrent protocol would just be way too slow even with some changes when compared to HTTP with SPDY and all the internal tweaks done at the geographically close CDN edge nodes to make them as fast as possible. 150ms before adding a new peer is an eternity in an age when 47% of people expect a web page to load in 2 seconds or less and the abandon rate increases with each second that passes with 40% abandoning at a little after the 3 second mark.
Of course, even if this was released today and referrers were phased out in June 2014. We'd still be able to use them for at least 5 years until you could safely assume that they were gone. Likely longer.
You are of course correct that we don't have a "right" to this information. But I've discovered, many times, through the referrers in my logs, links to my pages from some very interesting places that I might not have discovered otherwise (because the link information that Google discloses is woefully incomplete).
Any user who wants to hide referrer information can easily to do in a variety of ways. For example, I wrote a bookmarklet that does this for you: http://lee-phillips.org/norefBookmarklet/
It's irrelevant how useful you find the information. You'd probably find it useful to know the name and email address of everyone that visits your site too... So?
AdWords paid search clicks still send the info, if that's what you're talking about.
But this discussion is on completely removing referrers, not just stripping search keyword.
Right now, once you hit 10M pageviews a month you either have to sample or pay $150k/year for Premium.
I don't need support, an account manager, four-hour turnaround on data, an SLA, etc. I just need more pageviews sometimes.
I've heard that running a $10/month AdWords campaign gets you higher caps, but it may be an internet old wives tale.
> 1 billion hits per month
> Up to 3 million rows of data in unsampled reports
If you're doing conversion tracking etc. you're going to start getting sampled data at some point, but it's the pageviews our folks care for.
I just thought I'm already using NoScript, AdBlock, RequestPolicy, BetterPrivacy, Cookie Monster, Blender, and HTTPS-Everywhere; might as well go all-in.
Firefox, ABE, NoScript, Request Policy, Ghostery, HTTPS-everywhere, hygiene.
The irony of my militant approach toward privacy is that I probably make myself more interesting to would-be eavesdroppers by my carefulness than I would if they could see it all -- I'm just not that interesting.
On the plus side, the LCD of legitimate-threat hostiles is greatly increased. I'm fairly boring even to neighbors and law enforcement and copyright holders and scam artists and advertisers. I imagine I'm pretty stultifying to nation-state actors. :)
Still, I'd like everyone else to join me so that I can get lost in the crowd. The untracked, encrypted, well-rested crowd.
Come on in, the water's fine.
I would only advise against Ghostery, as they whitelist some trackers, if being paid. With every update I had to reselect these trackers.
And Evidon (Ghostery's mothership) selling usageinformation really bugs me: http://venturebeat.com/2012/07/31/ghostery-a-web-tracking-bl...
I would recommend the FF-addon Diconnect: https://addons.mozilla.org/en-US/firefox/addon/disconnect/
Does anybody have an idea, how I could make my own sites secure in a relatively cheap way? Just a personal site with not that much traffic, so spending much money seems a bit off to me.
I work on Disconnect. I don't understand why any hacker would still put Ghostery on their machine:
* Ghostery is run by former ad execs (7/9ths of their executive team): http://www.evidon.com/our-team
* They make their money (I've heard tens of millions of dollars per year) selling user data to ad co's and data brokers: http://www.evidon.com/#block-views-from_our_partners-block
I also like how Ghostery provides URLs for each tracker source (actual payload) that you can easily view on their site.
There's also a database with short description, affiliations and privacy terms for each tracker (e.g. https://www.ghostery.com/apps/google_analytics).
I really appreciate an ethical alternative to tainted Ghostery and hope you guys will catch up soon.
== HTTP Switchboard
For HTTP Switchboard, I could easily identify glancing at the matrix that what was requested was a CSS file, I then proceeded to block with one click anything coming from `echoenabled.com`. The page still displayed properly for all three blockers.
Also of note, with Disconnect and Ghostery, there were some scripts running requesting data from api.echoenabled.com and echoapi.wpdigital.net every few seconds. These requests were blocked by HTTP Switchboard without my intervention to prevent this.
The first thing I tried when I wrote Disconnect was to block every third-party domain. Within an hour, I realized that I broke the whole web. So I built a crawler to identify and categorize the most prevalent third-party services instead. The domains you list under Disconnect would all be categorized as content by our crawler - resources that most people would consider pages broken without and that Disconnect doesn't block by default.
Your going personal. I am talking about the extensions, not you. If for-profit companies are going to claim to care about user privacy, expect this claim to be taken to task, especially in the current era.
Above I am providing hard data, not an opinion.
> "The domains you list under Disconnect would all be categorized as content by our crawler"
The page works fine if whitelisting only the page domain. If someone want the comments, then it's a matter of whitelisting `echoenabled.com`. The rest doesn't appear so important, so I personally rather not ping them. But the point is, I am of the opinion that people need to have the ability to know exactly where their browser connects, then they can agree/disagree/not care. I don't see how one can make an informed decision without proper information.
There is no way this 1x1 pixel gif would break a web page. And yet it's not blocked by Disconnect as reported in another comment below. I also reported how adobetag.com is reportedly blocked by Disconnect and yet a script from adobetag.com was downloaded by Disconnect.
Can't you appreciate why I am rather skeptical? Going personal rather than provide a credible answer is not going to dispel this skepticism.
* Chrome, Safari, and Opera give precedence to the newest extension - if that extension blocks a request, older extensions don't see the request.
* Firefox gives precedence to the oldest add-on.
* Disconnect has more than a million active users and Ghostery has more than two million.
* Disconnect is also as public as ABP: https://github.com/disconnectme/disconnect.
* I can't seem to see a list of all Google trackers. Some sites have multiple Google trackers, but if I click on the icon to see them it just turns off blocking for them. I'm assuming sites don't have 6 GA trackers, what are the others?
* I can't seem to turn off Content trackers for all sites. Configuring this site-by-site seems clumsy, to say the least.
#2 first: Code to block everything marked as content with one click was either just checked in or is about to be.
For #1: Disconnect groups tracking requests by company. If you want to see all the Google services that Disconnect filters, you could look through the filter list (services are grouped by category and company here, so start at lines 33, 1,882, and 2,326): https://github.com/disconnectme/disconnect/blob/b27abbf033c6....
I've never been able to detect any nefarious network traffic caused by Ghostery (and I've looked), but I don't like the games they play, so I'll be pleased to ditch them without ceremony.
I was interested in your project but your smearing of 'competitors' with FUD is seriously disconcerting.
That seems to go against the OS nature of the project.
The unencrypted list is at
And formatted as JSON at https://disconnect.me/services-plaintext.json.
The encrypted list is also trivial to decrypt with the SJCL code in https://github.com/disconnectme/disconnect/blob/master/firef....
Aren't you a former doubleclick-turn-righteous? And don't you also employ an ex-NSA dude?
And no, we do not sell user data, just tracker data.
Ghostery seems to be in the business of selling the data that I forget to tell them they may not collect. This is intrinsically a sneaky thing to do.
Yes I know about and use the "default to blocking" setting, but I don't think there is much argument that Ghostery users download your software with the expectation that the default would be anything else. But it is. And that's sneaky.
So you offer a very useful product, for free, and make money off of the people who fail to configure it so that it performs the only service they would ever purposely download it for.
Again, I have sniffed Ghostery looking for violations of my configuration settings, and never found any. I believe that it follows its configuration settings, and I am thankful for its existence. And I recognize that development and maintenance of it is not free. Presumably you are not a volunteer.
I have gotten value out of Ghostery, but apparently that has been on the backs of other users who want the same thing, but are less-careful than me about reading configuration options, and that doesn't sit well.
This is somewhat wrong: Ghostery, ever since version 1, had Ghostrank feature in it. It has always been an opt-in deal, the users who trust us may turn it on so provide us with data. For the first 4 years the data sat without any use until recently where Evidon figured out how to turn it into money. Even so, the data Evidon sells has nothing about any user, merely tracker data. Here are some samples of whats actually delivered to clients: http://www.knowyourelements.com/ and http://www.evidon.com/evidon-trackermap/tagchains-static.htm....
As I said, we do not trick the users into anything, and are as transparent about where the data goes as possible, if you have suggestions how to increase this then please let us know. We currently cover this question in every listing Ghostery has, all options pages, web site, FAQ, and many posts on our blog.
As far as defaults: originally, Ghostery was a detection software designed to "reveal the invisible web", but has added blocking since. Our official stance is that we do not make decisions for the user, but we do run every user through an install wizard that explains whats up. Disconnects stance here is a different, they do offer default blocking, tho they also have their own "whiteliest" built into it without telling the users about it. We are going to add some easy configuration in the near future that will pre-block stuff, but this is still in the works.
>Online marketing companies need better visibility into real-world applications of their technologies and those owned by their competitors. GhostRank data is sold as reports to businesses to help them market to consumers more transparently, better manage their web properties, and comply with privacy standards.
> So you offer a very useful product, for free, and make money off of the people who fail to configure it so that it performs the only service they would ever purposely download it for.
I gave some numbers above that show, in practice, just how many users are in one of these unexpected configurations:
> Ghostery's game seems to be tricking users into sending their data to Evidon. Going off the company's own numbers, something like 45% of Ghostery users send Evidon data (by comparison, only 2% of Firefox users share data through Telemetry).
> And how exactly is it trickery if users have to opt-in to the program and they're told what the program does?
Ghostery seems to rely on vague messaging (last I looked, they don't actually say anywhere in their extension that they sell the data you share to ad co's and data brokers) and UX "optimization" (what quesera dubbed the "reconfigure-on-update dance", for example) to get less attentive users to leave blocking off and to send data - as the numbers show.
When you enable GhostRank, Ghostery collects anonymous data about the trackers you've encountered and the sites on which they were placed. This data is about tracking elements and the webpages on which they are found, not you or your browsing habits.
Online marketing companies need better visibility into real-world applications of their technologies and those owned by their competitors. GhostRank data is sold as reports to businesses to help them market to consumers more transparently, better manage their web properties, and comply with privacy standards.
I'd argue that Ghostery should come with a default configuration of ALL trackers and cookies blocked. I'd argue even more strenuously that after the user configures Ghostery manually to do so, ALL should continue to mean ALL even after updates. Ghostery currently has 700 3P cookies in their database, and almost 1700 trackers. There is no valid argument, imho, that a user who configures to block ALL really means "block ALL right now, but if you see any new ones, I would really like to try them out first!"
However, I mostly agree that Evidon has been up front and straightforward about what they do and how they do it. I want to like Ghostery. I do like Ghostery. This little bit of sneakiness though, honestly, taints the whole operation. You can call it an oversight, and I will agree that it can't possibly have much marginal value to Evidon...but it's somewhere between tone-deafness and carelessness, two qualities that call for heightened vigilance.
If you feel strongly that your opinion is important and should be prioritized, please create relevant topic here: https://getsatisfaction.com/ghostery/ and gather support to change it so we address it quicker.
Anonymizing data is hard.
Ghostery does what the user tells it to do. If you are seeing unblocked trackers, most likely, its because we've added new trackers and you didn't select "block" by default for the new trackers when the list gets updated. You can change this preference by going into Ghostery options, Advanced, and review the "auto-update" section.
And heres a full explanation as to what Evidon gets and what it does with it: http://purplebox.ghostery.com/?p=1016023438
But back on topic:
Why should I (as fairly technically adapt person) have to search deeply inside the configuration, to maybe find a feature, that I expect to be active by default?
So sorry - Ghostery is so far off my radar nowadays, as I felt tricked and victim of a dark pattern . And as I nowadays have a strict "zero tolerance" policy regarding sites/services/tools that act this way. Ghostery was, is and will be on my list of tools, that I would never ever recommend to anybody.
Btw.: I do not mind downvoting - as it shows me, I must have done something right:
"Methinks thou dost protest too much." (English proverb)
just because you don't like someone (likely based on one comment on a web site...) doesn't mean that they should be downvoted...
His comment seems willfully ignorant of the problem, which is that Ghostery calls itself a tracker-blocker, but squirrels that obviously-desirable config option away under "advanced" settings.
If Ghostery was on our side, really and truly, that would be the default. Indeed, it probably wouldn't even be an option.
Of course when the tracker list is updated, I want to block the new ones!
No post on HN, no matter how helpful, correct, and civil, changes that this operating model is essentially a trick.
I just spent 45 seconds explaining it, but it would have been faster, and pretty defensible, to just downvote.
On the other hand, nuking his comment into gray-land would obscure useful instructions for making Ghostery do what it is assumed to do in the first place. So I agree, downvoting here is destructive.
Err, the option under Advanced just lets you set it to auto-block new elements as they're added. When you first install Ghostery, the walk through lets you pick that option as well without having to be "advanced" (oo, scare quotes!) in your preference setting.
Ghostery does not call itself a tracker-blocker, our users do. This is an obvious oversight for most users, and its somethign that we will address, but at this time, Ghostery is designed to reveal the invisible web and give user the control over it, not make decisions for the users...
As far as the feature, at the implementation time, we've queried a set of users that agreed that when new trackers are added, there is no need to let the user know until s/he encounters it for the first time and reviews it. Obviously, this is another setting we will be moving away from advanced and into wizard so the users may review and select this option at install time.
Sorry, I cannot accept that answer.
From your home page, in big letters, right now:
> Knowledge + Control = Privacy
> See which companies are tracking you
> Block over 1 6 0 0 trackers
> Learn how they track
> Ghostery is FREE
Please be honest with us. How do you view your operation internally? What services do you provide, and to whom?
I'm not sure what you mean by that question? Ghostery is a separate team inside Evidon with full control over what we do. I'm one of the people managing the product and my customers are users of Ghostery. As such, my primary goal isn't improved blocking, its education - to let users know that they are being tracked, to provide relevant info on who are the trackers and where to find out more about them, and finally, provide control in the form of blocking.
I won't needle you with follow-up questions, but for the record, I think there's more than a little cognitive dissonance here regarding customers and conflicting goals.
This is why people who spend time thinking seriously about the issue are concerned about Ghostery, but I accept that "trust" doesn't pay the bills.
Regarding your sites' security, what sort of advice are you looking for? OS-level hardening? Service config?
BTW, if you're using Chrome, you might also want to look into the "Users" section of preferences. You can create multiple user profiles with separate history, cookies, cache, etc. You can have a different user profile per window at the same time. (After you create a second user, there will be an icon in the top right corner of the window to open a window as another user.)
I like to use this to protect against CSRF. (I do financial stuff as another profile and facebook as another profile.) It's also useful for QA if you need to be logged in as multiple people at the same time.
You can even modify you Firefox by changing values in about:config
geo.enabled ---> false
keyword.URL ---> Your Search engine query url
browser.urlbar.trimURLs ---> false
noscript.ABE.wanipCheckURL ---> 0
network.http.sendReferheader ---> 0
network.http.sendSecureXSiteReferrer ---> false
^these ones break some site functionality that rely on it. It's rare at least.
I measured something, and that is the result of my measurement. People can make an informed decision with proper information. I found that the page served well without all the extra requests that Ghostery and Disconnect allowed.
Given the results, I am quite surprise you would say "look to be first parties serving content for the page".
> "If you go to the Guardian, you're going to be tracked by the Guardian"
Aside `discussion.guardianapis.com`, others are clearly 3rd-parties.
It's seems my definition of "3rd party" aligns more with that of the EFF: https://www.eff.org/deeplinks/2013/06/third-party-resources-...
Now you focused on the Guardian, how about the two other cases I measured?
I'm sure you don't like the result, but this is what came out when I decided to audit. Your response: You don't think it is a problem. That is settled.
> Given the results, I am quite surprise you would say "look to be first parties serving content for the page".
I believe every single domain name you listed (except the Google and Twitter domains, like I said) is a domain owned by or a CDN used by the Guardian or hosts an app run by the Guardian - prove me wrong:
This is a terrible answer: you are suggesting that Disconnect knows exactly which 3rd-party is legit when visiting a web page, and somehow you can vouch that none of these hostnames is a threat to privacy (this is what your defense of this implies).
`static-serve.appspot.com` is no different than `ajax.googleapis.com` (you didn't list this one, why?): they are 3rd-party hostnames, some are CDN which is exactly why they are not to be trusted, you can end up hitting these hostnames from other places than just the Guardian, which is the problem.
In any case, the legitimacy of their their purpose is not the point. They are 3rd-party hostnames: Unless being told, the user wouldn't know that he is also hitting these hostnames.
I will note that you completely disregarded the other results which are even more embarrassing to explain (like `simplereach.cc`: "SimpleReach tracks every social action on each piece of published content to deliver detailed insights and clear metrics around social behavior.")
Yes! You now know how Disconnect works - Disconnect's filter list is based on weekly crawl data that identifies what the most prevalent third parties on the web are.
> `static-serve.appspot.com` is no different than `ajax.googleapis.com` (you didn't list this one, why?)
You think that URL might belong to Google, which I already called an exception 2x?
> I will note that you completely disregarded the other results which are even more embarrassing to explain (like `simplereach.cc`: "SimpleReach tracks every social action on each piece of published content to deliver detailed insights and clear metrics around social behavior.")
I examined and debunked the entirety of the first example on your page, so I'm not inclined to waste any more time on your so-called "science".
Despite "Adobe tag" marked as blocked by Disconnect, these requests were not blocked:
This is the part that bothers me: fooling people into thinking they are shielded against this kind of thing. That is not ok. I accept bugs can happen, but so far your position has been to rationalize why these 3rd-party domains are not blocked.
Oh and in this particular case, Ghostery blocked everything it said it blocked.
Ghostery database is not static either and we update it very often, if you feel we are missing something, please let us know.
Curious, how would this differ from whitelisting cookies in the browser's own settings?
There's this add-on called Blender that is supposed to make your browser send headers like the average browser, you might be interested in that.
Like to point out what may be obvious to some but not others, when using NoScript you may want to remove Goog, Yahoo, etc from the default whitelist.
Before Blender I used various iterations of FF for testing & different surfing types(Waterfox/PaleMoon/ESR), but it appears I'll only be doing that for testing purposes anymore.
I also share your concern that my (lack of a) footprint makes me an outlier, and thus inherently more interesting to an adversary with the power and reach of No Such Agency. There's precisely zero I can do about that, without compromising my local objective of not being followed by every damned website, though, so I just carry on.
This is interesting. I would have actually expected more. The last time I remember someone analyzing this, I believe the result was "<script ... ga.js>" was the most popular tag on the web by far.
This was, however, a few years ago.
First, as <bdt101> pointed out, you "cannot track a unique visitor across the web using GA cookies" because of the way they're designed:
Second, the NSA doc as excerpted in the WashPost article talks only about Google's PREF cookie, which is set only when you go to say Google.com, not when you go to a non-Google property. It's a first-party cookie used for things like saving language preferences when you're not logged in, not for advertising across other properites. (That's what the Doubleclick cookie is for.)
I've been in the internet industry for a while now.
For what it's worth, I think your thesis has significant value.
Google Analytics requests are also only unencrypted if the site itself is unencrypted, so the fact that the GA request includes the referrer doesn't seem relevant (since the referrer would have already been transferred in the clear in the Referer header on the initial HTTP request.)
I'm curious. In that case, the GA JS is requested from what you call the "end page", so the referrer it has should be the "end page", not the one before it.
It does everything I want it to do (so far), but I'm not an analytics power user by any means.
 - http://piwik.org/
Add this line to your hosts file:
0.0.0.0 google-analytics.com ssl.google-analytics.com www.google-analytics.com
Search - Check (goog.com)
Mail - Check (Gmail)
Browser - Check (chrome)
Devices - Check (Android/Chrome books)
Websites - Check (Double click/AdMob, Unknown number of other companies)
Google Analytics - Check
Your DNA - Check (23&Me)
Cars - Check (self-driving cars)
I am probably missing large chunks of tracking even with this list.
Where do you draw the line so that organizations like Google do not handover (willingly or inadvertently) our life to NSA, GCHQ, ASIO, CSIS & whatever New Zealand's Intelligence spooks go by, on a platter?
Heterogeneity - Make the buggers at least have to work a little bit to invade your privacy.
If every site switched from Google Analytics to, say, Mixpanel... nothing would change. The NSA would just target the equivalent mixpanel cookie. So long as their are popular third-party cookies, this will be a problem.
Having all your mail sitting on someone else's server means it can be handed over by that company in response to a government request, legal or otherwise. After 6 months, it's not even a fourth amendment issue and no warrant is required; it's not "your" mail when it's data on someone else's server.
This doesn't require technical prowess the average person doesn't have. You can use your ISP's mail server, or a professional service like Rackspace Mail. There are free native e-mail clients for every desktop and mobile platform. You can still get instant mail notifications with IMAP Push. Just set at least one of your computers to delete mail from the server after downloading it.
This is all assuming you consider your adversary to be the NSA. If it's google, well, choose other vendors. If it's both, you'll have to consider both your destination and wire-protocol axes.
FWIW, if your traffic is split evenly between 3-4 main vendors (e.g., google, amazon, bing, etc), and all HTTPS, it's hard to tell what you're doing.
Enormous amounts of things have some connection into Google. Other connections into Google's equipment potentially include Voice, Talk, Hangouts, embedded Google Plus +1 buttons, embedded YouTube, Blogspot sites, embedded Picasa images.
Google runs ReCAPTCHA. ( http://www.google.com/recaptcha/ )
If you email someone with a GMail account your email address is in Google's servers with the email header containing your IP address.
Google's SafeBrowsing URL check built into FireFox which normally works by hashed URLs but could still track that you are using it, but has a simple version of the API so applications could send plain text URLs to it without you knowing ( https://developers.google.com/safe-browsing/ ).
Sites hosted on Google AppEngine ( https://developers.google.com/appengine/ ).
If you have IOS, Safari defaults to Google suggestions - i.e. sending everything you type in the address/search bar to Google.
Google Maps, built into other websites and services. Google Geolocation API built into other software ( https://developers.google.com/maps/documentation/business/ge... ).
Sites embedding Google Sparklines ( https://developers.google.com/chart/interactive/docs/gallery... )
Links going via Google's URL shortening service Goo.gl
Not counting things you choose to use (Chromecast, music, docs, drive, Now, voice search, News, Groups, Finance, Toolbar, Android sat nav, Chrome's open tab sync between your devices via Google Cloud, etc.).
That's not to say they are good or bad, or they are or are not tracked. Just that it's way to late to "avoid Google" just by switching away from GMail and blocking Google Analytics.
Yes, I know Google likely didn't cooperate in this, but they built a giant tracking engine, so it's not surprising to see it repurposed.
I'm sure they have plausible deniability.
Disable 3rd party cookies. It solves a lot of these types of tracking issues.
What kind of "preferences" changes in that way each time the user browse away the page and how does it help "user experience"?
Browser string, viewed content, frequency and magnitude of access, user authentication cookies, and ad-tracking cookies all would be tremendously helpful for this purpose.
Also, I'm betting they can easily tell when specific computers on a network are powered on or not based on fixed-interval network traffic from anything that polls regularly, such as anti-virus, news readers, mail clients and background updater services.
All of the above could aid in painting a more complete per-user picture behind the NAT, without actually having to compromise the local network or individual computers in question.
As long as these companies build the best tracking engines the world has ever seen, that can identify anyone and everything they're doing, it's just a matter of time before governments get their hands on that data, legally or illegally. It's just too tempting to pass.
If I were Google I'd start thinking long and hard about how to solve this problem, and try to make money by actually being on the user's side when it comes to privacy, not against them. Google will ultimately fail if their goals aren't aligned with those of the users anymore.
Of course, I'm sure they have some other way to pwn me, but it's nice to know that I was doing something right.
Also, I'm on Iceweasel/ Firefox instead of Chrome. It's probably nothing to worry about, but you can never be too careful these days.
This news makes me happy to see there's a point to me having Google Analytics blocked the last two years. I've noticed a new thing, Google tag manager, lately. Any point in whitelisting this? Anyone know what it does?
Unless you are something more than Ghostery's DBA.
It would need to be a recognizable image and symbol. The image would link to the site above, that informs users how you respect their privacy and do not track them. Personally, I'd add it to my sites, because with all the recent concern about privacy, I think my users would appreciate this change, and it would provide some advantage over competing sites. I'd like to visit a site, see that image in the footer, and feel more confident using their service.
I think it would be a good way to encourage change from developers. Very few are going to pull Google Analytics on their own. However, if they get pressure from their users to follow a certain privacy standard, and by doing so they can drop an image on their site to illustrate the change and potentially increase trust and improve their reputation, we might see some improvements.
Comments and further improvements welcome.
Can be a little inconvenient at times but seems justified now.
- Cross-site requests not allowed without whitelisting. This means some setup will be required at first (for example, for separate image domains used by Amazon, Google, Yahoo, etc.), but after a bit it shouldn't be a problem. This also serves as a "better adblock" in some ways, as it blocks ad networks without relying on a database that needs to be updated.
- All cookies blocked by default; whitelist as necessary
- No Flash or Java, period. If I need Flash for something, I'll launch a VM.
Sadly, Safari doesn't support whitelisting for any of this. Chrome supports whitelisting of cookies and JS by default, but the Chrome UX is worse than Safari's IMO (for a few reasons, but that's another topic entirely).
RequestPolicy handles the first one quite well, but is unfortunately Firefox-only.
Firefox is the answer. No other option makes any sense, if you're serious about this stuff. I understand that some people like the UI or process model of other browsers better, and that's where the evaluation of priorities comes in.
The good news is that the days of Chrome's technical superiority are truly over.. Speed, memory consumption, rendering engine...Firefox is all there and sometimes better.
Firefox is also the only browser with an ability to sanely handle tabs on the side, which is the only sane place to put tabs on modern screens. If I had to choose between sane tabs and sane privacy policies, I might have some soul-searching to do. I understand that everyone has their own equivalent, but be sure not to dismiss Firefox based on historical issues.
It's incredible how much inertia there is with that. The majority of the people I know that switched to chrome did it back when firefox was blatantly slower and that's the image that's stuck in their head. It's incredibly hard to remove and to get someone to try it long enough to change their mind again.
Firefox has a tough issue with marketing right now. They need to start a nice "firefox is faster" campaign.
Posted on HackerNews two days ago.
Disable Google tracking, log off user FROM Google search engine:
* keep login into Gmail
* also remove ads
* remove Cookie,Sess~/localstorage
First run, need refresh Google page to log off
Also remove Google anal-itics Cookie :)
From a business perspective why is Google and Facebook getting involved in this and calling for the government to not track users. Won't that just bring more attention to their two business models of... wait for it... tracking users and selling their information?
Previously, when the customers didn't care, they did nothing to involve themselves with this, and almost certainly aided the government.
It's purely business. Google and Facebook don't have morals, they have a bottom line. You can understand their actions by following the money.
You can get your open-source and locally running web analytics here: https://prism-break.org/
Like OWS protesters, for example.