Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Pretty sure the mechanics as detailed in the article are factually wrong. Google does not sell personal data. It collects all the personal data it can and then stockpiles it, where it becomes an asset that the company can use to build products and charge rent (largely from advertisers, but also from consumers). Selling an extremely valuable proprietary asset to competitors would be a really dumb move - the money in today's economy is in owning monopoly assets and charging rent for them.

The actual mechanism behind the behavior the author observed is that the user went to Google, searched for a product, and then visited the product's website. The product's website contains a tracking beacon that associates a unique cookie for the user with the visit. The website's owner can then choose to run ads on Facebook or Twitter that are shown only to people who previously visited the website. This program is well-known among Facebook advertisers (it's apparently quite profitable to them), but Facebook understandably doesn't publicize it much to the general public.

Agree that it'd be cool if the person involved could actually control who and where it went to and was compensated for the use of it, but that'd likely require a fundamental change in how the web works.




Google does sell personal data, a little at a time. If I make an ad that targets young dog owners in Seattle, then when someone clicks that ad I know Google thinks they're a young dog owner in Seattle. Google sold me data about that person.


I guess, but that's basically the same information that you can get through the referer header in HTTP or by A/B testing different print ads. If you put up flyers in a certain neighborhood telling people to visit a URL, and you put up flyers in a different neighborhood with a different URL, then when somebody visits your site you know where they live.

This is of a dramatically different scale from the information Google has for its own purposes.


I would be a lot more OK with this if Google were only using that data about me themselves instead of letting anyone with a little money use it too!


> "I would be a lot more OK with this if Google were only using that data about me themselves instead of letting anyone with a little money use it too!"

that would be better, but google would still be subject to attacks by adversaries and (secret) subpeonas by governments.

much better would be if they did their analytics on user data in real-time and store only the results while discarding the source data right away.


They have a project to do exactly that:

https://ai.googleblog.com/2017/04/federated-learning-collabo...

IMHO this'll end up being one of the most influential research developments of the last few years, but it's only a couple years old at this point and needs a lot of supporting infrastructure (which'll have to be done by people outside of Google - the mothership has no economic incentive to support this) before it really works well.


Not that I'm saying we shouldn't be careful with our personal data, but:

There's some serious mental gymnastics going from "young dog owners in Seattle" to selling somebody a dossier with their location, search & browsing history (all of which you can turn off).


The mental gymnastics is in defining "selling data" as "selling all data Google has about you". This isn't a binary where it doesn't count if it's partial.

Google provides selective contextual data about individuals to advertisers who target those individuals in exchange for money. Those advertisers are 3rd parties. Google sells data to 3rd parties. End of.

They don't need to be handing over the full database for it to be a sale of data.


Fine, how about a subset? Prove that an advertiser has seen one instance of my exact location, or an instance of my search or web history.

90% of the 'valuable contextual data' is that I searched for something related to an ad's keywords. The other ~% is targeted data. Targeting anonymised, generalised demographics/keywords which billions of people fit into does not class as selling private data.

I'm not arguing to trust Google, nor am I denying that there is massive data brokerage in the advertising industry - just that there is a big difference between targeted ads and selling data.


> Prove that an advertiser has seen one instance of my exact location, or an instance of my search or web history.

With all the subsidiaries and partner companies - and even just shell companies - out there, why should we be doing such gymnastics in order to support the advertising industry. If a company bought access to some data what's to say they couldn't eventually re-sell it to another company with another slice of that pie... or have that data inherited on acquisition.

That seems like an awful lot of trouble to go to just to sustain an industry that really isn't providing a lot of value.


> that there is a big difference between targeted ads and selling data.

I'm not arguing that the data received by the advertiser through targeted ads is anywhere near as valuable as a selling a dossier full of an array of information about a person, but... they're both examples of selling data. Unless you're redefining English language verbs... data is being sold. About a person. To a 3rd party.

There's a difference in scale and value, but there's no difference beyond that.


What? You can't turn off any of that. It's recorded whether you see it or not.


Presumably when you "turn off" various Google features that data is no longer used for ad targeting. For Google to do otherwise would invite a scandal. (I wouldn't expect all copies of the data and its backups to be deleted immediately since that would be a lot of work for the tape robots.)

Check out https://www.google.com/ads/preferences/ and https://myaccount.google.com/activitycontrols


Where's your evidence for this? Are you saying Google violates their privacy policy billions of times per day (every search, location update, page view)? (i.e. fraud)


That's one of the concerns I've heard raised over ad-supported email providers that target ads based on the content of your email.

One way that could alleviate this but still allow targeting, although it might be too expensive to implement in practice, would be only allow ad targeting to be based on attributes of the viewer that are either very broad (e.g., gender, income quartile, age group, geographic region) or that correlate highly with use of the advertised product and do not correlate will with non-users of the product.

For example, targeting "young dog owners in Seattle" would be OK on the "young" part and the "in Seattle" part because age group and region are whitelisted broad categories, but would not be OK on the "dog owners" part unless the product was in a category that dog owners use but people who are not dog owners do not. Dog food, for example, or a dog walking service.

The idea is that if someone ran an ad for dog food targeting "young people in Seattle", rather than "young dog owners in Seattle", the respondents are going to largely be dog owners, and can reasonably expect that they are outing themselves as such to the seller. So allowing using "dog owners" in the targeting doesn't really give away any more information on the responders. The targeting just serves to not waste the time of people who don't own dogs seeing ads for dog food.

If, on the other hand, "dog owner" could be used on the targeting for unrelated products, someone could target "young dog owners in Seattle" with an ad for something that almost everyone wants, and the people who respond out themselves as dog owners without having any way to reasonably know that they are doing so.


Some might opine that you did not so much buy that data as buy the benefit of using that data. Afterwards, Google still has the data they used and you still don't. This might be a distinction worth discussing.


Now that someone has visited your site through a particular ad, you do have that data about them.


That's pretty different from what's in the article.


I propose that it would require a minor change in web browsers only. Consider that we have 'incognito' and 'private browsing' sessions. The implementations are slightly different between browsers (e.g. Firefox appears to allow sharing between tabs in a window; Safari keeps each tab completely isolated), but the principle is the same. Modify the browsers so they're always in 'private' mode, and to only permit sharing once the bill is paid.


Pageviews within an individual incognito session still share cookies. If they didn't nearly every login on the web would break. When you create an incognito session just for one specific task and then close it, this doesn't matter. When you're always in incognito mode you're back where you started: a site that you're always logged into will have all the data from any site you visit with its tracking beacon.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: