Hacker News new | past | comments | ask | show | jobs | submit login
GDPR complaint claims Google and IAB ad category lists leak intimate data (techcrunch.com)
146 points by imbiased 88 days ago | hide | past | web | favorite | 67 comments

We should also be able to access our marketing categories without a Google account, since sensitive data is collected and a profile is built even if you don't have a Google account.

Throwaway of an ex search engineer here. This can’t work because how would google reliably only give you your information without a login account? They are only useful enough in aggregate to google so they can’t know it is only you with 100% certainty until you log in. AFAICT anyone telling you anything else is probably nonsense fearmongering, or maybe doesn’t understand how sophisticated attackers of google are and hasn’t thought through what happens when you have a logged out way of spoofing someone’s internet behavior to get their whole data.

This is effectively a) putting everyone’s approximate search histories on the internet, or b) outlawing google search’s buisiness model.

The consequences of trying option a and failing even once are so great, I argue if that’s your goal you should ban logged out personalization before anyone deludes themselves into thinking they can do it without leaking everyone’s info publically. I also think that’s going to harm consumers more than it actually helps them but I am obviously biased as an ex google search engineer.

We don't expect relinked data to be viewable, because that is ripe for exploitation, but we do want to access and control the data that is linked to the advertising cookies that were placed on our devices.

Better yet, google should not be able to store any personal data about users who are not registered

I feel like this is another example of something which can only be enforced by creating a new framework that inevitably benefits the few corrupt actors and defeats its own purpose.

How sure are they that it's one individual person that they are tracking? If you and others use the same computer, then maybe stuff might get mixed up. If "accessing" marketing categories includes read-access then your situation is worse than before.

What about the legal risk for Google if you opted out of the tracking a month ago but now google thinks you are a different person and is tracking you?

There's subtle ways to fingerprint someone, even how they move their mouse:


These are indeed tricky problems to solve.

They're also googles problems to solve, not societies. We're not required to provide solutions for them.

If google can't solve them, maybe that means google's business model is illegal.

I agree but is this a Google specific thing? So many companies gather data on people without letting them see anything... I feel like we need legislation regarding this for all of them.

How is legislation going to fix this unless you mandate region locking of the internet? The moment a website loads some script from a Chinese site all bets are off from a legislative protection standpoint.

Under GDPR, an EU website owner is responsible for the Chinese scripts they load onto their site, as part of the Controller-Processor relationship. That doesn't help for Chinese companies without a locus of business in the EU, but it covers the hypothetical case that you raised.

In practice, legislation goes into effect globally by being in a large enough market that companies would rather comply than lock themselves out. Several companies have rolled out their GDPR compliance updates globally rather than just to the EU. It's the same reason that lots of products in the US comply with standards that only exist in California.

GDPR already requires all that data access and control even for people that don't have a formal account and haven't agreed to your ToS. No additional legislation in the EU needed.

But more enforcement. GDPR enforcement has been disappointing so far.

That's because DPAs understand that if they reinforced GDPR properly then half the companies, particularly small businesses, in Europe would have to be fined. I'm not just talking tech companies either.

That's also because fining isn't the first step, its pretty much the last. You will have received a warning that you are not compliant and been given a deadline ito fix it in most cases.

Like Google did in France? (receiving a warning before getting fined)

All those small businesses mostly have data about subjects they are conducting business with. In general this is a valid reason to have that data and GDPR compliance is merely about implementation details.

The data subjects of ad networks however are completely different entities from their customers, which makes it a very different compliance problem. It might not be possible at all to conduct that kind of business in a compliant way.

These categories apply to the content, not the cookie (when a cookie is even available which it isnt in many places).

This is not personal, it's the contextual targeting everyone wants. These blog posts never understand adtech.

The point is that bid requests may (do) contain both an identifier and data about that person. "Is reading a financial news article" being an attribute of the content, sure, but broadcast such that it can be associated with the person.

That's just how context works. If you visit another site then there would be different categories involved and has nothing to do with the user.

There's also no personal identity, it's just a cookie if available, used mostly to frequency cap.

Can't adtech companies associate that cookie with the category and build a profile over multiple pages? Then they can correlate the data and identifiers Google provides to any that they collect on their own (e.g. their own pixels served in the ads that actually get shown). If they connect their own pixel identifiers to data that they buy, then they are building up a decent profile.

Those profiles would quickly become so broad as to be useless. Context in the moment is the most important thing, which is why even Google shows ads targeted to your actual search query.

Cookies are also not an identity and refreshed very often. Their main use is to cap ad frequency and track conversions over the short-term (hours to days).

Google and Facebook do not provide any personal identifiers. That would be a massive breach of their core 1st party dataset. What little data they did provide is now gone with GDPR.

I'm not sure what you mean be "that's just how context works", but perhaps to illustrate the disconnect, it is just as incompatible with GDPR to share the page URL itself, let alone data derived from it like the page category. Doing so is sharing user data in a way that is not consented to.

I don't know what you mean to say about the cookie either, the whole point of this kind of advertisement is to persist associatable data about a person for the lifetime of the cookie.

Page URL is not personal information, that's a ridiculous overreach and misinterpretation of GDPR.

Cookies are an anonymous identifier, they are specifically not a person. As I said, it's a short-term stable ID used to control the amount of ads shown and track any conversions for campaigns. Adtech companies do not know who you are, only Google and Facebook do.

You're incorrect, or at least that's what this complaint claims, and I personally have been expecting it for awhile.

The fact that cookies are pseudonymous has 0 effect here -- literally their entire purpose is to be able to associate third party data with a person's browser.

Your contention that all they're used for is frequency capping isn't true either, but even if it was, it's not relevant -- "I'm just using it for frequency capping" isn't acceptable under the GDPR, just as much as "I need the data to do advertising" isn't a reason acceptable under the GDPR for keeping a piece of data in the first place.

Here's the text of the GDPR on cookies: https://gdpr-info.eu/recitals/no-30/

I'm familiar with RTB (real-time bidding protocol) details, so I can assess this from the technical POV.

Most ironical thing here -- IAB categories applied not to user profiles but to URL's.

So, their goal is to facilitate ads targeting not to user profile, but to page content. This is the use case which is often discussed on HN as ethical and "right" way of showing ads -- you get the bid request with "Nature, travel" IAB categories and you show ad about outdoor gear. You don't need to crunch user data to make this simple decision.

However, I have to admit this complaint has it's own merit. Bid request usually contains not just page URL and IAB categories, but user cookie as well. So, by data-mining bidstream, you can theoretically find people (well, at least their unique cookies) who are looking for a cure for impotence, and this is against GDPR, for sure.

Key quotes:

> Lack of transparency makes it impossible for users to exercise their rights under GDPR. There is no way to verify, correct or delete marketing categories that have been assigned to us, even though we are talking about our personal data.


> Under GDPR, processing special category [medical information; political affiliation; religious or philosophical views; sexuality; and information revealing racial or ethnic origin] data generally requires explicit consent from users — with only very narrow exceptions, such as for protecting the vital interests of the data subjects

The last quote is particularly troublesome, as Article 9 GDPR [1] is explicit about this: processing this data is prohibited by default, and none of the exemptions seem to apply even by a stretch of imagination.

Assigning such labels may be the norm from the Ad industry's point of view, but that is simply no longer possible under the GDPR.

[1] https://gdpr-info.eu/art-9-gdpr/

I am curious, if you ask for a dump of your data from Google, where do you have to look to find your ad category ? As far I know, this is not directly accessible from your profile or privacy settings.

Looking at the data selection to export, I am not even sure this is included somewhere.

Yes, it seems that service providers confuse the "provide a dump of my data" with just being the information a user actively uploaded and stored. The point of this is to get access to any information the service provide might have about the individual requesting the data. And have it all deleted upon request also!

I'm pretty sure GDPR is meant to force disclosure of derived data too.

Are you sure? I didn't find a clear sentence on this last time I looked. It seems hard to define what is derived data (if they guess that a 20 year old is a student, is that derived data? or just a guess) and I can imagine it leaking information about other people if it involved aggregating together pieces of data from multiple people.

The GDPR says "'personal data’ means any information relating to an identified or identifiable natural person". So as long as this derived data is directly related to a person, the GDPR applies.

More explicitly, the UK's regulator says: "You should however note that if this ‘inferred’ or ‘derived’ data is personal data, you still need to provide it to an individual if they make a subject access request."

Huh, thanks for that. So e.g. LinkedIn not providing any information on (say) emails they've scraped seems blatantly illegal too?

In that case, if I stored your date of birth then I'd also have to "disclose" your age and star-sign.

Only if you actively stored or processed their age or star sign. It's not 'disclose every possible inference you could make given the data you hold'

What if you don't store the inference, but instead process it when needed internally in a function?

  if (dob.month == december) $birthstone = quartz
  select advert from adverts where stone = $birthstone
Or whatever

For ads, you probably don't need to tell people about the "birthstone". But if that automatic processing "produces legal effects concerning him or her or similarly significantly affects him or her" (such as denying a credit card or job offer), then you have to give the person "meaningful information about the logic involved, as well as the significance and the envisaged consequences of such processing for [them]."

GDPR requires that you log when you use information from a user in models, reports etc so this would probably have to be logged and disclosed.

You would also need explicit concent to use the date of birth for advertising purposes.

I have ad personalization turned off, but you should be able to view and edit your interests from your settings page.


It is mostly empty just my age and sex.

However I am sure I got an interest profile, at least being used with the Discover feed on Android.

Google ad data isn't really linked to data for other non-ad services for privacy reasons.

The discover feed on android is mostly powered by web search history, chrome browsing history, and location history. You can see all that here:


Maybe Google doesn't store it, and just uses these categories in the process of auctioning advertisements, sending them as context?

The problem with that being, of course, that any company participating in the bidding process can decide to store that information and build a profile that does have this information.

Bidders don't get as much info as googles internal models get. For example, bidders get to 'track' users by a unique id for up to 30 days, but then the id gets reset, so they can never persist any data beyond that unless they can correlate the new id with the old.

That correlation tends to take weeks worth of data to do with any accuracy, and by that time, all the opportunity to actually use the knowledge to place bids has gone.

The bidders can re-correlate if they actually make a bid, and use a creative to inspect their own cookies, and then resell that ad-spot, but typically that isn't worth it for small bidders (there's just too many devices on the internet - you'd need a huge ad budget), and large bidders are bound by privacy laws that stop them doing it (no investor wants the company the wrong side of an EU fine).

A few bidders used to do that on iOS devices, since the ads there are sufficiently valuable to make it worth it, but I haven't seen it for a few years.

I think you mean "participating"

tnx, updated

Advertising and marketing is a trillion dollar industry that employs millions of people across the globe and I want nothing to do with it.

Advertising atheism and I wouldn't be entirely surprised if in the future people will be prosecuted for it.

Google is the elephant, but it is very rare to see compliant services/sites. The interesting question is when/if EU is going to flex its GDPR muscles.

Never. This is used as another tool to selectively target companies the EU doesn't like while maintaining appearances.

Germany (well, German states actually) alone has handed out fines to 41 companies, and has hundreds of proceedings in flight (i.e. one German state is currently investigating 50 local companies), the vast majority against locals. The average cases just don't make headlines, especially not in english-speaking media, because who cares?

Here’s a few more highly sensitive labels that are being attached to web users’ identities and shared with potentially thousands of bidding ad companies — in this case the labels are ones which the IAB uses: Special needs kids, endocrine and metabolic diseases, birth control, infertility, diabetes, Islam, Judaism, disabled sports, bankruptcy.

I'm jealous that at least Europeans can complain legally.

In the U.S., we believe that the free market knows best and that's freedom and such. Meanwhile, we're being profiled by these vile companies (FB, Google) and our data resold. Aside from individual rights being violated (hint, individual rights aren't just rights against government intervention), there's a huge societal threat here: what happens when this data is used to pit us against one another? Are we still free, then?

In the U.S. it will take a cataclysmic event to reach a GDPR-like desire by the population. The sad reality is that the EU has its citizens' interests generally in mind (consumer protections, GDPR), while in the U.S. Big Brother has the interests of large corporations at heart (namely by allowing them to run roughshod over our rights).

Setting aside penalties for a moment, what is the minimum set of changes to programmatic advertising practices that would bring it into compliance with GDPR? Would removing the targeting categories that relate to intimate data be sufficient? Or is something deeper, more structural in the crosshairs?

Remove userid / cookie sync / whatever you name it from bid requests, and make them only context based. Also, forbid cookies from ads providers to be stored on the end user machine.

Note that this doesn't disallow websites with first party data and user consent to add user related information to the bid requests to increase their value, it just doesn't allow to correlate the information with a person after the RTB process ends. Of course it totally changes the role of data providers in the current ecosystem, but that wouldn't necessarily be a bad thing.

I by no means and expert on the subject, but I believe that the "simplest" change would be to target based on content, rather than the individual user.

Yes, but that's also a torpedo to the way programmatic ads are currently bought and sold. (FWIW I am in favor of such a change, but there are a lot of very large tech companies selling data management platforms whose core value prop is being able to stitch together audiences from this sort of data and precisely target them everywhere their browser cookie or device ID goes.)

Agree. GDPR and programmatic ads are totally incompatible.

I believe this is intentional on part of the EU.

Basically anything that hurts American tech companies is intentional on the part of the EU. That's why they're keeping both eyes shut on the plethora of violations many European companies are doing.

You have to report the violating companies. There are no government organizations actively looking for violations.

The American companies are simply bigger target and have the attention of more people, so their reported more quickly.

> That's why they're keeping both eyes shut on the plethora of violations many European companies are doing.

Name and shame. List the EU companies that are shitting on user privacy the way Google, Twitter, and Facebook currently are.

The first fines were leveraged against EU companies [1], your whining falls flat on it's face and has no basis in reality.

[1] https://iapp.org/news/a/germanys-first-fine-under-the-gdpr-o...

The ad categories mentioned in this complaint are the content categories.

I don't mean the "content categories" of the user.

If a page on some website is about cars, then you sell that page as being about cars to the advertisers. At no point would you care about the user, just the assumption that a person reading about the latest Toyota might be in the marked for a new car.

That’s literally what this amendment is about. They are talking about the content categories.

Offer users a meaningful reason to actually consent to such targeting. Current "consent" forms are not meaningful in that most users are probably clicking "ok" just to get rid of the pop up and not because they actually agree.

Why would users actually provide meaningful consent to having a tracking profile? You need to actually offer something to users. The law essentially says you cannot just start profiling them without their permission.

You could offer users a subscription based ad free browsing experience. User pays 50 euro a year, you take a 10% margin, leaving 45 euro behind to provide the ad free experience. At 164 impressions per day (stretched inference from the article) you bid 0.075 cents per ad space. If an ordinary advertiser bids less then this to show you an ad, then no ad would be shown instead and the content publisher would still get paid. At any time you could cancel your subscription and demand that the profile be deleted. This is just one idea on how you could collect meaningful consent for an ad profile.

Not just Google though, is it?

And your point is what exactly?

I think Google and FB have met their match, EU! Not only penalties but crippling changes to their existing, everything goes, business model. Penalties they can afford...

It just gets creepier and creepier and to think there are hundreds of thousands of people involved in this sordid endeavour who think nothing of and stalking, profiling and dehumanizing others for personal gain.

More evidence there is zero moral compass in SV and given enough money people are willing to do whatever away from public view and posture and pretend to care about niceties like ethics in public. And these are educated folks who are not starving and desperate.

Discussions should move from a default human base ethical position to any discussion about ethics is posturing and empty, its only by actions that any sense of ethics can be gleaned.

But people who behave unethically cannot then expect an ethical society or ethical behavior from others. These others too have a right to exchange their values for money and attempt to normalize, redefine or hand wave away their actions.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact