Hacker News new | past | comments | ask | show | jobs | submit login
Google’s GDPR Workaround (brave.com)
1868 points by donohoe 16 days ago | hide | past | web | favorite | 588 comments



I checked the sample log provided.

Below is the google_gid for different publishers, there is no proof of overlap, they have different google_gid for same person. Which is exactly what google describes. [1]

I don't understand what Brave claims.

  d.agkn.com          CAESEP-S3Zs5f0_kq11XTCZP_mE
  id.rlcdn.com          CAESEPpf2T4-2AsAR_4rer3RfNs
  image6.pubmatic.com          CAESEB9H3qdV26kxEiz-BJ_TY-M
  pippio.com          CAESEJyqG1Pg1j-_scqW8kDzTkg
  token.rubiconproject.com         CAESEE1DyZ245WggYaQZEWpQWI8
  us-u.openx.net          CAESEPIJ9jHcY2j4jK3-DPmfar4
[1] https://developers.google.com/authorized-buyers/rtb/cookie-g...


This log [0], right? Did you miss in the article that it's the `google_push` identifier that's being used for syncing between adtech companies? If you search for it (AHNF13KKSmBxGD6oDK9GEw5O0kvgmFa3qM30zpNaKl72Og), you can see it being included in requests to lots of different adtech firms' domains.

[0] https://brave.com/wp-content/uploads/files_2019-9-2/sample_p...


There is unfortunately no way to prevent that part.

BidRequest Data [0] and Request Time is already enough to fingerprint the user.

"Google prohibits multiple buyers from joining their match tables." part is not technical, it is contract based.

[0] Sample Data from Bid Request

  ip: "F\303\006"
  user_agent: "Mozilla/5.0 (Linux; Android 7.1.1; Pixel XL Build/NOF26V) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Mobile Safari/537.36"
  url: "http://www.myfitnesspal.com/food/calories/popeyes-buttermilk-biscuit-29980768"
  cookie_version: 1
  google_user_id: "CAESEIMlaNwMN-rtiDFzjwNIX6Y"
  timezone_offset: -360
  detected_content_label: 39
  mobile { is_app: false 3: "android" 8: 1 12: "google" 13: "pixel xl" 14 { 1: 7 2: 1 3: 1 } 15: 412 16: 732 18: 70092 19: 3500 }
  cookie_age_seconds: 12960000
  geo_criteria_id: 9023221
  device { device_type: HIGHEND_PHONE platform: "android" brand: "google" model: "pixel xl" os_version { major: 7 minor: 1 micro: 1 } carrier_id: 70092 screen_width: 412 screen_height: 732 screen_pixel_ratio_millis: 3500 }


>There is unfortunately no way to prevent that part.

It being technical impossible or infeasible does not give them license to not follow the law.

Either they comply with the law or they don't. I'm not a lawyer, but it certainly doesn't look like they're following the law here.


There is unfortunately no way to prevent that part.

Well there's absolutely a way: single-source JS and no CORS.


Lol.

:/


Like stopping? I do not do that and am surviving last I looked.


> There is unfortunately no way to prevent that part.

“there is no way for google to operate under the GDPR and provide targeted advertising” ?

i’m inclined to agree :/


> There is unfortunately no way to prevent that part.

Couldn't each buyer get their own "auction ID" as well as their own "user ID"? Am I completely misunderstanding things?


Also another benefit of google_user_id vs Google's Ad Manager cookie is, it expires after 14 days. So after 14 days, you will get new google_user_id for same user. So syncing between adtech companies does not have much value.


You are implying that this mechanism is already used by adtech providers, we have no proof. Those players are often competitors that are not working together so they won’t share their user data (not a user identifier, only a “page load” identifier). If they want to sync their user ids (because one is buying inventory from the other), they can launch a cookie sync between themselves (same process as with google_gid, persistant user identifier, much more efficient)


which is the "workaround" to the gdpr the article badly describes (probably because brave upcoming ad network will do the same but more workaroundily)

now those are used to match a 3rd party id. you just need a gdpr_workaround schema in your data base with two columns user-id, google-random-id with N-1 records indexed both ways.

gdpr has restrictions on pin pointing a single person. this is effectively doing that, but claim it is not, because random ids. apple is just a little better with how device-advertiser-id works.


Well, the “right” workaround is an opt-in system. But that would drastically reduce the number of qualified ad prospects, reducing their wholesale value, killing the online ad business, drying up the websites themselves who exist for this revenue (some/many of which are trash, but not nearly all).

I don’t think we can have it both ways, or at least it is very difficult and we don’t have a great compromise solution.


I don't think the EU, or consumers in general, are terribly interested in "compromising" with the ad-tech industry.


I know that I'm not interested in "compromising" with the ad-tech industry. They've been spending too much time and money attacking my defenses against their terrible practices for me to treat them as anything but an attacker.


Not that I support the ad-tech industry, but those consumers probably are interested in having their favorite websites being kept alive. Which implies that they might indeed be interested in "compromising" with ad-tech industry.


Why should we give ad-tech their one millionth chance to "do the right thing"? They've proven time and time again they cannot be trusted.


it isn't about giving ad-tech any chance of anything, it is about websites that are liked and used by people (most of whom have either no means or no desire to support those websites with money directly) being able to sustain themselves in order to exist.


If only there were other models for ad sales, say ones that were successfully used for decades prior to the advent of the internet and ubiquitous surveillance, that could be used instead of said ubiquitous surveillance...

But no. The internet enabled vast, invasive user tracking, therefore vast, invasive user tracking is the only conceivable way to sell advertising.


That's pretty much a false dichotomy: a site must either support itself via ads, or cease to exist.

There are other ways to get money to support your work, and if those ways are too painful right now, that's just an opportunity for disruption. Even better, it's an opportunity to prove that disruption doesn't have to be exploitative.


There are entire markets that cannot be accessed by publishers unless they subsidize content with ads. That is not a false dichotomy, that is a market requirement.

Not every website is the WSJ or Bloomberg, which cater to markets that are willing to pay for content.


Then maybe markets are the wrong tool to organized this kind of publishing.


Not saying that it has to be ads only. If there comes a disruptive alternative revenue model that allows all those websites to self-support themselves, I will be one of the first people to jump the ship and advocate for the ban of ads in favor of that new model.


To be honest, that doesn't matter to me. I think that websites who inflict the ad-slingers on their readers are showing great disrespect to and disregard for their readers.


> but those consumers probably are interested in having their favorite websites being kept alive.

I'm one of "those consumers" and I'm actively looking for sustainable ways to pay content producers.

Here's what I do currently:

- subscribe to two newspapers in addition to the mandatory payments to the national news broadcaster.

- donate to the Guardian

- buy on Blendle

If there was a way to pay for single pay-walled stories I would probably use it a few times a week in addition to my current subscriptions.

I'm not interested in any more subscriptions (unless they are all inclusive like Spotify so I can cancel my current subscriptions, and even then I'm not sure since I actually want to support those two papers and think I do so better through direct payments than through revenue sharing through a huge international tech company. )


If consumers want it they will pay money, plus ads don't necessarily need to be targeted to readers.


More generally, if enough people (including the author) think the content has merit, they will choose to support it (by which I mean “collectively supply all the resources it needs to continue”).

The cost of running a basic website to publish text is modest. Tools like [dat][] and [scuttlebutt][] make it completely free (once you have a computer and any internet connection) to distribute content to people who actually want it.

[dat]: https://dat.foundation/

[scuttlebutt]: https://www.scuttlebutt.nz/

On the other hand, if you want to make a living out of producing content (rather than wanting to publish the content purely for its merit), that is harder — the content has to be that much more valuable to enough people.

As long as individuals can publish stuff, and others can see it and choose whether to support it financially (all without 3rd parties mediating/filtering), then I'm content. Our distributed tools make that possible; we just need to make them easier and more ubiquitous.


Given the popularity of ad-blockers these days I'm not sure they're as interested as you think they are.


With current ad industry it's more about whether you are ok with being bullied or not.


> probably because brave upcoming ad network will do the same but more workaroundily

AFAICT Brave's plan is to send a block of potential ads to the client and use a client-side machine learning algorithm to choose specific ads. So the claim is that none of client events, inferences from the algo, nor ad choices travel from the client to the ad networks. (But ad networks retain their crazy microtargetting which I guess is the selling point.)

Even if Brave were to choose the blocks of ads based on geolocation and other install-time/runtime data which they then sell to third parties, it's still significantly less data leaking from the client's browser compared to, say, a default Chrome install. But them storing/selling that would be a clear GDPR violation as well as going directly against all their explicit public claims so far.

What is your understanding of Brave's upcoming ad network that leads you to believe it requires a surreptitious GDPR violation?


I'm an engineer who has worked on ad systems like this and I'm really struggling to make sense of this article - what hope does a layman have?

Here's my understanding: Google runs real-time bidding ad auctions by sending anonymized profiles to marketers, who bid on those impressions. The anonymous id used in each auction was the same for each bidder, which is in violation of GDPR. If Google were to send different ids for each bidder, it would be ok? Is this correct?

Why would it matter that the bidders are able to match up the IDs with each other, aren't they all receiving the same profile anyway? Wouldn't privacy advocates consider the sending of the profiles at all an issue?


This is a problem because companies can use this ID to correlate private user data, without anyone's knowledge or consent.

There are companies that specialise in sharing user information. Some of them work by only sharing data with companies that first share data with them (an exchange).

If you got this Google ID, and you had a few other pieces of information about the user, you could share that data with an exchange, indicating that the Google ID is a unique identifier. Then, the exchange would check if it has a matching profile, add the information you provided to that profile, and then return all of the information they have for that profile to you.

So, let's say you're an online retailer, and you have Google IDs for your customers. You probably have some useful and sensitive customer information, like names, emails, addresses, and purchase histories. In order to better target your ads, you could participate in one of these exchanges, so that you can use the information you receive to suggest products that are as relevant as possible to each customer.

To participate, you send all this sensitive information, along with a Google ID, and receive similar information from other retailers, online services, video games, banks, credit card providers, insurers, mortgage brokers, service providers, and more! And now you know what sort of vehicles your customers drive, how much they make, whether they're married, how many kids they have, which websites they browse, etc. So useful! And not only do you get all these juicy private details, but you've also shared your customers sensitive purchase history with anyone else who is connected to the exchange.


Considering google_gid is valid for you for 14 days only. It is very unlikely to build a profile around it.


I have no doubt that if you had a record of my browsing habits for 2-3 days you could readily identify who I am the next time you have my browsing habits for that period of time.

I wouldn't be surprised at all if 2-3 hours of active browsing was enough for this.


Your device fingerprint alone is generally enough to tie your new google id to any previous ones.


Which is also a typical example of privacy violations in the name of alleged security.

Some newer linux kernels (>2016) use random tcp timestamps offsets to prevent clock skew profiling.

That is a security feature, not the shit big tech is offering here.

But of course the mechanisms in question are suddenly implemented for fraud protection instead of user security. Yeah, bullshit.


It seems likely that the ad network could detect the change in ID if the expiration happens in the middle of a browsing session. Which, considering user habits, they are probably online at the same time every day, or have habits that cycle weekly.

Also, considering we largely do the same things every week and every day, I suspect a single day to give you at least 50% of a user's identifying data, and a week to give you at least 80%. That leaves a whole week of pretty accurate tracking.

I think you've made a pretty wild claim that 14 days isn't enough time to build a useful profile. Regardless, even if the usefulness of the data over two weeks is questionable, it's still illegal to share the data in this way. You wouldn't be too happy if someone broke into your house and "only" stole a single fork.


Considering how much time many people spend online, and how efficient these profiling systems have become, I wouldn't be surprised if 14 days was plenty of time.


The time of validity and how hard it might be to build a profile are not factors in whether or not this is legal under GDPR. Here's the actual text from GDPR on pseudonyms and synthetic keys of this type[1]

> The principles of data protection should apply to any information concerning an identified or identifiable natural person. Personal data which have undergone pseudonymisation, which could be attributed to a natural person by the use of additional information should be considered to be information on an identifiable natural person

So PII that has been pseudonymized (mapped to a gid in this case) is protected in exactly the same way as if it had not been if the pseudonymized data could be mapped to a natural person by the use of additional data. The pseudonym (gid) is itself also considered PII under gdpr. [1] https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CEL...


> The pseudonym (gid) is itself considered PII under GDPR.

I know of multiple systems that use a UID but throw away a user’s information, including the UID mapping, when the user leaves. This allows historic metrics to be retained without ever identifying a user who isn’t still using the system.

AFAICT, guids are a grey area.


I don't mind that at all, so long as that replacement is never shared with other entities.


Thank you: that explanation is the first that makes sense to me.

I get the impression that this structure would require an exchange: retailers would not trust each other otherwise.

Wouldn’t commercial pamphlets, interviews with salespeople, etc., from the exchange be obvious proof of illegal behaviour there? Google’s implementation is imperfect but, for the loophole to work, it would need coordination between several competitors and third party with a business model explicitly and almost exclusively about going around against GDPR.

If I can risk a comparison, that would be Google is like a chemical company selling fertilizer, and the exchange is selling bombs made from raw material bought by other people.

Am I missing the point? Shouldn’t this article be about those exchange and their clients, not Google?


> To participate, you send all this sensitive information, along with a Google ID

Isn't that also a GDPR violation?


> Why would it matter that the bidders are able to match up the IDs with each other, aren't they all receiving the same profile anyway?

I would guess that yes, they're all receiving – _from Google_ – the "same profile" but they also are collecting additional info that they can then share with each other and, because they can match profiles exactly, they can access each other's info about specific people.

> Wouldn't privacy advocates consider the sending of the profiles at all an issue?

I'd imagine that the profile Google has and shares is by itself fairly anodyne, but I could be (very) wrong about that. The problem seems to be more (if not entirely) that different advertisers can share info using a common profile ID.

I'd imagine that even a single advertiser would be able to perform a similar 'attack' by, e.g. running multiple different campaigns, but I may be misunderstanding exactly what info is being shared. It's possible advertisers are able to match the Google profiles to specific unique identities and thus are sharing much more than just the info they're collecting directly from their ads.


If it's the different advertisers who are going to share info, then why aren't they responsible for their own adherence to GDPR, rather than Google?


I'd imagine they are responsible too, not just alone, and that Google is a much more attractive target for GDPR enforcement both because they're larger, have more money, are more visible, but also because they're directly facilitating the "different advertisers" sharing that info.

If Google ceases to provide them the means of readily sharing info then all of those entities will no longer be violating the GDPR, in the scenario anyways.


As I understand it, Google is responsible for not sharing information that would allow them to violate GDPR. Without explicit user opt-in, that is.


The answer is: RTB is illegal and we're just waiting for the courts to decide on it.


Are they maybe only receiving a partial profile, with info relevant to that ad buy? And by compiling that data with the unique identifier, they can match it with other partial data from other ad buys?


I'm glad this story was reported, and I'm thankful to the author for putting in the work required to report this story. But after the first five paragraphs, the author's shameless, repetitive self-promotion and insistence on referring to himself in the third person almost made this unreadable.

The headline was enough to pique my curiosity to explore Brave's product offering. Unfortunately, actually reading the article had the exact opposite affect.


I thought the exact same thing after reading the first few paragraphs but didn't even notice that the author IS Johnny Ryan, the person mentioned in the story, until you pointed it out.

I didn't make it to the end, closed the tab and went over to HN comments for a summary.


I've worked in the sector for years and honestly thought this was well documented, common knowledge: https://developers.google.com/authorized-buyers/rtb/cookie-g...

The only thing Google did in regards to GPRR was limit the number of parties in RTB they're including by default for syncing to a "trusted set" of parties.


I think the silent/invisible nature of cookie sync'ing is what upsets people when they discover it. T

he diagrams in your link show a single hop for the 302, in my experience that can be many hops going between different advertisers. The same thing happens on non-google platforms, like TradeDesk and others.

The sync scenario can make it next to impossible to delete cookies when those cookies can be rebuilt using data from others.


I think the HN community, and most consumers, tend to look at things from only one angle. Imagine you start work at some small shop that manufacturers widgets for consumers. What would you do when you have to advertise your product? You'd have to turn to Google is a similar company. Are there any real alternatives? (I am asking because I really want to know)

I say this because I am in this position now. I have to figure out how to advertise my company's products and am torn on how to go about it.


The alternative is to spend hundreds of hours finding widget-related websites, trying to contact the owner(s), negotiating what ad spots are available, what ads are acceptable to run, and what pricing/terms will work for both parties, then managing that relationship over time to ensure ads are actually being displayed, being paid on time, contracts renewed, etc.

It's definitely possible, but you're just doing everything manually that ad networks do for you. Whether that is worth your time (or worth it to hire someone to do this kind of thing for you...) is up to you.


At which point you'll very likely learn that a lot of widget-related websites use ad networks because it saves the 2-3 administrators involved a lot of time and energy.

It's definitely possible for small websites to do ads directly, but it's a lot of work. Often more than is justified for a few thousand dollars a year in ads.


> It's definitely possible, but you're just doing everything manually that ad networks do for you.

You've just explained how contexual ads used to work, which doesn't need all the invasive surveillance modern internet users have to put up with.


Yeah and there's a reason the tech moved on from that. It was a LOT of work on both ends to negotiate and monitor the relationship. Instead now we have a central broker who both parties work with that has set up a computerized way to manage these relationships.

Personally I think the solution that lets us keep ad supported content and easy ad placement would be for Google to force companies to provide bots they could run internally so the profiles never leave Google's datacenters and strictly monitor the output so the buyer bots don't leak information back to the companies. I think that would do a lot to alleviate the privacy concerns and breaches and is honestly how I though ads were being sold for the longest time instead of profiles being sent to companies buying placement.


> Yeah and there's a reason the tech moved on from that. It was a LOT of work on both ends to negotiate and monitor the relationship. Instead now we have a central broker who both parties work with that has set up a computerized way to manage these relationships.

I'm not disputing the necessity of a central broker. Contexual ads based on search keywords or website content used to work fine without surveillance, and can perfectly be automated by a central broker.

Years ago, I didn't have much issue with online ads (with the exception of popups and spam emails). Nowadays, I'm forced to block them altogether to avoid the extensive surveillance by adtech. It doesn't have to be this way if adtech respected user privacy.


> I think the solution that lets us keep ad supported content and easy ad placement would be for Google to force companies to provide bots they could run internally so the profiles never leave Google's datacenters

Honestly, that wouldn't do much to alleviate my privacy concerns, as it does nothing to protect my privacy from the likes of Google (or other ad-slingers).


Just so long as I don't have to have relationships with dozens or hundreds of hosts.


Native ads like those can certainly be automated to a great degree, and at the very least use a self-serve interface. "Advertise with us!" links in the sidebar or footer or wherever.


AdSense used to do this for you before it tracked individuals...

You can still target adverts at content on a site, and have an aggregate system to make that easy and granular.

There could even be services which collected lists of websites and categorised them, and the rough demographics of their audience, and retailed slots.

Individual tracking is unnecessary financially, you can make as much revenue without. See example of the New York Time dropping tacking for EU readers for GDPR reasons, and continuing to grow ad revenue: https://digiday.com/media/gumgumtest-new-york-times-gdpr-cut...


You advertise on a website for widget fans. That's how it successfully worked for a long time.

The whole targeted advertising is to allow adtech companies to identify users of the "widget fans" site, and then advertise to the same people, but on another, cheaper site.


That's how it successfully worked for a long time... ... until google search and facebook groups killed thematic sites and communities.

I'd love seeing this turned around though.


Targeting existed long before computers. The difference today is the massive amount of very personal data being collected on people without their knowledge or consent and the risks it puts on them.

Imagine if you went back to say the 1920s, and you told a marketing director that you could install a one-way mirror room and back door into every household in the country, and staff them all with analysts who will secretly observe people in their homes 24 hours a day, 365 days a year. People would think you were crazy. In fact, I bet most people today would be completely against that idea, yet wouldn't realize that what companies like Google and Amazon are doing today is effectively the same thing, except _much more_ invasive.


For actual, physical widgets the traditional advertising markets still work: trade magazines and trade shows. Contacting vendors who specialize in your market of interest and running promotions with them also works.

However, it's all a whole lot more expensive and effortful than running some Google ads.


I think the point that's missing from the discussion is that adtech companies should be targeting solely on placement (i.e. what ads should show up on a particular website based on the content of the website), and not select individuals (i.e. based on user interests regardless of the place on the web an ad is shown), when faced with this kind of legislation.

This is exactly how marketing worked before and how people can go about it now, as you describe, through traditional advertising markets.

I'd personally prefer it that way in general, but legislation is necessary for them to optimize on those restrictions.


And there lies the problem. The cost in both time and money is much higher. I'd love to run TV and magazine ads on all the shows and in all the magazines my customers are reading, it just costs a whole lot more. Way more than we have just starting out.


You find your audience and you target them directly.


Yes, thank you. I logged in just so that I could upvote this. Why isn't this more accepted of an answer? Why is spam everyone's goto?


I've started a few successful businesses, and all I can say is what's worked for me.

I've never needed to turn to Google or other ad-slingers. Instead, I've done things the old-fashioned way, by going to where my potential customers tend to congregate and engaging with them (this kickstarts word-of-mouth, which is still the best advertising you can get), hosting my own online forums for customers, going to trade shows as appropriate, and supplementing everything with a few direct-placement ads in carefully selected media.

Yes, it's more work -- what Google et. al. are actually selling you is convenience, after all. But the rewards in terms of of ROI as well as fostering a real community, complete with evangelists, are more than worth it to me.

Of course, ymmv.


Online shop only, or brick and mortar?

Trade "print" media with an online presence. Online media that doesn't sell all their pixels to Google? Radio, depending on your audience and required reach? Forums specific to your audience that don't sell all their pixels to Google. Podcasts. Submarine articles in the trades? Open source ad networks that don't embed insanity or real-time bidding?

Mostly, I would try to target your initial audience as precisely as possible where they live, rather than with a wide net. Perhaps a Google search returns results for top websites dealing with your product - if they are not vendors, then perhaps you advertise on that site?

Disclaimer: I'm not a growth hacker, but I've thought about these things and run a couple of poor Facebook campaigns for a brick and mortar business.


Look for genuine, verified success marketing stories for people in similar positions and follow similar strategies.

I've personally never heard of a success story for what you describe that involves paying google. But maybe they exist and they're just keeping it quiet?


Surely Google has an obvious competitor, right? Because otherwise it would clearly be a monopoly.


Snippets from the article:

> The evidence further reveals that Google allowed [...]

> Google has no control over what happens to these data once broadcast [...]

Is it possible that Google does have "control" over the data after broadcast, albeit legal control via contracts with advertisers (as opposed to technical control)?

Perhaps Google's GDPR compliance strategy relies on the participating advertisers to comply with their contract with Google. If that assumption is accurate, perhaps Google's advertisers are in breach of their contract with Google which makes it appear as though Google itself is in breach?

I could be off-base, the details in the article aren't incredibly clear to me.

(For the record, I don't like Google's business model and I don't like Google's pervasive tracking -- I'm playing devil's advocate to better understand the issue)


The real time bidding on ad placements seems like a thing that a user could never give consent to as it's literally feeding your info to a massive ever churning list of companies that get to bid on it.

Aka - you land on a site, it send your IP and whatever identifiers it has to 10,000+ companies who all then figure out if they want to bid on showing you an ad.


Do you have to give consent for each individual third party your data gets shared with? I’d thought that if you give consent for some purpose, the company can use whatever processors it wants as long as it ensures they protect your privacy.


If ten thousand people agree to protect your privacy, is it really protected?


IANAL, but I have spent a lot of time reading the GDPR and associated guidance as the DPO for my small company.

As I understand it, you're correct. The Data Controller (Google) is responsible for getting consent, and the Data Processors (the third parties in this case) don't have to get consent themselves.

However, assuming Google's legal basis for processing your personal data is based on consent (rather than fulfillment of a contract or one of the other legal bases), then Google is required to get your unambiguous, opt-in, and non-coerced consent for each specific way your personal data will be used.

It seems likely that Google is covering themselves by acting as a Data Processor, not Data Controller, and the web site using Google is the actual Data Controller. In that case, the web site, not Google, is the one responsible for getting consent.


Yep, thats what those ridiculous pop up boxes with 400 (I counted one) "carefully select partners" of the websitd you visit are supposed to be.

It is IMO just a mockery of the intent of the law and I wonder when this will be punished.

I personally think GDPR might be a bit strict, but adtech have practically been begging for this for years so acting surprised now doesn't cut it.


I seem to recall (correct me if I'm wrong) that European courts ruled that “agreeing” to a very-long EULA for desktop software didn't constitute informed consent, because it's trivial to demonstrate that the users didn't actually read the entire agreement — even if they scrolled to the end, it's unreasonable to believe that most people read 10,000 words in 15 seconds.

So I assume that eventually these performances of consent-gathering will be legally judged meaningless.


But where does your PII end up, only at Google, no?


IP addresses and identifiers are considered to be PII under the GDPR. These get sent to the advertisers.


Is that necessary for some reason? Can't they just send the /24 of the IP? (Or other pseudoanonymized versions?)


> Is it possible that Google does have "control" over the data after broadcast, albeit legal control via contracts with advertisers (as opposed to technical control)?

That distinction is important. I’m happy that privacy advocates are realising that platforms having access to enforceable contracts is key–too many are quick to paint platforms as the source of the problem and not an agent who, with the right tools, could organise a market.

Facebook has suggested having an independent court what is acceptable content in their platform. Google could think about delegating the control of how its IDs are used to an independent entity with the power to audit its partners’ data practice properly.


The article doesn’t appear to claim that the author has been tracked in violation of the GDPR, only that the described mechanism makes it technologically feasible to do so.


Indeed. I fully admit I’d need to sit down and diagram some of Brave’s claims, but the large identifier screams “cryptographic entropy” to me.

The GDPR has separate rules that effectively deal with whether you are the business with which the customer works or one of their contractors. I could imagine a world where google is the second party and needs a secure feature like this so the first party can perform (consented) tracking across multiple domains they own. This is just a devils advocate argument since I can’t guess intent.


> Is it possible that Google does have "control" over the data after broadcast, albeit legal control via contracts with advertisers

I don't actually put much faith in such "control".


> "control" over the data after broadcast, albeit legal control via contracts with advertisers

Now there's a claim only a court would buy!

A court probably would buy it, though.

Sigh.


That is the GDPR-default though. You're still allowed to give data to third parties, you just need to have contracts with them regarding the handling, deletion etc of that data. Of course, mostly that's for data processors, which I don't think ad networks working with Google would fall under.


No, a data processor is any entity that collected personal data gets passed on to and where it is processed as part of the business arrangement. An ad network that receives personal data is definitely a data processor.


> No, a data processor is any entity that collected personal data gets passed on to and where it is processed as part of the business arrangement.

I'm relatively sure that there's another part: it's data processing for the client (here: Google) and the data cannot be used for other purposes. In this case, they don't process data for Google, they process it in cooperation with Google for the ad-buyers. Google also doesn't name them as data processors (which it would have to if that were their relationship).

If some contract was all it took, what would stop hospitals from selling patient info to insurance companies, saying "hey, they are processing the data, and we have a business agreement, this is all fine".


The 'client' in this case is the individual web sites that use Google's ad solutions -- not Google itself. It's the same thing with businesses' Facebook pages -- Facebook simply acts as a data processor in that case.

This is why nothing will be done about this. Sure, we might see a few smaller businesses fined, but the vast majority of sites using Google for ads will simply slip under the radar, while Google simply puts the blame on the site owners.


You are misreading the GDPR badly. A data controller can only pass on PII to a data processor. That is, any entity receiving PII from a data comtroller automatically is assigned this role by law. There are no alternative roles that could assumed instead. This means that a data processor must obey the rules laid out for it by the GDPR or it is in violation.


Oh, okay, I believe I understand your point and understand the misunderstanding. My point is that you can't just make everybody a data processor by signing a contract, share PII with them and be compliant (i.e. hospital sharing data with insurance companies). You're saying that by sharing PII with them, you're making them a data processor, but that says nothing about whether the DC or the DP are compliant.


Yup. That clears it up!


I discussed GDPR with you before and based on your answer here and maxidorius answer to you I will not accept anything you say about the GDPR unless it is obvious or has references I can verify.


You're suggesting that hospitals would be allowed to sell patient info to anybody willing to pay, as long as they have a contract?


In the US, HIPAA would apply to individually identifiable health information. HIPAA Providers share information with other HIPAA-covered entities all the time under contracts where the associate entities (non-providers) agree to comply with HIPAA privacy rules.


Those are generally with patient's previous consent though, right? Things get a lot easier if you have somebody sign some documents before you start working on them.


I'm suggesting :

1.) it might be stricter than you say.

2.) you (possibly like me? ;-) seems to have stronger views on what GDPR means than you can argue for.


Probably, I'm somewhat of a fundamentalist pragmatist ("this cannot be legal!" - "everybody does it, judges say it's okay" - "oh, I guess it's legal then :("), but in this case I'm not so sure. I still believe that Google does not consider them data processors (possibly because they don't consider a google_push id PII), because if they did, they'd have to name them in their privacy terms as entities they share data with. They don't. Of course, this might be because they don't care, but since it's a delicate issue and the stakes are somewhat high already, that doesn't sound plausible to me.

Pretty much all examples for data processing I've read are similar in this regard: the data controller (DC) passes data to the data processor (DP) so the DP can perform a specific task for them (handle invoicing, do analytics, run a web server, mail packages etc). The DP must not use the data for anything else, must not share the data with anyone (except for sub-processing, which has strict rules, too). "Exchanging/Syncing PII of users so we can create better profiles, more efficiently track them and show ads to them that are more personalized" doesn't fit the bill at all from what I understand. Similarly, landlords cannot get together and share all the data on their tenants to figure out who was a pleasant renter and who sued because the heater broke in winter.

So, in my understanding, even if you and I used the same invoicing provider, they wouldn't be allowed to tell me if they've invoiced a certain person for you previously, because we're different entities using them as a data processor and our data is to be kept separate. If we wanted to do data sharing (or even share aggregate probabilities like credit check agencies), we'd need a different construct, explicit consent and a bunch of additional compliance requirements.


Do they have to prove that the RTB ID can be used to retrieve PII? Or only that the RTB ID is correlated with personally protected information?

Is it enough that a RTB ID is pseudo-anonymous? (it always identifies the same person, but cannot be used to find that person's real information) - OR - is a RTB ID not even pseudo-anonymous?


GDPR definitions are slightly different.

A person is identified, if the ID references only one user in the whole dataset[1]. This also makes any information linked to the ID PII.

the ID would be pseudo-anonymous if one would need some extra data, to which they don't have access to, for linking the ID to one specific user in the whole dataset[2].

So to answer your question, RTB ID is not pseudo-anonymous as it only references a single user out of all of them.

[1] It's also important to understand the definition of PII in GDPR context. Which is any data that relates to an identified or identifiable person. Identifiable is the same as distinguishable. Knowing this helps to understand where the line is. https://www.lexico.com/en/definition/identifiable

[2] Definition of pseudonymisation, 5'th bullet-point: https://gdpr-info.eu/art-4-gdpr/ sheds some light on this.


Awesome, thanks.

(5) ‘pseudonymisation’ means the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person;


There's some documentation for this mechanism: https://developers.google.com/authorized-buyers/rtb/cookie-g...


This some great work on tracking down all of these measures to track users. I really hope we get to the point where dumb ads rules the web once more. Hopefully this results in more than a slap on the wrist, but I doubt it.


Why should ads rule the web at all? Surely the cleverest engineers to walk the planet can come up with a new way of making money that doesn’t involve psychological manipulation.


> Surely the cleverest engineers to walk the planet can come up with a new way of making money that doesn’t involve psychological manipulation.

If they could, they would've already done so.

One of the things "the cleverest engineers to walk the planet" would probably need to do is to increase consumers willingness to pay for good content by a factor of ~10 for e.g. online newspapers with quality journalism to be profitable, which frankly sounds near-impossible.


> One of the things "the cleverest engineers to walk the planet" would probably need to do is to increase consumers willingness to pay for good content by a factor of ~10 for e.g. online newspapers with quality journalism to be profitable, which frankly sounds near-impossible.

After more than two decades newspapers still haven't figured out that even though I want to pay for good journalism I cannot subscribe to every newspaper there is, and I am a user who actively wants to pay.

I already voluntarily pay for two newspapers and involuntarily pay for the national news-and-a-little-propaganda service. Oh, and I donate to the Guardian half the time I visit them.

If more papers allowed me to pay per view I would likely spend more money on journalism.

But I'm not going to have another subscription right now, thanks.


Not that I think their proposition is better but the Brave people particularly are trying to push a different model with their attention token scheme, so it's not that no one can think of something different, just that it's enormously hard to get people on board when the old advertisers are holding on to everyone using every single way at their disposal, legal or not.


Brave is trying to be the middleman and launching their own ad network. I think browsers forcing a business model onto publishers still isn't the right answer.


I disagree, the technical issues are relatively easy to solve, assuming there’s enough budget, buy-in. The issues with these forms of targeting are structural/cultural and AdTech is a surprisingly slow moving ship.

Technical issues are mostly used as an excuse.


I think there are simply engineers that are fine with the current state of things. You mention specifically "come up with a new way of making money"; however, for extrinsically motivated people, why reinvent the wheel? Problem solving can mean thinking up a solution or implementing a solution.

In the same vein, there may be engineers that enjoy working on this type of problem - how to identify someone that is actively avoiding you. The current iteration of Do Not Track mentality only makes the problem more interesting by putting up restrictions.


Engineers aren’t businesspeople. We’re tools of the businesspeople - we only stick around to do their bidding because they offer us equity.


I agree with you in theory, but until we can figure out micro-transactional payments that work globally it seems ads are good stepping stone. People want to get paid for work, some users are willing to pay (cash) some with attention to ads. We should not give up our privacy or anonymity for this attention though.


Micro-transaction payments are probably a long way away, for non-tech reasons. Briefly, you might have to deal with collecting and remitting sales taxes or VAT in any jurisdiction in which you have paying readers.

Until there is some sort of agreement among the relevant jurisdictions to greatly reduce the pain of this, direct micro-transactions with your site's visitors are likely to be a bureaucratic nightmare.


These platforms already exist. They fail because most people won’t pay for content, full stop.


> Surely the cleverest engineers...can come up with a new way of making money

New ways of building things is the province of engineering.

New ways of making money is the province of MBAs.

Not that really matters; I doubt you will succeed in replacing ads with something "better".


Sad that Brave did not do their work correctly, the google_push parameter they are talking about is not an identifier. Otherwise it’s true that RTB should not exist and violate GDPR, but it’s so complex that even Brave was not able to correctly state the workflow.

See their release note (15 April 2013); https://developers.google.com/authorized-buyers/rtb/relnotes

“Starting in mid-April, we will begin assigning a URL-safe string value to the google_push parameter in our pixel match requests and we will expect that same URL-safe string to be returned in the google_push parameter you set. This change will help us with our latency troubleshooting efforts and improve our pixel match efficiency.”


Okay, but the `google_push` parameter seems to be the same for all adtech providers swarming on the same user in the same RTB session. Nothing in your comment contradicts the claim that this allows them to sync up profiles for that user across providers, in the way that the switch to per-provider `google_gid` values supposedly blocks.


Well, for 2 page views (same session), I have 2 different ‘google_push’ (Chrome with default parameters, no extensions).


Sure, but as long as the adtech providers each have their own stable IDs for you, they can still use `google_push` to link their corresponding stable IDs together, uniquely identify you, and merge their respective profiles.

====

Page View #1:

- Acorp: google_gid=qwerty, google_push=foo

- Bcorp: google_gid=asdfgh, google_push=foo

- Ccorp: google_gid=zxcvbn, google_push=foo

By exchanging their `google_gid` values corresponding to the page load with shared `google_push` value foo, Acorp, Bcorp, and Ccorp can identify you as user qwerty-asdfgh-zxcvbn.

====

Page View #2:

- Acorp: google_gid=qwerty, google_push=bar

- Bcorp: google_gid=asdfgh, google_push=bar

- Ccorp: google_gid=zxcvbn, google_push=bar

By exchanging their `google_gid` values corresponding to the page load with shared `google_push` value bar, Acorp, Bcorp, and Ccorp can still identify you as user qwerty-asdfgh-zxcvbn, even though the `google_push` value has changed.


I now see your point, thanks. I was thinking this “google_push” is probably not unique (a.k.a many users could have the same) but the adtech providers could check the ids + timestamps to help with the match. NB: Google is not syncing with everyone on the same page view so the adtech providers have to be lucky enough to be synced on the same page view. Another question is: what is the “google_push” entropy?

Having worked in adtech, I can tell you the adtech providers probably don’t do that, for those reasons: 1) those adtech providers are usually competitors 2) if they work together, they can already sync their user ids directly together (so using google id is not necessary).

So I don’t think Google intentions were malign here on this particular point (contrary to Brave communication and all the press coverage). But yes, Google shouldn’t add entropy by sending the same “page view id” to different adtech providers. Note that Google is “better” than the others here: every other adtech providers send the same user id to each partner (persistant identifier, not session or page view like google). And those providers are sometimes quite big: for example, AppNexus or Criteo trackers are also everywhere on the web. Overall, it’s the RTB system with all those cookie syncs that shouldn’t exist, and except for the “google_push” argument, Brave study is quite good (they are just explaining how the adtech world works).


(5) ‘pseudonymisation’ means the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person.


can somebody explain in simple terms what Brave is actually accusing Google of doing? The article seems to be written in a way that matches the language of the GDPR legistlation, instead of language actually meant to be read by people, and i can't figure out what the "workaround" actually is.


Agreed, this is so wordy, this is what I got,

> Google claims to prevent the many companies ... from combining their profiles about those visitors

> Brave’s new evidence reveals that Google allowed not only one additional party, but many, to match with Google identifiers. The evidence further reveals that Google allowed multiple parties to match their identifiers for the data subject with each other.

BTW, many comments in here seem quick to agree w/this headline given how buried the details are. If someone has better detail, please share it.


I take exception with Brave's phrasing here.

Essentially, Google assigns an anonymized identifier to a user and sends that to prospective ad buyers. The idea is that the ad buyer can use this to target ads to people who have visited their site as they browse other areas of the internet participating in Google's auction. This is called remarketing.

An example. You go to footlocker.com and put a pair of sneakers in your shopping cart but decide not to buy. When you go read an article on the New York Times site, a potential advertiser recognizes your anonymized id and bids to serve you an ad for the sneakers.

The issue Brave is raising is that the same anonymized id is served to each potential ad buyer. This isn't an issue with data Google collects or exposes, but Brave states that buyers could theoretically collude to build profiles by sharing the data collected on their own sites with each other joining by Google's identifier. There is no evidence of this actually happening and Google's contract with ad buyers specifically prohibits this activity.


> essentially, Google assigns an anonymized identifier to a user and sends that to prospective ad buyers.

If it's anonymized then how could they send targeted ads to you? I think you're using a slightly different version of the word anonymous.

How I use the word anonymous it means, roughly speaking, that it can't be traced back to you. Or in this context, google wouldn't be selling anonymized data to third parties who in turn could contact you.

If they were selling data like X persons like product Y more then Z, there would be less of an uproar about this.


It's also written in third-party speech, with emphasis on spooky details rather than technical details.


Are Google engineers quietly working on alternatives? What is this repo? https://github.com/PolymerLabs/arcs

Also there was an interesting story a while back about a clash between advertising and the Fuchsia engineering team https://9to5google.com/2018/07/20/fuchsia-friday-respecting-...


> Fuchsia’s engineers wanted to create a secure platform, but the advertising team, at the time, believed that privacy “goes against everything [they] stood for.”


What is Arcs


It seems to be 'an open ecosystem for privacy-preserving, AI-first computing'?


Legitimately a meaningless description. I find it very odd the README and repository description are completely devoid of any meaingful information.


Maybe they value their privacy :-) More seriously this article might shed some light: https://internetfreedomhack.org/re-decentralise-the-commerci...


brave is incentivized to push this narrative, accurate or inaccurate as it may be. i am not ad-tech guru, nor digital marketer. i do know that brave's entire premise hangs on traditional ad-tech strategy remaining static, consumer sentiment around "big tech" to sour and a groundswell of "privacy focused consumers" to materialize. that groundswell is their identified target market for their product.


Which is the reason Brave is in a good position to do this kind of work. They represent a growing portion of web users, and their research helps to give these users a voice.


What's funnier is that Brave """product""" is nothing more but a theme over Chrome that any 12 yo kid can do in 2 hours, an adblocker based on FOSS blacklists and some compilation flags that prevent Google from enabling its own server features and tracking system and redirecting the tracking system to their own servers. Yet their entire PR and marketing is based on "Google is evil!". In any other industry this scam would have been shut down and the management would have been sued to probably jail time. But in tech, many things are blurry.


I also see how Brave likes to thrive on anti-Google pro-privacy camp and I personally pick Firefox over Brave any day if the week.

There is de-Googled Chromium OS project, but Brave takes a few steps sideways by making further changes such as proxying location services, safe browsing API, etc. I doubt a 12 y/o could compile it though, let alone in 2 hours.


EDIT: since everyone seems to be mentioning the 4% rule, I'd just like to point out that I'm not denying the existence of this, just denying that it is actually effective. Google has violated antitrust before, and walked away with a "big" fine that's a slap on the wrist. They've violated GDPR before as well once or twice, and got a "record breaking" 57MM$ fine. The 4% rule exists and clearly isn't enforced well. I know a lot of people love GDPR but I would be beyond shocked if the EU actually managed to hit Google with something that sticks. I very much hope I'm proved wrong!

This sort of resolution was inevitable.

I said it before and I'll say it again: GDPR is an annoying measure for developers, small businesses and startups. It doesn't do much other than put in place so many steps that growth tools for startups become risky to use. For big businesses that (ab)use big data, it's not much of a hassle because they can afford the legal steps as well as the change in infrastructure. They can even work around it and keep abusing data without consequences.

If they're able to beat Google's lawyer army and actually prosecute them, then Google will take a whopping fine in the millions of dollars that'll be more than covered by their daily revs.


The European Union has decided that growth based on clandestine tracking of users, selling their PII without consent is not a legitimate growth tool. You know, like the way we outlawed violence as a "growth tool"

Your other claims are more reasonable. But they would lead me to the conclusion we need bigger fines on bigger businesses. Not absolutely bigger, as the law already does, but relatively bigger. The more power you have to break the law, the bigger the stakes should be.


> Your other claims are more reasonable. But they would lead me to the conclusion we need bigger fines on bigger businesses. Not absolutely bigger, as the law already does, but relatively bigger. The more power you have to break the law, the bigger the stakes should be.

GDPR penalties are a flat fee or a percentage of revenue, whichever is higher.

If Google is truly willfully violating the GDPR, the maximum penalty by law could be up to 4% of their global turnover. I would not call that pocket change. But more importantly, it is a relative increase in fine based on the law breaking company.

(Will the EU actually fine Google ~6 billion dollars? Perhaps we will find out!)


> Will the EU actually fine Google ~6 billion dollars? Perhaps we will find out!)

The EU already fined Google a total of $9 billion over the last two years.


Did Google pay all of it yet?


Of course not. Not even the regular people I know pay their fines so fast :)


If their whole business model is selling personal data, then 4% is clearly just a cost of running their business.


Given that "European Commision fines" is its own bullet point under "Costs and Expenses" in Alphabet's latest quaterly report, that view sounds about right.


The GDPR does consider willful violations and a pattern of behavior.

4% the first time might be something you can shrug off, especially for a company the size of Google. But if you continue breaking the law and give the regulators an easy second or third bite at the apple...

I’d expect Google makes no changes and fights the regulators the first few times.

E: I used to have a comment here about Google continuing their current practices against non-EU people but it appears from my reading of the GDPR that may not be so simple


Unless I'm misunderstanding what you mean by absolute and relative, I think the law is already relative:

> The maximum fine under the GDPR is up to 4% of annual global turnover or €20 million – whichever is greater – for organisations that infringe its requirements.

From here: https://www.itgovernance.co.uk/dpa-and-gdpr-penalties


In context, "relatively bigger" would mean something like a progressive tax bracket. $20MM up to $500MM rev, 4% up to $1BB rev, 5% up to $2BB rev, 6% up to $5BB rev, etc...

A straight 4% would be absolutely bigger, but relatively the same (once beyond $500M).


This is a good idea. However what about phantom businesses which commit the crime and would not have real revenue.

My problem is with fines that they don't really force the PEOPLE in the businesses to play fair.

What about required & [RESPONSIBLE] roles and jail?

I am asking that in general because I am fed up with our business-entities made world where committing a crime is basically RECOMMENDED if the numbers and percentages say so.


I think it's not about "income brackets", it is about profit margins which can vary a lot between industries. 4% of revenue is enough to bankrupt traditional business, like Wallmart with profit margins of 2,48%. Google is a low-cost business, with profit margin of 25%, so even the maximum GDPR fine is something they can just write off.


So just make the fine a portion of profit? Maybe a three layer system that takes into account flat euro rate, a percent of revenue, or a bigger percent of profit; whichever is highest.


Making fine a percentage of profit would be even worse: Amazon, for example has no taxable profit at all, so the GDPR fine for them would be $0 (or $20M, which does not make much difference). And having different fines for different industries, based on gross profit margings could be viewed as discriminatory, and therefore ruled illegal.


Though I think the GDPR is bad law in some ways (chiefly in terms of the chilling effect on small operators), I think that allowing the cap on the fine to be revenue based (and specifically global revenue based) is nearly essential.

Otherwise, you get into accounting chicanery (or outright loss-making companies being able to operate with impunity while they grow).

There's nothing stopping the enforcement action to take into account the underlying profitability if something like a grocery store were to run afoul of GDPR.


This


The fine is already kinda big for GDPR (4% global rev for big companies) but Google has gotten away with way worse on way more regulated fronts i.e. their antitrust case which slapped them on the wrist.

If GDPR wants to be effective they need a ridiculous company-breaking fine for big abusers like COPPA and the like. Something like a per-user fine for violations that means they either can't do business in the EU or they have to become compliant otherwise risk being basically destroyed (for reference, violating child protection laws in the US can break and have broken companies before).

EDIT: also as for the tracking growth comment -- I agree on this, but the effects of GDPR on growth tools reach far beyond this. Even basic metrics are hard to get without a bunch of hoops. Even if you store no data you have to have a bunch of checkboxes and banners everywhere. Just for using Google Analytics that just relies on what they know about your computer (fingerprinting), and uses no PII at all, you need a banner and privacy policy. The laws are making it hard to use even basic analytics without fearing a misstep.


>Google has gotten away with way worse on way more regulated fronts i.e. their antitrust case which slapped them on the wrist.

Although I feel that Google has won its position in search, etc by offering a legitimately better product, we need to punish companies that continue to break the law. Success does not put you above the law. If you use your massive profits to absorb fines for breaking the law repeatedly, you should lose your ability to operate as a corporation and be dissolved. We need to reform antitrust laws. It's not just being able to control an entire market anymore, it's about being able to ignore international law because you have so much money.


I agree!

IMO, punishing data abusers can't be solved with data privacy laws. It's a regulation with very little ability to be enforced.

In the first place to abuse data you need a lot of it, so really we're looking mostly at big companies with big pockets. The easiest way to attack here is to punish unfair markets and to have stronger antitrust laws.

Let's look at the textbook case for privacy with Facebook. Facebook would be the prime candidate for antitrust no matter how you look at it: there are basically no competitors to it in the US social media market. They own everything except Snapchat, which is dying off and failing to turn a profit. Facebook accounts for so much presence in the US that they have login buttons you can integrate on different sites (Google does too because of how crazy cemented they are). Yet somehow besides being so obviously monopolistic and out of control they're hit with no antitrust. Bell was split up for doing much less.

Antitrust is just a joke right now. We have to get better enforcement first before looking to create regulations to be enforced.


Antitrust was neutered in the US by the Chicago School of Economics and the legal theories of Robert Bork (he of the Saturday Night Massacre and failed Supreme Court nomination).

Let's hope Lina Khan and the seeming bipartisan consensus that Big Tech needs taming are the beginning of an antitrust renaissance.


These are valid points but I still think it's reasonable trade off. Yes, I believe the popups banners etc might be annoying, but all companies share this problem. And I find it hard to believe it really hurts a business with a valid product.


> If they're able to beat Google's lawyer army and actually prosecute them, then Google will take a whopping fine in the millions of dollars that'll be more than covered by their daily revs.

This is why the 4% of global annual revenue fine option exists. A few of those add up quick.


Furthermore, if organizations are explicitly accepting penalties to keep on violating the law, that law will be adjusted accordingly. It might take a while, but it will happen. Laws with the intention to change behavior / culture cannot "work" from day one. This is a continuous process steered by politics, courts, governments, and public opinion.


Penalties can be applied repeatedly if the violation continues. It's not a 4% lifetime cap, it's 4% per enforcement action. The DPA can just churn them once a week and then we're at 204% of annual turnover.


And I'm very annoyed that your initial reaction to reading this article is to blame the GDPR instead of blaming Google for these shady practices. Boycott that crap, move to other services. This shouldn't be acceptable.

I'm very happy that the GDPR exist, if only because it forced all these websites from explicitly giving me a list of the literally hundreds of partners they want to share my data with, along with a way to say "hell no". Of course Google and friends will try to work around it but hopefully that won't come to pass and they'll have to actually bother changing their crappy business model. I think the spirit of the law is fairly clear, I wonder why Google thinks this scheme can work. Maybe they're just trying to buy some time.

As for startups that sink because they can't be bothered to sanely handle my personal data: good riddance.


- my initial reaction is to blame GDPR, yes, because it's just security theater that does so little to actually ensure privacy. Sure Google is at fault but GDPR was supposed to regulate this and it is clearly failing to do so. And if you want to boycott it you're welcome to but they've built an empire with their cloud, search, email, etc to the point where that would be pretty difficult and annoying to the average consumer. They're effectively too big to be boycotted at this point.

- It doesn't explicitly force them to do that. And most sites aren't explicitly sharing the data either; i.e. almost every site uses Google analytics which doesn't really comply with do not track all too well, and Google will then share their data with everybody else (which is part of their violations in this article). Also saying "no" doesn't do much either as most of these big sites either already stored the cookie or won't do much to delete it. And it's not that they'll try to work around it, they either don't put forth the effort because GDPR is like a pebble for them, or they already worked around it in a way that changes nothing. Their business model is still the exact same.

I think the spirit of this law is fine, but the actual law does nothing and is just privacy theater. Google isn't buying time, almost 4 years later nothing has changed -- they just know they can't be touched.

- They're not going to sink, they'll just grow much slower and thus won't be an alternative to the big data abusers you hate so much. And they're not failing to handle your data, most of the time startups aren't selling you out they're just trying to figure out who their customer is internally. To do this they collect some data that is usually optional and very much with your consent; GDPR just puts a bunch of hoops in front of this so that it's an enormous pain to do so. I run a startup that collects basically no data (literally, we do not have a database for 2 of our products). It was a pain for us to become GDPR compliant because that disables our metrics entirely and requires a bunch of banners and checkboxes everywhere even though we literally store nothing.

I'm all for the spirit of the law. I just think the execution sucks and they definitely didn't think it through enough. I think the evidence for this is clear based on the sheer number of privacy violations we've had since GDPR was enacted alone, and how little enforcement and regulation has actually gone on.


Your reality differs violently from mine.

We're not four years into the GDPR becoming enforceable (it was just last year), and certainly not enough time to see real action for complex cases - regulators and courts move slowly.

Startups barely ever asked for consent, and collect way more data than you imply - including, as you mention, by using Google Analytics.

The GDPR requires zero banners and checkboxes if you are not processing data.


> GDPR was supposed to regulate this and it is clearly failing to do so

How do you know? The fact that there's still crime doesn't mean that law enforcement doesn't do anything. Somebody reports an alleged violation, an investigation is started, and maybe the investigation will produce that Google is in violation (in which case a fine will "regulate" Google's behavior) or they are not (in which case it may be fair to criticize GDPR for allowing that behavior, or it may not be because the info wasn't correct).

> almost every site uses Google analytics which doesn't really comply with do not track all too well, and Google will then share their data with everybody else

Unless you have specific info, I believe you're mistaken here. GA is generally seen to be compliant if anonymizeIp is active and you're not pushing PII into it via customization. Google is, if I understand it correctly, not "sharing" GA raw data with anyone, but analyzing the data for their own research and providing the website owners with aggregate data (i.e. demographic information) without sharing data on individuals. I'm not a fan of GA, but I haven't seen any info that they're that obviously in violation.

> It was a pain for us to become GDPR compliant because that disables our metrics entirely and requires a bunch of banners and checkboxes everywhere even though we literally store nothing.

What was the specific pain? I get that having to add a privacy info sucks (and might cost money if you get a customized version), but I never found it to be that big a deal. If you don't store any PII, it's pretty straight forward, and so will the procedures be if anybody asks about the data you stored: just inform them that your systems do not store any data in general and also didn't store any data on them in particular.


From my (admittedly limited) understanding, this is not actually legal under the GDPR. Certainly the alleged (but not demonstrated) behind-the-scenes trading of personal info isn’t, but the shared id is also personally-identifying information, and directly regulated.


It is very not legal, but I think the parent was saying these regulations are more onerous to small dev shops rather than Google and the fine for this will be minuscule. Hopefully companies will find paths to revenue that do not require selling out there users to this level, maybe by just having ad auctions without any identifying information at all.


Like it or not. Currently the internet would not exist without the ad-supported business model. Regular users expect everything on the internet to be free. I know people that categorically refuse to buy a 1-dollar app. It‘s starting to change now with people getting used to subscription models (Netflix, etc). But it will take a while until we start paying for news again, for example. Apple news + and google news initiative are a step in the right direction. Even if it‘s just for aggregate sub management.


GDPR does not prohibit ads.


> Currently the internet would not exist without the ad-supported business model.

In it's current form, yes. However, I'm not so sure that everybody here would agree that "the web today" is fundamentally better than the web ten years ago, technological advances aside. Everybody smelling gold and starting a blog to mindlessly shill for products in hopes of getting a commission, super low quality texts written/generated/spinned entirely for SEO reasons to place ads between the paragraphs etc doesn't come to mind when I ask myself "what could be better on the web?" If those things disappear tomorrow, I don't think a lot of us would miss them. We'd notice them being gone because it might feel like being able to breath freely after a strong cold, but I don't think many would miss them.

> But it will take a while until we start paying for news again, for example.

Plenty of people pay for news. They won't pay for Gawker or Buzzfeed though. I don't think that's a problem for anybody not invested in or working for those companies.


From gdpr.eu:

"The more serious infringements go against the very principles of the right to privacy and the right to be forgotten that are at the heart of the GDPR. These types of infringements could result in a fine of up to €20 million, or 4% of the firm’s worldwide annual revenue from the preceding financial year, whichever amount is higher."

For Google that would be 4% of its worldwide annual revenue, I'd assume. Taking into account that it's not one infringement but multiple that could mean a pretty hefty fine.


That is the worst case, no GDPR fines have been near there maximum yet.


The first GDPR fines handed down by the ICO have been hundreds of millions of pounds for negligent breaches - I don't think it would be out of the realm of possibility for breaches by _design_ to result in multi-billion pound fines.


Several billion pounds is still not much! They already broke GDPR once (or twice I think?) And received a 57MM$ fine.

57MM$ is nothing to Google. They've escaped even antitrust cases with minimal injury, it would be a truly shocking event if the EU actually managed to touch them.


This has always been absurd. Large companies have way more code and features in general which need to be checked for compliance, whereas small shops with small sets of data and features will have a far easier time complying with GDPR.


Large companies have the means to pay for the manpower (lawyers/consultants, developers, etc) to certify compliance with GDPR. Small companies often don't.

I paid $2K for my first GDPR consulting session for a $7K MRR app and was quoted ~$25K for consulting while I would personally implement what needed to be done. $25K is nothing for a large company, but it's prohibitively expensive for a lot of small companies. This cost also doesn't include the (probably hundreds of) man hours required to implement and certify GDPR compliance, which are also disproportionately valued when it's being done by 1 person in a <5 person company versus N people in a >5K person company.

Hopefully these costs will fall as more people become lawfully knowledgable about what GDPR entails and the market of people available to help grows. Unfortunately there's no "feel free to wait if you can't afford it yet" clause in GDPR.


Another issue is that as a small company you generally lack the resources to effectively contest violations. Google can, and will, drag these things out in court for years. And ironically for free. Their legal costs are going to be covered by inflation on the fines themselves. 2% inflation on a $1 billion fine reduces it by $20 million a year. And also factor in the interest Google is earning on that $1 billion on top of the 2% 'principal' reduction per year.

The whole penalty system is quite silly. The fines destroy small companies who are the ones struggling to comply, and do little more than offer extremely gentle pokes on the wrist for megacorps that have relatively unlimited resources available for complete compliance, if they actually wanted to comply.


Even from the basic point of view: People go to Facebook, Amazon and Google daily. They accept the GDPR privacy policy once. Every single other website is bombarding users with popups, so there's a far greater chance users will click off from a startup's website.


It's not legal but there isn't much the EU can really do. It would be shocking if they actually managed to prosecute Google which has so far avoided much hassle in antitrust and the like, taking I think a billion dollar fine which sounds like a lot but is basically a slap on the wrist.

That's why, IMO, GDPR sucks for small businesses that can be outed to the ICO for a minor oversight and not so much for big data abusers that can take on GDPR and come out unscathed.


That sounds like something fairly trivially avoided by having the punishment be proportional to revenue. And I believe this is already the case for GDPR?

A quick search indicates "Up to €20 million, or 4% of the worldwide annual revenue of the prior financial year, whichever is higher" https://www.gdpreu.org/compliance/fines-and-penalties/


EU have shown that it's willing to scale up the fines all the way if the company in question keep on violating the law. Alphabet global revenue 2018 was $136.8 billion, so the maximum fine is $5.5 billion which is in the vicinity of fines they've already received. It's a separate post in their yearly financial report. The gain must be significant if they continually keep violating the laws.


This is being quoted in every comment but if you have enough lawyers anything is possible.

Google has come out of antitrust cases relatively unscathed. They've even violated GDPR itself once before explicitly, and got out with a 57MM$ fine. This case won't be any different than all the other times that Google has blatantly violated laws and walked away with a slap on the wrist.

I would be very very very shocked if the EU actually managed to touch Google. I welcome and hope to be proved wrong.


I mean, that's an entirely different class of problem.

If the law literally doesn't work because of reasons, then that's just systemic corruption.


I would argue that it's the same problem and the reason GDPR is privacy theater.

It's a lot of regulations that can be worked around and the fines are hard to and rarely enforced. There are a bunch of poster children of GDPR fines that make it seem like it's doing a lot but the principal abusers (i.e. Google) just walk away with a light slap.

It needs the ability to be enforced, and I think this much should be obvious to lawmakers -- a law that can't be enforced well is useless.

That's why I'm calling it privacy theater. It's the EU saying "look what we did!" but in practice it doesn't really do much without enforcement that still does not exist both at a national and global scale.


As far as I know, GDPR fines are purely regulatory and never go to court. So I am not sure how the lawyers are relevant.


I really doubt Google Adx would pass buyer_uid to buyers in EU28 countries. They were the first ones to truncate IPs in EU for privacy reasons.

We've stopped cookie matching in EU28 countries so I can't verify if they do pass the buyer_uid.


Targeted ads are already a serious leak of information.

If somebody looks over my shoulder and sees the ads presented to me, they can infer things about me.

Also, if a malicious actor targets an ad to a group of people, and some of these people buy the advertised items, then the actor can infer things about those people not necessarily related to the items sold.


At my last job the traffic was filtered through a proxy due to FINRA regulations. I’d see Portuguese ads for diabetes medication and there were 2 Brazilian guys in the office.

Seemed like a major HIPAA violation to me.


HIPAA only keeps healthcare providers from sharing your information. It's not an omnibus shield for your health information. If Alice tells her coworker Bob that she had diabetes, it's not a HIPAA violation for Bob to tell Charlie.


> HIPAA only keeps healthcare providers from sharing your information. It's not an omnibus shield for your health information.

Maybe not, but GDPR sure is.


Is it really? If Alice tells Bob she has diabetes and Bob tells Charlie, is Bob in violation of GDPR?


Are Bob and/or Charlie the name of a person or of a company?

How you're using it, it sounds like Bob or Charlie in your mind is a person. I might be wrong in interpreting it that way. If so could you give another example where Bob and Charlie are companies and the information of Alice is part of a transaction.


GP's comment paints Alice/Bob/Charlie as people:

>If Alice tells her coworker Bob that she had diabetes, it's not a HIPAA violation for Bob to tell Charlie.

I was responding to the parent comment's claim that it's not a HIPAA violation but rather a GDPR violation.


No, GDPR does not apply between 2 persons.


> Individuals can also face fines for GDPR violations if they use other parties' personal data for anything other than personal purposes.

https://www.coredna.com/blogs/gdpr-fines


It would have been a more funny story if it were ads for Viagra ;)


Because erectile dysfunction is funnier than diabetes?


Not really. Those sorts of ads are sent without targeting.

> If somebody looks over my shoulder and sees the ads presented to me, they can infer things about me.

You have to take some personal responsibility, though. If they saw your Youtube recommendations or your Spotify playlist, they'd probably make inferences as well. That porn link in your history you forgot to clear? Be aware of who's watching and browse anonymously if you're concerned.


I've had ads for sketchy shit I googled at home on my personal computer show up on my work computer at the office.


Connecting personal accounts (Gmail, Chrome browser profiles) on a work computer is something that you should only do with careful thought.


I've had ads for things that I only just spoke about, out loud, to someone near me like a friend or family member, show up on a computer in a different country.


I've had ads for things spoken about show up in FB. I have more of a libertarian mindset, but that really creeps me out and I think speech-based ads be outright banned due to privacy concerns. It's not so much the ads; it's being recorded and potentially having those recordings leak in a data breach.


Or it's just one of 100 coincidences that happen to you every day.

Easy to prove, store a log of all your network traffic, and record all the audio you speak, then when you see a match, go back, find the proof, become world famous


I stand corrected. I thought I had read this was actually happening, but it appears to only be speculation.


It was widely believed for literally years until the Senate Judiciary and Commerce committee hearing in 2018 where Zuck called it a 'conspiracy theory'. Since then it has been dismissed as such. My question is - if I personally observed it before I even heard about this 'theory', and thousands of others around the world also observed the same thing, why are we dismissing it as a 'conspiracy theory'? Just because Zuck labelled it as such? Why are we trusting him to tell us the truth again?

"somebody looking over your shoulder" can see a lot more than you ads, like your private messages, bank info, medical info, etc.


Not if I'm just showing them a random website. The problem with targeted ads is, they show up at random websites.


I dunno man. This reminds me of the time that someone at defcon said they found a vulnerability in my last company's product because it flashes a WiFi password to an iot device instead of making a user type it in.

"What if we capture the flashes and steal the password?"

Well, if you're positioned to capture the flashes, you're definitely positioned to just watch me type it in...


Would you be ok with it if your browsers at home, in the office and on your mobile phone always showed your bank balance on the top of the screen in a large font?

I assume most users would not. But they would be ok with their bank balance being shown if they specifically opened their bank website.


Why are so many people that paranoid. No one is gonna destroy your life because they saw your ads


Imagine someone giving a presentation to room full of co-workers and a web ad comes up saying something like "Resubscribe to Cannabis Weekly Delivery and get 10% off."

It's not hard to imagine a person's career being affected by something like that.


What if I was anti-abortion or pro-trump in a progressive tech company? Would my co-workers feel more comfortable destroying my life then?


if someone looked over your shoulder and saw you browsing HN they could infer things too


Yet, they choose to surf to HN.

They're not choosing to have targeted ads that share their info around the web and cause someone over their shoulder to infer things about them.

That's the point -- we should have that choice. And the default should be "no".


I understand the opt-in rather than opt-out, but does disabling Ads Personalization [1] not do what you're asking?

[1]: https://adssettings.google.com/authenticated?hl=en


No, a Google account shouldn’t be required.


Why not? How else would Google know who not to track? It's not like they can identify you and remember that preference without a Google account...


Yes, that's why targeted ads shouldn't be a thing unless it's opt-in (not necessarily my opinion but it seems to be the point the parent was making). At that point, to opt-in you can create a google account. Currently though, Google will attempt targeted ads on people without a Google account by trying to identify and track them through other means.

Ideally you would have site-specific or content-specific ads normally and personalized ads if you created an account and chose to opt-in.


My children tease me about "being a hacker", by which they mean unlawfully breaching security of internet systems, because they've seen me reading "hacker news".


Yeah, but HN is not shown as part of every website out there.


The problem is if the person looking over your shoulder has power over you.


The sharing of data is what makes RTB valuable and most likely viable.

Because what Google are doing is not dissimilar to how any other RTB participant is acting, saying this is a Google workaround seems disingenuous.

Unfortunately I fear this will only embolden Google to further monopolize digital advertising.


Is it really a "workaround" if they're just breaking the law?

I mean, if the allegations are correct, Google didn't find any loophole, they're just hiding the fact that they're selling person identifiers.


EU should raise the 4% annual turnover rule to 10%. Google doesn't seem to be deterred


There is a reason they didn't. They fear the US government's reaction.

Edit: Why downvote? Do you really think that the US government will stay silent if the European Union threatens with such fines? Political tensions are something you take heavily into account.


EU should ignore the fines this time and start an "information campaign" regarding behavior of Google and others.

I bet that hurts Google 10 times more.


They could also do both.

I'm _really_ tempted to write that they could use the fine to finance the information campaign, but I know that government finances doesn't work that way.


Gov finances do work that way. That's how the anti-tobacco campaigns are funded in USA.


Governments are likely walking a much finer line than we might imagine. Imagine they carried out your idea. The EU is a political organization manned by a large number of mostly professional politicians. Google is world's largest data harvesting and advertising company whose products are used, on a daily basis, by a pretty sizable chunk of our entire species' population. Imagine if Google decided to fight back. Who would be able to create a more effective "information campaign"?

I can't help but consider one current "information campaign" in the UK. In response to skyrocketing violent crime they've chose to put anti-stabbing/knife messaging on fried chicken boxes, Literally [1]. Knife amnesty bins and fried chicken anti-knife messaging. That's a government "information campaign" through and through.

[1] - https://www.cbsnews.com/news/u-k-government-tries-to-fight-k...


I disagree. People were not aware of the shenanigans Facebook was pulling before the media outrage. Now they are and many are leaving the platform.

Exactly how would Google fight back? Kill Android? Close down YouTube??


Consider Brexit for a minute. The most recent polls show support for leaving, but throughout it's been extremely close. However multinational corporations are universally against Brexit - a global world is a more profitable world. And these same multinational corporations tend to have a strangle hold on the places most people get their news from. This can be from the news agencies themselves (Disney owns ABC, Comcast owns NBC, Time Warner owns CNN, etc) but more directly also from the way that people get their news. For most people that is Facebook and Google. And these corporations tend to promote what is their own best interest. As a specific example CNN ends up being chosen for about 20% of Google's news recommendations. It's a deeply partisan site that's not uniquely popular and has a dubious track record when it comes to reliability. But their agenda and Google's agenda fit nicely.

Consider the two topics above, combined. The global media has nearly universally tried to condemn Brexit. And while media clearly doesn't have as large as an effect as some would like to imagine (Facebook was seeing an exodus of young users before any media outrage - it's become the social media site for your mom), it equally clearly does have at least some effect. And so imagine Google simply swapped their bias. And was suddenly now disproportionately promoting messaging come propaganda against the EU, in favor of Brexit, promoting things such as the yellow vests in France, the various leave campaigns gaining momentum in other nations, etc.

When topics, even with the media disproportionately on one side, are so close - if that media that people were presented suddenly started lobbying for the other side, that would have a massive effect. I don't think it's hyperbolic to suggest that companies such as Google and Facebook could effectively cause the EU to collapse if they so desired. It's already on somewhat shaky ground with near universal media support. If they don't play ball with the companies that direct that media, that ground very much stands to give way.

That doesn't mean the governments are completely obsequious to the corporations, yet, but it does mean that the corporations are also in no way obsequious to the governments. And I think this interbalanced relationship is one major reason that we see increasingly see governments reluctant to do anything that could meaningfully negatively affect mega corporations or other very powerful players. It's also why I see us gradually headed towards more overt corporatism. Corporations grow exponentially more powerful by the decade, and this shows no signs of abating.


> I bet that hurts Google 10 times more.

Nah. Too big to bother.


GDPR enforcement is 15 months old and regulators aren't the fastest bunch. They're also cooperation based as the goal isn't to serve fines but to ensure compliance: If you cooperate, you might get away with no fine at all (depending on circumstances).

Also the 4% global revenue fine wasn't exercised yet because it's the maximum fine, and there needs to be room for escalation: hard to serve a bigger punch if you're already at the maximum.


Is there any way to improve the matching of ads to the viewer without violating their privacy?


The matching is in itself a violation of privacy, at least if you interpret the right of privacy as "The right to be left alone", as former Supreme Court Justice Louis Brandeis put it.


Yes, thats what I was thinking too. For Google, being in the ad business itself necessitates that Google's trajectory will be on shaky ground w.r.t privacy.


I think that’s incorrect, relevant ads could be displayed based purely on the site content, without user info attached to ad calls. We’ve been there.


True and irrelevant. If you're displaying ads based on site content, you are matching ads to the site content, not to the viewer.


It is actually relevant because they are matching the ads to the user, only it happens by a proxy variable which is the site you are visiting.


The original question was "Is there any way to improve the matching of ads to the viewer without violating their privacy?"

Your answer is that we should match something other than the user, that happens to correlate with user interests. That is, by definition, not matching ads to viewers.


In think either our idea of "by definition" or something else differs.

Viewers get ads matching their interests, as proven by the fact that they are on a related website. I don't see how that isn't "matching ads to viewers"?


Why the downvote? Care to provide a reason?


From [the guidelines](https://news.ycombinator.com/newsguidelines.html):

> Please don't comment about the voting on comments. It never does any good, and it makes boring reading.


Thanks, didn’t know about that. I wish I had received a better feedback there, though, as I’m deeply interested in the topic and generally it’s quite hard for me to find negative opinions, counterarguments I could work on. I find negative feedback more helpful, if it’s constructive.


I think maybe the problem with the comment was that it started with "I think that’s incorrect" but it reads like a non sequitur. The comment to which yours was replying was claiming that matching ads to viewers is itself a privacy violation but your point seems to have been that it – matching ads to viewers – is unnecessary, which, altho related, is a different point and doesn't follow from you thinking that the other commenter's point is "incorrect".

I think you're right in that "site content" is a good-enough proxy for users/viewers/targets for advertising, tho I also readily understand why advertisers would always like more info with which to target their ads.


They will leave you alone if you choose not to send an HTTP GET call requesting content.


Yes! Contextual Targeting (target based on what I am reading) could work, although the industry seems to be clinging to Behavioural Targeting (target based on who I am). This will become more important for Open Web due to the 3p cookie constraints, regulatory changes etc..., but Google/Amazon/FB are less likely to be impacted.

In fact Contextual Targeting predates the current approaches, but it became less important once advertises/adtech companies started preaching the thinly veiled idea of using behaviourism to trigger conversions/less products.

Changes like this are slow to introduce due to technical and (mainly) structural/cultural issues in the Advertising Industry, but that’s a topic for an entire essay/series of blog posts.

Source: I work in AdTech and deal with privacy/the ethical impact of programmatic, content monetisation models. Opinions obv. mine.


Mind sharing which company you work for? I'm interested in scaling contextual, privacy-focused publisher targeting solutions.

On the PG/PMP side, the usage is obvious, but I'm also curious what it will take for publisher-provided data to be trusted in the open exchange environment where historically advertisers and DSPs have tended to not trust the publisher-supplied categorization.


Yes, we can do contextual targeting, but after that, what are the ways to improve upon the outcomes (from an advertisers standpoint) without violating privacy?


Depends on the outcomes/metrics they’re interested in. For instance, I genuinely believe that targeting based on content is less dangerous from brand safety/brand perception perspective.

How about conversions, sweet $$?

I think replacing audiences with a semantic targeting model (nlp) could perform almost as good (if not better). Behavioural Targeting performance is overrated (feel free to look it up, esp. CPM vs. conversions, it’s quite interesting, I’m on holiday and shouldn’t be sitting on HN anyway!).

Another, deeper point-how much advertising do we (or brands) really need? Do alternatives exist, are they expensive/hard? What problem does advertising solve? I know this sounds like a silly question, but I think it’s worth asking given the current technical landscape and the ethical impact.


Well.. I'm totally OK with the efficiency in online-advertising being low. My point was that after a certain point, there is no way to get a better ad-to-viewer match without violating privacy. Google was never on the right side of privacy, and given the business they're in, they never will be.


Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: