Hacker News new | comments | ask | show | jobs | submit login
GDPR will pop the adtech bubble (blogs.harvard.edu)
668 points by mpweiher 9 months ago | hide | past | web | favorite | 437 comments

Oddly, I think the article underestimates the size of the change coming. I think one side affect of the surge of IT into advertising, is that it has become easier to measure exactly how well advertising works. By and large, it doesn't, very well.

I am reminded of this, from Paul Graham, about his time at Yahoo: http://www.paulgraham.com/yahoo.html

"...The reason Yahoo didn't care about a technique that extracted the full value of traffic was that advertisers were already overpaying for it. If Yahoo merely extracted the actual value, they'd have made less."

I'm pretty sure that, the more we learn about how well advertising works, the less money people will be willing to pay for it. What's happening to billboards and newspapers right now is probably coming to several other industries soon.

> By and large, it doesn't, very well.

I know Twitter isn't known for being the best at advertising, but it was made exceptionally clear to me that online advertising is a massive bunch of lies when I did my GDPR Twitter data export and it included me in a bunch of incorrect, non-sensical and contradictory ad targeting groups.

Twitter claims I:

    * Own a cat, dog and other animal (I don't)
    * Have between $100k- $999k liquid investible assets (I don't)
    * Have a net worth between $1 and $1m (cool - I own *something*)
    * Am highly affluent (/shrug)
    * Am a high spender (okay...)
    * Am a frugal spender  (...but how can I be both a high spender AND frugal)
    * Own a house (I don't)
    * Have multiple families (I don't)

I was very disappointed that the Google and Facebook data exports don't contain this data. Maybe it's for their best.

> GDPR Twitter data export and it included me in a bunch of incorrect, non-sensical and contradictory ad targeting groups.

This is expected. It's not desirable, but it is expected.

The problem is your comparing it to an absolute. I.e. to perfect.

If advertisers had that choice, they would love it. They generally don't. Remarketing is kinda close, but limits the scale.

Rather, the only other scalable options are far worse. Think about it. What are the marketers other choices?

The one that should come to mind, and the one they spend most of their money: TV.

With TV you pick up a huge amount of waste. Say you buy a spot on Big Bang because you want people thinking of buying a new iPhone. Not a big stretch, right? At the same time thinking of all the other people they have to buy, who watch the ad, and aren't buying an iPhone. It's waste.

And that waste is huge relative to what you're talking about above.

So you're asking the wrong question here. It's not how good is the targeting in terms of precision/recall. The question is what's better?

The waste here is generally known & its priced accordingly.

> I was very disappointed that the Google and Facebook data exports don't contain this data.

It's in my Facebook export. Under "Information About You" did you uncheck "Ads"?

That's where the information is contained & it's checked by default.

You can also view it here for both FB & Google, as well as opt of both:



> Rather, the only other scalable options are far worse.

Is it really worse than "this article is about X, Y is related, let's show some ads for Y"?

I bet from the point of view of the advertisement middlemen it is not. Because they capitalize mostly on showing ads of products that people already decided to buy or are very near deciding. But that is yet another racquet that only decreases the value of the industry. My question is, from the point of view of the real advertisers (the ones selling something), is it really worse?

Right but the advertising budgets for things actually related to articles or other media content are vastly dwarfed by the advertising budgets for unrelated things that actually make more revenue/profit.

So you'll end up having to try to monetize something that has little no no related things that are profitable to advertise.

> What are the marketers other choices?

Advertise on a tech site, 80+% will be interested in tech.

Advertise on a dog community site is probably pretty spot on if you want to target dog owners...

Far better than what, 5% ? This isn't rocket science, we sold everyones integrity for pennies.

> who watch the ad, and aren't buying an iPhone. It's waste.

No, advertisements like that are more about selling the brand than selling iphones. Something TV is probably pretty good for.

> Advertise on a dog community site is probably pretty spot on if you want to target dog owners...

That would probably target dog enthusiasts or dog hobbyists or dog fanatics, which are all subsets of dog owners.

I think that one of the things that makes social media ad targeting so attractive to advertisers is that they can target the dog owners, instead of just the enthusiasts/hobbyists/fanatics.

> Advertise on a dog community site is probably pretty spot on if you want to target dog owners...

Big brands struggle with UGC. Witness the events of YouTube & brand safety.

But even without that, what you're saying just isn't scalable. What community site can reach even 50% of dog owners -- let alone what a primtetime sitcom or YouTube can reach?

> No, advertisements like that are more about selling the brand than selling iphones. Something TV is probably pretty good for.

What? Sure, they have campaigns running for awareness, recall, perception.

But you're really saying Apple doesn't dramatically increase its spend when a new phone is released? That's poppycock.

Only google et al cares about scalability, everyone else cares about results. Only google needs to sell ads without any knowledge about the product or audience, this scales extremely well but leads to poor results.

> But you're really saying Apple doesn't dramatically increase its spend when a new phone is released? That's poppycock.

When is the best time to sell the brand? When your flagship is a year old and still costs as much as on launch day and when the competition has surpassed you? Hardly.

People that has enough interest to use a dog-forum has too much knowledge of quality to buy things that a company advertises. They will know already what works for their dog and does not listen to advertising in the normal sense. They might however listen to bloggers that has been payed to talk about brands.

If you buy ads, you want them displayed to the non-fanatics that is not influenced by experience and vulnerable to brand exposure.

As long as the algorithms are close to or better than this model why would they bother? One is all automated and the other is a lot of work dealing with individual websites (even if it is just a matter of manual targeting settings)

I'd say they are much much worse, but they are scalable and cheaper so that kind of makes up for it.

But the only one who cares is the middleman. They are the only ones that benefit from the scale.

> Advertise on a dog community site is probably pretty spot on if you want to target dog owners...

Give me a list of dog community site to target 1 millions dog owners.

Now tell me how expensive it would be to show ads on theses websites.

Plenty of dog owners are on Facebook, much easier to target 1 millions dog owner there instead.

Did I forget to mentions that it's for a dog shop in Montreal? Good luck!

Looking at the hundreds of mostly incorrect or irrelevant demographics that Twitter has be in, I would still say there’s a huge amount of waste going on. At least with offline advertising (like TV or print) the “fuzziness” is very explicit. Here, Twitter sells you that you can target “dog owners”, but nowhere through the Twitter ad buy process does it tell you how BS that targeting is.

And that's why Twitter is not doing as well as Facebook. Facebook doesn't need algorithms to know what you are interested in, where you like to eat, etc. You explicitly tell them in your info and by what types of articles you like and share, and what you post.

Well I’ve told Twitter that I hate Blockchain and think it’s all BS, yet I still get ads for ICOs and “Uber for the blockchain” - whatever the fuck that is.

Again, I guess this is because Twitter sucks at advertising.

Twitter is the same, except even more-so. People tweet 10+ times a day, update FB status infrequently.

All the things FB has done to extract things like resturant data etc are product development they undertook. Twitter spent 10 years moving form 140 characters to 280 instead.

Even if that is true, when people sign up for Facebook, they fill out a bio telling FB their work history, their school affiliations, who their family members are, their hometowns and current cities, they check in where they are at, tell FB their favorite books, music, tv shows, thier gender, relationship status, thier sexual preference, and thier political party affiliation. Everyone doesn't fill out everything, but a lot of people do.

That's exactly my point.

There is no reason why Twitter couldn't have done the same 5 years ago. And no reason that they can't do it now.

(Also people are just as likely to follow and talk about things like tv shows and music and politics on Twitter as FB)

Plus the fact that they can also average your interests from information volunteered by your friends and family because "birds of a feather, flock together".

IOW, Twitter followers may be total strangers to many IRL, unlike Facebook followers.

Traditionally, it wasn't necessarily considered a waste to give people a positive feeling about your brand even if they weren't the specific ones who wanted to buy, because everybody talks to everyone else.

I think advertisers have been chasing a chimera via targeting, because not only doesn't it work well, but there is a trade-off to maintaining a brand.

I’ll tell you one thing that’d be better: prices and wages that make sense. I shouldn’t be faced with dozens of subscriptions just to lead basically the same life I did over a decade ago with half those subscriptions (or fewer). And being paid 5x what I made then shouldn’t feel like I’m making it work but not saving much. And I shouldn’t be getting sales calls over 4 months after having used TrueCar.

For the frugal/high spender conflict, I can kind of see how that might happen.

I try to buy quality goods because cheap stuff doesn't last. That means spending more up front to spend less in the long run. So frugal, but also maybe high spender?

My mom always says: if you're poor, you can't afford to buy cheap.

All things equal (features that I need from the product) I like to buy high quality things because they make me feel good (they look well built, they give me this warm feeling inside that I got my money's worth). But I realize this is all emotional and rationally I shouldn't, in most cases.

I don't think they pay off long term, at least it seems heavily dependent on the product type. That's because we are playing a lottery game, even if this product breaks in 0.1% cases vs 5% cases for a much cheaper product, we're not buying large numbers of them, we're buying just one and while the probability to break on me is smaller it can still happen and the monetary loss would be much larger than if I were to buy the cheaper product. That is, the warranty doesn't scale with the price (the expensive top quality TV is $2000 and the cheaper one is $500 and both have 1 year of warranty), for the same money I can buy 4 of those $500 TVs and would last at least 4 years (but very likely to find at least one that will work much longer, since I'm buying up to 4 of them).

That almost never works for the reason you describe. For any quality item that lasts M times as long as the cheap item, you can almost always find N > M such that you can buy N copies of the cheap item for the price of one quality item. You do not spend less in the long run, though you can convince yourself of that. The primary effect of buying cheap is not financial, it's that you're dealing with less quality over the entire usage lifetime (hence getting less utility out of it) and the replacement effort has cost (but which for poor people is very little by definition).

It is probably because they have to show you the categories/tags but they are not required to show you the exact percentage/probability. So in their db they have it like 80% frugal spender, 15% high spender but when exported you see it as a yes/no which is confusing.

I guess the 'spoiler' is that I'm fortunate enough to certainly not be a frugal spender. Yet another data point they have about me that they sell, but is completely incorrect.

I think this might be it eg spending $250 instead of $50 on shoes or boots for longer life and improved comfort.

$250 shoes do not last 5 times as long as $50 shoes. Improved comfort is the main benefit.

The math can work out in some cases. I've had my 400$ Allen Edmunds for 8 years, some wear, but generally still in fantastic shape and very comfortable. Prior to getting them I bought 1-3 pairs of ~70$ department store shoes per year.

Usually it's very anecdotal and hard to tell ahead of time. My current shoes cost 40€ and that was about 4 years ago. So they have lasted atleast half as long as your expensive shows for 1/10th the price.

I have some inexpensive Chinese-made boots that are over a decade old, I think. It can't be the norm that shoes last 4 months, unless you are talking about wearing down the soles of running shoes.

easier to repair also

Have they got the following mostly right:

    * Your age bracket;
    * The fact that your net worth is positive (not deep in debt);
    * Whether you live in a city / suburb / countryside;
    * Which part of country is that;
    * Your gender;
    * Your race / ethnicity, broadly?
If so, they still have an immensely precise focus on you, compared to TV, radio, or paper media, even if they're mistaken about your cat.

I don't know about the person in question, but in order:

* Yes, but the age bracket for me is _very_ wide (it says "alive and not in need of new knees yet probably", but not much more)

* They didn't have this info

* This was wrong

* No, they didn't have this (although the country itself was correct)

* Wrong

* They did not have this

He just doesn't want to admit on HN that he has a cat.

Yep agreed. As a salesperson, knowing someone went to college is often more helpful than knowing what exact college someone went to - too personalized is just creepy. I think the trick is somewhere around the space being able to accurately engage someone with of "hey you look like the sort of person who might be interested in xyz".

Yeah, my impression is that this whole "smart, targeted ads" thing is actually just a lie, and that it's really more of a "spray and pray" kind of thing.

It's a bubble!

But everyone using this spray and pray knows it is far from perfect. You just use the best you can get and knows about the imperfections. Spray and pray also works in a broad sense.

I had very similar experience with Facebook when tried to check my profile. Out of 20 interests only 4 were correct

I think it is an intentional stretching the interests way to far to look better for advertisers.

In more details: https://levashov.biz/can-trust-facebook-audience-interests/

The idea of advertising is also to create interest where there is none. If you all of a sudden see an ad about those other topics your aren't super interested yet, maybe you will become. That's the idea of manipulation and marketing.

That's the kind of advertisement TV is selling, not what social media are claiming to sell.

For Google, something similar can be found under My account -> Personal info & privacy -> Ads settings.

Most of the topics Google thinks I'm interested in are close enough. But American football and Parenting? I don't know where they got those. I have to consciously think to come up with the name of the local team ("Mariners? No, that's baseball."), and we don't have kids nor any plans to do so.

Maybe it was all the clicking on CafeMom.com (mommy blog with tons of trackers and ads) while I was testing the Pi-Hole set up. :-)

Would like to see my own. How do you do that?

> I was very disappointed that the Google and Facebook data exports don't contain this data.

Facebook has it available, if not in their Download Your Information tool. Go to Settings -> Ads -> Your Information -> Categories.

Mine says "You do not have any behaviors in your ad" preferences. https://i.imgur.com/giAxfiF.png

But interesting page. Looking at the "advertisers who have added your details to their targeting list" it again shows how bullshit this industry is:

    * Playstation in 19 countries
    * Musicians which I definitely don't listen to, like Keith Urban, Post Malone, Jack White, YBN Nahmir, and Ziggy Marley, whoever these people are.
    * Pages like "Top Kickstarter Watches" and "Top Kickstarter Inventions"
    * A bunch of restaurants that I've never been to, but are in the same complex that I used to live in (thanks whoever sold/'shared' my email, literally probably my former real estate agent)

Last time I checked mine, it was pretty accurate, but I have a lot of the "PlayStation in <random country>" too!

I also see a lot of PlayStation related ads on my news feed... I haven't used a PlayStation since I got an Xbox one!

One can see it a bit like search where it can be more important to do fast exclude than to have correct include. Those 8 ad targeting groups are potential matches, but what the data do not say is the hundreds of targeting groups which twitter claim that you are not. This kind of fast classification is good as a performance/cost saving technique when the error cost is also low.

Also, one can be high spender and frugal. A person who buy one pair of very expensive shoes and wear them for 10 years rather than buying 10 pairs of inexpensive ones would be such person.

> Facebook data exports don't contain this data

Facebook does have demographics and targeting data. Go to Setting -> Ads -> Your Information -> Your Categories. I haven't found a way to export it.

On the categorization side, Facebook does a half decent job considering I don't upload anything on FB or any other sister sites.

Twitter on the other hand might be running algorithm which tries to extrapolate too much from the available data.

I've never looked at the Google export, but you can see what topics they think you are interested in here: https://adssettings.google.com/authenticated?hl=en

Twitter is just entirely incompetent at everything, so no surprise they get their advertising product wrong too.

Those are just arbitrary names given to clusters of users. If your behavior and responsiveness is similar to others in the group, why should the advertiser care if the label isn't literally true?

That's not a failure of advertising. That's a failure of technology.

The fact that you choose to purchase certain items over other equally suitable items is largely because of advertising.

Facebook believes I'm african-american (I'm not)

You might just be to facebook because you like content similiar to others in an ad group labelled african american.

How much do you use twitter?

By and large, it doesn't, very well.

If that's true, it's not because ad tech doesn't work, it's that most advertisers don't use it properly. A large percentage of my company's revenue comes from performance marketing (running paid traffic to affiliate offers) and it works very well for us. You just have to know what you're doing, make sure you're not getting too much bot/junk traffic, and bid the appropriate amounts. We've written software that handles all of this automatically, and our paid traffic campaigns do amazingly well. Show me another investment in which you can reliably and consistently achieve 30%+ ROI weekly with a fair amount of scalability.

Ironically, native ad networks like Taboola and Revcontent will flourish in the new GDPR world. Since they target ads based mostly on nothing more than the topic of the content that the user is viewing, these ads are effective enough for smart performance marketers to make money with, and GDPR doesn't fundamentally change anything about their model. EU advertisers will flock to native and abandon other forms of advertising that were based on more invasive targeting.

Is that 30% confirmed by a hold out group? The cynic in me suspects most measures, nowadays.

I can't believe nobody is making a profit. I just suspect most of the trick is in generating demand, not finding it. That or diversifying offerings.

Is that 30% confirmed by a hold out group?

I’m not sure what you mean by that. But generally we get about a 30-50% ROI on our native ad campaigns once they are optimized, and most affiliate networks pay weekly if you produce significant revenue. We do better than most because I wrote some clever stuff to identify sites that were sending us mostly bots/bad traffic and automatically blacklist them from showing our ads. There are a ton of these in the native ad space.

I just suspect most of the trick is in generating demand, not finding it.

There are evergreen affiliate niches such as weight loss, hair loss, erectile dysfunction, etc. Aging, bad genetics, and poor lifestyle choices generate the demand for us. Making money in these niches is generally as simple as placing ads on sites where people that belong to certain known demographic groups visit, and reducing fraudulent traffic by as much as possible.

How do you measure the 30%? Compared to previous revenue or compared to a cohort you purposely don't advertise to, but is randomly selected from those you do?

That said, the niches you describe sounds like snake oil markets. :(

If true, it makes a very amusing reversal from the usual story. Rather than peddling their wares to suckers, the placers of the ads have themselves been suckered into buying a product that doesn't work as advertised!

The more you think of it, the more it seems to be totally expected. It's the advertising industry we're talking about. They're not exactly a paragon of honesty. On the contrary, I suspect everyone tries to bullshit everyone else. The few observations I made from working next to people in marketing & advertising supports that viewpoint.

EDIT: reminds me of that one story about A/B testing, which pretty much directly told us how adtech companies are bullshitting their own customers.


I highlighted the more interesting parts back then: https://news.ycombinator.com/item?id=10873226.

The rumor is that both the Cruz campaign and the Trump campaign found Cambridge Analytics' products to not work as advertised.

The SNP had an introduction to them, and the delegate reported back exactly that: not trustworthy and not a good idea to work with.

Read The Waste in Advertising is the Part That Works: https://www.researchgate.net/publication/4733724_The_Waste_i....

In it E. Ann Hollier and Tim Ambler make a strong case for what has been known from the start about branding: that mass media do it best, without performing at the personal levewl. One corollary for their case might be, "Not everything you value can be measured, and not everything you can measure has value."

Because digital advertising can be both targeted and measured, the whole advertising business decided that ads perform only when they are targeted and measured. But that's not advertising, really. That's direct marketing, which -- as I say in that post -- is descended from junk mail and a cousin of spam.

It's no accident that the $trillion or more spent on adtech hasn't produced a single brand known to the world.

The unanswered question is the one raised in The Problem With Targeting (https://www.dotcoma.it/2015/06/22/the-problem-with-targeting...), and pretty much everything Don Marti and Bof Hoffman (look them up) have been writing as well: is it possible for online advertising to brand products the way offline print and broadcast media could, and still do?

I suspect the answer is no. But my mind is open on the matter.

> By and large, it doesn't, very well.

Compared to what?

It may be easy to provide evidence for:

* Time and resource intensive prebiased word of mouth has a higher conversion rate.

* Conversion rates for mass advertising are very low numbers.

It is also easy to provide numbers for:

* Conversion rates for targeted advertising are significantly higher than non-targeted advertising.

* Having a non-existent marketing strategy for scale markets has a high probability of failure.

What is your position exactly?

Consider also some of what is implied throughout the article, that changes implied may price smaller companies out of the market. This doesn't bode well for competition, particularly for less funded competition. Small p&l organizations typically need to market to drive revenue, they're rarely in a position to push brand marketing strategies alone. Indeed this is a common huge difference between fine and coarse targeted platforms - when did you last see a small p&l trying to advertise on cable? What about on fine target platforms such as Facebook? Do you want such companies to become competitors to brand giants?

There are three main kinds of ads I see: (1) products I just bought, (2) products I would never ever buy, and (3) creepily manipulative clickbait. None of these should be worth the money to the advertiser.

I test drove a car and after I got almost a dozen follow up calls while I was busy with other stuff, I blocked the dealership number. They kept calling me from other numbers, which I also blocked (with the new Android spam filter on my phone). But the amusing thing was that I started seeing ads for the dealership online promoting their "low pressure" sales team!

I made fun of a company called Salsify for naming their business after oyster plant, a root vegetable I can't stand, and now I see ads for them all the time.

> I'm pretty sure that, the more we learn about how well advertising works, the less money people will be willing to pay for it.

That sounds like a statement from someone not very familiar with advertising.

Minimally it's contrary to the agency model & its incentives. A lot spend (the majority?) goes through that model, and it's not likely to change any time soon. And if anything GDPR, will increase it.

That said, even as measurement & attribution improves, the reality is that the effect will be shift to spend around. The better an advertiser can identify waste, they better they can redeploy that spend. Because that improves the ROI, the less it hurts the margin, and the more it makes sense for them to spend.

>I think one side affect of the surge of IT into advertising, is that it has become easier to measure exactly how well advertising works. By and large, it doesn't, very well.

If that was true, Coca-Cola and McDonald's wouldn't exist.

Coke's most successful marketing trick was a giant fluke. Even then, most of their success is probably in cornering the market, not just marketing. In large, many (most?) places sell it because they have an exclusive license that provides cups and supplies.

McDonald's is a funny one. They really don't seem any worse than most other fast food to me. I suspect they saw more expansion from their franchise methods than anything else, though.

Which is not to say the marketing isn't important. I just don't think it deserves the full victory.

I think you're doing a disservice to marketing - both of the elements you've mentioned actually fall under the marketing banner:

- Coke: their branding, market research, and consumer strategy got them where they are today. All are marketing functions.

- McDonald's: franchise go-to-market strategy, branding, consistency of product. All are marketing functions.

Whether or not paid advertising helped them is a whole different argument (and one I'd argue, as when 'tis the season, it's always the real thing...)

Then I contend that marketing is expanded to too large a term. And at that point, yes, it is responsible from everything including product strategy, but also worthless as a term. :(

Might as well debate that "business" is responsible.

Especially if the success is not repeatable. Which, most early practices of large companies today are not repeatable by smaller companies.

Marketing has always been defined by the 4P's: Product, Price, Place, and Promotion.

The "business" defines the problem-space and strategic direction (in these cases, food/drink), marketing defines how you address the space (specifics of the products from research and evaluation, market positioning, promotion to audiences), operations executes on the above (supply chain, training, maintenance), and then customer support feeds into all three of the above.

This GCSE Bitesize piece (revision material for UK exams students take at 16) outlines why marketing is more than simply advertising and promotion: http://www.bbc.co.uk/schools/gcsebitesize/business/marketing...

Advertising is a subset of marketing, but marketing is a broad term, and is often mistakenly limited in how it's used in arguments (usually to belittle the role of it).

That still seems to broad of a definition. I get that it is taught to be concerned with the 4Ps, but those can clearly exist independent of a marketing department. Or any discernible marketing strategy.

Consider, as well, that the engineering department is also responsible for the Product. In large, they are also influential in the Price. Yet, I think it would also be unfair to give engineering full credit for a successful product. Even if it was well engineered or poorly marketed.

Which is to further my point that I was not trying to dismiss marketing as a valid area that a company should invest in. I just don't think it should deserve full credit for Coke and McDonald's existing. :)

underestimates or overestimates? There is advertising that doesn't require tracking. Simple brand-awareness-building and content-based advertising is pretty profitable without requiring tracking. For a website, the lack of personally-targeted ads may be offset by increasing the number of ad slots. I wonder if there is an estimate on how much value tracking adds.

There is no reason to expect that ad spending will drop in total, because ads never worked very well in any medium, but they worked well enough that it's worth investing in them. The portion of ad spending in TV has not changed significantly the past decades, what has happened is spending has shifted from newspapers to online. but the overall the volume keeps increasing in all media

[1] https://cdn.static-economist.com/sites/default/files/images/...

Oddly, I think YOU are underestimating the size of the change coming, too. An upended industry won't go quietly, and I think we're about to see a lot of technology innovation to circumvent the new wave of privacy restrictions.

I think you're missing the point that's being made in the parent post. It's not that enhanced privacy will cause advertising to implode, it's that as the industry has gained more knowledge and understanding of advertising it's becoming apparent that advertising isn't worth what they thought it was.

I saw a similar up-ending of an ad-dependent industry 15 years ago with newspapers. Advertisers for years assumed a certain value based on the circulation numbers as reported by ABC. There was no real way to directly measure that value other than inferred activity within a geographic region that could arguably be tied to newspaper advertising. With the internet advertisers began to be able to track specific value for ad campaigns based on click-thrus, cost-per-action, etc. Those values were lower than the assumed value of newspaper advertising, which began a shift in where money was spent. Combine that with the fact that ABC numbers started to show that they were far less accurate than originally assumed, due in part to gaming by the papers themselves, and revenue started seriously shrinking. The last firewall that newspaper had was classified, which quickly folded in the face of job and real estate sites and of course craigslist.

Internet advertising has been suffering the same game-playing by sites that newspapers pulled with ABC circ numbers, and as advertisers gather more and better performance data about the value of ads and campaigns they're realizing that value is less than they thought and seems to be trending down.

Regardless of whether there's more or less privacy for users/consumers, the indications are that advertising is becoming less and less valuable.

advertising has never worked, but the delusion that it could work is strong - it seems like there’s probably a neverending supply of businesses who are going to throw some money down the drain, just in case they get lucky. It’s like buying lottery tickets - you don’t do it because of your great ROI, you do it because you’re having a fantasy about results

> I'm pretty sure that, the more we learn about how well advertising works, the less money people will be willing to pay for it. What's happening to billboards and newspapers right now is probably coming to several other industries soon.

I don't see any evidence to support that. Total spending on advertising is increasing every year. Newspaper spending is down, Google & Facebook is way up. Overall it's up.

Yeah, it pretty much doesn't work well. The unit economics is something like "pay 50 cents a click and hope your funnel can monetize that effectively". More if you have more specific/desirable targeting, less if it's broader, but anything you save on the targeting end you lose in funnel effectiveness. If you've got solid data all the way through to conversions and lifetime value, that can either justify dialing ad spend up or down, and it's usually going to be down.

I agree that as an aggregate we're probably massively overpaying for ads for little average effect, but there may be a game-theoretic reason why it does not go away. What if out of 100 competitors the top one gains a lot from ads but the 99 others lose? All 100 have to pay to play to try to be that one. If they cooperated as a cartel they could all pay less for ads and consumers could pay less for products but that is unlikely to happen.

There's a well-worn advertising quote "I waste 50% of my advertising budget. The problem is I don't know what half."

Most ad rates have dropped pretty dramatically since the early dot-com days, though. Are they still overvalued?

Advertising doesn't work that well if you look at the raw numbers. Assume you have a great campaign and awesome conversion funnel, so that you get 1% clicks and 1% conversion.

The math is 100 x 100, or one sale for every 10,000 eyes who sees your ad.

In a vacuum, those numbers are awful. If your entire marketing strategy is buying ads to convert to sales, then of course, advertising doesn't work, and really, you probably shouldn't be advertising in the first place.

The only thing that matters is ROAS. If you spend $1 and get back > $1 in return, it works.

Also about half the market is branding, which is all about exposure rather than direct clicks or conversions.

While reading your comment. I imagined 10,000 one-eyed people. :-)

but what percentage of ad spend on the internet is actually ROI justified? I'd guess it's pretty low, right? I got the impression it's dominated by big brand advertisers that can't calculate ad spend ROI anyway.

The only way to eliminate all waste in advertising is to use an affiliate marketing model and that's the biggest cesspool of them all.

Affiliate marketing does not magically get rid of waste. It just transfers the risk involved to another downstream party, while buying a shield of deniability of the unethical and scam tactics commonly used by affiliates.

We are considering an affiliate model for a new side-product. Can you point out the worst issues we should watch out for ?

(Its a virtual-goods product. Not subscription, but usually multiple-return customers. Around 70% gross margin since we have some high upfront provisioning cost.)

Pressure sales tactics, spamming and hijacking the last click are the things that really work in affiliate marketing

> I think one side affect of the surge of IT into advertising, is that it has become easier to measure exactly how well advertising works. By and large, it doesn't, very well.

I just don't understand the anti-social media agenda. The same people who for 2 years have been saying that social media advertising is so effective that it got trump elected are now saying it is ineffective. They demand that political advertising on social media be monitored because it is so effective that it gets presidents elected. Now, it's social media ad spending is so ineffective that these companies won't be around.

> What's happening to billboards and newspapers right now is probably coming to several other industries soon.

What's happening to billboards and newspapers is that ad money is flooding into digital space ( where the young kids are ).



Why speak so authoritatively about something you obviously know nothing about?

Desperate politicians making absurd rules when:

1) US election went “wrong”.

2) The same is happening to them

3) They’re mostly not hitting their own companies

Obviously it’s a lie. The “right” is gaining in Europe like crazy, not because of Facebook, but because the existing governments have seriously underperformed.

But politicians faced with the choice of admitting mistakes that they don’t even know how to fix, or to wildly and randomly strike at whoever got blamed ...

> The same people who

Is it really the same people? Do you have any examples of this?

"The same people who for 2 years have been saying that social media advertising is so effective that it got trump elected are now saying it is ineffective."

So does that mean you think that social media advertising is effective as people claim and Russian use of it got Trump elected?

Regardless of whether your "same people" is a straw man, the opposite of a set of inconsistent claims tends to also be inconsistent, so I'm not sure what you are trying to say here.

It won't do much. It'll definitely get rid of some low-value companies but that's about it.

People think that adtech has some incredible insights but 99% of data is terrible. If you check your profile at any major site, you'll quickly find that you're probably in several conflicting segments that have nothing to do with you. Context is still king for any marketer who knows what they're doing, but unfortunately the industry is overwhelmed with subpar talent and politics through layers of agencies that buy buzzwords like "AI" and "data". Barely any adtech companies have a direct link to a person, and any links that did exist are even harder now with adblocking, 3rd party cookie deletion, ephemeral mobile device ids, IP renewals, and other noisy and loose signals.

Interestingly, big companies like Facebook and Google that have user logins and 1st party data connections will actually benefit from the higher industry regulation and are not going to lose any of the data that users willingly give to them. ISPs are another major source of data, along with credit agencies and banks, and now giant ecommerce companies like Walmart and Amazon, all of which have very accurate and exhaustive histories and will see little change from GDPR. Overall it's well-intentioned, and good progress, but what it will do is vastly overstated.

What you're describing sounds very much like a bubble that is going to pop...

I said that GDPR will not affect much, other than a few companies that add up to a microscopic market share. Where's the bubble?

The bubble is those companies, although I guess the original article would argue that their market share is (currently) not microscopic.

It's hard to overstate how dominant Facebook and Google are in online ads.

i am hopeful for the idea that shifting from ads to paid services will reduce clickbait. right now may be seen as the yellow journalism moment of the internet.

One surprising side effect of this might be that the hordes of machine learners and data scientists who used to work for adtech might go to healthcare/bioinformatics where we might get new breakthroughs.

A lot will just move to finance...

However, financial firms are usually pretty against open academic research that would allow publication of papers. Many researchers truly value the ability to contribute to this field openly. Perhaps you are right and they will move to health.

Dodd-Frank isn't repealed and AFAIU, it made taking investor money much harder than it used to be.

We can see the next most lucrative fields where ML can be applied to by Alphabet's portfolio: * Calico, Verily - healthcare * Waymo - self driving cars

Another fields which comes to mind is robotics but Google liquidated their holdings related to this (Boston Dynamics) for some reasons.

Hopefully some find their way into rocketry research to help further commoditise space travel.

I'm trying to. It's not simple, if you're not in the US or Germany :<.

Germany, eh? Interesting. What kind of companies in that field are in Germany?

I wish I knew more about this subject.

Too many to list. DLR - Germany's NASA - is a key player in Europe's space industry, so it draws all kinds of space companies to its neighbourhood.

EDIT: those two links might give you a quick overview:

- https://spaceindividuals.com/space-companies

- https://spaceindividuals.com/space-jobs?country=Germany (over 50% of all listed, but not surprising, since the site's owner is in Germany as well)

Right now, I'm trying to map the industry myself.

There are not that many actual hard data science experts in adtech. Much of it is smoke and mirrors with every vendor claiming "AI" even if it's just some excel tweaking. The industry can't even agree on standard metrics or run a campaign without 20% discrepancies.

Also you can definitely apply ML techniques to data without PII, it's just harder, which actually makes the job more in-demand for high-quality talent.

As someone who has just left the advertising industry, and was doing data stuff, the real issue is that a lot of companies have a lot of data but almost none of it has any actual useful information content in it. It’s a bit of a hard concept to explain to explain to people because they go “but there’s a hundred something columns, and hundreds of millions of rows!”, you can have all the data you want, but if it’s all non-indicative trash you’re not going to be able to usefully predict anything, but hey, you’ve hired a data scientist now and added the cost of a big spark cluster to your AWS bill now, so you may as well continue telling the market you’ve got some real hot stuff.

Yes, agreed. Lots of useless data out there that doesn't really have significant signals. Also a lot of it is duplicated with everyone getting the same logs, so I find it hard to believe that any single company magically derived a big advantage over another.

Not to mention, no amount of AI powered results can beat a slick sales team. $1M spent on quality salesperson is much higher ROI than $1M on fancy AI engineering when it comes to relationship-powered media spend.

In statistics, there's a classic quote by John Tukey "The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data."

This should be printed in large letters on the wall of anybody claiming to do data science.

They can still use sufficiently anonymized data under GDPR. They also will have data from people who have consented to have their data used. They will also have data from visitors who are not in the Union.

I would not at all be surprised if this is enough for the machine learning and data science people to be effective, especially at very high traffic sites like Facebook and Twitter.

I don't believe for a second that they can make the data "sufficiently anonymized". That is extremely hard and pretty much every time "sufficiently anonymized" data gets leaked it turns out it weasn't sufficiently anonymized after all.

Anonymization won't really help because you can't store unique identifiers. IPs, emails device identifiers, etc and anything derived from them (no hashes of that data either) are all prohibited. So you can't really build a profile of someone if you've got no key to look it up when you need to target someone.

But that still prevents the entire targeted ad industry. If you can't hold any identifying information on a user that didn't opt-in to tracking, you can't show targeted ads to that user.

There's a lot more stuff to do with ML and make money than ad targeting

Adtech is the highest paid application of ML.

Based on what exactly? ML is not that common in adtech companies, many just lie about some basic regression testing. Some even just put up a facade in front of grunts with excel.

Even the most sophisticated ad modeling pales in comparison to fintech and bioscience/pharma where the real money is.

Maybe that’s true, but if the ML ad targeting industry disappears, several of those currently employed ML workers have little to no reason to exist.

Yikes! Why the hyperbole? I'm sure if someone is clever enough to learn and be productive with ML, they'll be able to pivot and be productive in another field and/or using other tools.

He's talking about the positions, not the people in them.

What did you have in mind?

To me, the best applications are going to be those by which the human and the ML'd AI will benefit each other by a greater degree than if each were on their own.

At my particular company in logistics, I can see many places where it can be applied- especially in the dispatching of deliveries and helping to set up a call with customer support.

Predictive maintenance I think will be big as well

Two "might" occurrences means it's unlikely. The chances are these people will be swallowed by whatever the most profitable industry is at the time.

My guess would be similar learning systems that aren't subject to GDPR, such as government, external to EU (US), the ludicrous information of things (smart sensors, cloud, etc) where information doesn't leave the user's control, AI driven products, etc.

Then you'll have quite a few who will just continue regardless, sometimes it's easier to apologize (and pay a small fine) than it is to ask for permission.

Don't get me wrong, I would love for all of these people to end up in the (Scientific) health industry. (It's worth mentioning Scientific, as there is a lot of useless crap that uses some form of learning and offers no real gain - would be disappointing to see them all end up applying deep learning to homeopathy for example.)

Not unless they're willing to take a paycut.

Well advertising and marketing firms don't pay there tech talent very well.

You are begging the question.

> tracking people without their knowledge, approval or a court order is just flat-out wrong.

Requiring them to check an "I agree to be tracked" checkbox and signing an agreement (which has just happened to me yesterday in an EU country in accordance to GDPR) before they can use a product/service is hardly much better. This reminds me of the Android app permission system which requires you to allow an app to do everything it wants (including ridiculous requirements like when a game requires access to your contacts list) or just give up the idea of installing it (as for me I just grant the permissions at the installation time and then block everything redundant with XPrivacy). So I doubt it is going to do much good, a way like the cookie law doesn't really do anything but just introduces useless cookie warning banners.

Requiring them to check an "I agree to be tracked" checkbox and signing an agreement (which has just happened to me yesterday in an EU country in accordance to GDPR) before they can use a product/service is hardly much better.

That's specifically not allowed under the GDPR. Either the information is needed to provide the service (and needed means actually needed, not "my business model depends on it"), in which case they don't need to ask, or the use of the service can't depend on that consent.

(By the way, even if the information is needed, they still need consent to use it in other ways, and the same applies)

See the ICO guidelines on the issue: "If you make consent a precondition of a service, it is unlikely to be the most appropriate lawful basis."


How is "my business model depends on it" not a bona fide legitimate purpose for information collection? Nobody is forcing anyone to patronize a business that relies on data collection for profitability.

I'm extremely skeptical of regulation that interferes with consensual deals between economic actors. You want transparency? Fine. But you don't get to randomly outlaw certain entire classes of business.

> How is "my business model depends on it" not a bona fide legitimate purpose for information collection?

Business models are arbitrary and orthogonal to services provided.

> But you don't get to randomly outlaw certain entire classes of business.

Why not? Some business models are clearly antisocial and don't deserve to exist. GDPR is only outlawing business models based on large-scale abuse of people's private and identifying information. If you can obtain proper consent from enough percentage of your users, then your business model will be fine. If that's a problem for you, then ask yourself why.

I find it funny to see people complaining that their business model will be in trouble under GDPR. GDPR literally only outlaws being a huge asshole (in context of users' data).

> Business models are arbitrary and orthogonal to services provided.

that doesn't change the fact that the business model depends on that orthogonal action. Is it unlawful for gyms to charge by subscription instead of per-use knowing that most people don't use them? Should the NYTimes be forced to go pay-per article instead of subscription? Should nightclubs not be allowed to overcharge the bottle?

> GDPR literally only outlaws being a huge asshole (in context of users' data).

It goes beyond that, by forcing you to serve people that don't agree to your business model.

> Is it unlawful for gyms to charge by subscription instead of per-use based on the model that most people don't use them? Should the NYTimes be forced to go pay-per article instead of subscription?

It isn't, and it shouldn't. Either can choose whatever works for them best. GDPR only outlaws a few particular antisocial behaviours, which makes a particular class of business models illegal or much less profitable.

The "but business model!" whining sounds a bit like complaining that you can't make money mugging people, because the government disallows theft and assault. Cry me a river.

They can charge any subscription or per use fee that they want, but not if the payment is personal data not necessary for the service.

> Business models are arbitrary and orthogonal to services provided.

That's a very interesting claim. So you're saying it'd be possible to organize fighter jet production as a co-op vegan collective?

Sure it is. How effective this business model is is another story.

Either way, you don't have an inherent right to specific business models and the EU is simply not allowing a business model anymore that has been widely abused. Of course it'll hurt some but I think overall the industry will adapt and change for the better.

One example might be the provision of a ‘free attraction’ service in exchange for surveillance capabilities, and the business model being various unrelated ways to exploit those capabilities (targeted ads, selling the raw data, selling products derived from the data, etc).

This has become the default business model of the B2C Internet, as distinct from offering a service in exchange for payment.

If you can find enough vegan aerospace engineers and have them liquidate their savings for the good of the collective, then sure, why not :).

Maybe "orthogonal" was a bit much. But almost always there's plenty of wiggle room to choose alternative business models.

You wouldn’t think that militant orders of Christianity would have been workable, but they clearly were. I have no doubt that money and power can lead any ideology far from what we’d associate with their base principles. I doubt that vegans are the exception.

There are a lot of companies that are collecting data about you and I that we aren't patronizing. For example, even if you don't have an account at Facebook, they have a profile of you. Even if you don't do business with Equifax, they are collecting your data.

People want more control over how information about them is used.

Personally, I am extremely skeptical of entities where the only motivation is pure profit, and I personally would like to organize to protect myself against these entities that don't align with my civic concerns.

> consensual deals between economic actors

For web, most people don't understand the scope of tracking and privacy. None of the people I know understand what these web companies are doing, how they are being tracked and what data companies have about them. Nobody reads the privacy notice in those websites. Even after someone tells them facebook is tracking everything, they don't understand and just ignores it. There are no consensual deals on web. So GDPR is for all of them (and for people who understand and wants protection).

Also in America a company is more important than people. That's why there are so much negativity towards GDPR.

I have visited a drugstore to buy some vitamins yesterday. They have handled me a touch-screen where I had to check a checkbox (saying that I agree to allow my medicines shopping history to be stored and analyzed to track my health (which I obviously don't want them to do actually)) and an electronic signature tablet with a stylus where I had to put my signature. This was a mandatory condition for continuing using a discount card that gives bonuses when you buy meds. I could opt-out but this would mean I won't be given discounts any more.

Although it initially seems kind of scummy, isn't this better than the majority of ad services anyway? In exchange for selling your medical data, you get a discount. It's essentially getting paid for sharing your data.

Of course if it's essential medication which people can't afford without the discount then that's another matter, but otherwise it seems kind of fair to me. And it gets a bit more complicated if you're in America where that data is going to be used by health insurance to adjust your premiums.

That's a good question, I don't know if it's valid to offer discounts and such in exchange for consent. It goes against the EU principles ("personal information cannot be conceived as a mere economic asset"), but I'm not sure if the law actually prevents it.

One of the criteria for freely given consent is that the customer must be able to revoke it at any time without detriment.

If revoking consent causes a detriment, then it's not freely given, and so that "consent" isn't sufficient to grant the data controller a legal permission to use that data.

Quoting recital 42 from https://gdpr-info.eu/recitals/no-42/ "[...] Consent should not be regarded as freely given if the data subject has no genuine or free choice or is unable to refuse or withdraw consent without detriment.", so it's quite explicit.

I think a minor discount would probably fly (under 15% or so) or atleast defendable to the regulatory bodies.

Bigger discounts would be a problem since that would be more of a detriment.

Thanks, I missed that recital. It's confusing that the phrase you cited isn't in recital 43.

> personal information cannot be conceived as a mere economic asset

I wonder why not? Personal information is useless for most people, they give it away for free to the state institutions and the police wont even ask your consent. Some websites and services have found a way to make money off it, in exchange for free services etc. Why is this an ethically unacceptable proposition?

Because it ends up abused by marketers, and frequently ends up facilitating crime as well. GDPR as a law came out directly from the last decade of mass surveillance economy.

So what? It was part of the agreement when you consented. The same way that when you go to a club, a strip club, or a casino you know you 'll be exploited in a way. Sounds like advocating for overreaching laws that "save you from yourself" and keep you away from harm / drugs etc.

It's a bit ironic that you brought up mass surveillance because GDPR explicitly exempts the police and security services from its reach.

Edit (i can't post a reply): - consent should always be required, but if you don't consent i shouldn't be legally required to service you. GDPR is more than just consent hence the overreach

- The NSA has more data than any single actor on the internet, we can't possibly claim that private surveillance is worse. The NSA may have a better profile of me than any private actor even though (and especially because) i m not american. And their profiling can harm something that businesses generally don't care to harm: my freedom

> It was part of the agreement when you consented

I didn't consent to shit. GDPR is the response to rampant abuse of personal data without obtaining proper consent. You're still allowed to use my data if I consent, you only have to obtain an actual, informed consent.

> The same way that when you go to a club, a strip club, or a casino you know you 'll be exploited.

What kind of clubs are you visiting? :o. Are you sure they're legal?

> It's a bit ironic that you brought up mass surveillance because GDPR explicitly exempts the police and security services from its reach.

It isn't, because adtech surveillance dwarfs government surveillance. Also, the police and security services are doing something valuable for me, even though they do it imperfectly. Advertising industry exists only to fuck me over. It's a cancer on society.

> It was part of the agreement when you consented.

That's a big point in GDPR, I think -- there never was consent. It's the same as why terms and conditions aren't legally binding: nobody actually considers there to be a valid agreement when they click next. In a sense GDPR is just enforcement of people's expectations, and ending predatory practices that were misusing them.

Replying to your edit:

> consent should always be required, but if you don't consent i shouldn't be legally required to service you. GDPR is more than just consent hence the overreach

Yeah, and in a world where companies were not abusive, it would work that way. As it is, we both know perfectly well what happens - companies have leverage over users, and they'll use it. They'll make you consent to every kind of data abuse and sharing to use the service, exploiting the fact that giving up privacy doesn't feel like it's hurting at the point the data is being taken. GDPR is designed to remove that leverage - to make it unable for companies to extract arbitrary consents on the threat of refusal of service.

This only really affects you if your business model was baiting users with "free" services, spying on them, and selling that data to adtech industry.

> The NSA has more data than any single actor on the internet, we can't possibly claim that private surveillance is worse. The NSA may have a better profile of me than any private actor even though (and especially because) i m not american. And their profiling can harm something that businesses generally don't care to harm: my freedom

Sure, so NSA may have pulled in your e-mail history at some point in time. But it's mostly sitting there. NSA doesn't care about you unless make yourself important to US national security. Adtech surveillance, on the other hand, track you constantly, through pretty much every device you have, every site you visit, and makes use of your data all the time. And all in all, this data might at some point finds its way to NSA too, already nicely packaged. NSA vs. adtech is kind of like choosing high potential loss but very rarely, vs. low loss all the time. I'd say the expected loss is worse with adtech, but I'm still happy GDPR will make the life difficult for both.

Legitimate question; if you're not American, why do you think the NSA would be impacting your individual freedom, or perhaps the freedom made available by the state(s) of your citizenship(s)? Or were you more referencing that it is violating your privacy?

there is international law which allows the US to affect me and my freedom even in my home country. It would be up to the local courts to decide. And of course when i visit the US, as well as my freedom to do business with US companies. Also, in this case the tracking is both without my consent , and without the protections that american law provides to americans.

> Sounds like advocating for overreaching laws that "save you from yourself" and keep you away from harm / drugs etc.

I understand you probably move in tech/libertarian circles so it doesn't seem like this, but the majority of the world population is in favor of laws protecting people from themselves and keeping them away from harm.

Now what is overreaching or not is a matter of opinion, and hence politics.

I might be wrong, but I don’t think that this is allowed under GDPR. They have to offer to you the same services with or without the consent to collect your data.

Also, under GDPR, you can always request export and removal of all your profile data from their data stores.

Well, I guess that answers the questions "how much is my privacy worth to me" and also "how much does the society i live in value privacy" quite succinctly.

So they have been storing and analyzing your health already and now they need your signature?

Sounds like something I'd opt-out, request all past data and have it deleted afterwards.

I think a lot of people/companies have been misled into thinking that the GDPR is just another type of cookie law and the same solution will work - get them to ok it by having an obtrusive message at the top of the site. They are in for a rude and costly awakening.

I wonder if there will ever be a suit in which that cookie banner ("Got it!" ugh) is found not to apply because users basically agree to it simply by seeing it. You have their stupid cookie by the time you can read that message. That doesn't seem like an agreement to me, but I'm not even slightly a lawyer.

The whole strategy seems to be in bad faith. I wish more sites would react by minimizing usage of cookies instead of just adopting that dumb overlay.

Completely oblivious and ignorant here:

If a company has no official office in Europe, how does this affect them? All advertisement and business focus is say only in the US, is it business as usual? What if an EU citizen decides to sign up?

Are US companies forced to deny customers not par of say an IP block (half assed method I know, but just speaking in general)?

The guidance we have received is that if your business is not located in the EU, not targeting EU countries, you don't have parts of your site translated to languages spoken in EU countries, and do not have an EU-based domain name extension, your EU traffic is considered incidental and GDPR probably does not apply to you. But, as with everything in the GDPR, that is extremely murky. This law is subject to unique interpretations and degrees of enforcement in the courts and regulatory offices of 28 distinct countries. This is why we simply chose to block EU traffic - there's no need for us to take on the liability of EU traffic, and hope that some country over there doesn't need fines from us badly enough to decide that article 484208408 makes us subject to it.

With regard to enforceability outside the EU, that is anyone's guess. If you're in the US, there are already mechanisms that allow for the domestication of EU judgments in the US. Once domesticated, the judgment would have the same force and effect as if it had been issued by a US judge. However, the treaties that allow this are very complex, and allow for a large number of exceptions. So it would be up to a US judge in each specific case to decide whether or not a judgment for a fine issued under the GDPR can be domesticated. There are currently no treaties specifically relating to GDPR in the US, and I'd imagine there would be (very welcome) strong opposition to such a thing.

Language is a horrible benchmark.

French in Canada, English and Spanish in the US, German among German expats (of which there are millions.)

Targeting a business for extortion because of the languages offered? Ridiculous. GDPR should only apply to businesses with a physical nexus in Europe, anything else is an attempt to assert extraterritorial jurisdiction.

Europeans don’t have to visit US/Canadian/Chinese websites. If they want to “protect” themselves, they simply stop using services they find objectionable. GDPR is nonsense — individuals should be allowed to do what they feel is right for them.

Why not ban all junk food from Europe? Tobacco? Alcohol? Those harm people far more than targeted advertising. If we actually “cared,” we’d be banning those industries.

GDPR is nothing more than a trade barrier.

If you have a site in an EU language that is spoken in other countries as well (Spanish etc) but your site doesn't have an EU domain extension and your content and services are generally not targeted at EU residents, then GDPR likely does not apply. Language isn't the only test. If you have a site in German that is reporting German news though, you're likely going to have GDPR apply to you even if you have a .com extension and are not in the EU.

Unfortunately, I must use words such as "likely" here because there is a large amount of ambiguity in these tests, along with a major conflict of interest - it will essentially be up to the would-be beneficiaries of these fines to determine whether or not you are subject to them. The EU HN crowd seems to believe that their various governments will only fine "bad" companies "reasonable" amounts under this law, and that it will not be abused to extract government revenue from foreign companies and/or hobble foreign competitors of companies in their countries. I certainly hope they are correct, but this would be the first time in the history of the world that such a broadly worded statute was not abused. The only safeguard we have is that the world is watching. If/when the EU gets too out of control in their abuse of GDPR, hopefully countries like the US will implement legislation that makes it impossible to enforce GDPR fines within their borders.

Pakistan was once considering issuing an arrest warrant for Mark Zuckerberg because someone created a Facebook contest that offended some Pakistanis [1] [2]. The case would have carried a sentence of death by stoning. Even if charges had been filed, it is doubtful that the US would have extradited him to be stoned to death under the laws of another country. While GDPR fines are civil in nature, this case underscores the importance of not necessarily allowing the enforcement of other countries' laws in your own. If GDPR enforcement becomes abusive, one would hope that similar protections would apply in our home countries.

[1] http://www.adweek.com/digital/could-mark-zuckerberg-face-a-p...

[2] https://tribune.com.pk/story/342031/blasphemy-arrest-mark-zu...

Don't worry, I am sure it wont be abused. The whole fearfull effect was created by adtech companies trying to create public outrage and paranoia to pretect their money source. Be sure that there are going to be some nasty penalties for greatest violaters (like online dating sites) I have noticed a lot of them is packaging GDPR into terms and conditions and privacy policy changes and those will have to get a fine, but I am sure warning will come first. But as one of ICO said (UK?), "We will always try to use carrot instead of stick, but some companies are carnivores, they don't eat carrots."

Also I am sure, half of world will have similar laws in next few years. Private data invasion just went to far, this has to be stopped.

Don't worry, I am sure it wont be abused.

It would be the first time in history that such a law has not been abused. The fear is real and fully warranted.

I am sure warning will come first

There is no mandate written into the GDPR requiring warnings before fines, nor is there anything preventing multimillion-dollar fines for first-time, minor violations.

It's not a law it's a regulation and EU regulation is largely intended to be more of a carrot before the unpack the stick.

See the Smartphone Charger regulation. It requires all smartphones vendors to come up with a standard for charging, everyone picked microUSB (though moving to USB C now). The EU is fine with that and the smartphone vendors know that if they start pulling the "everyone has their own port" shit again that the EU will get out the stick.

Nobody wants the stick. The EU not and the Vendors not. The carrot was the EU Cookie law, which was largely ignored and the consent dialogs poorly implemented (not even asking for consent the majority of the time). So this is them getting out the stick. Now you can pick which one you want.

>There is no mandate written into the GDPR requiring warnings before fines, nor is there anything preventing multimillion-dollar fines for first-time, minor violations.

Art. 83 of the GDPR details this. Art. 78 details what rights you have against them imposing a fine.

Now you can pick which one you want.

I don’t have to pick either. My company is not subject to the GDPR, and we will never put ourselves in a position to be subject to it. I will not be dictated to or threatened by a foreign government.

Art. 83 of the GDPR details this. Art. 78 details what rights you have against them imposing a fine.

People keep saying things like this, and yet neither article a) requires that a warning be issued before they seek a fine or b) limits fines in any way, except for a top cap of $10 million/$20 million (or percentages of revenue, but the caps are more than 100% of the revenue of most companies).

I would love for someone to just say “yes, technically there are no required warnings or limits other than the $10/$20 million”. Because that’s the only true statement that there is about GDPR fines.

>I don’t have to pick either. My company is not subject to the GDPR, and we will never put ourselves in a position to be subject to it.

Canada, Japan and some other countries and even the US have indicated to copy the GDPR if not in letter atleast in spirit, though the US response is a lot weaker.

>I will not be dictated to or threatened by a foreign government.

The US is a foreign government and does it all the time to me, why is it a problem now?

>I would love for someone to just say “yes, technically there are no required warnings or limits other than the $10/$20 million”. Because that’s the only true statement that there is about GDPR fines.

You won't have that. The GDPR has a strict guideline on how to impose fines, it's not a law an won't be enforces as such. The regulatory bodies have bite because large players like Facebook or Equifax that leak large amounts of userdata require more than an angry letter in their mailbox.

As these articles mention, the agency imposing a fine should severely think about the level of fine and ensure it's appropriate. If you get hacked by a 0-day, you followed the advice of your regulatory body, your shit gets leaked and you inform your users immediately, it's very unlikely anything will happen.

If you get hacked because you didn't update your MySQL server in 5 years, you ignored what your regulatory agency said and you don't tell your users, don't expect them to go easy on you.

Easy as that. If you don't like it you can sue back and get the fine reduced or rescinded.

People keep saying all of this. Again, there is absolutely nothing enshrined in GDPR limiting fines, other than $10/20 million. It says they should consider some things when determining the fine. But (for example) one of the 28 countries could decide that in their country, the lowest level fines are “only” $5 million, and they go up from there based on the factors they are supposed to consider. That would still be enough to destory most businesses.

You cannot tell me that there is anything limiting the fines (other than the cap) because it isn’t written. You’re saying that you hope and think that each of the 28 governments involved here will be reasonable, but in truth you have no way of knowing, and they have every incentive to not be reasonable.

I hope that my government won't do this. As a EU citizen I only have to care about the one in my country.

Again, if you think the fine you got is too heavy you can escalate this to the courts (even EU courts).

There is also no incentive for the regulatory agency to impose such fines if the business cannot pay them. In that case they would get less or even nothing as the business collapses and it has not been the modus operandi in any EU regulatory body I know or experienced.

If they aren't reasonable than the EU courts will make them reasonable or the EU will add additional paragraphs to the GDPR to prevent excessive fines. Simple as that.

> The fear is real and fully warranted.

Much like the cookie law or CAN-SPAM?

>GDPR should only apply to businesses with a physical nexus in Europe, anything else is an attempt to assert extraterritorial jurisdiction.

It covers the personal data of EU citizens. Similar laws exist going the other way. Betfair can't (or couldn't) give accounts to US citizens. IIRC, various poker sites had to close US citizens' accounts. The US even arrested a CEO of a UK company who was only changing planes on his way home to Costa Rica: https://en.wikipedia.org/wiki/David_Carruthers#Arrest_during...

>anything else is an attempt to assert extraterritorial jurisdiction.

Good. The EU should grasp the nettle and fulfil it's role as the leading global hegemony.

It covers the personal data of EU citizens

You're both correct and incorrect. It covers the personal data of EU citizens. However, not all sites are actually subject to the GDPR at all. EU traffic to these sites is considered incidental and no GDPR protections apply, even to EU residents, on those sites that are outside of GDPR jurisdiction. There are legal tests build into the GDPR (which I detailed in my original comment above) that determine this.

Targeted advertising is not being banned, just regulated. Tobacco and Alcohol are already highly regulated. Junk food is being increasingly regulated https://en.wikipedia.org/wiki/Sugary_drink_tax#Countries

I work in the USA, as a sysad. The company I work for has a social media product.

We've have had European and African citizens who've signed up. And that was more than enough for us to discuss "How do we make our stuff comply with the GDPR?". If we ever considered in starting up in Europe, us ignoring the GDPR is tantamount to writing them off before even thinking of them.

We also do things the right way. Deletion requests aren't treated as "ignore kthxbai", but all data is zeroed out then nightly purged from the DB. And I really think, with how current society is slowly turning against orgs like facebook, the way we're approaching this is one avenue of right ways.

If you dont have money, materials, or employees flowing through Europe there is very little they could do to you even if you actively flaunted the law. They could always block access to your site, or even try to sanction your business but unless they are willing to invade your country to physically stop you they don't have very many options.

There is always the chance that your own government will enforce EU rulings against you but at that point either your own government thinks GPDR should be enforced or you're in a very weak country and are going to have to capitulate to the EUs power anyway, much like small Latin American countries were forced to follow US policies

If another company with facilities in the EU then buys the noncompliant company, could they enforce a judgement on the parent company?

If a noncompliant company is going through due diligence prior to being acquired, they be legally obligated to disclose that judgement? Even if they didn't, how hard would it be for an associate at a law firm to check public records about the company?

Yea they could, but this is basic politics and sovereignty. Nations are allowed to make rules for operating inside themselves. You don't just get to ignore all of those rules when working inside them because you said your home is another nation.

The other side of that is that you can tell foreign governments to fuck off if you aren't dealing with them at all. The only time the foreign governments matter is if they are a superpower able to bend your own country to it's will, and that point you are basically a colony anyway so there's not much you can do

You can totally tell foreign governments to fuck off. I'm just speculating that doing so would affect the value of your company.

The more likely problem will have to do with a US or Chinese company buying an EU company (given the scale imbalances in question). It'd be critical to maintain the existing GDPR compliance and keep the entities separate if the US or Chinese operations are not GDPR compliant per their domestic businesses.

Alibaba for example likely has no plans to concern itself with GDPR compliance in its domestic Chinese operations. They obviously will segment and comply with GDPR as it pertains to the EU operations / EU customers.

You're right, if a EU resident decides to visit your website and you're tracking them (Google Analytics etc) you have to comply with GDPR when handling his/her data. The IP block has a logic to it because the law applies to people in the EU rather than EU citizens.

Really? I was under the impression that it applies to EU residents, regardless of where they are accessing the website from.

It's the other way around. The law protects according to point of access (EU soil), not according to nationality. So an American tourist in France would be protected, but not a French tourist in the US.

This is just like most laws, when you're a tourist in a foreign country you have to follow the local laws, not the ones from your passport country.

This is technically true, however if the site is not targeting EU residents, the traffic is supposed to be considered incidental and the GDPR is not supposed to apply.

As long as you don't target EU customers, you're fine.

Do you have any references/citations for this?

Would help me out! I'm trying to put together a one-pager for my team.

When the regulation does not apply

Your company is service provider based outside the EU. It provides services to customers outside the EU. Its clients can use its services when they travel to other countries, including within the EU. Provided your company doesn't specifically target its services at individuals in the EU, it is not subject to the rules of the GDPR.



And “target” is such an arbitrary idea. Just existing could be argued as trying to target.

The standard should be: “do you have a physical nexus in the EU.” That’s it.

It's either "target", or it apples to all EU customers. "target" is far less of a problem for companies that don't consider their customers data to be important.

And that’s on top of the main problem: tracking people without their knowledge, approval or a court order is just flat-out wrong. The fact that it can be done is no excuse. Nor is the monstrous sum of money made by it.

I use Piwik (https://github.com/matomo-org/matomo) and track visitors without their knowledge or consent, because I need analytics. Piwik is also configured to respect the "Do not track" header, so opting out is as easy as indicating that you don't wish to be tracked.

Is that wrong?

And I use Piwik precisely because it's self-hosted. I know for a fact your data doesn't get sold, because the data never hits any server except mine.

If this seems acceptable, it's also why legislation seems worrisome. There are a lot of corner cases that law tends to overlook. But hopefully the requirements won't be too onerous.

I don't see how self-hosting has anything to do with it honestly. This is about tracking people without their consent. It's about preserving rights of individuals. Self hosting is no more or less acceptable than third party hosting. It's not a technical issue.

So is your position that website owners should have no visiblity at all on their visitors, and no way of knowing how many people are using it?

They can know how many people, as long as they don't store personal information. Keeping a counter doesn't require consent.

Keeping a reliable counter of meaningful interactions is hard without personal data to correlate unique users.

> Keeping a reliable counter of meaningful interactions is hard without personal data to correlate unique users.

It's hard if visitor numbers would directly translate into revenue or similar. The trouble you're running in then is discerning organic hits from clickfarming.

If all you care about is when, where what content is popular on your website, there's a pretty simple method: Tally 200 and 304 responses. 200 tells you how many visits you get. 304 tells you, how often people hit refresh, or re-visit your page within the expiration time of the URL that 304s.

Also there's little value in identifying individual visitors. Getting a coarse idea where visitors are located in the world might be nice (for a regional news outlet for example). So just slap some coarse grained. GeoIP on it.

Ok, so it’s hard. Nobody is entitled to that being easy in the way that they are to privacy.

What privacy? Knowing that user dhsidhujdjdowhdyyehheis is using my website is not an infringement on your privacy!

Why do you think you can control what I do with data you send to me? Don't send me data I'd you don't want me to have it.

Why do you think you can control what I do with data you send to me?

Are you really interested in rehashing this conversation? You got plenty of answers last time¹, I doubt you'll get new ones

¹ https://news.ycombinator.com/item?id=16957978

Obviously no one said anything convining. Sure, it's law under the gdpr and I'll comply. That doesn't mean I think it's a reasonable law.

I'm just saying there's no point starting over, you'll just get the same kind of responses. You should preempt them if you're hoping to get anything new. Though I doubt you'll get it anyway.

Why do you think I can’t photograph your face?

Don’t reflect light in my direction if you don’t want me having pictures of you!

> Why do you think you can control what I do with data you send to me?

Legal protections, like the GDPR.

I agree with this perspective. I haven't seen anyone present a compelling case as to why a person should own all data about themselves.

You would if you could see what the military does with data about people.

Website owners can have visibility on their visitors if those visitors explicitly consent to it.

Not sure what is so difficult to understand about this.

So if I don't think that Walmart should be able to record me with security cameras while I'm in the store I should have the right to demand that they ask me to sign a waiver before entering the store? Shouldn't it be Walmart's right to do what they want on their property, and my right to decide not to visit Walmart if I don't agree with that. Isn't the converse an infringement of Walmart's rights?

GDPR has exemptions for security. Your free to track information for the purposes of blocking vulnerability bots, but only the minimum data required for that purpose... and your visitor's data cannot be used for other applications without their consent.

> Shouldn't it be Walmart's right to do what they want on their property, and my right to decide not to visit Walmart if I don't agree with that. Isn't the converse an infringement of Walmart's rights?

No. Property "rights" are secondary to human rights. Like, Walmart can't knowingly sell poison as food just because it's their property...

In your example, Walmart is free to record you on security cameras for security / theft purposes. However, they can't record what you're looking at and reuse that information for targeted advertising without consent - profling is simply not required to do business, so your right not to be profiled wins.

> “they can't record what you're looking at and reuse that information for targeted advertising without consent”

What law prevents them from doing this?

The actual Walmart? Nothing. The previous poster used Walmart's security cameras as a strawman argument against GDPR, which I expanded on.

Why are people obsessed with companies rights? Do they not have enough already?

There's an asymmetry between what a company can do politically/legally given it's resources and what an individual can do. This is why countries generally have some kind of laws protecting consumers.

Security cameras recording footage and it not being used in 99.999% of the time when no crime occurs is fine. The tapes aren't kept forever. Just as having server logs to identify malicious actors i.e. hackers or scammers is fine. What's not fine is e.g. running facial recognition on the security camera footage, or figuring out who bought what (cough Amazon Go).

Sometimes, guaranteeing freedom for one party requires taking away freedom/rights from another party. For example freedom from slavery means taking away the right to have slaves. Society has to make a choice: which one is more important? In case of GDPR, the EU has chosen: the privacy of citizen is more important than business concerns.

I'm not 100% sure, but I believe the cameras at Walmart are for security only. I don't believe Walmart is using facial-recognition technology to identify visitors, track their behaviors (which aisles they visit, how long they look at specific products, etc.) and then using it to other retailers for them to then target those visitors with ads.

OK, it's the petrol station part of the supermarket, but

"Tesco is set to install hi-tech screens that scan customers' faces in petrol stations so that advertisements can be tailored to suit them, it has been reported.

The retailer will introduce the OptimEyes screen, developed by Lord Sugar's Amscreen, to all 450 of its UK petrol stations, in a five-year deal, according to The Grocer.

The screen, positioned at the till, scans the eyes of customers to determine age and gender, and then runs tailored advertisements."


Consent isn't the only lawful basis for processing personal data.

> It's about preserving rights of individuals.

I'm not sticking up for all the big data perverts, but what about an individuals right to speak freely and disclose information they have observed/recorded? The 'right to be forgotten' seems at odds with everyone else's right to remember and disclose occurances.

That is not the case. The "right to be forgotten" is not absolute. There are lawful bases for processing under which the right to be forgotten does not apply.

Go read GDPR article 17.3, it's easy to read. A few things for which the right to forgotten does not apply:

- exercising the right of freedom of expression and information

- for compliance with a legal obligation (e.g. keeping records for tax reporting)

- for public interest reasons related to health, science, historical research

Seriously, there is too much FUD about the GDPR.

Yes! The fud keeps increasing every time another gdpr related post pops up here.

I'm 60% sure it's people who read articles from Adtech Corporations about GDPR and not from people in the EU having to deal with it.

The only bad thing that I've noticed about GDPR is that some niche sites that rely on ad revenue are getting fucked over by Google (if you turn off personalized tracking for your visitors you still need the consent to track, there is no difference) so their income might break down.

That's quite sad but on the other hand they're exploring alternative methods of income and I'm certain adtech will adapt.

I doubt it's only that. I think it's also the tendency of people nowadays to only read titles and not to investigate further. I think some people are triggered by the phrase "right to be forgotten" and they jump to conclusions without finding out what it's really about.

The EU cookie law was almost the same issue, barely anyone implemented it and those who did did so poorly or incorrectly.

This time it bites back tho.

Thanks for the direction. I've read the GDPR a few times but it is (intentionally) non-specific on many implementation details. As I alluded to in comment, 17.3(a) seems like it would work as a global excemption in practice. I'm not familiar with how court rulings that provide legal precedent have impact or are respected across the diversity of legal systems across the EU. By May 25th 2020, after a multitude of GDPR related cases and verdicts, the answer may be more clearly defined. (Or increasingly nuanced with even greater uncertainty ;-)

GDPR also doesn't forbid individuals to do pretty much anything. Article 2c "This Regulation does not apply [...] for a natural person in the course of a purely personal or household activity;" - you can speak freely and disclose information you have observed/recorded, but the business you are running can not.

And with application layer DDoS attacks and intrusions: surrender? Do not ban bad actors? Or outsource to Cloudflare that keeps an army of EU lawyers and is compliant?

Nah, that's a legitimate interest (the example given in Recital 47 is preventing fraud, which is similar). You can process it as long as you only use it for that purpose and delete it as soon as possible.

Are you also against access logs?

Only if you’re not scrubbing IP addresses and other PII within a reasonable window of time.

IP addresses are not pii. (Maybe personal data under the gdpr, but they are not pii.)

Also, what pii is tracked (by default) by piwiki or ga or access logs? I certainly cannot think of anything.

> Also, what pii is tracked (by default) by piwiki or ga or access logs? I certainly cannot think of anything.

Cookies and other artifacts in the request headers or query parameters that can identify a unique user.

Identify a unique user does not mean that it's pii if I cannot produce the real world identity of the user from the cookie.

True! But some cookies are usually tied to a user account, with an email address, and/or other PII.

How careful are you with tacking cookies to requests? Be mindful, and keep documentation.

Under the GDPR that's PII. IP Addresses too. Personal Data is any data relating to a natural person, if you can uniquely identify them that falls under the monitoring/profiling definition of the regulation.

"A much discussed topic is the IP address. The GDPR states that IP addresses should be considered personal data as it enters the scope of ‘online identifiers’. Of course, in the case of a dynamic IP address – which is changed every time a person connects to a network – there has been some legitimate debate going on as to whether it can truly lead to the identification of a person or not. The conclusion is that the GDPR does consider it as such. The logic behind this decision is relatively simple. The internet service provider (ISP) has a record of the temporary dynamic IP address and knows to whom it has been assigned. A website provider has a record of the web pages accessed by a dynamic IP address (but no other data that would lead to the identification of the person). If the two pieces information would be combined, the website provider could find the identity of the person behind a certain dynamic IP address. However, the chances of this happening are small, as the ISP has to meet certain legal obligations before it can hand the data to a website provider. The conclusion is, all IP addresses should be treated as personal data, in order to be GDPR compliant."


PII or not PII it is still covered by GDPR.

Piwik has several blog post about this very topic, for example: https://matomo.org/blog/2017/09/gdpr-potential-consequences-...

You can disable the tracking parts in Piwik, or you can assume a legal stance under one of the 6 exceptions like "Legitimate interests" (https://matomo.org/blog/2018/04/lawful-basis-for-processing-...).

Thanks, that's very interesting! I wasn't aware that IP address is considered personal data under GDPR... That seems exploitable...

In what way are you thinking IP address as PII is exploitable?

My first thought is that you could impersonate a person by spoofing their IP address. PII identifies a person but I don't think anyone is advocating for (solely) IP-address based bank logins.

What site you run and why you need to log my personal data(IP,browser,other identifying data) do you have good reasons to keep it forever or you purge it after you analyze it?

How can a SAAS defend chargeback claims without IP address? Stripe stores IP address and the proof that the user paid for the subscription using the same IP and created an account ties them together.

Have a timeout, that may be a valid reason to store it for the chargeback limitation (120 days IIRC) but definitely not a valid reason to store it forever, as the parent post was talking about.

How is IP and browser personal data ? If you think it is, you should not be on the internet.

The article we're in the comments for is about the GDPR - as defined in article 4 of the GDPR, IP addresses constitute personal data. In many cases, a specific IP address, even a dynamic one, can be used to uniquely identify a natural person.


Even though in many cases this won't help you identify a specific natural person (ie, you don't know whether whether you're dealing with a specific person or multiple people on a shared connection, and an IP address on its own usually isn't enough for you to de-anonymize a person), they're still a considered a personal identifier, are often coupled to a general physical location, and are now explicitly legally protected.

Couple an IP address with a browser user-agent, and you've got the basis for a strong unique fingerprint for a specific person.

Under GDPR guidance, IP addresses are considered personal data because they can be used to identify an individual in a moment in time. Personal data consists of things that identify individuals, but also things that can be used in conjunction with other information to identify individuals.

You might not like that, but the regulators are pretty clear on this point.

> Under GDPR guidance, IP addresses are considered personal data because they can be used to identify an individual in a moment in time.

That's actually the most frightening thing I've heard in a long time. Does the GDPR actually make that connection? If so, it literally links people to an IP address, rather than simply a connection.

If that line is accurate, I'm surprised it hasn't been mentioned before, associating an IP address to an specific person. I have to believe you are wrong, otherwise the legal implications are scary.

For those that don't understand: my concern is that in the US, for a long time, in copyright claims by the RIAA or MPAA, for example, was to go after someone because of an IP address, a common defense was basically: An IP Address is not a person. The above commenter made the claim that an IP address alone can be associate a specific person. So, I'm wondering if 1) that's accurate and 2) what are the ramifications of an IP address being a person in the world of law enforcement?

No, the GDPR does not actually link people to an IP address. The GDPR never even refers to an IP Address, and where it refers to an Internet address, it is clear it's referring to email addresses.

The ICO (furthermore) has given guidance that they don't think an IP address is uniquely identifying an individual, and have confirmed this to me on the phone.

Where you get into trouble is in transmitting your browser logs/activity to a third party who wants to keep them for their own purposes (e.g. Google). In this circumstance, you have to let people know that you've done this, and to transmit their preferences that you receive onward.

Yes, it explicitly mentions it, because it's actually very often true. Of course there are plenty of examples where it would be extremely difficult to link to an individual, but there are tonnes of examples where it's extremely easy. GDPR says that because it's sometimes easy, you have to consider it personal data.

Again, it's not always saying an IP address is a personal identify. It just is often enough.

> Again, it's not always saying an IP address is a personal identify. It just is often enough.

Well, that's not what you said or implied. I'm just thinking of all the cases in the US were the defense is you can't assume that an IP address ties to a specific person. Anyone could use the computer, or someone could attach to an open wifi.

Basically, if the legal argument is the IP address can be associated with a person, that raises legal concerns.

I said they can be used to identify an individual in a moment in time. That's correct.

Can it always identify an individual? No. Is the standard of identification good enough for a criminal case? Certainly not. But why are you comparing these? The GDPR is a standard about privacy and data protection; a UK postcode (like a zip code in the US) is considered personal data for exactly the same reason.

The ICO does not consider IP addresses personal data since more than one person could use a computer in a household.

That's not true; I can only assume you're reading the 2011-era DPA guidance.

Under GDPR, an IP address must explicitly be considered as personal data, and any processing of them must be written in the documentation of the data processing activities:


As another commenter has mentioned, this is included in the legislation. There isn't much interpretation to apply here.

Yes it's absolutely true, insofar as I do not have to obtain someone's consent to have logs of their IP address (which is what we're talking about[1])

[1]: https://news.ycombinator.com/item?id=17060280

The GDPR requires informing of use, transmitting preference, and protecting rights, of things that can potentially identify an individual, but this is easy to accommodate by simply not being an asshole. You're not under any requirement to actually identify an individual with your IP log.

Jurisprudence in this area disagrees with you:

"online media services provider may collect and use personal data relating to a user of those services, without his consent, only in so far as that [..] that data are necessary to facilitate and charge for the specific use of those services by that user"


This is related to the DPA; but the GDPR doesn't change anything here, only strengthens it (i.e. making IP addresses explicitly personal data).

So if you're arguing collecting IP addresses is absolutely necessary for you to facilitate the service, no, you don't need consent. But I would not want to have to defend that, since disabling collection is as simple as a webserver reconfig.

I have not read any legal opinion that agrees with yours. I have also been to ICO events where they have stated they expect to treat it as personal data. That's reflected in their site (I gave you a specific example).

I understand that's not the outcome you're looking for.

I'm ignoring German opinions since Art 51-52 suggest only the ICO is going to be involved.

> if you're arguing collecting IP addresses is absolutely necessary for you to facilitate the service, no, you don't need consent. But I would not want to have to defend that, since disabling collection is as simple as a webserver reconfig.

Using IP addresses for audit and security is best practices; I can use the IP address to make sure that a user isn't logging in from two countries at the same time (and then require a call e.g. to whitelist).

Thinking of an IP address in binary, as you're suggesting is extremely dangerous: The GDPR is not supposed to prevent you from thinking about what you're doing.

> That's reflected in their site (I gave you a specific example).

Your example doesn't come out and say IP Addresses are always personal data. Try again.

They've previously said the opposite:


So, three things. First, the case reference I gave you. That "German opinion" is from the Court of Justice of the European Union. This is the highest court that applies, and the ICO must obey it (until Brexit - and even after then, it's highly unlikely that the UK will interpret the GDPR in a different manner, at least for some time).

Second, I've said before IP addresses might not always be personal data. But the issue is they sometimes are, and if you record them without discrimination then you're recording personal data. The old guidance says "An IP address is only likely to be personal data if relates to a PC or other device that has a single user" - ok, so are you able to not record IP addresses that do relate to a single-user device? No?

Third, the ICO do think IP addresses count. I've given you a GDPR reference already, even their DPA tool treats them as such:


It genuinely doesn't matter what the ICO might have said in the past. Right now, they say IP addresses are personal data. The courts say that. The law says that. I've given you multiple references for all of this.

> the case reference I gave you.


a dynamic IP address registered by an online media services provider when a person accesses a website that the provider makes accessible to the public constitutes personal data within the meaning of that provision, in relation to that provider, where the latter has the legal means which enable it to identify the data subject with additional data which the internet service provider has about that person.

if you're not the Internet Service Provider, or more broadly, that you don't have "legal means which enable [you] to identify the data subject with additional data" then the ruling doesn't mean what you claim it says.

You're being intellectually dishonest by trying to tie irrelevant sources of information. That the high german court took a broader look than the European court is irrelevant.

> their DPA tool treats them as such

It says, as I've agreed, that an IP Address could be personal information. You've also agreed this position. The ICO does not consider IP addresses [by themselves] personal information.

What exactly are you still responding to?

That seems like a weak argument though. Two humans can also have same name and zipcode. They can also have same personal number and bank account number if they are in different countries. Without tracking and correlation, most information on its own is useless.

This reads like you are intentionally trying to misunderstand. GDPR has two categories of user information: direct identifiers and indirect identifiers.

Direct identifiers are pieces of data that allow to target a person, or a very small group of persons from a single data point. Indirect identifiers are anything that you could use to build a marketing cohort.

Combining a few indirect identifiers allows to target very specific groups of people. Or, using the very examples you quoted:

- The tuple (bank account number, country) is enough to target an individual.

- The tuple (full name, zip code) is enough to target a very small group of individuals. By adding just one more element you can identify individuals.[ß]

Each one of the four data points counts as user information under GDPR. Doesn't matter whether they are direct or indirect.

Disclosure: I wear the DPO hat at Smarkets. As a gambling company we are legally required to know quite a lot about our customers.

ß: For the nitpicking armchair lawyers: unless you happen to have a gated community for John Smiths.

I have no idea what you mean. I was not talking about GDPR.

Currently, or in relation to GDPR? What's your source?

https://ico.org.uk/media/for-organisations/data-protection-r... (20th Oct 2017):

> Like the DPA, the GDPR applies to ‘personal data’. However, the GDPR’s definition is more detailed and makes it clear that information such as an online identifier – eg an IP address – can be personal data

> What's your source?

The ICO. I've called them up, and they've confirmed their 2011 interpretation of personal data:


An IP address is "personal data" in the same way that "lifestyle information" or a "location" is. That someone can combine an IP address with other information to personally identify someone is important, but it doesn't prevent me from logging personal data.

That's their position at the moment. GDPR makes it a bit more explicit: if you can combine the IP address with other information to identify a natural person it becomes personal data.

Sure, but by itself, an IP address isn't personal data.

IP and browser fingerprinting is used to identify you around the internet, why would someone need to fingerprint me if it has good intentions. If you wnat some stats on who visits you like countries, what browsers and OSs they use , you can count them and discard the data after you would not need to keep it.

IP address alone on an anonymous web access log need not be. Start combining it with persistent cookies or a logged in user etc and it clearly becomes personal data as it is then enough to identify someone.

The clause in the regulation is quite clearly worded:

"Natural persons may be associated with online identifiers provided by their devices, applications, tools and protocols, such as internet protocol addresses, cookie identifiers or other identifiers such as radio frequency identification tags. This may leave traces which, in particular when combined with unique identifiers and other information received by the servers, may be used to create profiles of the natural persons and identify them."

So your face is also not personal data, in your opinion?

The sensitivity of data can change based on context. My name isn't sensitive until it's attached to something like browser history or medical files.

Anything and everything that’s about a person in any way is personal data.

I actually think you need to do some homework if you honestly think so.

IP and browser information are definitely PII, which is Personally Identifiable Information. Those are PII, because they can be used to personally identify an individual.

PII is not a GDPR term, by the way.

Correct. There is a similar concept, but it is not the same as PII in the US.

Recital 26:

To determine whether a natural person is identifiable, account should be taken of all the means reasonably likely to be used, such as singling out, either by the controller or by another person to identify the natural person directly or indirectly. To ascertain whether means are reasonably likely to be used to identify the natural person, account should be taken of all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments.

Since the concepts mentioned in the comment, IP and browser information, are already being used to single people out for tracking, those particular types of information can definitely be viewed as the equivalent concept under GDPR, as defined in Recital 26.

Serious question: What are the odds of someone offical even noticing that you are in violation of GDPR? It's not like they'll enforce GDPR and collect $7.89 in fines from small businesses.

A good question. But that’s not how these things work. The law has to be self enforced or it’s useless - since as you say, how will small to medium businesses ever be caught violating?

Megacorps comply because of mega fines.

Small business comply because their owners or future buyers are a larger Corp who fears that their sub-subsidiary might be in violation, causing a future mega fine.

So small businesses who care about the value of their company follow these rules. It’s almost exactly the same reason small business buy software licenses. It’s not of fear of fines but because otherwise they don’t look like a serious company.

I question I have been wondering is how many companies will leave some violations such as data in backups - simply because removing it is too expensive so it’s a risk worth taking. I honestly haven’t understood how backup of data fits into the requirement to delete data of a certain age?

GDPR has the concept of backups and their expiration windows covered.

I'll pick an example from my work. Data can be deleted from the active set, at which point it takes extra effort to retrieve it. (If you can't SELECT it anymore from the warm slaves, it's gone.) But as long as you can make a point-in-time-recovery from your backups, the data is still present in the inactive set. Using the inactive set requires, by definition, extra effort.

So you need to state that fact in the data protection/retention policy, AND put reasonable technical enforcement mechanisms ("controls") in place to ensure that backups are expired and fully deleted after a given retention period. The older your unexpired backups get, the less valuable they should become.

All it takes is one disgruntled employee

Why would you care about someone official? I'd care more about a bot that is set up to save the officials some work.

Minimum fine is $20 million, isn’t it?

No. It's part of the calculation for the maximum fine.

The maximum fine is defined as €20,000,000 or 4% of your global anual turnover, whichever is largest.


> Up to €20 million, or 4% of the worldwide annual revenue of the prior financial year, whichever is higher


What are the odds of someone offical even noticing that you are in violation of GDPR?

Plenty of ordinary people will be actively looking for opportunities to file GDPR complaints. I know I will. This is a crusade. Taking the Internet back from adtech.

Are you going after amazon or my neighbours small online flower business?

It depends, for example a local restaurant mini-chain has been doing some spamming and refused to take me off their list, so if I get a single message from them after May 26th, I'll definitely file a complaint. From the consumers perspective, the main GDPR effect is that things that previously were scummy but legal have now become forbidden, and some of the things that have been forbidden but not enforced now have an enforcement mechanism with teeth to make it happen.

If they have Google ads they’re fair game.

Why do you need analytics, aside from ads and ad-tech?

...to understand whether I'm reaching my target demographics e.g. for the sake of the success of my non-ad-related business?

How can you tell what demographic am I in just from my IP and browser fingerprinting? Do you have more sites or correlate the data with third parties?

I think is fine if you just count the browser,OS, and country the user is from but I don't think is OK to keep more details then that

And to see what devices you should support? If all your traffic mobile? Maybe the mobile version gets more love. Lots of old ie users? Guess you gotta use old school css.

Now that I think about t I’ve never used analytics for anything ad related.

If someone hits you from an iPhone 6 in turkey you can just incrementally your count of those two values without needing to store that IP is in turkey and has an iPhone 6 and youve correlated that data through Facebook's ad network so that now they having a profile built around them despite not even having a Facebook account.

Everyone's been sucking up all the data they can just in case they need it, and actual people are being harmed by it through data breaches. I get why people are upset about these changes happening to their businesses, but what did everyone expect when the industry has failed to self regulate against their worst excesses? The rest of society isn't just going to let you hurt them so you can make more money

Analytics is a broad term, and covers things like "how many of my users use this new section of my website" or "Let's A/B test a new navigation menu".

Security, such as analyzing how quickly a form is filled out, or whether the person is coming from a location that is known to be home to a lot of fraud, and other such checks, for example.

Using that data as well to improve learning so we do a better job of detecting fraud and not flagging legitimate customers appropriately.

Also, using analytics to determine if a user is having issues on the site. For example, are users having a difficult time filling out forms, or becoming confused, and being able to provide help in an appropriate manner.

These are just two things I immediately thought of. Security, and customer support. I'm sure there are many other users of analytics aside from ads and ad-tech.

To provide data for UX decisions?

You shouldn't need to keep any personal information on your visitors to get meaningful data.

How can AB testing work without storing personally identifiable information?

AB testing doesn't require personally identifiable information.

How do you bucket users into a control and enabled group without having some way of identifying them? (usually ip address)

You could for example bucket them based on the last digit of their IP address.

So, everyone whose IP address ends in 0, 1, 2, 3 or 4 gets the new version, everyone else gets the old version.

You don't need to store the IP address for that, you just need a rule that decides which version to serve as the user requests it.

Why are you bucketing users? If you have some sort of current session for their presence on the site, flip a coin and assign them your A or B. Store nonpersonalized info about their experience on the site associated with the given A or B.

Generate stand-alone, opaque identifiers for whatever sessions you want to analyse. Make sure these tags are ephemeral and decoupled from anything else.

Then, store only the flow data, discard everything else. Or if you need to keep some data around for the test duration, delete all of it once your A/B test has concluded.

Sounds like you'll have some urgent reading to do when/if you receive your first GDPR access request.

Yes, that is wrong. You MUST tell your users that you are tracking them, and what the data you collect will be used for.


Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact