Hacker News new | past | comments | ask | show | jobs | submit login
Austrian DSB: EU-US Data Transfers to Google Analytics Illegal (noyb.eu)
254 points by sarnowski 12 days ago | hide | past | favorite | 285 comments





Max Schrems is just incredible. Just look at his Wikipedia page [1] and see how many EU-US data transfers he's challenged successfully.

[1] https://en.wikipedia.org/wiki/Max_Schrems

At this point, I wonder why the EU doesn't consult him personally prior to enacting some law. It's not as if they don't consult with others as well.


> At this point, I wonder why the EU doesn't consult him personally prior to enacting some law. It's not as if they don't consult with others as well.

Because it costs companies a lot of money to merge to EU-only market, while waiting for the wheels of justice to grind buys time (for EU-only market).


That's right, but the US does this all the time with its military/intelligence networks, China does the same for it's public internet. It's not like their providers are running at a loss, they just need the right incentives.

What's the end goal of all this? Information (data) technology has essentially left europe

The EU is never held accountable for the laws they make, and i m not aware with them consulting with local entrepreneurs. It seems only lobbyists and politicians have access


>Tech has essentially left europe

If your "Tech" can't work without siphoning up and selling people's personal data to the highest bidder, Good.


Then how do they make money? Everything is a subscription service? Cable TV 2.0? Can you use the internet normally by avoiding these services? No Google, no YouTube, no Windows, no Android etc.

Right now we basically get the products of the internet for free. If you kill advertising as monetization you don't have those free services anymore. And if you have to pay for everything then you'll leave even more of a trail on what you did. It just might not be accessible to corporations, but it'll still exist.


>Windows

lol most people pay money for windows (even against their will if they want to buy a laptop for example) and still get ads and telemetry cranked to the max. What's up with that?

>Android

Another bad example. In a less shitty world where the majority share of the global mobile OS duopoly isn't owned by a glorified surveillance company, the support of a device would be covered by, and only by the sale price of the device. Which wouldn't be such a huge cost if Android better had engineered HAL and better standards so every device could update from the same repo tree.

>if you have to pay for everything then you'll leave even more of a trail on what you did.

This is my problem with Kagi[0](frequently posted here). This is more of the fault of the current global payment infrastructure than any ordinary tech company's. But if you think that the current ad infested browsing experience makes you any more private I have bad news for you. Namely fingerprintJS[1].

You might be asking what are my solution's to these problems. And my answer is that I have no idea. The only thing I know is that I'm seriously jaded with the current {mono,duo}poly ridden state of tech.

[0]: https://kagi.com/ (currently throwing internal server error lol)

[1]: https://fingerprintjs.com/


> You might be asking what are my solution's to these problems. And my answer is that I have no idea.

The thing is you always pay, just the currency changes. In one case you pay with your data and in another with money out of your pocket. At least with the appearance of Kagi, now people also have the second option on the table. For web search it did not exist previously and having more options is always good thing from a consumer perspective.


Definitely. I may have went too hard on you guys previously. Wishing you the best of luck :^)

No worries, I didn't take it as such, critique is always welcome.

Banning behavioural advertising is not the same thing as banning advertising in its entirety.

Newspapers, for example, survived for centuries on a mixture of subscriptions and advertising before they could snoop on their readers to decide which ads to show.

TV and radio advertising right now works as a funding model without syphoning personal data.


TV, radio, and newspaper ads are also local. They do not have international reach. Would ads be useful to you in Russian or Chinese? Would the company paying for the ads find it useful that you get those ads? No, they wouldn't, but on the internet without data everyone would get all of the ads.

You don't need to snoop on your visitors to choose a language. Choosing an ad based on browser preferred language or geoip is not behavioral advertising. It is behavioral advertising when you record the data, when you track the user and show your ad on a site with the lowest bid instead of the relevant site that your target audience originally visited.

You don't need the persons purchasing history, sex, marital status, age, ethnicity, interests, political orientation, a list of visited sites and all data you can get your hands on, to choose a language.


"Free" commercial services were made possible only in recent years with the advent of ad-tech that was able to monetize the data being gathered.

If you want an alternative to that, yes then of course you have to pay for it.


The primary reason to collect personal data is retargeting, which is a massive waste of money. (Have you ever heard of the game theory?) Google(search) and YouTube Ads rely on content you're being served to actually target ads, that's much more relevant and drives higher click-though rates.

PS: I definitely pay for Windows, MacOS, iPad OS and Android. Developers of those systems get paid from proceeds of hardware sales.


>Google(search) and YouTube Ads rely on content you're being served to actually target ads, that's much more relevant and drives higher click-though rates.

Then how come I get ads in my native language despite almost never watching or searching for content in my native language? Obviously ads in my native language are more useful, because I might actually partake in those services. Walmart advertising to me would never matter, because I will never be able to get to a Walmart.


For what’s worth, Windows is paid and there are better alternatives to all the other things you mentioned. I do not use any of them.

Just let the patsies add it to the blockchain themselves, give them tales of web3 ;)

> If your "Tech" can't work without siphoning up and selling people's personal data to the highest bidder, Good.

Not good, not good at all. The idea is to find a common ground, not to push potentially innovative businesses and people to the rest of the planet.


If we outlaw {child slave labour/people trafficing/theft/explotation/etc} it will just push potential businesses to another country and we will lose out on the innovation.

Instead of locking your front door you should find common ground with the local thiefs so they only steal half your stuff, smash the old TV not the new one, only beat you half way dead and only kill one of your children. Anything less would just be undermining the fantastic innovation potential in your local home invasion and robbery sector.


The fact that you need to equate something most people find quite harmless (data acquisition for targeted ads) with child slave labor, trafficking, robbery and murder only points out how weak your argument is.

There are many people on the timeline of human history that believe that child slave labor, trafficking, robbery and murder are all harmless. These are all things that have been commonplace at various points in history.

If you are in an area with any population density you are sat within a few miles of more people that have committed or are currently committing those acts, with no regret at all, than there are CEO's of companies that do data acquisition for targeted ads.


Do you actually believe that child slave labor, trafficking, robbery and murder are equivalent to data acquisition for targeted ads?

It's called hypoerbole and kmix's position is laughable.

No one is banning ads, targeted ads or data collection.


All of these things are illegal for good reason, they harm people.

But, the argument is simple: In all these cases businesses do something illegal and therefore either need to be stopped, or stop existing.

People who use this tech are criminals, and everyone knows it. Everyone has known it, even in the US. Look at how many people use ad blockers and anti tracking scripts. It is time to stop allowing literal criminals at tech companies to repeatedly break the law and face no consequences whatsoever.

It has been going on for years, and it needs to be fought just like any of the other horrible things.


Governments have a duty to protect the human rights of their citizens, and that includes privacy.

Innovation is not always a good thing - Think of it as a vector and not a scalar. Increasing the magnitude of innovation would be bad if the direction of innovation is harmful.


> Innovation is not always a good thing

Maybe so, but you do not get to decide which innovation is good and which is bad. People do. And most of us do not give a flying monkey about data companies collect about us. What's the worst they will do: better ads?!

We do care about privacy from governments though, but that is never legislated, is there?


And it is exactly the people of Austria who elected those politicians who enacted the laws which all innovating companies must respect. So apparently the people of Austria DO give a flying monkey about this topic.

>Maybe so, but you do not get to decide which innovation is good and which is bad.

I didn't decide, the European Parliament decided. And they made the decision to protect our (collective and individual) human rights.

>We do care about privacy from governments though, but that is never legislated, is there?

Privacy from governments is exactly what this case is about.

The US government can - in secret, with practically non-existent oversight, and absolutely no means of redress - simply take personal data sent from the EU to the US. Because of this, the US is not deemed to have "equivalent protection" to the GDPR and thus transfer of personal data from the EU to the US is banned (unless it's made technically impossible for the US entity to comply with an order from the US government to access it).


> European Parliament decided

You mean EU politicians decided. And it was an abusive decision which hurts more than helps. I want the right to use my data as I see fit, including exchange it for "free" services and products.

> The US government can

I do not live in the US, it's MY government I worry about, not some far distant boogeyman.


> You mean EU politicians decided. And it was an abusive decision which hurts more than helps. I want the right to use my data as I see fit, including exchange it for "free" services and products.

Did someone ban it? Last I checked you can still give away your data.


>> European Parliament decided

>You mean EU politicians decided.

Well parliaments tend to have politicians as members, yes. That's generally how representative democracy works.

>I want the right to use my data as I see fit, including exchange it for "free" services and products.

You can use your own personal data however you want to, the GDPR has absolutely no limits on that. And if you want to give your express consent for others to use your personal data in any way they see fit, you can also do that. Consent is a legal basis for processing personal data, after all.

They just can't use my personal data - or the personal data of anyone else who has withheld consent - like that.


> You can use your own personal data however you want to

Oh but I can't, when the companies decide not to offer those services to the EU due to the onerous requirements of GDPR. Because the GDPR was not some harmless default being changed but a horribly written regulation that affected the way software must be written, from data retention and storage to logs and analytics.

All for a bunch of politicians pretending to represent us while I am willing to bet that the vast majority of Internet EU users is busy hunting for the "Allow All" button on every damn cookie popup on every website today. There was no need in the market for this.


> Oh but I can't

Yes, you can

> when the companies decide not to offer those services

They never offered "you can freely sell your personal data to us". What they offered was "we siphon your personal data, for free, whether you want it or not, and sell it to the highest bidder".

> I am willing to bet that the vast majority of Internet EU users is busy hunting for the "Allow All" button

Ah yes. It's the politicians who are at fault, and not the companies who put up these banners in clear violation of the law.


Human rights also include access to learning, education and news. Having them for free online is the best option. But you cant have both at the same time

Very well put.

In that framing, it is often the case that US-led "capitalism-at-any-cost" over-indexes on scaling the magnitude of the vector while leaving the direction free to be influenced by other entities. The EU approach is to disregard the magnitude while keeping tighter bounds on the direction.

I'd like to think that bodes well for the UK, since we historically have trodden the middle ground between both camps - but I'm sure we'll find some way of fucking it up and gaining neither magnitude nor direction.


This is the common ground. My ideal would be banning all forms of personalized advertisement and banning the sale of sensitive medical information.

Not every potential innovation's cost is acceptable. This draws a line.

Not sure, if it's bias due to being part of the netsec bubble, but there is a growing market of privacy as a feature. Also in the wake of countless breaches companies get more aware of how and where the data is stored.

If your product isn't addressing this, you're probably at a growing competitive disadvantage.


Changing the law to suit "local entrepreneurs" is, politically, seen as the same thing as changing it to suit any other lobbyist. The software industry's interests are not seen as politically "cleaner" than those of any other industry.

Extending laws to apply to all in the market is not "favoring local entrepreneurs".

The parent comment used those exact words ("consulting with local entrepeneurs") as a reason not to implement GDPR. I was pointing out that this is did not convey the clean-and-efficient image they had in mind when framed that in opposition to lobbyists.

>The EU is never held accountable for the laws they make

Of course it is. The tech sector is practically non-existant here, and if something works out, they leave for the US. Isn't that what being accountable is?


The shortsighted definition of "tech sector" of some people is hilarious.

Here in Europe we are running software tools and platforms written in the US, owned by USA-companies, on hardware designed in the US. And the people who created all that include lots of bright ex-Europeans.

What do you find hilarious about that?


And all those American tools would not work without East Asian (mostly Taiwanese and South Korean) hardware. Tech is global, and that doesn't mean every part of the industry is everywhere.

Without some European hardware the US would do nothing.

True, we're not completely out of the loop. But let's not fool ourselves: we are far from running the show. And with an attitude like this we will be left watching how progress and innovation will keep coming from the US, driven by our own best and brightest emigrated there.

...

Same for hardware. Europe is quite big on embedded, think of Siemens just as an example.


This is a self-inflicted wound. The EU needs to look to China in order to fix this. Impose draconian laws to clamp down and control the tech industry.

Anything else WILL lead to loss of sovereignty. Control the data.


»Max Schrems: "In the long run we either need proper protections in the US, or we will end up with separate products for the US and the EU. I would personally prefer better protections in the US, but this is up to the US legislator - not to anyone in Europe."«

That's the point: we need real data protection in US law for non-US citizens as well. Currently, US lawmakers treat EU citizens' data as US state property. Obviously, that's unfair.


> I would personally prefer better protections in the US, but this is up to the US legislator - not to anyone in Europe.

I don't agree that Europe can't change anything in that regard. Deeming US-based services illegal and banning US-based companies doing business in Europe because of the way EU-customer data is treated in the US would speed up better regulations in the US tremendously.

It's a fact that big corporations are ready to bend over backwards to the foreign governments, even when they require "immoral" [1] things, so they would have no problem complying with actual sensible requests [2] if they are forced to do it.

[1] Chinese censorship rules, ... [2] Data protection, ...


> banning US-based companies doing business in Europe because of the way EU-customer data is treated in the US would speed up better regulations in the US tremendously

Maybe it would, or maybe it would spur a tariff-war between the EU and US and a great deal of resentment between traditional allies.

> they would have no problem complying with actual sensible requests

Morality and sensibility don't play a role in modern big corps. The real question is: do these requirements impact their bottom line? Chinese censorship rules don't, but EU's data protection rules clearly do. Hence, their willingness to comply will adjust accordingly (i.e.: US corps will fight tooth and nail to prevent that from happening).


> I don't agree that Europe can't change anything in that regard. Deeming US-based services illegal and banning US-based companies doing business in Europe because of the way EU-customer data is treated in the US would speed up better regulations in the US tremendously.

I think it would do way more damage on the EU side than anything. Imagine having to migrate applications overnight because hosting with AWS has been outlawed, even with all the protections in place (e.g. location in EU, encryption etc etc).


Overnight is rather exaggerated.

GDPR (which the above case is about) was approved in 2016, became enforceable in 2018, the major legal case that provided that kind of interpretation landed in 2020, and now a concrete (very high profile) enforcement is actually happened in 2022.


>or we will end up with separate products for the US and the EU.

I thought this was the goal the EU was working towards. There was even that policy recommendation for building a firewall similar to the Chinese one. It didn't amount to much, but we seem to be going down a path like that.

Why would the US listen to the EU on this topic though? EU countries are trying to use privacy as a way to limit the reach of these US companies, but we don't have anything comparable to replace them with. Those US news sites that blocked EU visitors? They're still blocked and you can't really blame them - they don't make much money from advertising to European users, so why take the risk and cost of implementing GDPR? I understand it, but parts of the internet are still unavailable to me. And I don't seem to have any more privacy anyway.

Data protection is good, but at this point I find it difficult to believe that this is the actual goal of EU politicians.


Read up on Schrems II. This policy is actually based on a court's decision not on a decision of politicians. Politicians actually tried to save data transfer with the "Privacy Shield".

"The CJEU ruled that the Privacy Shield does not provide adequate protection, and invalidated the agreement. The court also ruled that European data protection authorities must stop transfers of personal data made under the standard contractual clauses by companies, like Facebook, subject to overbroad surveillance. This decision has significant implications for U.S. Companies and for the U.S. Congress because it calls into question the adequacy of privacy protection in the United States."

from https://en.wikipedia.org/wiki/Max_Schrems


That's unfair assessment.

While I find it hard to believe that European countries are that much more privacy focused... the reason for the divide is that European countries, in or outside of EU, have stricter rules on user data... and much more recourse for users.

Having those rules creates an advantage for any company that doesn't operate by those rules while serving people located in the countries covered by those rules. The goal was never to "limit the reach of US companies", but to prevent uneven playing field.(EU was specifically created to keep markets competitive)

What's worse is that US government, that is legally barred from snooping on people in US, says that data of people not physically present on US soil is fair game to do as they wish.


> Currently, US lawmakers treat EU citizens' data as US state property. Obviously, that's unfair.

The unstated assumption being that the data in question belongs to those citizens.

If I write about an orchard, the writing doesn't somehow belong to that orchard. If I photograph a wedding the copyright is still held by me. It's not obvious if we're instead talking about a name or an email address that the subject of your data should magically become the owner.


The reality is that privacy in the US isn't the same as it is in the EU. Making these kind of deals with the US or China will always fail.

Ultimately there's nothing to stop the US from wiping it's ass with any treaty- that's the major advantage of being a superpower. America lives by different values as is their right.

Yes we need to silo the EU from the US.


The US is a large power compared to any individual European country. But for example Microsoft about as much money in the EU than in the US.

Sorry, I think you missed an important word there - how much do they make in the EU compared to US?

The EU was specifically founded because European politicians were acutely aware that divided they would fall against the US/USSR.

We are at a crossroads: remain independent or try to get as good a deal from the US as we can like Hawaii did.


The Americans should use this to pressure the government in fixing their surveillance laws.

I recognize that this is merely my impression of the matter, but I don't think most Americans are that concerned about it. I very much doubt that enough are sufficiently concerned to convince enough politicians to do something about it.

I don't know. It's kind of worrying that you can't host EU data in US isn't it?

Not as much if you're in the EU, I guess. But, yeah, it's not the best feeling as an American.

The deeper background is of course Google's business model of data and privacy prostitution: Users give their private life to Google and they get web search, email, and videos back.

In a more reasonable world users would pay money for the services they want to use.

Of course it needs to be noted that most users don't even understand that they are selling themselves. And of the few who do most still think it's better than paying money.

This ruling, should Google comply in the end, will not change anything. Google will store the data in the EU and that's it. I don't think they share user data with the advertiser when they show an ad. So they could still show ads of US companies. And that's a niche business only anyway because when Europeans do business with Amazon, Disney, and the like they deal with the respective European subsidiaries already.


>In a more reasonable world users would pay money for the services they want to use.

>Of course it needs to be noted that most users don't even understand that they are selling themselves. And of the few who do most still think it's better than paying money.

This is such pretentious snobbery. In a world where you have to pay for search engine I am still dirt poor working some shit entry-level job/doing manual labor because when I was a kid I couldn't even afford interned and had to hitch off a neighbour, having free access to Google, tons of free learning material, messaging boards, etc. is what got me out of that situation.

I pay to avoid advertisement, but that's a luxury I can afford now days, and I have almost no concerns about privacy - I don't care at all that Google knows my interests, browsing history, purchase history, etc.

The concerns about data collection I see are mostly blown way out of proportion and most people rightfully don't care TBH.


This is an extremely naive view of the situation. I'm really glad that you were able to help yourself but the material likely wasn't by Google.

In the meantime Google is using this information to manipulate your desires and actions and you can be sure as hell that it is using data about your behavior and interests to improve it's position in the market. Google buys companies and stocks and they have an advantage few of us can ever hope to have.

I promise you, Google knows you better than you know yourself and it's using that knowledge to further it's own interests without caring much about the people, countries or economies it's hurting.


>I don't care at all that Google knows my interests, browsing history,

I also don't care that Google,NSA, KGB, CIA, ChIna knows my browsing history or what files I have on my PC but I care if this groups know everyone browsing history because they can affect me indirectly by blackmailing, manipulating key individuals or entire populations with targeted ads or propaganda.

Is the same with fake news like "WiFi is illegal in Japan because causes cancer" , this fake shit won't affect me directly but affects people in my family so it affect me indirectly and I have to reduce the damage done.


> This ruling, should Google comply in the end, will not change anything. Google will store the data in the EU and that's it.

The US CLOUD Act allows US law enforcement to force Google to hand over data; even if that data is stored outside the USA.

It is highly likely that processing and storing analytics data only in the EU is not enough to "fix" Google's issue here, because the USA still has juristiction.

See the recent Akamai / Cookiebot case.


>Users give their private life to Google and they get web search, email, and videos back.

>In a more reasonable world users would pay money for the services they want to use.

But would people actually be willing to pay? They rather use adblock and other services to circumvent ads rather than pay for YouTube premium.


> But would people actually be willing to pay

Of course they wouldn't pay as much as they currently do. But the world would be a better place, if they didn't. In Google's case it's more indirect, but in Facebook's case it's obvious. It's well known that Facebook has a negative impact on the mental health of many. Most of the turnover created created just ruins the planet for no good. People in the Western world lived a reasonable life in let's say 1970. The same level produced with the technology of 2020 would reduce destruction of the planet a lot. We would all work 6 hours a day and in the free time we could could walk an hour to the office and an hour back. Or do something else good for physical and mental health. Without a cloud-based GPS tracker and activity cam of course because normal mortals couldn't afford those. So what? I don't see how surveillance capitalism has improved or will improve the situation in Africa either.


GA has always been a raw deal. Businesses that use it don’t realise they’re effectively giving away their customers to any competitor that pays Google more.

It looks like this makes Fathom Analytics the only provider for website analytics that you can use if you don't want to maintain a locally hosted version if an open source product – which blows my mind. A small company is the only service that is able to comply with the rules while huge ones simply fail.

I assume that this regulation is also coming to other services soon and analytics isn't the only service that needs to be replaced when a business is in the EU and can't ignore these rules without risking fines. The team at Fathom wrote about alternatives for lots of services here: https://usefathom.com/blog/degoogle


> A small company is the only service that is able to comply with the rules while huge ones simply fail.

I think all the big ones can comply, but gambled they would be able to come up with creative constructs to get around the requirements. Wrong play it would seem.

Fathom did the right thing, isolate by region. Which is handy for a lot more than complying with the GDPR.


Not the only provider, worth looking at Plausible.

https://plausible.io/


Nope, they use US providers. The servers are in the EU but the providers are US companies and that means that they aren’t GDPR compliant at all. This is exactly what Schrems II targets.

All site data plausible.io stores on behalf of the customers is hosted in Germany on servers owned by Hetzner, a European-owned company. Previously it was hosted by Digital Ocean in Germany but the move to Hetzner was made last year.

For our self-hosted version, you can install it with any cloud provider and in any country you wish. Even in the USA.


Can someone tell me if this is even true? Plausible doesn't save any GDPR related data as far as I know?

https://plausible.io/privacy-focused-web-analytics#no-person...

And the backend is hosted @ Hetzner in Germany


All site data plausible.io stores on behalf of the customers is hosted in Germany on servers owned by Hetzner, a European-owned company. Previously it was hosted by Digital Ocean in Germany but the move to Hetzner was made last year.

That's written on their site, but isn't true:

https://imgur.com/a/9wEanqD


All site data plausible.io stores on behalf of the customers is hosted in Germany on servers owned by Hetzner, a European-owned company. Previously it was hosted by Digital Ocean in Germany but the move to Hetzner was made last year.

By its very nature, an analytics product must process personal data.

Personal data is "any information relating to an identifiable individual" (see GDPR art 4(1).

Your IP address, browser and OS (via user agent), the website you visited, the pages you visited, time of visit, the site you came from (via referrer) are all personal data.

If Plausible have put a US owned cloud provider in-front of their Hetzner infrastructure, even if for a legitimate purpose (CDN, DDoS prevention) then that is likely an unlawful transfer of personal data to the USA.


>> Your IP address, browser and OS (via user agent), the website you visited, the pages you visited, time of visit, the site you came from (via referrer) are all personal data.

No. These are all not considered PII. Only the IP address in this list definitely is.

All other information with a wholly anonymized user would be considered by most interpretations to be ok. Often it depends on the context and presence of other meta-data on whether something is PII or not.


“PII” is not a term the EU or UK GDPR recognises. It may have a specific meaning in American law; but the GDPR definition of personal data is significantly broader.

It certainly includes the items I listed; particularly when linked to an identifier like an IP address.


I really don't understand why countries are so persistent about storing data in their country. It's not like the enforcers could walk into the datacenter and plug in the usb drive and get the data. And it's even hard to see what all constitutes user data. Does logging constitute user data. Does that mean that to get logs for the error the developer need to travel to every country and remember the log messages in his head.

And companies could easily copy their data in a click if they need to. A much saner approach should be limiting what the company is allowed to do with the data.


> A much saner approach should be limiting what the company is allowed to do with the data.

Perhaps we should have some sort of GENERAL rules or legislation specifically for DATA to define what companies, based in or with customers in a region, can and can't do for the PROTECTION of the data and end users, so the companies can stay compliant with this REGULATION.


We could call it GDPR for short.

>I really don't understand why countries are so persistent about storing data in their country

It is about having some rights. So say if you are from USA then Google or NSA should follow the laws , but if say I am a politician from some other country the Google and NSA employees can just read my emails and then blakmail me (or grab my paypal code and grab my money) because US laws only protect US citizens, terms of service are not laws and we know that we can't attribute morality to Google,Apple or NSA.


> the Google and NSA employees can just read my emails and then blakmail me

this is a new level of conspiracy theory.


>this is a new level of conspiracy theory.

This is a new level of denial, I mean you should be outrage if your CIA is not doing it's job , this is why you americans pay them, to spy on foreign governments and companies, if something changes in the direction that would disadvatae US then blackmailing, killing and other methods are required by the CIA dudes to do.

At least you can use your logic and think

1 CIA job is to spy

2 Google/Apple has juicy content of CIA targets

3 CIA wants that content, they could send dudes in black at night int he server room to grab the data or they can just ask for it, since if are foreign people they have no rights, even if an employee would see this since is an US citizen he will also need to respect US laws and shut up or he will be in trouble.


Blackmail is new? Of course providing the opportunity for blackmail is one of the primary concerns when foreign actors exfiltrate data. Blackmail is a lot older than the cold war, and has been going on as long as people have been doing embarrassing things.

It would be absolutely foolish to think anything other than that there are many spies for various governments working in telecom and other IT companies. It is likely their primary target, even before infiltrating government positions.


Well, them reading this data is a known fact, mostly from Snowden's leaks.

Blackmail happening is harder to prove, but it would approach incompetence if the NSA at least found no opportunity to blackmail someone using the data we know they have access to.


If your data is in Russia, the Russian government can do what they want with it, within the limits of Russian law (at least theoretically). If your data is in France, the French government can do what it wants with it, within the limits of French law (at least theoretically).

Now, most countries have close to 0 protections for non-citizens' data - particularly, the USA has 0 protections for a French person's data sitting on a Google server. If a US government agency wants to read this French person's data (of any kind, including, say, medical records), they can ask Google for access to it and, if Google agrees, they can just use it. If Google doesn't agree, they only need a warrant against Google, not against the French citizen in question.

The same is NOT true for a US citizen's data - which is more or less sufficiently protected, at least theoretically. But foreign nationals' data that happens to reside in the USA has 0 legal protections from the US government.

On the other hand, the US government can not (legally) obtain data that resides in France or Russia, unless they work with the French/Russian legal system to obtain access to that data.


> US government can not (legally) obtain data that resides in France

Explain how? If the US govt orders "copy the data from FR to US, or else" and the French govt orders "you can't do that, or else" then what is the company to do? They are breaking the law no matter what. Something has to give.


In general, such international disputes can be arbitrated either at a diplomatic level between the two governments, or by an international court.

If they're an European company operating and hosting in Europe, the US government has no jurisdiction over them.

If it's an international company, sucks for them (until the countries harmonize their laws to guarantee reasonable privacy protections for everyone internationally). That's exactly why people are now looking for local alternatives to Google et al.


> If they're an European company operating and hosting in Europe, the US government has no jurisdiction over them.

No it is not true. Region of operation is completely irrelevant. US could arrest Kim Dotcom. Or non European companies have to comply with GDPR for European customers.


It's not completely irrelevant. The US has to cooperate with whichever country Kim/Julian/etc resides in. That country can totally reject the extradition request.

They only have to contact them because Kim was not in US soil. They would have to request for any people including American citizens if they are physically in some other country. If Kim decided to vacation to US they could skip all the extradition requests.

> It's not like the enforcers could walk into the datacenter and plug in the usb drive and get the data.

They can ask and/or force a given company to hand data to them in many cases in most jurisdictions.

> Does logging constitute user data.

Logging of user-data? yes

> A much saner approach should be limiting what the company is allowed to do with the data.

GDPR does also regulate what a company is allowed to do with data. The thing is: whether the GDPR applies and is enforceable depends on where that data is stored.


The companies are already restricted in what they are allowed to do with the data, but allowing to transfer it without restrictions would allow this to be easily circumvented by just moving the data to a location where data protection can't/wouldn't be enforced. Therefore the EU rules are not at all persistent about storing data only in the EU, it's explicitly allowed to store data in other countries as long as the data is still protected there.

This decision is a great example of this: The decision isn't made because it's not allowed to export data at all, instead it explicitly references US law which forces the affected companies to violate the data protection guarantees provided by the GDPR.


>I really don't understand why countries are so persistent about storing data in their country.

That's not what this is.

The EU is not saying "data MUST stay in the EU", it's saying "Data can only be transferred to a jurisdiction which has equivalent data protection".

>It's not like the enforcers could walk into the datacenter and plug in the usb drive and get the data.

No, they send a request for the data and threaten to jail anybody who even reveals that a request has been made.

>And it's even hard to see what all constitutes user data. Does logging constitute user data.

Article 4 of the GDPR: ‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person

>A much saner approach should be limiting what the company is allowed to do with the data.

That's exactly what the GDPR is...


Relevant thread on open source alternatives to Google Analytics from earlier today: https://news.ycombinator.com/item?id=29888599

posthog seems to be hosted in Digitalocean - an american company

plausible.io is hosted on AWS - an american company

snowplowanalytics.com seems to be hosted in digitalocean as well

as I understand they are equally illegal now.

[Self-hosting and maintaining is not an option for the vast majority of mom-and-pop shops]


Posthog as well as Snowplow are open source solutions that can be self hosted. Snowplow is always hosted in your cloud infrastructure even if you use their managed service.

Fathom's EU docs (https://usefathom.com/features/eu-isolation) seem to suggest that EU-hosted but US-owned cloud infrastructure isn't sufficient either though - you're exposing any data stored/transferred through there to access by the US government.

That means Posthog self-hosted on an AWS server in Frankfurt wouldn't avoid this issue.

What're the best options for non-US owned cloud providers? AFAICT Canada or many other countries with privacy laws would be fine, it's really the US specifically that's problematic.


well according to them they use hetzner and I'm not sure but since hetzner now has us servers they might be in the wrong, too. it has nothing to do with us companies...

When creating a new VM at Hetzner, you have to explicitly pick the US location, for the exact same GDPR reasons. Hetzner has to do this or risk loosing customers because hosting in the US and/or using US services is forbidden for some kind of infrastructures.

let's be honest, no ones want to self hosts a website analytics application because in most of the cases they just want to focus on their core business, or at least that's the value proposition of SaaS. This will ultimately limit innovation to bigger companies that can afford maintaining their own infrastructure for everything.

This is why we went with Fathom as their edge locations are isolated regional (EU). So if I understand it correctly, core hosting is AWS, but on the edge locations run by an EU company in the EU process the data and remove the sensitive data.

For me, it makes no sense when companies like plausible say they have EU based hosting when they pay for hosting from a US-only company (DigitalOcean)


All site data plausible.io stores on behalf of the customers is hosted in Germany on servers owned by Hetzner, a European-owned company. Previously it was hosted by Digital Ocean in Germany but the move to Hetzner was made last year.

All site data plausible.io stores on behalf of the customers is hosted in Germany on servers owned by Hetzner, a European-owned company. Previously it was hosted by Digital Ocean in Germany but the move to Hetzner was made last year.

For our self-hosted version, you can install it with any cloud provider and in any country you wish. Even in the USA.


Uhm, isn't your comment misleading?

https://plausible.io/privacy-focused-web-analytics

All of the data that we do track and collect is kept fully secured, encrypted and hosted on renewable energy powered server in Germany. This ensures that all of the website data is being covered by the European Union’s strict laws on data privacy.


If a US company controls the servers, it’s illegal. If it’s EU-owned, it’s legal.

Plausible seems to use Netlify/AWS for analytics. Both US companies.


Jack, you are free to promote Fathom but I don't think you have the right to spread false information.

In fact, I don't think I will ever use or recommend Fathom to anyone after seeing you act so childish.


I'm confused. Open up Plausible, look for /event in your inspect element (devtools in chrome), look at the IP address that it connects to. Run that IP through ipinfo.io and see which country comes up. If it's the US, it's illegal (as per this entire thread).

What's childish about me not wanting people to potentially get fined?


Yes, I just checked it. It is a testing environment deployed on Cloudflare Workers. What's the problem here exactly? It is the same exact script using the same exact tech behind Plausible.

At what point exactly are they going to get fined? I don't understand so I would love to know, so as long as you actually manage to answer with somewhat of a technical depth.

Maybe you should do one of those "Fathom vs Plausible" pages on your website, then point out that Plausible is using a testing environment and because of that they will be fined.


Sure, happy to explain further. You have found the testing /event but there is another (make sure your ad-blockers are off).

I've put together the details here in an image, so it's easy to follow (https://imgur.com/a/9wEanqD). Hope that explains what I'm talking about.

Sending data from the EU to US-controlled cloud infrastructure is illegal. Please read the noyb article again, read the Schrems II ruling and read the EDPB's advice.


But Plausible doesn't send its data to US-controlled cloud infrastructure? By the looks of it, they're using a self-hosted testing environment through a CDN.

This is unique to Plausible itself and not the services they provide for their customers.

Why do you insinuate misbehavior from a competitive company when you don't have actual proof?

You have the URL of a CDN network that is hosted in the US. What you don't have is the proof of this data being stored in the US. Because it is not. Their FAQ pages clearly state that none of the data is ever stored outside of EU.

Last but not least, you entirely missed my point. Plausible is an extremely successful business, do you really believe they would risk their reputation / livelihood without understanding Schrems II or otherwise?

I honestly have nothing else to say mate. But good luck with Fathom. I am sure it will be a great success.


Yes they do. It's not just about data being stored, it's data processing as a whole. You cannot casually pass EU data subject Personal Data to US-controlled infrastructure.

CDN is processing of Personal Data in the clear. Please read Use Case 6 of the EDPB's recommendations, specifically what they say about US cloud providers (https://edpb.europa.eu/sites/default/files/consultation/edpb...).

And I'm not interested in commenting on what I think Plausible would or wouldn't risk, as it's not relevant.


Well, it's very relevant. Why else would you try and smear a company that is your competitor, yet fights the same fight you're fighting?

And also, you're completely wrong.

As a Plausible customer, your data is never processed in the US, or sent to the "cloud" outside of EU.


Your website visitors Personal Data is processed on US-controlled cloud providers. I've provided evidence that folks reading this need to be careful when choosing analytics software, and I'll leave it at that. I hope to see Plausible move to an EU Isolation approach which doesn't involve US cloud providers.

You have not provided a single ounce of technical proof that Plausible processes their customer data in the US. Furthermore, you have somehow managed to overlook the fact that Plausible does Cookieless tracking without actually tracking any "Personal Data" signals.

I wonder what Paul thinks of your attempts to fear monger people into thinking your crappy product is superior to an open-source alternative.

But hey man, good luck with Fathom. It will be a great success.


I have no skin in this game, but Jack clearly demonstrated that data is passing through servers that our controlled by US-owned entities - namely Cloudflare and Digital Ocean ... what am I missing ?

Just posted this thread to a friend and they said I wasn't being 100% clear, so I apologize. I'll clear things up.

Using EU servers that are owned by a US company (e.g. AWS deployed in the EU, DigitalOcean deployed in the EU) is a violation of the Schrems II ruling. The way you check this is by looking at the IP addresses the analytics software are using, seeing where they're located and who they're owned by. You can then run that IP in ipinfo.io to get information about who controls that IP. If it's a US cloud provider, regardless of server location, it's a GDPR violation.

The English translation of the ruling can be found here. They go into detail within the rulings about the transfer of Personal Data (IP & User Agent) to servers that cannot be protected from US surveillance laws: https://noyb.eu/sites/default/files/2022-01/E-DSB%20-%20Goog...

"This is a very detailed and sound decision. The bottom line is: Companies can't use US cloud services in Europe anymore. It has now been 1.5 years since the Court of Justice confirmed this a second time, so it is more than time that the law is also enforced." - Max Schrems


All site data plausible.io stores on behalf of the customers is hosted in Germany on servers owned by Hetzner, a European-owned company. Previously it was hosted by Digital Ocean in Germany but the move to Hetzner was made last year.

For our self-hosted version, you can install it with any cloud provider and in any country you wish. Even in the USA. That's the testing one we had on our site as we're testing the latest release of our self-hosted version on our own website. This has nothing to do with what our customers place on their sites.


Yup, just to be clear, I wasn’t talking about site data, I was talking about the processing of Personal Data (IP & User Agent).

You were using Netlify previously, which is a US provider and backed by AWS, and then Cloudflare for the testing.

But yesterday I can see you moved to Bunny (an EU cloud provider), which is great news for your customers, party time!

Provided you’re using Hetzner behind Bunny, that looks like solid Schrems II compliance to me.


What I'm hearing from my Govt agency clients is precisely that. EU Datacenters hosted by EU Subsidiaries of American companies are not ok.

Correct. That's absolutely right. I'm not 100% sure how my comment wasn't clear, but I will apologize to everyone if I confused them. Anyway, Plausible updated their analytics to use Bunny yesterday, which is a win for their customers. We wrote more about this solution back in 2021 (https://usefathom.com/blog/eu-isolation) after a lot of work. We spent a lot of time looking into possible options, the law, and are pleased that our innovation is going to help other companies.

You are correct. Fathom Analytics is the only globally distributed provider that offers EU Isolation (keeping EU data completely away from US cloud providers). https://usefathom.com/features/eu-isolation

I don't think the home country of a company matters that much.

International companies must comply to the local laws and regulations. EU is so large market that they will implement anything EU requires. For example, AWS can host and collect EU data and fully comply with EU regulations, never moving data to the US. With AWS customers can determine where their customer data will be stored, including the type of storage and geographic region of that storage.


The key points in the article for me:

> Max Schrems, honorary chair of noyb.eu: "Instead of actually adapting services to be GDPR compliant, US companies have tried to simply add some text to their privacy policies and ignore the Court of Justice. Many EU companies have followed the lead instead of switching to legal options."

> In the long run, there seem to be two options: Either the US adapts baseline protections for foreigners to support their tech industry, or US providers will have to host foreign data outside of the United States.

> No penalty (yet). The decision is not dealing with a potential penalty, as this is seen as a "public" enforcement procedure, where the complainant is not heard. There is no information if a penalty was issued or if the DSB is planning to also issue a penalty.

We need more trials related to GDPR breaches. While having the legislation is a huge achievement, it needs to be backed with enforcement.

If there is no enforcement, a third long-term solution arises -- just ignoring the law until you manage to get the necessary amendments to it in order to keep operating as before without fear of penalty.


I'm really curious what would happen if those companies followed the law.

My bet is that they would entirely stop doing business in the EU, because I'm suspecting that data collection is the cornerstone of google/facebook/etc's business model.

They cannot properly advertise if they don't collect data.

To me it's a bit similar to what happened with China. China doesn't want the US to get data on chinese people, but their solution was to just block those companies.

The EU uses courts to protect itself, but I guess the result would be a bit similar.


The expulsion of US tech led to a native Chinese tech industry fully compliant to Chinese law. They didn't go back to the Stone Age (which was the prediction of US experts at the time).

No, they'll just host their data in europe

Is that enough? If a company uses google services , they are liable if google decides to take a bite at this data (because e.g. NSA asked them to). And it's not enough if google makes a contract that promises not to do so. the company is essentially liable for the NSA

- Ironically, if Google does create an EU spinoff just to run analytics as a free service, it will kill the local competitors

- The NSA is not "exempt": https://www.eff.org/deeplinks/2020/07/eu-court-again-rules-n...


The point of hosting the data in Europe is that the law can intervene before that data is re-shared outside of Europe. (And to be frank, the NSA is not the target; the GDPR has lots of exceptions for intelligence agencies.)

They'll just host data related to their European customers in Europe*

Ant they will use an European Branch, under the EU regulation... but money will still flow back to US

I think that’s all that the court is asking them to do, for the last 1.5 years or so.

Data centers cost money. They need incentives (read: high fines) to do that.

https://www.google.com/about/datacenters/locations/

Turns out, shockingly, that Google already have some in the EU for some totally unknown reason. Might have something to do with making $XX billion in the region each year.


This ruling doesn’t have enforcement recommended (yet), but under GDPR, the EU's data protection authorities can impose fines of up to up to €20 million, or 4% of worldwide revenue for the preceding financial year, whichever is higher. I’m not sure whether this offense would rise to the full 4%, or only a lesser 2%, if it came to enforcement.

https://gdpr.eu/fines/


A little bit more. The management of that data should also happen by people falling under the GDPR, so either EU members or people living in country's with compatible laws. The USA explicitly is not, because of the FISA courts with secret justice.

The problem is, some USA Googler can issue a query to an EU server and still access data he's not supposed to see. A FISA court can require him to do that and not tell anyone. No legal document written by a business can override a court decision, so nothing any US company can do helps here.

Google might create a local company with local personnel. The theory goes, when the USA Googler orders a lookup of some date, the EU Googler says can't do that, it's illegal.

I wonder why Microsoft's Office365 or Windows platform aren't hit by these lawsuits. The issues are the same, and the information gained seems much more interesting.


I don't believe that 3 letter agencies will care about the physical location of the user data. So storing the data in europe might not solve this issue.

Well, with data residing in the USA, US three-letter agencies can just ask the data operators to give them the data. I'm sure the same is true of French three-letter agencies for data residing in France, or possibly even the whole EU.

BUT, it's much harder to US three-letter agencies to obtain access to data residing in France, or the other way around - that would require hacking, and that carries a much higher degree of difficulty and risk (not that I would ever imagine it doesn't still happen).


The GPDR is explicitly not about trying to get you protection from state-operated intelligence agencies, and in fact within the EU state agencies like that are explicitly exempt from it.

It would be if the EU had the legal powers to do so, however the EU's treaties reserve national security to the member states.

EU law does apply when national security concerns of non member states are engaged, though hence the Schrems cases succeeding (and why the UK in on shaky ground when it comes to equivalency decisions post-Brexit).


That's wrong, GDPR explicitly applies to government agencies, police, etc.

Eurpol is currently on trial for violating it, as was the German BND previously.


The GDPR doesn't apply to the police, at least not in their capacity as law enforcement.

But it does have sister legislation - the Law Enforcement Directive[0] - that does apply many of the same principals.

[0] https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=celex%3A...


Maybe the GDPR issues would be resolved if countries would settle and agree how to do surveillance on their population... Obviously the gears of silicon valley turn faster than the gears of intelligence agencies.

Why don't they?

I would have thought they do. There is also a decent chance that at any given moment even Google have absolutely no idea where a given piece of information is physically located (and mirrored to X times).

You might be interested in

https://www.enforcementtracker.com/

I agree we need more enforcement, but it's reassuring to see the current levels.


Violations outpace enforcement both in current volume and rate of growth, by orders of magnitude. This is the opposite of reassuring.

But the fines are getting very significant, and major types of violations are awarded with fines ( so other violators might see that and stop, or at least when they get caught the ruling will be faster due to precedents).

Nothing about GDPR is hard ... unless your business model is to abuse your customers' personal data. Then it might be hard.

I routinely see the loudest complainers about the onerous nature of GDPR compliance suddenly get vague or stop posting when you ask for details of precisely what bit is so hard for them in particular. Note lack of those details in this present discussion, for example.

So far, it seems a safe assumption that the excuse makers are abusing personal data, and they know they're abusing personal data.

Perhaps one day a clear exception will show up.

I wrote up a thing here a few years ago with my actual on the ground experience of getting us compliant: https://reddragdiva.dreamwidth.org/606812.html

tl;dr anything that might vaguely constitute personal data, down to Apache logs, must either be in a writable database for redactability, or deleted.

Since then, our legal team - who are not your legal team! - has advised:

* 30 days for operational purposes is fine actually.

* Go feral on anything over 30 days. You need a named person responsible for GDPR redactions.

* If you want to do analytics on those Apache logs, do them quickly and into a form that doesn't contain personal data.

I'm in the UK, which is no longer in the EU, but the GDPR laws still hold here.


I have a small business that operates a website. To be absolutely clear, we don't rent or sell personal data in any form and our business model has nothing to do with tracking or profiling anyone.

We retain certain access records that can potentially be used to identify individuals indefinitely. These records have demonstrably helped us to defend against attacks on our infrastructure and to prevent attempted fraud on multiple occasions, sometimes years after the records used were first collected. We include these general purposes for processing but do not disclose exactly how we use these records for these purposes in our privacy policy.

So, are we compliant because there is a demonstrable legitimate interest in keeping these records? Is holding that personal data indefinitely, knowing that it mostly won't be needed, disproportionate and a GDPR violation? I'd love the people who think the GDPR is simple to show me verifiable, authoritative answers to these types of questions, because so far we haven't found any lawyer who can, nor found any information from any relevant regulator that we could point to as a clear indication either way.


1. You'll have to disclose this in your privacy policy

2. You can store identifying data of website accesses etc for at most 30 days without worry

3. Beyond that, you can only store data that's absolutely necessary, e.g. metadata associated with actual purchases and transactions, but not every access.

4. Usually, you'll have to delete that 2 years afterwards, in some exceptional situations up to 30 years are possible

What I'd do: 1) disclose, 2) delete logs after 29 days, 3) copy all logs associated with a customers transaction into a separate storage location, shared by customer, transaction and date, so you can delete it 2 years later.


My response to all of your points is the same: can you cite the authority for those claims please?

For example, no-one processing card payments is going to disclose in any privacy policy exactly how they combine all their signals to determine fraud risk and whether to allow an attempted transaction in real time.


If you're really in business in the EU or associated companies, and not just LARPing in a comment section, you'll have been under these laws for several years now, and should already have contacted your own lawyer on this question.

I've commented about my business interests and the GDPR several times in my more than a decade on HN. You're welcome to scan my comment history if you think I'm LARPing but I have no interest in continuing a discussion with anyone who isn't doing so in good faith.

I addressed your point about taking expert advice in my original comment above: neither lawyers nor regulators have been able to give us a clear answer so far.


I'd believe that you're describing your system's current requirements at a high level. But without exact technical details of how retaining personal information helps you prevent fraud many years later, I don't believe that it is the only way possible.

For example, if the personal information you're talking about is IP addresses, it seems like you could cook those down to non-identifying information pretty quickly - eg zap the last octet. Furthermore, I'd think you would want to cook it down promptly so you can store the current use of the IP block rather than what it might be used for in a few years. (Sidenote: I personally get hassled based on my IP address block way too much, so keep in mind you're harming legitimate customers if this is what you're doing).

Another example - if you're keeping personal details on people who have committed fraud (or not) and referencing that years later, then I'd say that falls squarely in the purpose of the GDPR and you should not be doing that long term.

Or you're doing something else. But without describing exactly what you're doing, you don't make a very compelling case.


Another example - if you're keeping personal details on people who have committed fraud (or not) and referencing that years later, then I'd say that falls squarely in the purpose of the GDPR and you should not be doing that long term.

You're saying that we shouldn't be keeping detailed records of previous attempts to criminally defraud us that are demonstrably useful for identifying and preventing further attempts to criminally defraud us over a long period of time by the same groups of people?

I'm sorry to be blunt but that is not a serious proposition. If anyone thinks the GDPR says otherwise, chapter and verse please.


I agree with you that this is a good example of a GDPR challenge. I think that building a profile of user patterns to protect against fraud & abuse is a perfectly legitimate business interest, even under the GDPR. But I disagree that this is a problem unique to the GDPR -- any means of profiling like this runs the risk of discriminating against protected groups or individuals, and there's been plenty of discussion here about e.g. the use of proxy identifiers for race of background employed by universities to filter applicants.

As with all legislation, there is no clear yes or no answer. If a GDPR watchdog were to evaluate your use of this information, they would primarily care whether you (as a company) are aware of the risks involved in the profiling, whether you have spent the effort to weigh the pros and cons of your approach, and whether you have taken steps to sufficiently anonymize such data without making it useless for your purpose. If you have, I don't think you have to worry about retroactive fines even if the watchdog concludes you're violating the GDPR in some way.

Personally, I'd go even further and say that you don't have to honour data deletion requests from users that have tried to defraud you -- it's unlikely they will do so because they would be required to identify themselves to you, after which you can turn them in to the police, but you can legitimately argue that you need to keep their identity on-file to protect your business. I'm sure the GDPR disagrees with me here, but I'd like to see a watchdog test that case in court.


I'm sure the GDPR disagrees with me here, but I'd like to see a watchdog test that case in court.

I doubt it. The right to erasure has never been absolute even under GDPR. Typical examples are that you can't compel a bank to delete all records of a loan it gave you, nor compel the police to delete a criminal record of your past behaviour, as long as the data is lawfully and properly handled.


Is that what you are specifically doing and describing above, or are you just choosing one implication of my general pondering?

If this is actually what you're doing, let's discuss the specific details of the information flow you're using to make these decisions, rather than talking in terms of strong sweeping generalizations. If you're just picking out a worst case implication of my general statement, that doesn't seem very productive. An example of what I was specifically thinking:

Customer buys something from Vendor. Customer never receives package. Vendor refuses to issue refund because tracking marks delivered. Customer files CC chargeback (I know this is less common in the EU but work with me here).

From the perspective of the Vendor this Customer has defrauded them, or at the very least is an increased risk. From the perspective of the Customer, they've been unjustly judged for circumstances beyond their control.

Can the Vendor retain that judgement on the Customer forever? Can they share it with other Vendors to create an industry blacklist of "problematic" customers? These questions seem squarely within the aim of the GDPR.


In the example I was thinking of the situation was much more clear-cut than that. Again I'm not going to get into real specifics because this is legal stuff and it's just a discussion forum, but consider this broadly similar example.

You provide a service that anyone can sign up for. It costs money.

As a matter of good customer service your usual practice is to allow significant grace periods when money owed is overdue before you actually cut a customer off.

Someone signs up for a real account using the name "Mallory One" and then exploits the "generosity" of your system to avoid paying part of what they owe you. Eventually you cut them off.

Someone then signs up for a real account using the name "Mallory Two" and does the same thing again. Again you eventually cut them off but miss part of the payment you were due.

After this has happened several times over an extended period, it comes to your attention that the only people signing up using names of the form "Mallory (number)" are ripping you off and the person or persons responsible have already cost you thousands in unpaid bills.

You add a rule to your security system that says when anyone creates an account with the name "Mallory (number)" you will immediately block it.

How long are you allowed to remember the pattern "Mallory (name)" in your security system if it can potentially be tied to a specific individual and is therefore personal data but you reasonably believe that person to be responsible for all of that fraud and you reasonably expect that they will continue to defraud you if you don't prevent it?


Is this is a practical way of preventing fraud? Can the person not switch their next account from "Mallory Three" to "Eve Smith", thereby evading your rule?

I understand you've simplified the example here for the sake of discussion, but I think the details inform the situation. Like if you really just want to discriminate on any account named "Mallory _____" then that doesn't seem like personal data to me (even though you've created the rule from "personal data"), but also it doesn't seem particularly effective so there must be more to the story.

For an analogous example, you don't need to keep a permanent record of fraudulent transactions with specific IP addresses of 10.0.37.{23,45,67} to remember that 10.0.37/24 is suspicious.

(Also what about everybody else who legitimately has the first name Mallory ?)

Your case is interesting because it contains a few unusual qualities that businesses generally don't offer, but smaller "nicer" businesses will give more leeway. You could straightforwardly stop giving a large freebie to new users or require a payment method or identification on signup, but it would be nice to figure out where the line is instead of just giving in to such less friendly practices.


I'm sorry that it's difficult to discuss these issues reasonably based only on simplified analogies. Again, these are real issues with potentially real police or courts involved if it got serious enough, so there is only so far it's sensible for me to go with any examples.

Yes, the situation absolutely was a practical way of preventing fraud. It saved us a significant amount of money with no apparent downside except for a little time to implement the security measures and the slight GDPR concern we've been discussing. The pattern we were looking for in that case wasn't quite as simple as the name example, but perhaps you'll take my word that it really was almost as obvious but it did also have personal data/identification implications. As I wrote in another comment, it's amazing how dumb people are sometimes but even dumb people can still cause damage. I have a few other examples in mind where similar principles apply and those have also prevented material damage to the business and/or other customers.

Just to explain one detail that might look implausible, the grace period being exploited wasn't for new customers, who do have to pay up front. It was for existing customers who pay late (or, as it turns out, sometimes not at all) when further payments are due. Ironically part of the reason we allow that period beyond wanting excellent customer service is for GDPR compliance. We have an obligation to protect any personal data we hold properly and there is at least a plausible argument that deleting everything the moment an account goes overdue on a bill would not meet the standard.

As you have perhaps guessed, this is a smaller business and we do try to be a "nicer" one. Most of the time I think that is a good thing. However it does mean we don't have dedicated staff or budget for any issues like this. When someone on the far side of the world is trying to rip us off, one of us doesn't get to sleep that night until we've fixed the problem, if we can. Every time we have to spend time and money on compliance changes or taking professional advice and every time the business loses money to fraud, that has rather direct consequences for the personal finances of the people who are doing the work to run the business. We do take security and privacy seriously and we try extremely hard to stay on the right side of any relevant rules (far more than most professionals we talk to expect for a business of our size, and I'm told far more than a lot of much larger businesses with dedicated staff for this stuff).

But it really does boil my blood when people say things like GDPR compliance is easy unless you're doing dodgy things or they assume that because I don't agree it means we haven't thought about it or run a business professionally. If the issues were so simple and obvious, there wouldn't be 16 comments under my original one as I write this without a single citation of either the GDPR or any regulatory or court authority to back up any of the answers given or claims made.


Reading between the lines it seems that you're hesitant to share any details because your method would be easy to work around if you did.

However I am still trying to understand your specific difficulty with the GDPR, because I am one of those people who will blindly assert that it should be relatively easy as long as you've built your systems to be legible to user requests (which I will admit is a bit naive). I really am interested in specific cases where this is not true, and furthermore where it's impossible to change data storage to make it true.

Trying to think about this abstractly just leads me to scenarios where it's not a problem. For example if a customer owes you money, then I would think you have a legitimate business reason to hang onto their personal information for as long as it takes to collect their debt. It's kind of hard to argue that a business has no legitimate reason to remember you if you still haven't paid them. Have you been advised to the contrary, xor does this not apply to your situation?


righty-ho. You're in the EU, UK or another country under the GDPR? Have you spoken to an actual lawyer about this? Since, as you say, it's your business.

And this has been a regulation you've been required by law to follow for quite a few years now. Have you just not been worrying about it?

You're asking questions that, as other commenters have noted, are plausibly a valid case, but are quite specific to the precise details of what you're doing and how you do it.


Yes, we took real legal advice in good time. We also had some time with a specialist in GDPR compliance and eventually spoke with the regulator in our country. While I'm obviously not going to discuss specifics here, nothing was hidden from any of those experts. And we are still not 100% clear on what is theoretically allowed here.

This is my point. Literally no-one actually knows whether these kinds of edge cases are permitted under the regulations until you're already at the point that someone in a regulator's office has initiated a formal action to find out and potentially penalise you if they're not.


Are there not any other ways to secure your system? Seems a bit off to me that some personal info is all that is needed to try fraud or 'attacks'.

You would be amazed how dumb (and yet still dangerous and disruptive) some people can be. Here is an example without getting too specific.

We once had a series of attacks by the same group. They would sign up for real accounts on our site and then take certain actions that violated our terms of service. Everyone here would agree those violations were serious enough to justify immediate termination and potentially reporting to relevant authorities.

Every time they signed up there were certain patterns in the details they gave that allowed us to recognise them. Those are the kinds of data we intend to keep indefinitely so that our security system can intercept any further attempts (which still happen sometimes) and block them.


IP addresses are the problem. Those are pretty important in trying to find bad actors. They are widely stored for a long time to to be able to identify various forms of abuse. But GDPR considers IP addresses to be personal data, as it is potentially possible that one identifies a unique individual.

For example, most classic forum software stores the IP address of a post submitter indefinitely for anti-abuse reasons. It seems like nobody running such forum software could ever be GDPR compliant. This is despite them never selling this data, trying to mine it for any nefarious purposes or anything like that.


> unless your business model is to abuse your customers' personal data. Then it might be hard.

It's not only your business model, but also the business model of all third-party services you are using on your site.

Also, part of the reason why it's not that hard is that the GDPR is pretty much one of a kind. Imagine the US and maybe some countries in Asia having similar but different implementations of privacy laws, and you having to work with them simultaneously. Or even different laws in each US state (CCPA?). Imagine every country requiring you to store user data only the user's country of origin, thus managing a separate database for each country.


as I said:

> Note lack of those details in this present discussion, for example.

your comments so far have been apocalyptic GDPR fan-fiction, but are notably short on the actual details of what you're doing and how you do it.


>Also, part of the reason why it's not that hard is that the GDPR is pretty much one of a kind. Imagine the US and maybe some countries in Asia having similar but different implementations of privacy laws, and you having to work with them simultaneously.

That's why treaties like Convention 108+[0] exist, to provide a common framework for implementing data protection laws.

[0] https://search.coe.int/cm/Pages/result_details.aspx?ObjectId...


How about a site like Wikipedia?

There is relatively little non-public information about about users kept. The email address, date and time of a few first time actions (like creating the account, verifying email address, going to an edit page, etc), some account settings like language. They do keep some data short term, like track of the ip address a given user signs in with. I'm not certain how long this data is kept for but apparently up to 90 days. This is one of the tools used to check for certain types of abuse by logged in users, like sockpupetry.

The majority of information about a user that the site stores is publically displayed information clearly voluntarily submitted, with implied consent for use, like what pages the user edited and when (public info), information they choose to add to their user page etc.

But never the less, Wikipedia is potentially a pile of GDPR violations, despite pretty clearly not doing the sort of stuff the GDPR is trying to restrict.

Potential violations include:

#1. When an anonymous user edits the site, the edit is publicly attributed to an IP address, which is kept forever. IP Addresses are considered personal data under the GDPR. It is not feasible to only keep these address for 30 days, as all edits need to be attributed to something. It is not at all clear that keeping the IP address indefinitely for this falls on the correct side of the legitimate interest line here. So this could well be a GDPR violation.

#2. What about users requesting deletion? While the project can delete the user-pages, and even rename the account to something non-specific (like renaming away from being the User's real name), it is likely to not be terribly difficult for someone to identify the renamed user, especially if they ever left a signed message on a talk page. Retroactively modifying such past edits, and editing other people's posts that referenced your old username would be too disruptive. But it is not 100% clear that what Wikipedia reasonably can do is enough under the GDPR.

Also, technically speaking as a rule Wikipedia never actually deletes revisions from the database unless technical reasons require it. Deleted articles are no longer visible but are still stored in the DB. Even copyright violations are normally only rev-deleted (can be restored by admins), or Suppressed (can be restored by oversight users). This sort of not-actual deletion might not actually be enough under the GDPR.

#3a. Let's say a user submits a data access request. Wikipedia could provide them with their own email address, profile settings, non-public temporally logged information about the user, like the IP addresses used to log in. They could provide a copy of the user pages, and even the complete history of them, as well as all the edits the user has made, possible even edits that are not currently public. (Like articles that have been deleted, edits that were suppressed, etc).But is that all really enough?

What if other users on some talk page end up talking about this user, without specifying the username (so Wikimedia Foundation cannot easily find the reference), but the prose is sufficiently specific to clearly identify this natural person? The posts could potentially even reference other interesting data about the person, like their religion. While Wikimedia foundation may not have the sort of AI needed to parse the conversation and extract the personal data and associate it with the user in question, by the strict letter of the GDPR it still counts as personal data, and there is no infeasibility exception to disclosing it, so if the user later find this conversation, and then wants the relevant data protection agency to go after Wikipedia, the agency technically could justify issuing a fine here. Is is likely to actually happen? Of course not! But it could if for some reason the relevant people at the agency has a personal grudge against Wikipedia.

#3b. Once again a data access request: What if the user is actually a also public figure. Surely they would also need to be given a copy of their article, and possibly the complete history of the article. But there could well be other articles that reference this person and it is not necessarily feasible to automatically find all of them, especially if any don't explicitly link to the subject's main article. Once again, strictly speaking not providing any personal data contained in those other articles would be a violation of the letter of the GDPR, despite not violating the spirit.

------------------------------

These are only a handful of edge cases I can come up with. In all of these scenarios Wikipedia is being very reasonable, and is not trying to collect any more personal data than needed to run their site, and is being fairly reasonable in trying to balance user's rights to with practical considerations. But they still have multiple places where it could be argued they violate the GDPR nevertheless. They are not an evil company trying to collect personal data and mine it for profit or sell it. But the extremely vagueness about details contained in the GDPR makes it so it is hard really have any idea for sure if they are on the correct side of it or not.

This is true despite the fact that no data protection agency is likely to every try to take action against the Wikimedia Foundation for such violations, simply because in practice their actions are good enough, and trying to attack something like Wikipedia will likely piss off the population that want the agency instead going after Facebook, or companies who have massive data leaks they try to hide.

One might argue that Article 85 might be interpreted to protect Wikipedia under freedom of expression and information. Or perhaps one might say that the data qualifies for processing under the Article 6 1(e) because identifying users modifying a public resource is a necessary part of the task of developing Wikipedia itself, which is a task in the public interest (questionable, but not impossible to try to argue). But let's say it was not actually Wikipedia in question, but some other forum of user provided content with similar limitations, that might not qualify for extended protections for freedom of expression and information, or as a task in the public interest?

Some of these same sort of concerns technically apply to any sort of online public discussion forum, even ones that are very much not trying to collect personal information, beyond the bare minimum they need for accounts and anti-abuse. Even this very forum we are on right now can potentially suffer from the "other people talking about you in an identifiable way", but admins cannot find the conversation to provide it to you for an access request problem.


I think you're over-complicating some of these.

On point 1, I think legitimate interests covers this fairly well, but it would be arguable for sure.

On point 2, the right to erasure is not absolute so the fact that data are not purged from the database is not relevant. Legitimate interests also come in to play here.

On point 3a, the GDPR only mandates that data subjects are given access to personal data, so the WMF need not collate the information to send to them. Surfacing rev-deleted data might be more tricky, I suppose, Wikipedia has policies against posting personal data of other users and such edits will be oversighted where it's brought to admin attention (see WP:OUTING).

On point 3b, again the legal requirement is to give access to the data. Rectification and erasure is also straightforward (edit the page, ask for other edits to be revdeled/oversighted if the violate WP:BLP). Like you say, Article 85 offers wide protection here, too.


To my knowledge, Fathom Analytics is the only analytics app that has bothered to hire actual lawyers and navigate EU isolation.

They wrote about it here: https://usefathom.com/features/eu-isolation


Great victory. I bet firebase crashlytics is illegal as well in EU.

The reason I uninstalled the hacker news app 'Materialistic' is because it regularly crashed and was probably unvoluntarily siphoning off pii data through the crashlytics module.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: