Edit: or at least consider that their rules are ridiculous.
But the important question is, do the countries pay their fines?
You can think what you like about the Euro, its implementation, the Maastricht treat in general. But let's not downvote actual data.
That depends on how the GDPR is implemented within the country. E.g. above is factual for Belgium, but _not_ factual for The Netherlands. "Autoriteit Persoonsgegevens" has been notifying everyone to comply, government website or not. It's a steep learning curve though, there's also an multi-year effort to have government websites make use of TLS/certificates.
Edit: A reference: https://www.rijksoverheid.nl/onderwerpen/privacy-en-persoons...:
"Sinds 25 mei 2018 moeten overheden, bedrijfsleven en verenigingen voldoen aan de Algemene Verordening Gegevensbescherming (AVG)."
"Since 25 May 2018, governments, businesses and associations must comply with the General Data Protection Regulation (AVG)."
Overheid.nl is the official government site.
Also: there’s a guidance document for authorities here: https://ec.europa.eu/newsroom/just/document.cfm?doc_id=47889
EU law does not work with exact codified procedures, which I understand is more common in US. So indeed, you will find guidance but not exact procedure (though it seems to be clear enough to me)
EDIT: article 83 instead of 82
That's a blatant untruth.
> Indeed they can impose the maximum fines for a first offense, and are fully incentivized to do so
And yet they didn't when they first fined Google, where the fine was 50 million euros -- which was only 1% of the maximum fine they could've imposed. It's almost as if the maximum penalty is the upper ceiling and not the default.
The EU has had the capacity to levy fines far greater than they typically have for a whole spectrum of violations of its laws and regulations.
It never goes full-fine right away. It does show restraint on first offences.
Funny, considering that there already are cases going on and not a single one is close to those maximums.
> Everyone I know takes GDPR seriously
Wow, what people do you know? Considering that the vast majority of sites doesn't even have opt-in into tracking but opt-out after they started tracking, I think the people you know are some weird exception.
Though I did have to do a bit of clicking around until Privacy Badger found something so it looks like they at least are trying.
Facebook is an example they seem to just fine, but a big Dutch media company (RTL) having a cookie wall that quite clearly explains what you are consenting to by clicking "continue" but doesn't strictly fall within the correct opt-in mechanism? They send a warning for that, not a fine.
Yes, it's rather embarrassing that even the government itself doesn't follow the law. But then screaming the 2019 equivalent of "get your pitchforks!" shows how misunderstood the GDPR is. It's supposed to help, not collect extra money.
>It's supposed to help, not collect extra money.
It's supposed to give more leverage over foreign companies to EU countries, because we have somehow managed to create an environment that's very hostile to building tech companies.
on edit: I see someone has already addressed the issue several comments lower and in depth https://news.ycombinator.com/item?id=19426066
Let me put it another way: the European Union is not like the federal government in the US; it's mainly an economic union.
Or another way: European countries are independent, not states in a federation, and the European Union is a separate entity.
Let's not forget this same company also promised to "don't be evil", and then changed their mind. What's backing up this "binding statement" and how do we know they won't change their mind again?
I've heard this ridiculous statement so many times. Do you think a company needs to put "don't be evil" to stop itself from doing evil things and then needs to go ahead and remove that phrase because otherwise it simply cannot proceed with evil? Sounds like a joke. We're talking about humans not robots here.
That said, I don't get why Google gets singled out all the time while all other players often play a dirtier game.
My point was that Google has a history of making public statements that make themselves look good, and then backing down on them later. Maybe 5 years ago they didn't data mine Google Analytics, but who knows what their policy is now or when they may change it? I'm not saying the OP is wrong, just that I'd like more evidence than "they said so."
> That said, I don't get why Google gets singled out all the time while all other players often play a dirtier game.
Because Google went out of their way to tell everybody they were going to be different.
Google's approach to GDPR compliance is entirely based around the idea that they're a Processor and it's not really their data, they're just the middleman. I would believe them because they have a lot riding on that.
If you agree with them then it’s an issue. Google clearly tracks data that is GDPR personal.
Mind you I don't even want to know how much information browser addons have access to. Are there any APIs that forbid any addon from accessing the page? Probably not, that would thwart adblockers.
Of course, this is ridiculous so what I'm really suggesting is that they look at their laws and reconsider how crazy they are.
Businesses that only make sense financially if they can gobble up user data without their consent and sell it to third parties should not exist, just as businesses that can only work financially by not paying their workers should not exist.
What of this is crazy?
Their cookie disclosure regulation IMO has collectively wasted perhaps millions of hours of users' and website designers' time.
So really, users should be angry with websites for intentionally working around the spirit of the cookie law (and creating the annoying pop-up which basically requires you to consent to cookie tracking if you want to continue to use the website). The EU's mistake was not making the cookie law far more strict.
I find this all hilarious.
Are you insinuating the EU has implemented GDPR as a means to get some extra funding instead of to protect its citizens? If so, do you think that before GDPR there were no victims, but now, post GDPR there are lots of victims and they are the US IT industry? If so then let me teach you about a concept that might be new to you: human rights. Privacy is one of those.
Yes, because its "protection" is just another form of security theatre.
> do you think that before GDPR there were no victims, but now, post GDPR there are lots of victims and they are the US IT industry?
How are you defining victims? People and government agencies (in the EU…hahaha) who blindly continue to use services of non compliant companies (within the US and outside of the US) while putting up superficial barriers against such at the same time?
Sure, humans can have rights, until they end up on the end of metadata drone strike from partially collated data from said institutions supposedly apart of the "protectors".
At the end of the day, governments nor corporates will give anyone privacy esp to those who dont take meaningful practical steps to combat intrusions for themselves in their everday life for whatever reason… though I don't mind having a laugh at those dancing to the tune of this circus of the piper singing what they want to hear.
Tick tock, tick tock…
This continent still remembers when Nazi Germany and the Eastern Bloc tracked people to abuse and even kill them. That was never a direct concern in the UK, but people there are still strongly against government databases (see: UK national id card trial, NHS database trial) and in favour of the right to privacy.
Filling the coffers is just a nice bonus on top of dispensing some much-needed slap-downs.
No Microsoft (for two years now), no Google (for 6 years now), no Facebook (for two years), no Amazon (for a year) services here.
The reason I don't use their services is that I don't want to support shitty companies that have no respect for the people of the countries in which they operate, and I believe these companies are primarily responsible for turning the web into the ad-infested walled-garden shitscape it currently is.
There are alternatives for everything, and some are far better than what the big tech players offer.
The difference is I'm willing to pay for a quality service. Most aren't.
Besides, those companies would never leave the single largest trading bloc on Earth. Their shareholders would crucify them for it.
In that sense, the EU is doing these companies the favour, because they make so much money doing business in the EU that they'll never leave. They'll adapt and fall in line, like everyone else.
Trust me, nothing of what Facebook, Google, Amazon or Microsfot offer is irreplaceable.
I think GDPR is a great law, but if there's one critique of it'd I'd level without hesitation, god damn is it hypocritical.
Article 6.1.e "in the exercise of official authority vested in the controller;" - Wide open door.
Article 9.2.d - exception to prohibition on racial profiling for political parties on their own membership.
Article 9.2.g "processing is necessary for reasons of substantial public interest, on the basis of Union or Member State law which shall be proportionate to the aim pursued" - The "anything we declare acceptable" biometrics exception.
Article 9.2.h - The "no opting out of online medical records" clause.
Article 17.3.b "or for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller" - The "if we say it's in the public interest not to delete it then we don't have to delete it" clause.
Article 23.1 - The laundry list of cases where any EU government can throw out all rights the GDPR establishes. Includes the following: "other important objectives of general public interest of the Union or of a Member State" as if that's not a goalpost a mile wide.
Article 49.1.d - Allows transfers of data to countries with inadequate data protections to take place if they are declared to be "in the public interest".
Pretty much everything in the GDPR document is untested at this point, and whether government or corporation, quite a lot of cases are going to have to be argued before the courts.
However, this document leaves open many arguments for governments that are not open for others. There is no definition for what might be "in the public interest" in GDPR, nor are there guidelines for interpreting when someone is "exercising official authority". One could argue that police departments are doing that 24/7 and thus large chunks of GDPR don't apply to them at all because processing is always lawful as a result.
By leaving themselves so many fruitful avenues of arguments to present to courts that have not been granted to others, the collective EU governments have created a law that holds others to a higher standard than themselves. Hence, hypocritical.
Why would a law be needed to be argued in court? There's been various improvements related to privacy already. Various big (national) companies have been ignoring the GDPR (disallow visitors unless they agree); this practice is now being investigated. Simplified: Netherlands asked the EU to clarify if the practice is ok according to the GDPR or not. There's been no court case. There has been discussions between companies, government as well as the EU.
The purpose is improved privacy, not fines.
By the grace of living in a democratic society and not a despotic dictatorship, it is in essentially every western nation the right of any legal person who is accused of breaking a law to request judgement on the matter by the courts.
What you are implying would make judges politicians.
Ublock detected and blocked google analytics in all of them ( has urchin.js too).
I’ll take the hit though.
It’s not just pageview logs, but GA has great tools to analyze those logs, do reporting on some decent set of actions and to bring it all together in a simple to use interface.
You can take your server logs and then what will a non technical person do with them? Not much.
That said, you can deploy GA while opting out of behavioral data and ad network features, and even fuzz ip addresses.
Analytics has the stigma of ad networks because they historically existed to validate ad spend. We’re past that point and they are often used with strict first-party intent.
There’s nothing preventing us from imagining all the malicious things any analytics tool could do, and imaginations run wild.
Disclosure: I work for an analytics company that doesn’t want to own your data, but I understand why folks have a knee jerk reaction to analytics of any kind.
How useful is the information without this? If they aren't tracking you then they don't have your profile data, ASL is usually the most useful data but only the L is sort of available.
> You can take your server logs and then what will a non technical person do with them? Not much.
IME this is exactly what usually happens with analytics. It's one of those things that management is convinced they just have to have for it's pretty charts and feeling of empowerment, but when ask them what changes they've made based on the data they won't have a lot of examples.
I'm sure they're valuable in the right hands, but for the vast majority it's just it's a waste of time, similar to most reporting.
you are thinking in term of ads.
If instead what you want to know is, what parts of the sites do visitors stop navigating. Or which pages are seen by recurring visitors vs other pages seen mainly by new visitors. What pages are almost never visited.
Those informations don’t need ASL, the goal is not to target individuals but to profile the site and see what brings value and what might not.
> management is convinced
I think analytics are not a tool for management though, except perhaps in very broad stokes. I see it more for product owners who need a feedback tool so see the impact of what they do or have a vision of how the user uses their product.
It’s like asking management what decisions they make based on NewRelic. None surely. That’s not their job.
I'm thinking in terms of demographics of visitors. Who's visiting, who's not visiting, why are we so big in japan, that sort of thing. My objection to analytics isn't just ads, it's sending information to a a third party behind the backs of users.
> If instead what you want to know is, what parts of the sites do visitors stop navigating. Or which pages are seen by recurring visitors vs other pages seen mainly by new visitors. What pages are almost never visited.
Those examples can be handled just fine by some trivial processing on server logs. Only the first needs a way to identify users (which analytics will also need) and the second 2 don't even need a user id in the logs. I'll give you the benefit of the doubt and assume they were just simple examples and you want a lot more detailed information and in real time and output prettier than graphviz, why aren't you setting up a locally hosted alternative? If the data collected is truly worthwhile then it's surely worth this minimal time investment?
> That’s not their job.
That's kind of what I'm getting at, it will be mandated by someone in management or marketing but IME it's usually no ones job and nothing happens with it. Google is the only winner.
I misread your focus on ads, is it more about user aquisition perhaps ?
Government sites for instance have less of these issues IMO, as they’ll have other means, usually in person or by mail survey, to directly ask why people are not coming to the site (do they know about it ? Do they have a computer ? Can they read in the language etc)
More than anything, these sites have a captive audience so the focus can really be on improving the access to the relevant information.
> processing on server logs
I think it’s overestimating the cost and technical competency of the agency handling these websites, but also the time it would take to reimplement a log parser that surfaces all these informations user session by user session.
It definitely can be done, it’s not trivial in any way though. Compared to what some of the government websites do (they’re basically glorified wordpress sites) building a log analyser + the associated dashboard would cost more than the site itself.
Can this analysis be done offline? Data collection can be done without third-party accesses, and any analysis on that data can be done offline using separate tools, isn't it? That removes the third-party script surface of attack.
Then you wouldn't need to imagine what else is done with the data. To imagine that your users data is exploited behind your back should not be a stretch of anyone's imagination.
At the same time the US is 10 years behind. GDPR is far ahead of anything the US has. Gun control is 50+ years ahead and lets not try to count years in social security nets or healthcare for the non-rich (or health in itself for that matter). In average I'd say the US is behind the curve and falling further by the day, especially now China has become a semi-great soon to be superpower.
However, they can easily get lots of compromising data from their own servers. Both from standard web server logs, and from their own scripts and tools.
Once you involve third-party analytics, though, there's another party to worry about. And not just about what they do with the data, but also about how carefully they manage it. That's arguably a key thing in GDPR.
All of this for free. Why?
Fuzz how? The IP is known to Google from the very connection ...
> Analytics has the stigma of ad networks
Well... Google with GA is an ad company, isn't it?
I genuinely don't understand. Is it that the majority have some secret pre-existing conditions and are afraid the insurance companies might realize?
Every time I visit a new doctor I need to spend ~ 10 minutes to fill out a long multi page form on paper listing all my medical history which could've been loaded from some database. I want my data to be analyzed and used to derive insights and help future patients.
It's not US specific, the privacy of health information goes back to at least the Hippocratic oath.
> Is it that the majority have some secret pre-existing conditions and are afraid the insurance companies might realize?
A lot of people do. Not just embarrassing conditions but they keep notes on mental health, drinking habits, illicit drug use, etc. If that information leaked out you could expect everyone from future employers to dates to be taking a look.
It is, when analytics infrastructure is provided by an ad company (Google Analytics)
> Where customers use Google Analytics Advertising Features, Google advertising cookies are collected and used to enable features like Remarketing on the Google Display Network.
If you just use GA on a site, without ads, none of the data is funneled into Ad targeting for the user.
So glad I have uBlock Origin.
I complained about this (and blocked everything), but at one point they required I accept an onerous terms of service to continue to use the site. So I requested they delete my account. Well, apparently they cannot delete an account, only inactivate it. They didn't do that and I still
California DMV has not only google analytics, but will log you into google when you use their site. dmv.ca.gov
It is hard to mindfully resist.
It’s no different than running Webtrends on a website. Companies want to know how people interact with sites and the methods as to how they arrive.
There are however other trackers that could send such information if they are not configured correctly.
I have been using ublock + configured my router as DNS server using https://github.com/StevenBlack/hosts/
The DNS server has the advantage of filtering for the mobile devices on WiFi as well, which is a nice plus.
That includes siphoning money from your account.
(In my experience all bank web sites work like above here in Finland)
Much more difficult to circumvent, assuming the user pays attention...
Having direct control of the user interface is very powerful.
There are banks that try to make it faster though, with MFA, though the MFA system is usually SMS.
I look at the data in Google Sheets.
On the page I want to track I paste a script tag that includes a few lines of JS from my counter site. That JS script hits a PHP script with the URL the user requested. I don't track ANY user details. No browser info, no IP address, no fingerprinting, etc. It would be trivial to track those things though. The PHP script logs the data to a CSV file (which I plan to change to an SQLite DB soon).
I have a Google Sheet setup where the first field of data is '=IMPORTDATA("https://example.com/data.csv")'. Google Sheets automatically fetches that data every time you open the sheet; no API integration required. Then I have a simple bar chart on the data.
I doubt that you care that much since the data isn't sensitive but just a heads up.
It's security-by-obscurity, maybe (as all public "secret token" URLs are) but it's better than what you're implying.
EDIT: You are right in theory though.
This is strictly as a learning exercise, no malicious intent on my part.
My first guess for the CSV was SITE/counter.csv
We as a society haven't agreed precisely on what "privacy" means so it is effectively impossible to know is a particular service's handling of data provided to meats you definition unless you just don't hand it the data in the first place.
I mean, this is always going to be the case with modern computers. No one writes EVERYTHING themselves, so they have to trust someone else. You are trusting the microcode on the CPU, the system calls on your OS, your compiler/interpreter, your standard library.... I get that this is a different sort/level of trust than trusting a third party metrics system, but it isn't fundamentally different. It is all about trusting someone else's work.
if you've got php running already, it's straightforward to code up a bar chart from the weblogs you already have (bypassing, csv/sqlite and google sheets altogether). that is, after all, how google analytics started (as urchin).
Is "matomo" japanese? If yes then the definition is here https://www.nihongomaster.com/dictionary/entry/36221/matomo
Unluckily it cannot give you directly the search string used by people that ended up on your site because search engines don't forward it anymore because of the whole privacy movement that happened some years ago (I am still against it as I don't see why, as website owner selling e.g. clothes, I shouldn't know that a person landed on my website by searching e.g. "yellow pants" => this fake privacy just concentrates all power/knowledge in the search providers, but this is just my personal opinion), but here they sell a plugin ($/year) which apparently can do that: https://plugins.matomo.org/SearchEngineKeywordsPerformance
(I guess that it retrieves directly the search keywords from the search provider, but I did not read the docs nor I tried it out)
That doesn't really matter. But if you set up a content farm / honeypot, you shouldn't be able to tell that the search term that brought the person to you is "how to deal with my XXX infection"
The business argument for Google: they still have the information, and can use it in their analytics, and potential competitors or customers don't have it.
Why not? How else are you going to 1) provide information on dealing with an XXX infection, or 2) recognize enough people are landing at your site looking for advice on their XXX infection that you should provide some answers?
Setting up such a "content farm / honeypot" and making it reach the top results of Google/Bing/Yandex/etc... used to be simple but is nowadays probably successful in only very few cases (as search engines are nowadays more and more context-aware), and Google/Bing/Yandex/etc... can still see & use "how to deal with my XXX infection", but whoever doesn't use directly their services cannot.
What I mean is that, in my opinion, the privacy measures in this case centralized even more power in the hands of few companies with very little added/improved privacy.
In my case, running a small techy website, the search keywords were very useful because they allowed me to understand e.g. which keywords forwarded the users to my website by mistake or correctly, to then correct appropriately the contents of my articles to make it more clear what my articles were talking about, or to see that the users had a very specific problem that I did not take into consideration when I wrote a certain article, etc... . Now I cannot see those infos anymore without using Google Analytics which I don't want to use (or, by using the plugin mentioned above, which is good, but for which I would still have to pay $/year, which is bad as it increases fixed costs).
If they gave the information to you too, it likely goes to you, but also to the other 50 .js files you include from various sources of dubious trustworthiness which every site these days includes.
Furthermore, what you are saying is "this admittedly private information used to be available to all and it was useful for some, now it's only available to the entity the user specifically gave it to, and that's bad because the few who actually used it for good don't have it". But the whole idea of GDPR (and similar) laws is to put the control back with the user, which is a good thing.
I think some standard with which the user explicitly lets the website know "yes, the search engine query that brought me here is X and I allow you to have it" would be good, but I don't think dropping this info from the referer (sic) is bad.
Somewhere in the settings, you have the option to anonymise users, which is achieved by deleting/not saving the last triplet of IP addresses.
There are some other options, and some functions you should avoid:
- You should set data retention to the minimum of 14 months
- Do not use the User-ID functionality (tracking across devices/browsers)
- Also avoid Remarketing and Advertising Report features.
The downside to all this is that it is mostly invisible to users that you are somewhat protective of their privacy.
But just thinking about my tax return can think of a few features that would have to be dropped, or would become more clunky, like some of the client-side validation, and the auto-saving of drafts.
You should try the dlang forums to see what can be achieved if you don’t fetch 2mbytes of JS from 5 different origins: https://forum.dlang.org
ELK (Elasticsearch with Kibana). Pretty powerful and AWS recently forked free Elasticsearch.
Has anybody else experienced this?
That being said, ultimately it's a frontend on a MySQL database, so there's lots of ways it could theoretically be slow -- MySQL isn't configured/tuned properly, the database server isn't resourced appropriately for the amount of data it's hosting, etc. But this is going to be the case for any self-hosted solution.
Do you mean loading the UI which displays the graphs etc...?
I really wouldn't want to be the person tasked with explaining these issues to the average politician (although some rare exceptions obviously apply).
* We will never know the full ramifications regulation has on a market. It is impossible to calculate objectively the _full_ effect.
* Regulation _always_ has unintended side effects. (Alcohol prohibition and violence, etc)
* A regulator that doesn't understand the entire problem will likely increase the unintended side effects.
I'm not saying that we always have to use them, but there are cases where they can be useful.
In a physical place of business, for example, a retail store or restaurant, keeping track of what times or parts of the store were busiest, or where people spent the most time, would allow you to eliminate waste from your business, and sometimes that involves knowing how many unique customer foot traffic you're getting.
There are also ways to configure Google Analytics to fuzz IP addresses, essentially de-anonymizing them, as well as setting up explicit data retention periods.
EDIT (responding to grandparent comment): Even so, I'm not sure it's disingenuous to call products adtech which are provided by a company whose main business is advertising, and which are often configured to contribute to the advertising business, even if it's possible to use them for purposes other than advertising. At that point, maybe it's just adtech that's being repurposed.
Since the governments are not subject to the GDPR, it doesn’t have teeth, and I would not be surprised if it fails to get resolved.
But it's pretty crappy that they haven't tried to follow the spirit of the law. And it's pretty crappy that all my interactions with the government, as a UK citizen, are reported back to the Google mothership.