Facebook doesn’t operate in a vacuum. They can’t scale back data collection unless others do. Moreover, the United States doesn’t exist in a vacuum either. Scaling back all US companies will give a leg up to competitors in more lenient jurisdictions.
Laws can be written to only apply to American users. That would leave American companies free to compete on level terms in other countries.
Perhaps the others will be in trouble in future too, but sorry Facebook, it's your turn now.
Haven't heard anything about Google wholesale handing all my data over to anybody who clicks a couple buttons to sign up to their developer program though.
I'm guessing Facebook has the technical expertise to allow 3rd parties to run aggregate (non identifying) queries on the consenting users' data on servers they control and only allowing certain aggregated data, as well as limiting the number and type of queries 3rd parties are running.
Guess it's easier to just let anybody have at it to prove how valuable their platform/data is.
It seems like Facebook gave away that information in hopes of developing a more engaging walled garden with 3rd party help, and perhaps some naivety.
> I'm guessing Facebook has the technical expertise to allow 3rd parties to run aggregate (non identifying) queries on the consenting users' data on servers they control and only allowing certain aggregated data, as well as limiting the number and type of queries 3rd parties are running.
This is a very challenging problem - multiple "non-identifying" queries can actually identify individuals. Differential privacy is the best solution so far, however it's still challenging to guarantee privacy without lowering quality far below competitors that don't care about consumers.
That is one possible self-interested reason for them to keep the data private. They could also just care deeply about guarding user data they collect. This could be for either selfish or unselfish reasons -- it's both good business and moral. It's very hard to tell from the outside, because their actions would look largely the same either way.
Personally I think it's pretty naive to believe any company does any thing for a single reason. There are a constellation of reasons, some more important than others.
Zero challenges from me on this. I 100% agree.
However, it is doable. It just takes effort.
If the government said "let's make a database of every Jew in America," people would get rightfully riled up. Yet we've allowed a single entity to assemble a database orders of magnitudes more detailed.
You can see the early fruition of this by looking at how useful Google Now/map planning etc are.
Computers could be our all knowing assistants.
That's on a very surface superficial level.
Imagine the kinds of benefits when this all knowing entity is applied to medical issues. Real substantive progress could be made.
However, today's mismanagement of data could scare people off all knowing entities for good & end up severely limiting real progress.
As with everything else, all that data is rife for abuse if the necessary legislation & technical limiting/auditing regarding access/use are not first put in to play.
This is an interesting case. You're completely correct, but it's basically a special privilege that Jews enjoy (and that has spread to a taboo on asking about religion on the census). If you're part of any other demographic group and you don't want to be counted, you get yelled at for not seeing the grand scientific project of the US census in the proper light.
But I don't really see that a database of every Jew in America is more inherently suggestive of abuse (if you're a Jew) than a database of every black in America is (if you're black).
Any group with (a) a living memory of violent ostracisation and (b) some ability to conceal their group membership knows this.
Gay men and women, for example, would fight against a sexual orientation question on the census. (One wonders if Tim Cook, had he not been born a gay man in Mobile, Alabama, would be as sensitive to privacy issues.)
Side note: I used Jews in my example because we have a case study for them. In countries where databases of religious affiliation were kept, oftentimes for tax purposes, the Nazis took advantage.
> For the LGBT question, the exact opposite is happening: People who want a head count of gays and transgender people believe the data will then be valuable in influencing federal policies and spending on projects that benefit LGBT people—or, more accurately, to benefit certain LGBT organizations.
See also the effort by certain ethnic groups to split into a special MENA category on the census, despite having passed as white successfully for the last entirety of history.
It's not about ability to conceal group membership.
OTOH, anyone who says they clearly understand the implications of GDPR for their site has either spent a lot of money on lawyers or is lying. Let alone someone who has implemented it. Privacy by design requires deletion of data after legitimate interests and/or consent have expired, probably (!!!) in 3rd party systems. How, precisely, do you implement that?
Can you shadow-delete accounts for some period of time to allow users to change their minds? If no, what UI do you put on a "delete my account" button that has absolutely no undo, even in the 24h regrets period?
Do people have GDPR privacy rights over eg comments on YC that may mention them by nym?
Given the GDPR covers EU residents (not just citizens), as an American can I buy a plane ticket to Dublin and start requesting full data dumps? What rules are those provided to me under, and how do you make software that can do that?
0. You require the third party you passed the data on to delete data when you tell them. The third parties should tell the person that they now have their data, where they got it from, how they will process it and how to get in touch with their data protection officer.
1. You can but you must also allow someone to delete in full (assuming none of the many reasons to reject removal requests apply or you don't wish to exercise them).
2. This is murky, but probably not. There's a right of freedom of expression and information.
3. No, you have to be a resident not a visitor. You'd have to see how Eire define residency.
It's long but the language is far easier than American legalese. The implications depend on your site/service behaviors. An RSS reader is pretty trivial, interactive social media... less so.
> Privacy by design requires deletion of data after legitimate interests and/or consent have expired, probably (!!!) in 3rd party systems. How, precisely, do you implement that?
Privacy by design is a design philosophy, it might be a pain to refactor into an existing system but the design constraints aren't onerous.
If your "3rd party system" is something like AWS, just delete the data. If you're sending it off to some other service, they do need to be GDPR complaint (the law covers this situation).
re: legitimate interests, we partitioned our data. Access logs, for example: one stream gets anonymized for simple analytics, another gets dumped into in-depth weekly analytics jobs, and the final log stream outputs encrypted auto-expiring S3 files with strong access control for infosec purposes. When a user withdraws consent, we just stop logging new information. Truly anonymized data is OK, our in-depth analytics data is purged within 14 days, and InfoSec is a justifiable legitimate interest.
> Can you shadow-delete accounts for some period of time to allow users to change their minds?
Yes. GDPR does not require instant response. You should be transparent about what will be kept and how long, a clearly communicated 24h shadow-delete is completely reasonable.
> Do people have GDPR privacy rights over eg comments on YC that may mention them by nym?
This is a good question, I'm also curious about quotes. The recent Google case suggests both fall under GDPR.
> Given the GDPR covers EU residents (not just citizens), as an American can I buy a plane ticket to Dublin and start requesting full data dumps? What rules are those provided to me under, and how do you make software that can do that?
Assume everyone is covered by GDPR.
Except the GDPR is full of hand-wavy stuff. Who needs a DPO? What is "large scale" in that context? How exactly do you conduct a legitimate interest balancing test? Who is your lead regulator and under what criteria as an American company can you decide?
Also, people have a lot more 3rd party systems than most think. Think transactional mailers, marketing mailers, billing systems, payroll, zendesk, etc.
And even an RSS reader is scary. What if someone follows a series of blogs about HIV treatments, or internal trade union politics? If that means you could infer the person is poz or is a member of that trade union, you now have heightened scrutiny data in your possession.
GDPR has explicit provisions for all of these legitimate interests (notifications, clients, employees, customers). Most of these services are aware of and planning for GDPR, I wouldn't want to work with any that aren't.
> And even an RSS reader is scary. What if someone follows a series of blogs about HIV treatments, or internal trade union politics? If that means you could infer the person is poz or is a member of that trade union, you now have heightened scrutiny data in your possession.
Right, and I like that! Attempting to derive sensitive information should require consent, transparency, right to rectification, and stringent data handling requirements. It sounds like overkill for an RSS reader, but why the heck does an RSS reader need to do that kind of profiling in the first place? Maybe that's the right level of scrutiny and prior applications were unwarranted?
On the other hand, there are no concerns with simply storing the followed blogs.
> Except the GDPR is full of hand-wavy stuff.
Can't win, legislation is either micromanaged or hand-wavy... it's worth noting that some of the hand-waving is actually business friendly.
I'm not saying these laws are perfect. There is definitely room for improvement, but this is still a consumer win over the pre-GDPR wild west.
I never said the RSS reader is profiling. They don't have to be. Does the mere presence of the inescapable user data -- ie what feeds they monitor -- create heightened scrutiny, because someone else could infer with that data, were it to be leaked. It well may. I would seriously consider blocking EU users until this is sorted out.
Worse, the RSS reader could offer suggested feeds, and accidentally find themselves in possession of such data, entirely accidentally. Even if users were clearly asked if they wanted to see suggested data, or allow their data to be used to suggest feeds. They may not intend to derive sensitive data to possess it.
Or suggest you have a site like YC, and someone puts "hi, I'm poz" in their description. Tada, sensitive data.
The GDPR should have defined when a DPO is required, what a LI balancing test is, etc. Alternatively, the orgs could have pretended to be competent and issued guidance before -- oh right, they haven't issued final guidance yet. I'm sure 6 weeks is plenty of time.
Is it the hacker news mindset that you should need a law degree to launch a website now?
GDPR is so bad that I would rather make a HIPAA compliant service than a GDPR compliant service.
If that's still too hard, a summarized form can be found at https://www.gdpreu.org
For the internet to function, websites need your information. If you want to log into a website using Facebook login, Facebook needs to know what website you are logging into.
When you watch a Youtube video on someone else's website, in order for that data to be sent to you they need to know what your IP address is and they need to know what website you are viewing the video on.
This is how the internet works. You cannot access something from someone else's servers without them knowing what your ip address is.
Are we invading people's privacy when we log ip addresses when someone visits a website hosted on our servers now?
Right but the main question was whether they were getting data on people from data brokers, and they responded by answering a completely different question.
Agreed. But what if you don't want to log into that website, or maybe that site has no login at all. If it has a Facebook "Like" button, that site still sends a ping back to Facebook, letting them know you were there, and feeding Facebook's algorithm about your interests. Same goes for Google Analytics, a Twitter share button, or the growing list of tracking scripts which get executed upon page load without even showing a visual indicator that they serve some purpose.
I'd guess that the vast majority of visitors have no idea that by visiting a 3rd party site, they're feeding that visit to the list of 3rd party trackers you find on many sites today.
People aren't. That's why we have laws like Lemon laws --so everyone doesn't have to be a specialist.
I'm trying to understand what you people that are reacting to this Facebook incident so emotionally actually want done
Do you want logging ip addresses to be banned? Do you want cookies to be banned? Do you want embeddable html to be banned?
Because all I see is outrage without any real suggestions
"Facebook sent a doctor on a secret mission to ask hospitals to share patient data"
"Facebook held a special breakfast for drug marketers about recruiting people for clinical trials"
"How Facebook can ‘unblind’ a clinical trial"
I'm the same person as the other account btw, not trying to hide that.
Also I'm surprised how nobody has mentioned how important this data is for AB testing.
Saying "But they do it too!" is no defense. Though it does make the argument for stronger regulation.
But, whatabout-ism misses an important point - we cannot get everything all at once. Everything happens one step at a time.
Maybe Zuckerberg should do the same thing Larry Page did. Create a parent company for Facebook (maybe call it Library), of which Zuckerberg becomes the CEO. Then find someone else to be the CEO of Facebook. In addition, just like Alphabet and other "Bets", Library could have other "Books".
Then, instead of Zuckerberg CEO of Facebook, you have Zuckerberg CEO of Library.
The alphabet conglomeration didn't obfuscate similar issues, nor would Zuckerberg following suit. Facebook's issues are related to unfettered access to a user and their graph(s) and sensitive information. More importantly that they knew of the risk and did little in response.
Obviously Google and Apple have their own privacy issues to account for, but it's fundamentally different.