(1) This is probably a project by a small "research" group at Facebook. The goal of that group is probably to publish papers in a Psych journal or something like that about how they were able to correlate anonymized medical data with Facebook feed updates. Tech companies have these research groups for prestige, they are not central to the company mission.
(2) According to the article, the project was never actually started. So it sounds like a bit of a non-story.
(1) Or it could have been just another way to add a little more data to your profile. And even if you are right, I don't think you are but if you were, do you really want a company to do psychological experiments with you without your knowledge?
(2) Yeah, but it wasn't started because of the pile on. You can't give Facebook credit for not starting this.
Facebook has been screwing with people's privacy for years, so it's only fair that they are now criticised for years on end.
This has got to be a new Facebook apology meme: the pile-on.
Until Facebook and privacy is regulated in the US, like GDPR does it for the EU, the pile-on should continue. Must continue, no matter how uncomfortable it is for the pro-Facebook, pro-ads or pro-spyware people.
For me, the result of having HN filled with these Facebook non-stories is that I'm going to stop taking an interest in them. As far as I can tell the situation with Facebook is substantially the same as it's been since it started. Users willfully broadcast information using Facebook, and sometimes Facebook uses that information in ways its users didn't intend or expect. Among the recent slew of dramatic stories I haven't seen anything I found particularly surprising or shocking, so I've started to pattern match anti-Facebook stories as fluff. One day something actually shocking will happen, like Facebook leaking people's private messages or browsing histories. Hopefully when that happens we won't all have reached the point of ennui, boiling frog-style.
What makes you think this is a non-story? It is yet another example of a culture that feeds on lack of empathy and respect of peoples lives.
Yes, it is boring, yes it is repetitive. And it will continue to be that way until facebook gains a shred of decency. But I guess that we should just forgive and ignore because there are too may of them?
Maybe you are not the target of these stories, the public at large do feel that they are surprising and shocking.
> the result of having HN filled with these Facebook non-stories is that I'm going to stop taking an interest in them.
> the public at large do feel that they are surprising and shocking.
Not OP, but I don't think the point is that its a non-story to the public at large, I think the point is that its a non-story to anyone reading HN, which is decidedly not the public at large.
I get where they're coming from. If I go to reddit or local news and see a bunch of Facebook non-stories I would tend to discount that as the public finally waking up to this. When I come here and see a Facebook article I immediately assume it is important and I need to pay attention because the audience here is so different than reddit/local news. If I can't trust HN to filter for only truly important stories then I'll start to treat it like Reddit and look for a better source for truly important news, which would be a shame.
> If I can't trust HN to filter for only truly important stories
The HN frontpage is a pretty poor proxy for "importance." It's simply whatever the users of HN find interesting. AFAIK, there is no "only upvote truly important" stories rule.
That's fair. I was using important to mean "worth spending the time to read" as my interests are fairly well aligned with the HN community (at least as far as my interest in Facebook-related news). Though over time audiences change, so as more of these non-stories continue to proliferate, perhaps HN is becoming more targeted to the interests of the general public and I should accept that or move on.
I don’t know why you’ve been downvoted, you actually posed a serious point.
Two days ago, Facebook’s CTO admitted that the search-by-phone-number functionality has been used by bad actors all over to rake in public profile information from potentially all users.
We all barely flinched.
Facebook PR strategy is now coming straight out of the Trump’s PR book.
Is it not shocking to you how blatantly Facebook breaches people's expectations about the use of their data?
People share data with Facebook for a particular, immediate benefit to themselves. I share my location so my friends can see where I am, I post my photos so my friends can see what I'm doing and who I'm with, I share my contacts so I can find my friends, etc. In and of itself, this should be fine and safe to do. The problem comes when Facebook takes the data that was given to them for one purpose, in one context, and they use it for another purpose now or in the future.
I can't make informed consent when it comes to data, because the real value of data only comes from when it's aggregated with other data -- either my own over time, or other peoples'. I can't know what incremental effect this datum has when it's combined with everything else Facebook knows about me, and all their other users, and run through their current or future machine learning algorithm. So, it's impossible to know whether it's in my interest to disclose any particular bit of information to them.
Details that are innocuous to human eyes can be very salient to algorithms. I might disclose a set of data points and never make any connection between them. I might mention I feel tired on one day, and write with a negative tone on a few other days, and wake up (i.e, open Facebook for the first time in the morning) later than usual. Without knowing this, that's enough information for Facebook to make a confident inference that I'm depressed, an inference that amounts to discovering private information I never intended to reveal. Of course, they don't disclose that they know that about me, but they do use it against me. They may target ads for anti-depressant drugs, or they may invisibly bias my news feed to have more negative content.
That's even assuming I'm aware that I'm disclosing information at all. If I log into Facebook to see what my friends are up to, then close the tab and start browsing the web, Facebook knows where I go on the web any time I visit a page with Facebook comments even if I don't post any comments. The content of the page, combined with other data they know about me and the other visitors to the site, can be combined to make inferences about me, my interests and hobbies, my sex or sexual orientation, race, socioeconomic class, medical conditions, vices, and so on.
We can't expect every person to become experts on data analysis so they can fully understand the implications of disclosing their data.
They're hardly alone, just the most visible. Let's remember that many of our fellow hackers are likely Facebook engineers doing regular work, and are probably feeling pretty bad right now, even though the decisions behind what's going on is way above their pay grade.
Yes there's some responsibility due them as well, but perhaps we can remain civil to our silent peers?
>Let's remember that many of our fellow hackers are likely Facebook engineers doing regular work, and are probably feeling pretty bad right now, even though the decisions behind what's going on is way above their pay grade.
Didn't we establish that "just following orders" doesn't cut it decades ago?
Barely. I think FB engineers deserve more than “some” of the blame here. Surely they could see the negative consequences of linking health data into their graph.
The ones I know are proud of the fact they can pull in thousands of data sources. They will show off what they can do using big data, AI and other bingo terms with your data.
Agreed. If you can get hired as an engineer at FB, you can get hired as an engineer at a less shitty company.
If FB started hiring unskilled laborers and promised to train them up to be a software dev, I could see this argument having some weight. But, AFAIK, they don’t.
I think this story could appear harmless - or perhaps not. The potential gold mine of information that insurers and the health industry could get from this kind of data could be staggering, particularly when we think about claims or the price of the insurance policy.
This is what many of us have been suspecting for some time so this confirms our suspicions. I don't think it's a non-story.
I feel that if such research was genuine, I see no reason why medical professionals couldn't get this information in a wholly transparent manner without a middle man selling your data about potentially sensitive issues.
Quite telling that it has been 'put on hold'. That in itself is a story.
Quite telling that it has been 'put on hold'. That in itself is a story.
It's been put on hold because Facebook PR knows that people don't read the details of stories. By and large, they read headlines - and many of those are misleading at best. The issue with CA, for example, wasn't a data breach - the data they had was collected in compliance with Facebook's rules at the time. Yet many headlines and soundbites have used the term "data breach" throughout this incident.
So, when 90% of the population incorrectly believes, based on some soundbites and a couple of headlines, that Trump hired CA to hack into Facebook and steal their data, read their minds, and steal the election, you don't want to go forward with something else that might sound scary in yet another mischaracterized headline or soundbite. If putting this project on pause is a story, that story is that Facebook has a PR department that understands its audience....I don't think it says anything one way or another about this project.
"Data breach" means an unauthorised access or use of data. Cambridge Analytica was not authorised to access or use the users' data. Therefore, it's a data breach.
It makes no difference if the breach uses a zero-day exploit to access FB's database, or if it uses social engineering to get someone at Facebook to send them a hard drive, or if it's some researcher being given access under false pretences.
"Data breach" is a catch-all like "homicide": that term encompasses murder but also involuntary manslaughter, euthanasia, and capital punishment.
>It makes no difference if the breach uses a zero-day exploit to access FB's database, or if it uses social engineering to get someone at Facebook to send them a hard drive
It makes an enormous difference because it affects what the public should reasonably be afraid of in the future.
Scenario 1 (what actually happened): Facebook used to have bad app policies that were too permissive, and political candidates like Obama and Trump abused data obtained under those policies. They were changed 4 years ago, and this behavior has not been possible since then.
Scenario 2 (what the media is implying to get clicks): Breach! Breach! We have a breach! Highly paid hackers are breaking into Facebook, stealing your data, and using it to brainwash you! Facebook is incapable of securing your information and therefore we must ensure that they never get any information about anyone ever again!
So, your personal definition of a “data breach” notwithstanding, it is both alarmist and inaccurate to use that term in describing the CA situation. Where news headlines are concerned, the most commonly accepted definition of that phrase, which is being intentionally used to conjure up false images of scenario 2 above, is the only thing that matters.
> (2) According to the article, the project was never actually started. So it sounds like a bit of a non-story.
According to the article the reason for why it was never actually started was because of the "Cambridge Analytica data leak scandal". Very important distinction.
It sounds to me like the researcher wanted to use this data to identify at-risk groups and get them preventive care.
I don't mean "identify" as in, de-anonymize them, but to use data to figure who may need help (or advertised to -- same difference really).
This actually sounds like a good idea if that's the purpose.
It's not any different than retailers identifying cohorts they can sell to.
It also sounds very similar to what a lot of AI research does (in my limited understanding of it). Take known samples and use that to identify/predict other things...
Sure, all I see is that insurance companies will be the ones with the most to gain (economically) from this information, and thus the ones that will pay the most for it.
Insurance companies already have tons of relevant medical data on their clients, there's no reason why whatever FB can provide them is going to be miraculously more telling than actual medical data such as their current and past diagnoses and medical conditions.
Insurance companies are perhaps surprisingly often ignorant about their own medical data and what it means for their clients or patients (except for what it means in terms of revenue). Entire profitable businesses (who aren't insurers) sell analytics products based on these data to companies because insurers don't have the means or know how to.
> Insurance companies are perhaps surprisingly often ignorant about their own medical data and what it means for their clients or patients (except for what it means in terms of revenue).
You may very well be correct, but how do you know that?
> because insurers don't have the means or know how to.
Again, where do you get your insight on the inner-workings of insurance companies in general from?
I'm not disagreeing with you outright. I just wondering if you're guessing or you have actual knowledge to the one or more health insurance companies and their actuarial and analytics processes.
Yes, I'm sure Facebook's goals are entirely altruistic just as they were when they performed psychological experiments on hundreds of thousands of people without consent.
Why are we still giving this company the benefit of the doubt after years of blatant abuse?
Advertising corporations shouldn't be anywhere near my medical records, period.
(1) This is probably a project by a small "research" group at Facebook.
With blessings from on high, no doubt.
(2) According to the article, the project was never actually started.
Oh yes it was -- the FB spokesperson said it was in the "planning stages". That's quite definitely a form of "starting" (especially for large companies).
If a local waste disposal company were to acknowledge that it was in the "planning" stages of, say, a major incineration facility on that vacant lot down the street your kids used to play in... you wouldn't say this was "a bit of a non-story", now would you?
"Hey, major waste company, how about you partner with Facebook and let us sift through people's trash using AI robots before it goes in the landfill? It's for....uhhh, science...yeah, thats the ticket"
For years privacy-minded people have complained about the lack of recognition of privacy issues among the masses. Suddenly, story after story about aspects of Facebook's anti-privacy practices are being read by the public. This will lead to more stories about other data mining entities as well.
The more this keeps up, the more concrete privacy issues will be in the minds of many.
And yet, HN is full of dismissals. I've decided that to many here, the idea of being in an "elite", informed group is more important than the actual issues.
Given that you are not the target audience of these stories, are you really in a position to judge whether they have reached "dead horse" status?
>the idea of being in an "elite", informed group is more important than the actual issues.
Do you mean to say it's voyeurism, i.e. the draw of being "informed"? Sitting in the eye of the panopticon?
I think people just have a tendency to focus on the positive things that might be gained from data mining. There's just a lot of naivete about how compromised an ad-supported business ultimately is.
Only if the subgroup "people who are tired of all this privacy stuff" is representative of HN's readership. There are more people reading than will ever comment. That's true of any community.
I find it funny how we complain so often of the news moving too fast and major stories becoming irrelevant in a day or two, yet when a something stays in the news for more than a week people gripe over how much attention we give the topic.
many posts on HN about facebook are not even from within the last few months. some of them are from several years ago, but are only being reposted now for sweet internet points and bandwagoning.
> many posts on HN about facebook are not even from within the last few months. some of them are from several years ago
HN is just for "breaking news." It has an established culture of re-posting "old" stories if they're interesting or informative in the context of more recent events.
The situation regarding Facebook and privacy is very much unresolved. How can you call it a dead horse? Facebook is very much a large, living, dangerous horse. I think we shouldn't let up until the horse changes its spots or dies.
Both parties have hashes and source data. If they are able to match people in their lists, then they can find out who they are. So it is a matter of seconds I think.
It seems like some are forgetting the near 6000 word 'manifesto' where this was all outlined by Mark.
"In times like these, the most important thing we at Facebook can do is develop the social infrastructure... "
"For the past decade, Facebook has focused on connecting friends and families. With that foundation, our next focus will be developing the social infrastructure for community -- for supporting us, for keeping us safe, for informing us, for civic engagement,"
" I have long expected more organizations and startups to build health and safety tools using technology, and I have been surprised by how little of what must be built has even been attempted. There is a real opportunity to build global safety infrastructure, and I have directed Facebook to invest more and more resources into serving this need."
This is barely a year old. Did he not lay out a vision where FB powered our civic, social & community services?
They deserve a pile on. I don't want any social media company having any access to my health data, especially if I haven't authorized it.
(1) There is no guarantee they will accurately associate that "anonymized" data with my profile.
(2) There is no guarantee they will "do no harm" with that data. It's a way to run-around existing HIPAA protection and something a lot of organizations would pay for if they could.
Although the medical data itself may be "anonymized", surely FB is in a position to associate that data with actual people, given that they know so much about a person's schedule, location, searches and private messages.
Deanonymization of medical data is actually pretty easy if you know a little about your target (age group, height, a set of pre-existing medical conditions limit the set of potential people considerably).
Just an aside...of the many things I have entered into Facebook over the years, I am 100% certain that I have never given them my height or pre-existing medical conditions.
There may be some other ways to link it up, at least with a degree of confidence. It really depends on what information is shared from the medical community and by the patient on Facebook.
A couple points of speculation:
* Facebook may possess a machine learning algorithm which can estimate weight from pictures. Getting within 5 pounds would eliminate most other people.
* Facebook could make photos of you and estimated weights into a time series, and pair up appointment dates with photos shared.
* Given enough photos with you and other people, they could probably estimate your height reasonably well. We know height distributions by age and race. If you're a Caucasian 21 year old female and consistently on average 10% shorter than the Caucasian males you're standing next to, that gives some info.
* Many people have willingly given the familial relationships to Facebook (tagging people as mom, dad, cousin, etc.) which will only help in being confident of race and the various risk factors which are higher in each race.
* Facebook knows your gender, which cuts out about half of the people. Such a basic fact would almost certainly be shared by the medical community.
* Facebook either has your birthday or could estimate it based on how you look. Again, being 98% confident of your age +/- 3 years cuts out most people.
All these fuzzy signals added up could lead to a reasonably confident matching up.
Anonymous data release is difficult. About 87% of people are uniquely identifiable by their date of birth, zip code, and gender.
The scary thing about data is that a lot of it can be inferred from other innocent data, such as photos. Height, can be inferred simply from the photos and videos you upload to social media.
Yeah, it's the ole "encrypted in transit" (implying unencrypted at rest) corporate spin gimmick. The data is "anonymized"... By itself. But if you just so happen to have some kind of reference dataset or index, well...
In light of this news, the fact that Zuckerberg General Hospital exists is in some ways completely irrelevant/cosmetic, but is simultaneously kind of hideous.
I mean, I’m no FB fan at all, but why? Assuming it’s named that because he made a donation and he’s not using his leverage at the hospital inappropriately, isn’t it only fair to give him some credit when his surveillance money is used for good? Similar to Bill Gates.
The problem is that the number of inputs is limited and it's trivial to enumerate over the input values. Let's take a contrived example: We have the data of a small, entirely made-up island where only two families live, so we have two surnames. Let's name them Foo and Bar. Now, they have an entirely funny tradition, they all get first names based on the order in which they were born (1). So we have Firstborn, Secondborn. Let's also, for simplicity assume that each couple gets exactly two children. That gives us the following 4 possible combinations of names:
Firstborn Foo
Firstborn Bar
Secondborn Foo
Secondborn Bar
Let's assume that there are 10 million of those people and we hash their names with a salt, that gives us 10 million unique hashes. But to break each hash, we only need to try at most 4 times, that's 40 million tries. Hashing speed varies from hash to hash and the hardware, but good old md5
easily achieved a few million hashes per second on a stock CPU in 2012. GPUs are usually around two orders of magnitude faster (2). So in the worst case, your desktop PC could break all those 40 million hashes in a few seconds without breaking a sweat. Better hashes are slower, but with such a limited input space, even the best hashes are breakable.
The salt must never be a constant, the entire point of a salt is that two identical inputs do not hash to the same value. However, it must be stored alongside the hash, so that you can later verify the hashed value. Many modern password hash functions (bcrypt for example) do store the salt as part of the hash.
That's not the point. Salt is constant, but different for each entry. They can encrypt the salt and when they share it with hospitals, those can't reverse the hash but FB can. Doesn't it solve the problem?
FB seems to have (rightly) realised that this is one of those 'just because we can, doesn't mean we should' cases, especially given the current situation. It's a bit of a non-story, and if I squint I can see the possible academic value in it, but given how battered FB's reputation is right now it's definitely not the time to try something like this.
I am sure that the privacy issues could be overcome with a properly run experiment, however there probably needs to be some rigour around that (possibly more than what was going to be provided given FB's history).
HIPAA laws (which protect the privacy of patient health information) have some real teeth. If Facebook was not scared off by them, their partners -- the hospitals -- may well have been.
I don't understand. Why? What was the motivation to conduct such talks?
Regarding what Facebook does, according to Facebook connecting people, how does patient's medical data contributes to that? "You've been committed to the local hospital last week, connect to people who shared that experience with you?"
It doesn't make sense at all!
If I were a FB owner or in this research data I'll put all my skill to deanonmaze those HIIPA medical data because it's clearly a golden mine for insurance companies and pharmacy. This's really easy to sell those users and their data, because of high demand.
I was wondering something similar. If Party B acquires Party A's anonymized-but-subject-to-HIPAA data and successfully deanonymizes it, who is liable? If the data is deanonymized, doesn't this mean the data wasn't sufficiently anonymized to begin with and Party A has some liability? Is Party B also liable since their goal from the start was to deanonymize the data?
Hopefully ICA susceptible De-anonymization techniques are no longer HIPAA best practice. Or perhaps this is a study to prove that newer additive and multiplicative techniques, are also susceptible to De-anonymization attacks.
I have to sign documents saying who can view my info when I go for a checkup.
I'd find another doctor if Facebook ever showed up on there.
I would think it would be illegal for the medical side to share and for Facebook to use their massive data collection in this manner if it's not buried in their impenetrable privacy statement.
It seems like a perfect pairing given the rise of the "fitness trackers" that are so popular. They could build vastly better risk models for [to sell to] the insurance companies with access to "anonymous" health history combined with all the data that the fitness trackers collect.
I am curious about the "cryptographic hashing technique" being proposed. How does that work? Is it just a hash of the name / dob / other identifying info? Does it somehow include matching faces?
Facebook might have collected a lot of Data in India especially in Southern states where they advertised blood donations camp and asked people to volunteer.
They might have a lot of data on people by now as many of my ex-classmates joined for blood donation drives.
People don't care about privacy anyways(At least where I live until some one explains them the implications of it. :/
I cant prove that they collected the volunteers data though as I didn't take part in it.
The initial partnership was with Stanford hospitals. Many in the HN community live in the Bay Area and may have used these medical facilities. If you are concerned, other boards are suggesting patients can contact:
James Laflin,
Stanford School of Medicine Ombudsperson:
jlaflin@stanford.edu / 650-498-5744
David Entwistle,
CEO Stanford Healthcare: 650-723-4000
It could be an epidemiological study on aggregate populations for some communicable disease. In all cases the hospital side would be bound by HIPAA to anonymized any data they provided. Google does similar prediction studies based on search, and it is very valuable to the CDC for allocating flu vaccine.
Can we just put a pause on all the Facebook stories for a while? Lots of fluff pieces with no real content making it to the homepage from outlets that are banking off the current outrage.
I actually like it, it's been brewing for a while and for once news are keeping a story for more than 3 days. I get tired constantly changing topics and shifting focus. And of course, media has a huge revenge to get to facebook from facebook algos deciding which ones to promote and which not and to whom.
This is generally very true, especially for facebook, but there are cases of pure altruism and fraternity, such as open and free software. Some people really do want the world to be a better place and dedicate small or large amounts of their life's work to improving it for the others.
Some free things are actually wholesome. So we have to learn how to discriminate between whether something free is actually good for us.
Fully agree. But a large for profit company with investors has an obligation to turn profit. There can't be really true altruism within those rules unless what appears as altruism supports for profi goal of the company becuse they for some reason are aligned with your goals. If you are not paying it then its interests are not likely aligned with yours. I don't want to say for profit companies can't do good, because they do. You just have to be constantly vigilant that your goals and the company goals stay aligned.
Building 8, who is responsible for this project, is under the direction of the same person who wrote the memo about "questionable practices", incidental deaths and "connecting people".
The previous director quit after less than two years. There are videos on YouTube of motivational speeches for Building 8 projects. I watched one; it felt like cult-like.
One of these Building 8 projects, Aloha - a video chat device, was set to launch next month but they have sidelined it, for obvious reasons.
Apparently they took surveys and users did not trust FB; they were worried the device would be used to spy on them.
Then they considered marketing it as "a device for letting the elderly easily communicate with their families." They also considered selling it under a name other than Facebook.
Heres an interesting "letter to Mark Zuckerberg" from a professor of health informatics who has worked with the NHS for the last 34 years.1
It discusses the issue of "the creepy line" and how to manage it in terms of getting informed consent to use electronic patient records.
He suggests NHS has "25 years of data on 50 million people" but because consent is required they cannot extract much meaningful information from it.
He tells that in an effort to "get around" this problem, the government proposed the concept of "implied consent".
A former shipyard worker in one of the authors workshops evaluated this concept plainly as thus: "Clearly some London-based bollocks. Nobody implies my consent."
(1) This is probably a project by a small "research" group at Facebook. The goal of that group is probably to publish papers in a Psych journal or something like that about how they were able to correlate anonymized medical data with Facebook feed updates. Tech companies have these research groups for prestige, they are not central to the company mission.
(2) According to the article, the project was never actually started. So it sounds like a bit of a non-story.