Hacker News new | past | comments | ask | show | jobs | submit login

Comparing Cambridge Analytica, who harvested data though means that were not transparent to users (and for malicious purpose), to NYU has explained what data and why, AND has the consent of its users, seems disingenuous at best.



The point is that CA's data harvesting looked like it was transparent to users at the time they were doing it — which is precisely the appearance you'd expect a malicious app to try to convey.

The NYU project is probably on the level, but "they're probably on the level" isn't a very good security model at Facebook's scale.

More to the point, the FTC's 2019 Consent Decree [1] makes it fairly clear that FB is responsible for third parties' access to its users' data — and it would be prudent (from FB's point of view) to interpret this responsibility as also covering browser extensions.

[1] https://www.ftc.gov/system/files/documents/cases/c4365facebo...


For a project like this to happen at a major US university (especially once outside funding is involved), it needs approval of the university's Institutional Review Board. Getting IRB approval entails researchers proposing a strict set of guidelines for how the data will be collected/used/stored, examining the potential for harm to participants, and convincing a room of very very risk averse individuals that the project is safe and bounded in scope.

This is in stark contrast to CA. "They're probably on the level" because they have entire systems in place to keep them there.


The data CA used wasn't collected by them. They got it from a research project at Cambridge University's Psychometrics Center. This is exactly the same situation.


You are a little short on facts. Dr Michal Kosinski and Dr David Stillwell of Cambridge University pioneered the use of Facebook data for psychometric research with a Facebook quiz application called the MyPersonality Quiz.

Aleksandar Kogan was a lecturer at Cambridge who then built his own app based on Stilwell's and Kosinki's app and work. Aleksandar then turned around and sold his version to SCL - the parent of Cambridge Analytica. And the reason that Cambridge Analytica wanted his app was because it worked under the social network’s pre-2014 term of service which allowed app developers to harvest data not only from the people who installed the app as well those people's friends.

Stillwell also denied Kogan's request for access to to his and Kosinskis myPersonality dataset. So No the Cambridge Analytica data did not come from Cabridge University or the Psychometrics Center.

The NYU Ad Observatory's data is completely public and the intended audience of that data is journalists and researchers doing analysis of online political advertising. This is the polar opposite of clandestinely harvesting user data in order to manipulate people.

So no it's not "exactly" the same situation but rather the exact opposite.


From the Wired magazine explainer on CA:

"That data was acquired via “thisisyourdigitallife,” a third-party app created by a researcher at Cambridge University's Psychometrics Centre. Nearly 300,000 people downloaded it, thereby handing the researcher—and Cambridge Analytica—access to not just their own data, and their friends' as well."

https://www.wired.com/amp-stories/cambridge-analytica-explai...

re: "the exact opposite", you are putting a lot of weight on the intention behind this use. After the public response to CA you might appreciate why FB is going to strictly apply the rules.

But I generally agree that users running an extension in their own browser is a different situation than an app developer subject to the FB ToS and am not sure why FB would be allowed to block this.


Hi, I am David Stillwell. I can confirm that Kogan's app "thisisyourdigitallife" was his own endeavour and unrelated to the Psychometrics Centre. I'm not sure why Wired has written this now. They actually already wrote an extensive article about the Psychometrics Centre here in June 2018 if you want the real story: https://www.wired.com/story/the-man-who-saw-the-dangers-of-c...


Thank you for clarifying. Always nice to get first-hand information.


The "thisisyourdigitallife" was not developed by the Psychometrics Lab it was developed by Kogan(a lecturer at Cambridge University) who by then had formed his own company called Global Science Research Ltd (GSR.) GSR signed the contract with SCL Elections and sold the Kogan app to them. SCL Elections being the parent of Cambridge Analytica.

Kogan's app was based on the myPersonality app which was developed by Kosinski and Dr David Stillwell who did work at the Psychometrics Lab and denied Kogan access to their dataset. Cambridge Analytica and Cambridge University are not the same thing at all. So there is no comparison to NYU and Cambridge Analytica or Cambridge University for that matter.

Saying I'm "putting a lot of weight on the intention behind this use" is kind of a bizarre statement considering the data is literally available to everybody. See:

https://adobserver.org/ad-database/

The Project also clearly states:

">If you want, you can enter basic demographic information about yourself in the tool to help improve our understanding of why advertisers targeted you. However, we’ll never ask for information that could identify you"

And to that end the code for the plugin that the Ad Observatory project is used also freely available:

https://github.com/OnlinePoliticalTransparency/social-media-...

How much more transparent can you get than that? The goal of the Ad Observatory project is literally to try to understand how we are being targeted and manipulated. How is this in anyway the same as the secret harvesting of data by a political consultancy that billed itself as providing "election management" services?


That makes quite a bit more sense. Thanks for clarifying.

To the grandparent: A researcher selling IRB-protected data would be effectively ending their academic career and opening themselves up to a mountain of legal trouble from the university and anyone who participated in the trial.


To clarify:

WHAT they were doing with the data was not transparent. HOW they were doing the data collection was completely transparent.

The worst of both worlds. Which is to say—we're saying the same thing.

Univeristy research projects such as these go through extensive review. the univeristy is basically putting their name on the line for any research project that happens under their watch.

I'm not sure what you're advocating for. Is it that Facebook shouldn't be researched because they do not allow it? Not very sound reasoning to me.


Users have to install a browser extension in order to participate in the study. That's a way higher barrier than the personality quizzes that Cambridge Analytica used.

It also happens at a different layer of abstraction. Cambridge Analytica extracted data through the permissions framework that Facebook itself implemented.

Facebook's interest in its users' data doesn't need further explanation after you see that most of their profits derive from their control over it. The same control that allowed the profitable mass political targeting that these researchers are trying to study.


The researchers ask people to opt in tracking a restricted amount of data, and then install an extension that has access to their entire Facebook accounts.

There is no way for Facebook or anyone else to prove that the current or a future version of the NYU's extension won't scrape more data than people agreed to.


> There is no way for Facebook or anyone else to prove that the current or a future version of the NYU's extension won't scrape more data than people agreed to.

How so? The extension is open source, anyone can audit it.


the plugins are just javascript, so verifying that is actually a trivial task. You just open the plugin and read the source. NYU could also provide the code, to make it even easier.


You cannot verify that the researchers won't change the plugin to malware in the future.


You cannot verify that Facebook will not change its product to malware in the future. That it to say, at some point, you trust the software publisher in the same way you trust the service operator.


> You cannot verify that Facebook will not change its product to malware in the future.

I apologize for wasting peoples time, but I can't resist taking the low hanging fruit here.

Facebook is malware.


It's fine for you to trust the software publisher, that doesn't mean facebook should, especially when they're legally liable for data breaches that could result from it


Wait a second. How does Facebook trust Firefox? Microsoft Edge? Safari? The other 20 extensions I have installed, three of which save a copy of every single page I visit?

They don’t. They don’t, at the least, care about anyone’s data - they just phrase it that way to sound legitimate because saying “we want no oversight whatsoever” sounds whiny, and it is. (And so does what they ARE claiming to anyone who understands the technical side).


If facebook is concerned about data breaches from a browser plug-in why don't they just stop server the data to the browser? If the data is that valuable and easy to get it wouldn't be hard for someone to write malware that collects the data and phones home once in a while.


The whole point is that a major problem with CA was the scaled friend’s data collection. The NYU app scraping modality could easily do the same thing which violates the present FB consent/sharing model of you control your data going to or not going to third party apps. FB has to fight as hard as possible against such apps. Remember Clearview AI? If we want FB to fight CA and Clearview they must fight here as well.


> If we want FB to fight CA and Clearview they must fight here as well.

Or they could partner with NYU, offer technical insight to maintain integrity and privacy (me stifles laughter) and do everything to support researchers who potentially could help build trust in their platform.

Going after this group just isn't a good look if you're Facebook. If there are valid concerns then don't start with a Cease and Desist.


They might be willing to partner if NYU is willing to indemnify Facebook against any and all liabilities which may result. How likely is NYU to take on that risk? Why should we expect Facebook to take on the risk for NYU?


So is your opinion just that facebook just shouldn't be researched?


I don't really have a view on that, but I think researchers and universities should be held fully liable for the harms they cause, that way, they'll be more careful.

Some research just isn't worth the risk, but as an outsider, I'm not in a place to make that judgement. NYU could also insure against data breaches; in that case, we might get some good security audits.


Hang on. The whole chain of reasoning started with FB protecting users' interests through the permission system, which NYU ostensibly circumvented. How is it in the users' interests to indemnify Facebook?


If NYU internalizes the cost of all breaches (by indemnifying FB against harm), they will be very careful with the data, and prevent another Cambridge Analytica problem.


> The NYU app scraping modality could easily do the same thing

So could any browser extension with the ol' "read and modify your data on \*" permission. Or any browser. Or any third-party Facebook client.

There is a difference between being technically capable of doing a thing and actually doing the thing- especially in cases where the software authors are well-known and relatively easy to hold accountable. To say otherwise is a little bit goofy!


> especially in cases where the software authors are well-known and relatively easy to hold accountable

Like a certain lecturer and senior researcher at University of Cambridge?

https://en.wikipedia.org/wiki/Aleksandr_Kogan


Suppose NYU sent a person to sit behind every NYU participant, and take a photo of their screen each time it changes - that would be exactly the same as NYU is doing (except more expensive); the participant knows that and gave their consent. It is within their rights to show their screen to anyone.

They are just doing it more economically then sending a person. This is entirely unlike CA, which effectively, sent a person to go through all participants available information as quickly as possible while they weren’t looking and store a copy of everything.


Sure. So what exactly is the binding rule which Facebook should apply here?

Rsearchers can get access to anyone's Facebook data if people enable it? What about the ones in chinese universities? Or just respected universities? Which universities is that? How do we decide?

You're missing the point. There needs to be a black and white line, and whatever Facebook allows they're always being demonised, nobody gives them the benefit of the doubt.


> Rsearchers can get access to anyone's Facebook data if people enable it?

Yes. Where is the problem?


This is ironic. Cambridge Analytica was a university with an IRB collecting personal data, then later sold to foe-profits.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: