
Facebook URLs Dataset Now Available for Academic Research - metahost
https://socialscience.one/blog/unprecedented-facebook-urls-dataset-now-available-research-through-social-science-one
======
echelon
So the researchers have to sign NDAs. Great, that certainly protects us from
leaks.

But what happens when the researchers are hacked or are negligent? That seems
to have a nonzero possibility.

I don't think academics are necessarily experts in security.

~~~
anchpop
The dataset is anonymized using differential privacy. In theory, someone
looking at the data would not be able to tell if any particular individual
were included.

------
mmbit
Research on small communities might be worthless.

------
rvnx
Seems like the worst idea from Facebook to have allowed third-parties outside
of Facebook to process the raw data (1 exabyte of data!)

~~~
stri8ed
Why?

~~~
amelius
Privacy.

Also: [https://techcrunch.com/2019/07/24/researchers-spotlight-
the-...](https://techcrunch.com/2019/07/24/researchers-spotlight-the-lie-of-
anonymous-data/)

~~~
Alex3917
The dataset is presumably going only to academic researchers under NDA. It's
not likely that any of them are going to leak the dataset.

~~~
ori_b
Why should I trust people working at universities with my private data?

~~~
nine_k
A better question: why would you trust a company like Facebook with your
private data?

To my mind, if you post something to a _social_ network, and twice as much to
Facebook, you should only post stuff which you won't mind posted openly on
every wall in your town.

~~~
rvnx
Because if done properly, employees interests are aligned with the interests
of the employer.

If you own significant stock options in Facebook, it's not very wise to to cut
the branch you are sitting on.

Additionally, external researchers have much higher risk, because they may
seek fame, or need money (employee are generally in a safe position, the
intern at a research group is not). An exclusive data leak can bring both
advantages.

Also, they may not have the same security practices as internal ones (deep
packet inspection, etc).

