

 Facebook Sues Data Geek, but That Doesn't Solve Its Privacy Problem  - cookiecaper
http://www.fastcompany.com/1603925/facebook-sues-data-geek-but-still-doesnt-solve-problem

======
petewarden
The 'innocent data compiler' is actually a HN semi-regular. _waves hello_

I've got lots of thoughts on all this, obviously, but I'm trying to collect
them all into a considered blog post. I'm happy to answer any questions I can
though.

And if you're interested in related code, you can check out my Google Profile
crawler over at GitHub:

<http://github.com/petewarden/buzzprofilecrawl>

~~~
bootload
_"... Facebook doesn't look totally evil though. According to Warden: 'From my
conversations with technical folks at Facebook, there seems to be a real
commitment to figuring out safeguards around the widespread availability of
this data.' ..."_

Hi Pete, I've been following your threads [0] with interest seeing how the
social graph can be interpreted in code. The above quote in particular stood
out though. I'm not sure if the _"no evil"_ tag applies where you can access
the official Fb API and developer program then expect good to come out of it.
[1], [2]

    
    
      “That pulls the rug out from a whole policy &
       technology perspective that the point is to give 
       you control over your information - because you 
       don’t have control over your information.” 
       
       Hal Abelson
    

It appears that while Fb is trying to tighten up public leaked information, at
the same time they allow access to the social graph API which is just as
potentially more damaging to individuals.

[0] <http://news.ycombinator.com/item?id=1199821> &
<http://news.ycombinator.com/item?id=1106859>

[1] PJF, "Dark Stalking on Facebook", <http://pjf.id.au/blog/?position=590>

[2]
[http://www.boston.com/bostonglobe/ideas/articles/2009/09/20/...](http://www.boston.com/bostonglobe/ideas/articles/2009/09/20/project_gaydar_an_mit_experiment_raises_new_questions_about_online_privacy/)

------
Groxx
TOS != legal right to block access to public data. _Especially_ when it's
crawled, where one is not expected to read the TOS. It's equivalent to having
a document posted in the bathroom of a restaurant that says that by walking
into the building you forfeit your right to see.

Too bad it didn't go to court, he could've countersued then. There could be
_definite_ personal damages because of all this, and because FB is flexing
legal-might it doesn't have.

And I cry bullshit on FB trying to protect privacy. Why then is everything
public by default when a new feature comes out? At _every_ step of the
embiggening process, FB has royally screwed over their users' privacy.

~~~
techiferous
"And I cry bullshit on FB trying to protect privacy."

Absolutely. That's why I rarely use it; I've read their privacy policy
(especially the privacy policy concerning facebook applications) and realized
that if I wanted something to be just between me and my friends, facebook is
not the place to do that.

~~~
joe_the_user
I too am dubious of Facebook and avoid posting anything especially private
there.

But being on Facebook, I also know that most of my friends are much less
careful about this and I am happy that Facebook makes _some_ effort to protect
their rather foolish trust. It would be better for them to protect themselves
but still...

~~~
Groxx
By picking something interesting, which anyone can _still_ do (and far more
invasive uses as well), and making a big deal of it? This is a publicity
stunt, more likely, because his work spread quickly, and _normal_ people
started to notice that this said things about them.

------
blahedo
Yes, because making _him_ delete the data definitely makes the problem go
away.

~~~
Frazzydee
Well it might make the problem go away if the problem is well-intentioned
research whose results are made available publicly.

------
jerf
I don't have the complete answer to this problem, but no discussion of
Facebook opening up their data set is complete without mentioning that there
has been a lot of work done on uniquely fingerprinting people on stunningly
small amounts of data. In Facebook's position, I would in all seriousness say
that I see no _reliable_ way for Facebook to release this data in any form
with the reasonable certainty (by legal standards) that the data will not be
used in a privacy-infringing manner. I can imagine some ways, but I sure
wouldn't be willing to guarantee any of them. It is possible and in some sense
perhaps even likely that it is not possible to have both a nontrially-useful
data set, and a privacy-respecting data set. Information theory is a harsh
mistress.

~~~
robryan
If it is from public profiles though it doesn't matter, you could just go
break privacy already by visiting a persons public profile.

~~~
jerf
A sufficiently large convenient aggregation of otherwise not-easily-obtainable
public data becomes a privacy hazard. (I choose the word "hazard" with care.)
Knowing that theoretically one could go find all fans of $PERSON_OF_INTEREST
with enough work is one thing, being able to type one query into your data set
and get the answer back in two seconds is another.

------
frederickcook
Meta: can we change the title? It is different than the title of the actual
article, and "does evil" is a pretty subjective term.

~~~
cookiecaper
Am sad to have seen this fulfilled.

Original headline was "Facebook does evil to innocent data compiler".

------
jrockway
Whenever I do crawling, I do it from AWS and I set the User-Agent to Google's.

Also, if you're up against someone that threatens to sue, the latency
introduced by Tor might not be too harmful.

~~~
joe_the_user
If you're trying publish your data in a scholarly work, you aren't going to be
able to conceal your identity or the origin of your data...

~~~
jrockway
And after you publish you work, Facebook can't threaten to sue you if you
don't delete your data. They just have to do it, which they probably won't.
(And if they do, the world benefited from your work already, so the damage
they can inflict is minimal.)

The idea is to keep Facebook from knowing what's going on until the last
possible moment, so they can't interrupt you in the middle of something. Once
you've published your paper, then they can know.

------
yesimahuman
Where does it say they actually sued him? It just says they _threatened_ to,
and he complied with their requests.

------
thasmin
It's possible that Facebook shut this guy down because they don't want
competition when selling their data. The notion that this information is
publicly and legally accessible may take away a revenue stream from them.

