
Facebook Wanted Gizmodo to Kill Investigative Tool - uptown
https://gizmodo.com/facebook-wanted-us-to-kill-this-investigative-tool-1826620111
======
stingraycharles
As a background, the problematic part of Facebook’s policy is that, while end-
users are the owner of all content they produce, they [Facebook] essentially
consider all their HTML their copyright. They have argued in court (and won
[1]) that scraping essentially requires creating an in-memory copy of this
HTML before the user-produced content can be extracted, and thus is a
violation of their copyright.

It’s a very unfortunate precedent in my opinion, that limits pretty much all
forms of scraping. Copyright law should not be abused for this, but yet here
we are.

If the Knight’s institute manages to get Facebook to amend their policy that
is a great step in the right direction, but I feel that copyright should not
be abused to restrict scraping like this: the copyrightable content is not
what is being scraped at all, it is immediately thrown away after the user-
produced content is extracted.

This means there is no way for a user to even legally get their own content
extracted from Facebook this way, and on top of that, web browser make in-
memory copies of Facebook’s HTML all the time; heck, they even specifically
instruct browsers and proxies to cache a lot of their content.

[1] [http://www.knowmad.law/single-
post/scraping](http://www.knowmad.law/single-post/scraping)

~~~
amelius
This is a bit like selling pens, after copyrighting the chemical composition
of the ink, and then insisting that anything an author writes with the pen
belongs to the company.

How can this kind of reasoning survive a sane court of law?

~~~
code_duck
I’m thinking of it like Facebook is a book (well, a journal that’s a
compendium of everyone else’s writing). You’re only allowed to read from it
while Facebook holds the book and turns the pages for you. If you pick up the
book and turn the page yourself, you’ve taken possession of the book, which is
theft since they didn’t give you permission.

~~~
bo1024
I think that's absolutely how Facebook wants it to work (as well as many other
website operators), but absolutely opposite of how the process physically
works. Also, I wouldn't equate copyright violation (especially non-commercial)
and theft.

~~~
code_duck
What I mean is that is how Facebook wants it to work.

------
docker_up
I realized the futility of trying to keep my data private. I used to never
upload my contacts to any site, because I valued my privacy and my contacts'
privacy. Then I found things like my phone number being detected by Facebook,
and I realized that it didn't matter what I did. Any one of the hundreds of
people who had my phone number could upload my information without my consent,
and thus nothing I did mattered. I was infuriated but now I just give up,
because there's no way to control your own information.

Even something as private as DNA is no longer under my own control. A sibling
signed himself up for 23andme, and I realized that once he did that, I'm as
good as being up there too. My entire family tree can now be identified
because of that single person, which is scary as hell.

~~~
colechristensen
Everybody used to be in the phone book, a few people were unlisted and it
seemed a bit strange but otherwise you could just look anybody up in a
directory and call them.

I don't really understand the motivation trying to keep one's phone number or
address secret.

~~~
aroberge
I'm old enough to have lived with a phone number published and not have to
worry about it. Then came the telemarketers. Still, long distance calls were
expensive and such calls were relatively rare. Next, long distance calling
became effectively free and telemarketing calls were becoming frequent.
However, it was still only a dumb phone - so nothing to worry about. Then,
smart phones came along, storing personal information and being potentially
vulnerable to threats. They were also used for authentication. But hey, not to
worry, since it's just like before: everybody used to be in the phone book ...

~~~
24gttghh
First they came for the Landlines. I was not a landline, so I did not speak
up...

------
shawn
I once managed a Facebook account on behalf of someone else. Normally, you
would expect this to confuse Facebook greatly. Instead the opposite happened:
Whenever I was logged in as them, Facebook recommended me people that they
knew.

This means you can find out who someone knows if (a) they’re not already a
Facebook user, and (b) you create a profile for them. Whoopsie.

I kind of want to write a dystopian story about babies being bartered based on
their social network rating, which of course is derived from their family
history. We’re all reduced to numbers in the end.

hyper-reality: [https://vimeo.com/166807261](https://vimeo.com/166807261)

~~~
iamdave
_I kind of want to write a dystopian story about babies being bartered based
on their social network rating, which of course is derived from their family
history._

I'd have to go look for it but I swear there's an episode of either Next
Generation or Deep Space 9 that covers almost this exact topic to the letter.
Anyone else know what I'm thinking of??

~~~
shawn
Critical Care
[https://en.m.wikipedia.org/wiki/Critical_Care_(Star_Trek:_Vo...](https://en.m.wikipedia.org/wiki/Critical_Care_\(Star_Trek:_Voyager\))

~~~
iamdave
That's the one (lo and behold it was Voyager, after all)! Curious if that's
where your mind was when you threw the idea of writing that story of yours?

~~~
shawn
Star Trek was one of my biggest influences. I’ve been trying to think of a way
to modernize the series without cannibalizing Roddenberry’s ideals. Or go the
other way and do a Game of Thrones style Star Trek universe.

Also
[https://m.youtube.com/watch?v=Wkedd6A6_mU](https://m.youtube.com/watch?v=Wkedd6A6_mU)
was one of the funniest trek videos I’ve seen.

~~~
redbeard0x0a
Looks like CBS is bringing back Picard for a series -
[https://twitter.com/SirPatStew/status/1025840545216823296](https://twitter.com/SirPatStew/status/1025840545216823296)

------
isoprophlex
Quoth Facebook, from the article:

“We don’t expose this information via our API and we don’t allow accessing or
collecting data from Facebook using automated means"

Oh man what a world we live in. Facebook is free to leech your phone of every
last bit of personal data... but heavens forbid an end user or journalist
tries to learn something about Facebook.

~~~
ehsankia
I agree with your sentiment, but then if said journalist sold said data to
foreign government, then how different would that be to what happened with
Cambridge Analytica?

Yes, being able to access your data is nice, but you also have to balance it
with people trying to trick people into leaking their data for nefarious
reasons. Obviously, here, it's an open source project and we can see that it's
(probably) secure, but then where do you draw the line?

~~~
minimaxir
Facebook closed up a lot of API endpoints after the CA incident became big
(including _public_ page post data, to my annoyance).

~~~
ehsankia
And this is one of those APIs that is closed. This application is
intentionally going around it, and if any data ends up leaking, all the people
in this thread demanding access to data will be the first to attack Facebook
for not doing a better job at guarding user data.

~~~
freeone3000
It seems like they were able to get the data just fine, so they're not
guarding user data _now_. I think the best way for facebook to protect this
data is to not have it.

------
zaroth
> _Facebook disagreed and escalated the conversation to their head of policy
> for Facebook’s Platform, who said they didn’t want users entering their
> Facebook credentials anywhere that wasn’t an official Facebook site—because
> anything else is bad security hygiene and could open users up to phishing
> attacks. She said we needed to take our tool off Github within a week._

I mean, this is a fair point. It wouldn’t be the first time that a Github tool
was forked to surreptitiously send information to a 3rd party.

So they updated the tool to let users login to Facebook.com through the
browser and just hijacked the session cookie to gain access to the pages.

Since this is a program which runs on the users machine, and downloads
standard Facebook pages over their standard HTTP interface, I don’t see how
Terms of Service can differentiate between accessing Facebook services through
this program versus Chrome.exe or Edge.exe.

What makes one thing a user agent and another thing not a user agent?

~~~
throwaway427
I think the difference in intent and purpose between PYMK Inspector and
Edge/Chrome is pretty obvious.

~~~
BLKNSLVR
Yes, Edge/Chrome is for the drone users that are Facebook's bread and butter
whilst PYMK Inspector users have some semblance of awareness of user privacy
and the asymmetry of the situation.

Putting on my 'big business c*nt' hat I can totally understand why Facebook
wouldn't want PYMK users.

------
icu
I became deeply concerned when the article seemed to suggest that Gizmodo was
more worried about losing their Facebook page, or being sued, than sticking to
a high standard of journalistic integrity. By the sound of things Gizmodo was
saved by a Cambridge Analytica deus ex machina.

I can't help but think that there will be ever more regulation because of it.

~~~
eropple
Y'all know that that Facebook page is part of the income stream that helps
them keep the lights on for that journalism, right? And that getting sued is a
fast way to no longer being able to do that journalism?

The problem it should raise for you is letting overwhelmingly omnipresent
companies chill journalism in this way, not that the journalists worry about
being chilled.

~~~
icu
Not sure I agree, it's the job of the 4th estate to speak truth to power, that
is why it is afforded special privileges and protections.

If you predicate your ability to speak truth to power on a business model that
relies on the power structure you are criticizing you undermine your raison
d'être.

~~~
eropple
Cool. So what's your solution when private businesses are busily making sure
that those "special privileges and protections" don't mean anything to the
people who have to _hear_ you speaking truth to power?

~~~
icu
I hear your criticism... I personally don't have a solution as news media
isn't my area of domain. However, there must be a technical solution to the
problem. I've seen one or two HN threads discussing a decentralised web.
Perhaps that's indeed the solution?

------
patja
Since it is not using the Facebook platform or API, what is Facebook's
recourse? If it was using the API and there was a developer with a Facebook
account they could "hold accountable" then they could do something, but there
is no specific Facebook account associated with creating this software.
Ban/censor the Gizmodo page on Facebook.com? That would be an interesting
gambit that would surely backfire in the court of public opinion.

This is just like Facebook banning anything which tries to track unfriending,
another thing you can't do through the platform API and that they have
actively worked to oppose and suppress.

Let's take Facebook's argument to its logical conclusion which would suggest
that they should go on the warpath to ban all password manager applications
that store a user's Facebook username and password.

~~~
askvictor
Facebook's recourse is to change their HTML specifically to break this tool.
Which leads to a cat-and-mouse game.

------
driverdan
Link to the project's source: [https://github.com/GMG-Special-Projects-
Desk/pymk-inspector](https://github.com/GMG-Special-Projects-Desk/pymk-
inspector)

------
DubiousPusher
Is there a good reason not have a law that prohibits service providers from
mandating the means by which you form requests your machine sends to them.

I totally get limiting the volume and frequency of requests. That's fair.
Bandwidth costs providerds money afterall. But why really should you have any
say over how I formulate my requests? Whether they comee from your app, a
personal script I wrote or I hand form them in Fiddler, what's the problem
besides the service provider's control-mongering.

Am I missing something here?

------
snowwrestler
The DMCA provides a mechanism for the government to define some activities
that do not violate the DMCA--essentially, a whitelist of activities. It's
done by the Librarian of Congress and the exceptions last 3 years. This is how
jailbreaking became clearly legal.

It provides certainty for activities that, under the law, are arguably legal--
but making the argument in court would be very expensive and time-consuming.

What if a similar process was instituted for the CFAA? This would take some of
the reins out of the hands of self-interested parties like Facebook, and
provide certainty for people who are operating in good faith in the public
interest.

An example of this would be to say that it is legal for a person to use
automated tools to observe their own authentic interactions with hosted
software, and it is legal for other people to make such tools available to the
public. That would cover Gizmodo's tool, I think.

------
kodablah
Had Gizmodo not changed the login approach, could FB send a DMCA takedown
request in good faith to GitHub? I'm asking for a...um...friend that is
developing a client-side-only open source tool that allows them to automate
tasks on websites they visit from their computer, some of which may require
credentials.

Is the law settled with regards to distributing non-commercial tools to do
something a site may not allow in their ToS, or is it similar to doing it
yourself where it's fuzzy based on the reasonableness of the ToS and intent
and a whole bunch of other things?

------
uslic001
It is creepy how much data Facebook has on users. I created a fake Facebook
account using an email account not associated with my name and using a made up
fantasy name. I created it on a computer at a large hospital while logged into
the computer with a coworker's credentials. Facebook immediately suggested all
my family and friends upon creating the account. I still haven't figured out
how they did it.

~~~
beagle3
Did you ever log in to FB with your own account on that machine? If you did,
they leave a cookie there; even if no cookie, perhaps they used canvas
fingerprinting (It's frightening how many websites do -- there's an
about:config setting in Firefox that would let you know and block that).

Did you have your phone with you, with the FB app? If you did, they likely
have your exact location, and possibly also the exact location of the browser.

A false positive costs them essentially nothing; So they'll always offer the
50-or so highest scoring matches to you. It's possible that those were matched
to you because you had just logged out from the same public IP 5 minutes
before (even if it was another machine on the network), and there were no
other sources of information about that fake account.

It is creepy. It should be illegal, but unforunately, it's legal -- and the
governments are happy about it because it's often illegal for THEM to collect
all this data, but it's not illegal to let FB collect it and ask them for a
copy.

~~~
uslic001
I did have my phone with me. Did not log into Facebook at all with my real
account at the computer before making the fake account so no cookie.

------
spunker540
Even though the code does something possibly illegal (by violating copyright)
when executed -- can't it still live on github as free speech? (like an
anarchist textbook or 3d printing schematic of a gun or bitcoin core)? Can't
it just be demonstrative and educational, and if you download it and run it
you may be held liable but otherwise the code itself is fine and legal to
simply exist?

------
jefe_
Facebook knew they were dealing with a popular blog, yet it seems they decided
to approach the issue in the most expensive and inflammatory way possible. Why
not simply implement mechanisms for making the data more difficult to scrape
and let the app die on it's own?

~~~
rhizome
Scorched-earth policies serve as a warning to others who might try the same
thing.

------
sequoia
So who here is cutting the cord? I quit about 12 months ago, the dissonance of
using the service and knowing how bad it was got to be too much to handle. It
isn’t the end of the world! It’s actually quite nice to miss all the junk that
constitutes a majority of the networks content.

It may sound crazy to suggest techies could lead an exodus off Facebook, but
look at twitter: techies lead the influx _onto_ that service, so why not the
other way around? When are we going to put our money where our mouths are,
“vote with our feet” and leave Facebook (i.e. permanently delete your
account)? Otherwise all this handwringing seems meaningless- they keep
misbehaving because they know no one’s going to do anything.

------
1337cat
Hi, ex-FB'er here.

> Facebook has nearly limitless access to all the phone numbers, email
> addresses, home addresses, and social media handles most people on Earth
> have ever used.

They're overlooking an obvious one which is location. If you and a stranger
use your FB apps from the same restaurant the same evening, the stranger will
appear higher in your search results of someone with their name. It would be
reasonable for this to feed PYMK.

------
iamleppert
You have to feel a little sorry for them in the position they currently find
themselves in. If they allow journalists this kind of access and power, what’s
to prevent bad actors using it as well?

The very thing they are in trouble for is giving third parties (which would be
to include journalists) unfettered access. It’s kind of disengeneous Gizmodo
didn’t even recognize their own cognitive dissonance in this situation.

~~~
denzil_correa
> You have to feel a little sorry for them in the position they currently find
> themselves in. If they allow journalists this kind of access and power,
> what’s to prevent bad actors using it as well?

On the same lines, here's another question - what prevents Facebook from
turning into a bad actor and using the power it has?

------
hw
Is the data from PYMK subject to GDPR / available for export? As a user I
should have a way to archive and keep track of PYMK, whether it be a third
party open source tool, or via some other means?

~~~
bencollier49
Under GDPR it pretty clearly falls under "opinions we have about you", and
ought to be requestable. I'm wondering when the first GDPR case against FB
will substantively materialise, and it surely will. They're the obvious first
target.

------
arrty88
Might they also be scanning the wifi networks we all connect to and if they
see us on the same network as someone else N amount of times, go ahead and
make the suggestion?

------
jrockway
It was unclear to me from the article what the outcome was. Did they keep the
tool up and Facebook got bored and went away?

~~~
sergers
"Shortly thereafter, in March, Facebook’s world exploded, when it was revealed
that Cambridge Analytica had gotten access to the profile information of
millions of Facebook users, going through what was considered an “official
route” in 2012. Facebook stopped bothering us about our PYMK Inspector, and
the tool currently remains up."

likely they still have an issue with it, but as the article implies they are
busy with other issues/data scandals to make a further issue with the "tool"
at this time atleast.

------
echan00
I'm glad events as such surface to the public. While the merits of Facebook
are undeniable, they are an evil company.

------
jrgaston
Pretty rich given that violating privacy is Fb's business model.

------
se30b
These large tech companies hate people who tell the truth and expose their
evil intentions. Alex Jones got kicked for telling the truth. Richie Allen got
blacklisted to speaking truth.

------
jumelles
Wow. Fuck Facebook.

------
mudil
Google needs to be investigated as well. It violates people's privacy to the
same degree as FB.

~~~
sctb
Like we've already asked, please stop posting generically about Google and
Facebook. We're here to learn and this isn't getting us there.

[https://news.ycombinator.com/newsguidelines.html](https://news.ycombinator.com/newsguidelines.html)

------
eanzenberg
Meta: Who in HN is pushing this story down? It currently has 3x the points as
the "Facebook Field Guide to ML" story, currently at #1. They were published
around the same time and this currently is at #6.

