
Zuckerberg denies knowledge of Facebook shadow profiles - peterkelly
https://techcrunch.com/2018/04/11/facebook-shadow-profiles-hearing-lujan-zuckerberg/
======
johnchristopher
> Lujan: It may surprise you that we’ve not talked about this a lot today.
> You’ve said everyone controls their data, but you’re collecting data on
> people who are not even Facebook users who have never signed a consent, a
> privacy agreement.

> And it may surprise you that on Facebook’s page when you go to “I don’t have
> a Facebook account and would like to request all my personal data stored by
> Facebook” it takes you to a form that says “go to your Facebook page and
> then on your account settings you can download your data.”

This is typically something Facebook should be prepared for considering it's
one of the use case of the GDPR.

It's highly likely they are prepared for that but won't show it before the
storm.

~~~
tareqak
Congressman Costello from Pennsylvania asked if AI will be used to recognize
the faces of "non-FB users".

Combined with the shadow profiles question by Congressman Lujan from New
Mexico, some of these questions are getting closer to what people need to know
and understand. Unfortunately, a lot of the rest of questioning seems like
blind man's bluff / Marco Polo [0].

It'd be nice if Congress can be asked to host a debate between subject matter
experts in the privacy domain (who are knowledgeable and can effectively
communicate this knowledge and their concerns to lay people) and tech company
C-level executives like Zuckerburg.

2:50 PM EDT: Congressman Duncan from North Carolina is going to give
Zuckerburg a small copy of the US Constitution.

[0]
[https://en.wikipedia.org/wiki/Blind_man%27s_buff](https://en.wikipedia.org/wiki/Blind_man%27s_buff)

2:59 PM EDT - Edited to add "to lay people".

3:02 PM EDT - Added "Marco Polo". Thanks to caabalis for the reminder.

~~~
heurist
> 2:50 PM EDT: Congressman Duncan from North Carolina is going to give
> Zuckerburg a small copy of the US Constitution.

Showboating... Not to defend Facebook and I admittedly have not watched any
testimony or questioning, but I'm sure these politicians made no mention of
FOSTA, SOPA, the PATRIOT Act, net neutrality, NSA domestic surveillance...
Zuckerberg could have turned this all around on them.

~~~
mygo
there's more than a handful of things that actually did happen that Zuckerberg
could have "turned around" on them. And he didn't. Don't you wonder why?

~~~
zie
No, he is clearly in damage control mode, trying desperately to keep his
monopoly without any major concessions.

Will be interesting to see how that works out for him, probably fairly well I
would guess.

~~~
rhizome
Really? By my read Congress is begging him to do their jobs for them, asking
the fox to design the henhouse.

~~~
zie
Indeed, They want, Facebook to "fix itself" to some meaningful degree. Except
Facebook has little desire for actual change, so like I said, I imagine very
little in practice will change, just a lot more "yay we <3 privacy" spam from
their PR department.

------
jasode
Some thoughts about the possiblities:

1) he's lying ... which would be dangerous since it only takes one disgruntled
employee to expose emails and work projects that merged in 3rd-party data
(e.g. Acxiom).

2) he's not "lying" ... by doing his version of Clinton's _" depends on what
the meaning of 'is' is"_.[0] Let's say Facebook has an internal system that
collects data on non-Facebook identities into a holding area. Maybe they
internally call it _" deferred profiles"_ or _" pre-activated profiles"_ but
not exactly _" shadow profiles"_. Well, MZ can basically interpret "do you
have shadow profiles?" as a hyper-literal question and answer with a hyper-
literal answer: _" no"_. Again, a Facebook insider would have to come forward
and reveal what the _shadow-but-we-dont-call-it-that_ profile actually is.

3) he's telling the truth by every meaning of the question. Facebook truly has
zero data on non-Facebook users.

[0]
[https://www.youtube.com/results?search_query=depends+on+what...](https://www.youtube.com/results?search_query=depends+on+what+the+meaning+of+is+is)

~~~
ynniv
He's not under oath, so the consequences of lying are limited.

~~~
briandear
Why wouldn’t they put him under oath? That just seems weird: the cost of doing
it is zero.

~~~
0x00000000
Possible that if they did every single answer would be "I don't know the exact
details, my team will get back to you"

------
TravelTechGuy
Just one question, following Zuckeberg's congress show-and-dance: are the next
CEOs to be called in front of the committee the ones from Equifax, Experian
and TransUnion?

Because the way I see it, they're collecting millions of "shadow profiles",
with no consent, no remedy, no removal procedures. Furthermore, they sell that
data to 3rd parties at will.

I'd also look into a possible connection between one of these and Facebook. I
imagine if someone crosses your social profile with actual credit records,
real name, address, car, etc. then you have nothing private left to protect
save the thoughts in your head.

~~~
lumberjack
Equifax may have your bank statements, and whatever sensitive numbers they can
collect, but they do not literally have have your inner thoughts, in the same
way that Facebook and Google have.

That is an important distinction.

Surveillance like this is harmful, either way. Not just the leaks, but the
actual data gathering itself should be criticized. Why is it that during the
Equifax leak nobody criticized the data gathering in the first place? It was
all about the leak. In this case I am glad that the data gathering is being
put into question.

But anyway, my main point is that they are collecting different sets of
private data.

~~~
JoshMnem
Are you sure about what data those companies have? Example: it seems like you
can't sign in to Starbucks wifi any more (via Google) without giving a real
email address that is verified through Experian's API. There is surely more
going on than consumers know.

------
Chardok
Lujan: It’s been admitted by Facebook that you do collect data points on
non-[Facebook users]. My question is, can someone who does not have a Facebook
account opt out of Facebook’s involuntary data collection?

Zuckerberg: Anyone can turn off and opt out of any data collection for ads,
whether they use our services or not

Is this not just an outright lie? You literally have to use their services to
turn off/opt out anything.

~~~
idunno246
Its probably technically true, if you trust facebook. They might respect the
DoNotTrack header. If not, most legit advertisers, including facebook, will
[claim to] respect AdChoices optout flows: optout.aboutads.info.
[https://www.facebook.com/help/568137493302217](https://www.facebook.com/help/568137493302217)
directs you to these sites.

Now, they obviously could lie and still record things. Also, this probably
doesnt include things like person A has phone number Z and person B as phone
number Z so theres some sort of link there, but presumably thats not used for
ads.

I worked for an ad company that respected the optout/DNT as best we could.

~~~
acct1771
When could you not?

~~~
idunno246
assuming you mean respecting settings. historical data from before you blocked
tracking wasnt explicitly deleted, but it was marked as ignored and rolled out
of the system after a month or so(lots of caching). id guess with fb having
orders of magnitude more data they do something similar.

DNT is easy, thats a header on every call so just drop it on the floor.
Unfortunately firefox was always threatening to make it on by default, and IE
actually did i believe...and basically all the big ad providers threatened to
ignore it if it went on by default. Management basically said we'd do whatever
the industry does, but is currently respected.

The problem with the opt out flow is how do you store that someone wants you
not to store information about them without storing info about them. The
standard way is drop a cookie that says dont track, but if you clear cookies
youd have to opt out again... which is shitty UX and unexpected. also, since
its cookies its opting out the browser, not the user or computer.

As a consumer, it sucks because you have no way of knowing if the ad/tracking
companies are respecting any of this. And while I'm positive we tried to
respect it, and im not aware of any, bugs can happen. That code was part of
the most heavily tested stuff since it hit critical paths.

On most targeted ads(including but not limited to on facebook) theres a little
adchoices(blue triangle) button. If you click on it you can see which company
targeted you, some info on what type of targeting, and a link to optout.

~~~
acct1771
Appreciate the lengthy response!

Are you still in the industry?

~~~
idunno246
np - theres plenty to be upset about but the ads stuff is pretty standard for
google and the like too.

I wish people were more vocal about the amount of info publishers leak by
embedding fb like / g+ like / etc buttons in their sites. Dont even have to
interact with it and fb gets to track that you loaded that page.

left a year or two ago.

------
bcheung
Isn't anyone who has Google Analytics installed on a page collecting
information and creating "shadow" profiles?

I watched the entire hearing, it was pretty interesting. I was surprised how
ignorant many of the senators were of the subject matter on hand. Some of them
clearly didn't understand the difference between Facebook and the Internet. Or
they thought that Facebook, Google, Twitter, were all the same thing, this
mysterious "Internet" and somehow Zuckerberg was in charge of it all.

Some of the questions were impossible to answer because they made no sense and
they kept hammering him for not answering it.

Then there were senators basically trying to guilt trip him into endorsing
their bill when he already said he supports the principle behind it but it
would depend on the details.

~~~
EquallyJust
Both Google Analytics and Facebook's interest based ads/"shadow" profiles work
similarly.

You can opt out of Google's interest based ads via
[https://adssettings.google.com](https://adssettings.google.com) and
Facebook's interest based ads via
[https://www.facebook.com/ads/settings](https://www.facebook.com/ads/settings)
without having to have an account with either.

I don't believe either "shadow" profiles are every associated with an account
or personal details even when you make an account on the associated service.

~~~
shock
> I don't believe either "shadow" profiles are every associated with an
> account or personal details even when you make an account on the associated
> service.

Based on what? Did you talk to someone in the know? Are you just speculating?

------
thirduncle
_Zuckerberg: Congressman, I’m not, I’m not familiar with that._

 _Zuckerberg: I do not know off the top of my head._

 _Zuckerberg: Congressman, I do not know off the top of my head but I can have
our team get back to you afterward._

This guy must have a whole team of people coaching him on what he can overtly
lie about, what he can obfuscate - and when he can't do either, how to change
the subject.

~~~
saagarjha
It's obvious why he's doing this: it gives him time to come up with a
response, in a non public setting, with the help of his legal team. There's no
reason for him to answer difficult questions while testifying.

------
deagle50
He slipped up yesterday and opened the door to the biggest issue of them all,
coersion and brainwashing of users for profit, and the Senator who caught him
in a lie didn't explore it further. No wonder they were testing if they can
make you happy or sad by manipulating your feed when they sell the OUTCOME not
just the impression.

[https://youtu.be/6ValJMOpt7s?t=2h36m49s](https://youtu.be/6ValJMOpt7s?t=2h36m49s)

~~~
dingo_bat
Link is dead. Can you post what lie it was?

~~~
deagle50
He was maintaining that all Facebook does is sell innocent ads. Then was asked
if Facebook gets a cut of 3rd party business or if they get paid based on
outcomes. He said no, but his example revealed that in fact they do sell
outcomes, which is what makes them so compelling to companies who bid. It's in
Facebook's financial interest to gain as much power of its users as possible
in order to achieve higher success rates for its customers.

This something the vast majority of users don't understand. The Facebook
experience is engineered to make you do things in a much more sophisticated
way than any other advertising business to date. They can recursively
manipulate your emotions to drive up their outcome compensation.

------
jandrese
I find it incredible how many congressmen seem interested in Facebook
implementing a rich person mode where you pay them money to not share your
data. They aren't worried so much about Facebook gathering data, but that
there isn't an exception for the 1%.

~~~
moate
"Listen, we've been working our asses off to create Rich/Poor tracks for the
Legal system and law enforcement practices, how do we get you in on that
action? Who do I have to bribe at facebook?"

------
alphabettsy
I’m glad someone is asking about the collection of non-user data and data
collection outside Facebook sites, but I can’t help wanting someone qualified
to speak on such technical subjects that would be able to effectively question
without a prepared list.

~~~
bcheung
It's an important concern but anyone who has Google Analytics, or any other
analytics tool, installed on their website is collecting non-user data. This
happens pretty much across the board.

There's also the concern about derived data, like some kind of user embedding
vector from clustering, that provides a lot of information about the user that
is not available in the raw source data.

~~~
confounded
I think this is a bad analogy.

If site-usage data is being collected via use of your site, then by definition
it’s on users.

That’s not the same thing as buying files on individuals who don’t use your
site from data brokers.

------
glup
Am I missing something or has the beacon functionality of the Like button not
come up yet in the hearing? I think it's critical to note that the Like button
can---and probably is---being used to build non-user profiles as well as user
profiles. The critical technical thing is that the _load_ of the Like button
is the critical signal; whether the user actually liked something is
secondary. It seems like Senator Gardner got close yesterday but had some
technical misunderstandings about just how powerful the Like button is.

~~~
humblebee
It came up yesterday. A question was asked wether or not Zuckerberg thought
user understood that Facebook had the ability to track every website they
visit when Facebook social media icons appear on those pages.

~~~
glup
That's what I was referring to regarding the exchange with Gardner: it was
talking about users and implied that Facebook had to be open at the same time.

>> Senator Gardner: if you're logged in to Facebook with a separate browser
and you log in to another — log in to another article, open a new tab in the
browser while you have the Facebook tab open, and that new tab has a Facebook
button on it, you track the article that your reading.

So it wasn't emphasizing that this tracking is ubiquitous and could be used to
construct quire information-rich shadow profiles. I still don't think the
senators are aware of this.

[1] [https://www.washingtonpost.com/news/the-
switch/wp/2018/04/10...](https://www.washingtonpost.com/news/the-
switch/wp/2018/04/10/transcript-of-mark-zuckerbergs-senate-hearing)

------
danschumann
He also denied knowing the political orientation of his content police(during
yesterday's Cruz question), however, he's Facebook, so he knows everyone's
political biases.

~~~
fra
Are you seriously suggesting Facebook use it's employee's private account data
as part of hiring & performance management?

~~~
testvox
Why would they not? Any other employer is allowed to, so why not Facebook?

~~~
fra
Facebook has a different level of access. Seems to me like that would be all
kinds of unethical.

~~~
einr
And clearly, Facebook would not use personal information for unethical
purposes.

------
dragonwriter
The title is extreme clickbait; he _acknowledged_ the existence of nonuser
data collection (and discussed one use of it), but denied knowledge of the
common use of the term “shadow profiles” to refer to the data thus collected.

------
thibautg
When I google my own first and last name, the 4th result is a Facebook “public
figure” page of me. I deleted my account years ago though. Creepy.

------
Springcleaning
Can your face be tagged when you don't have a Facebook account? I would be
very upset if this is true.

~~~
45h34jh53k4j
Yes. Ive never used facebook in my life, yet thanks to someone sharing a
picture of me 10 years ago, and tagging it, it knows my name and will auto tag
me in others.

~~~
B1FF_PSUVM
And holding data about people who never consented to it should bite them hard.

~~~
Springcleaning
No way to remove your data is even worse. I cannot opt-out without making an
account. I don't want to have a Facebook account.

------
TomMckenny
I sure wish Equifax were questioned about collecting information with equal
outrage.

------
drawkbox
Facebook OpenGraph v2 stopped the ability of api app developers from getting
friend data without their consent.

For the most part 'shadow profiles' are dead as of 2013-ish when OpenGraph v2
came out. Lots of scammy Facebook companies died because of this, if you
remember many game companies also suffered a bit like Zynga and threatened to
leave, they were given special access to still do it for a while and I assume
others like CA possibly.

With v2 they also shut off using global Facebook user ids and moved them to a
per app basis so you can't correlate outside your app, you'd have to correlate
with other identifiers like email or deeper data comparing.

Facebook also turned friend access into invitation lists where the other user
had to agree and join the app themselves before you could pull their data.

Facebook knew about shadow profiles and they already addressed it.

~~~
Lionsion
> For the most part 'shadow profiles' are dead as of 2013-ish when OpenGraph
> v2 came out.

> Facebook knew about shadow profiles and they already addressed it.

No, I think you're confused. IIRC "shadow profiles" are the profiles Facebook
builds of non-users based on data those people didn't share, like their
friend's contact lists.

The shadow profile issue has nothing to do with Facebook's public API. It's
all about what data they're collecting and what they chose to do with it,
regardless if the results are accessible to 3rd parties or not.

There are multiple _separate_ problems with Facebook, Cambridge Analytica is
just _one_ (and the one Zuck probably wants to focus on because it makes him
look the best and is not structurally threatening to FB's business model).

~~~
drawkbox
> _IIRC "shadow profiles" are the profiles Facebook builds of non-users based
> on data those people didn't share, like their friend's contact lists._

From a development standpoint I clearly defined the 'shadow profile' you just
explained.

Facebook provides OpenGraph APIs that app developers can access personal data
that is granted by a user. Previously to OpenGraph v2 you could access all
friend data and pull down the whole social graph for any friend level shared
data through one initial person, those were shadow profiles. For each
individual permission you only needed one friend to agree to it if the other
friend shared at the privacy defaults of friend access to everything.

That has been shut down at of OpenGraph v2, look back on that time where they
added it, lots of scammy sites were mad they couldn't build shadow profiles
anymore. I built games on Facebook at the time and this was obvious to many
game companies that were just there to collect data on users to sell not about
the game. Remember all the birthday apps and zombie infection games? All data
syphons. Facebook shut it down back then, not really because of privacy but
partly, mostly largely because they were scared other companies were getting
access to the whole social graph like many did including Zynga, CA, others.

Were 'shadow users' a problem, yes, but go try to pull friend data or personal
messages today, those are unique permissions you have to be granted by each
user that you access and for each action, you also have to undergo a Facebook
app review and justify use of them like you do with background geolocation on
Apple.

Part of this attack on Facebook is to bring in the government filter and
firewall [1] as an extension of FOSTA / SESTA and most of these issues are
resolved at Facebook already.

If you don't want a government censorship filter [1] that the ISPs and large
companies like Facebook will run, then understand that this is opening that
can of worms.

[1] [https://www.wired.com/2017/04/internet-censorship-is-
advanci...](https://www.wired.com/2017/04/internet-censorship-is-advancing-
under-trump/)

EDIT: for the people arguing details about 'shadow profiles'

> _So what is a Facebook shadow profile, and where does yours lurk?_ [2]

> _A Facebook shadow profile is a file that Facebook keeps on you containing
> data it pulls up from looking at the information that a user’s friends
> voluntarily provide._ [2]

> _You’re not supposed to see it, or even know it exists. This collection of
> information can include phone numbers, e-mail addresses, and other pertinent
> data about a user that they don’t necessarily put on their public profile.
> Even if you never gave Facebook your second email address or your home phone
> number, they may still have it on file, since anyone who uses the “Find My
> Friends” feature allows Facebook to scan their contacts. So if your friend
> has your contact info on her phone and uses that feature, Facebook can match
> your name to that information and add it to your file._ [2]

[2] [https://www.digitaltrends.com/social-media/what-exactly-
is-a...](https://www.digitaltrends.com/social-media/what-exactly-is-a-
facebook-shadow-profile/)

~~~
Lionsion
> Facebook provides OpenGraph APIs that app developers can access personal
> data that is granted by a user. Previously to OpenGraph v2 you could access
> all friend data and pull down the whole social graph for any friend level
> shared data through one initial person, those were shadow profiles.

So you're saying if someone uploaded their phone's contact list to Facebook,
you used to be able to access that data (about their contacts) from Facebook's
API?

Just to be clear, the person with a "shadow profile" doesn't necessarily even
have Facebook account.

I still think you're mistaken and trying to map "shadow profiles" to the
profile scraping that was done by Kogan and Cambridge Analytica.

~~~
drawkbox
'Shadow profiles' are anything that Facebook got without asking for direct
permission to get.

It could come through a friend list on Facebook back when you could do that or
through harvesting the contact lists on mobile. Both are shadow profiles and
both weren't granted access but Facebook got to it (or app developers did) via
a friend permission or from the native mobile apps contact list harvesting and
tracking.

The most you could get about a user that wasn't part of Facebook was their
email, name and anything in their contact list and they probably eventually
tracked that user through various means, many companies still do this and
other bigs did as well such as LinkedIn. Even on LinkedIn though, one initial
user had to use their app or their import feature to get to the 'shadow
profiles'. Most 'shadow profiles' came from initial granting by a 'friend' or
associate.

They are both 'shadow profiles', anyone that got their data pulled without
being directly asked is one. Facebook shut down the friend list and global
user id tracking/correlation nearly 5 years ago. Most of the problems were
back in 2007-2014 when app development on Facebook was bigger and games were
huge there because of this.

------
billfruit
Zs response was more to the tune that 'shadow profile' is not a term that he
or Facebook uses to describe the said concept, and not denying that it may
collect some data pertaining to non members.

~~~
rhizome
If so, that's a pretty legalistic interpretation of the question.

~~~
dragonwriter
No, the question was a follow-up after he explicitly acknowledged gathering
non-user data; the question about that non-user data was “So these are called
shadow profiles, is that what they’ve been referred to by some?”

It was literally a question about whether that _term_ was commonly applied to
the data Zuckerberg had already acknowledged FB was gathering. Answering “I'm
not familiar with that” because you aren't aware of the use of the isn't
“pretty legalistic”, it's the only kind of answer that makes sense.

------
matt_the_bass
“So you’re directing people that don’t even have a Facebook page to sign up
for a Facebook page to access their data”

That sounds like a simple solution. Yeah right.

------
noja
Can anyone find an interview or video where he (or his senior team)
contradicts him?

------
benevol
What a liar, it's getting more ridiculous every day.

------
paulie_a
TLDR: Zuckerberg continues to lie to Congress

~~~
dang
This comment breaks the site guidelines. Please post civilly and
substantively, or not at all.

You may not owe better to Zuckerberg but you do owe better to the community
here.

[https://news.ycombinator.com/newsguidelines.html](https://news.ycombinator.com/newsguidelines.html)

~~~
paulie_a
I am sorry to summarize that a billionaire is an outright liar in a
congressional hearing. There is no other way to put it. He might as well say
"I am not a crook"

