Hacker News new | past | comments | ask | show | jobs | submit login
Investigating sources of PII used in Facebook’s targeted advertising [pdf] (mislove.org)
79 points by jmsflknr 36 days ago | hide | past | web | favorite | 6 comments

> Fig. 6. In the privacy settings the user can decide who can look their profile up using the provided email address and phone number. The most restrictive option available is “Friends”. We find that even when the user sets the PII visibility to “Only me” and searchability to “Friends”, the advertisers can still use that bit of information for targeting.

> 4.3.5 PII obtained without user’s knowledge ... Facebook could learn users’ PII from their friends would be by scanning friends’ contact databases, linking contacts to existing Facebook accounts, and then augmenting the Facebook accounts with any additional PII found in the contacts database ... We found that the previously-unused phone number became targetable in 36 days, 13 showing that it had indeed been linked to the corresponding author’s account without their knowledge. Making this situation worse, the matched phone number was not listed on the account’s profile, nor in the “Download Your Information” archive obtained from Facebook [5]; thus the target user in this scenario was provided no information about or control over how this phone number was used to target them with ads.

> Making this situation worse, the matched phone number was not listed on the account’s profile, nor in the “Download Your Information” archive obtained from Facebook [5];

Huh, that sounds like a GDPR violation?

> We investigate a range of potential sources of PII, finding that phone numbers and email addresses added as profile attributes, those provided for security purposes such as two-factor authentication, those provided to the Facebook Messenger app for the purpose of messaging, and those included in friends’ uploaded contact databases are all used by Facebook to allow advertisers to target users. These findings hold despite all the relevant privacy controls on our test accounts being set to their most private settings.

Very frustrating!

The article assumes that the noise was added intentionally for obfuscation - however, for realtime size estimation facebook would have to rely on some sort of orobabilistic data structures, as sketches, and its wuite possible that the authors are observing the accuracy loss thats coming from these data structures.

One can argue that FB doesnt need to use probabilistic data structures for estimating the size of a small set of externally provided PII, but they probably need to keep them at hand in case they need to intersect with geo-demo sets.e.g one uploads a large list of emails, and wants to intersect that audience with the set of males in san francisco.

I found the statements about PII uploaded by advertisers confusing. The authors say “PII uploaded by advertisers to target customers via custom audiences” was NOT found “being used for advertising” but the whole point of uploading PII into custom audiences is to target them for advertising.

You have to read the details later, where they uploaded 2 different pieces of PII for a customer - one already associated with a FB user, and therefore targetable. The other was brand new PII. Only the latter was not found to be targetable.

So yay - Facebook doesn’t use rainbow table lookups to extract plaintext PII from hashes that advertisers upload. Gold star for them.

It sounds like privacy advocates are taking the position that "Facebook and private companies are using this private information to target me".

Another way to view the same thing is "Facebook makes a big database of adverts available to me to view, and I (automatically) choose to view the ones most closely matching my personal information".

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact