Hacker News new | past | comments | ask | show | jobs | submit login
Facebook Shadow Profiles [pdf] (cesifo.org)
324 points by Jimmc414 on Feb 18, 2022 | hide | past | favorite | 145 comments



That shouldn't be a surprise. At an old job with far smaller user base, we used to do something very similar and while the shadow profiles were not materialized in the form of actual profiles, the amount of data you can gather about someone just by having his/her number pop up in the contact list of several of your actual users is staggering: full name, workplace, city/neighborhood where they live, their partners, their hobbies and all sorts of things which you generally should not know. And we are not talking people who have their entire life published online, but people with little to no online visibility. Assuming that facebook is not doing something similar considering that their main business comes from profiling and modeling user behavior is just naive.


Seriously. Once a mobile app is granted access to contacts, it has your entire social graph. Every app since 2012 has been building these shadow social graphs.


Well before 2012. Started as soon as you could access the address book through an API - not just on the phone. First one that got a lot of bad press was Plaxo which launched in 2002.


Not forgetting sixdegrees.com (1997-2000). Sure, it didn't access your address book directly, but it did abuse your social graph. In my memory Plaxo was connected somehow (same founder?), but it's a long time ago, so I might be mistaken.


Amazon bought six degrees


"Every app since 2012 has been building these shadow social graphs."

No, not every app. There are still open source apps who usually do not do this and there are probably also some proprietary ones with ethics who have not given into it. But that this is the default and there is no big outcry is probably because allmost no one outside tech circles understands anything of it, or just throw their hands in the air.


Fun fact: Whatsapp (Facebook) was recently fined 75 million Euros for this practice under GDPR.

To be pedantic, that was one part of a 225-million-Euro fine. But that practice was specifically broken out as one of the line items, with a dollar amount attached.

And to be more pedantic, the fine wasn't for the practice itself. It was for failing to provide notice to the users whose contact info was harvested. In theory, this practice is acceptable under GDPR if the app immediately informs the user whose contact issue was harvested, and gives them an opportunity to reject.

The 225-million-euro fine is still the second-largest fine issued under GDPR, after Amazon's. The 75-million-euro line item were an individual fine, it would be the fourth-largest after (one of) Google's.


Does that mean Facebook will stop doing it? I find that hard to believe. I doubt the EU understands Facebook's code or servers enough to judge if they're complying.


With 200+ million it sounds like they'll be able to hire a developer.


There definitely is technical competence within the EU and also within the burocratic institutions. The thing is just, that the people in charge only listen to them, when it is convenient.


As an American I have to wonder what it's like to have technical competence within your bureaucratic institutions. We can't even get bureaucratic competence, technical competence is completely out of the question.


That was my point. I know there are people with technical competence in the institutions, but not necessarily in the top ranks who make the decisions. And for you in the US - I think so, too. I remember the statement of resign of a guy, who had to to reorganize the IT of your military. He did sound technical competent - but he failed at unavoidable command structures which where not and he could not change them.


Oh come on folks. If they do nothing "look at them lazies". If they give fines "look at them incompetents". What exactly should they do to stop those practices and look good in the eyes of HN?


GDPR fines are insignificantly small for companies such as Facebook. They will just keep doing illegal things.


I agree, though I suspect fines for "repeat offenses" might be less easy to ignore.


Then keep fining them 200 million until they pay attention


That's true, but more often than not, the benefits that, say Facebook, gains from violating gdpr outweigh the fines they have to pay. Otherwise they would have stopped doing it after the first such fine, yet they keep doing it. Clearly it's paying off.


And these get immediately sold on. I often buy contacts from online resellers who have obtained them through your number being in someone else's contacts who allowed an app to upload their whole address book.


> That shouldn't be a surprise.

This has nothing to do with it being a "surprise". That doesn't justify the action, the erosion of privacy. If your car gets broken into in a bad neighborhood, it not being a surprise doesn't justify the break-in.


I'm by no means justifying their actions but I'm acknowledging that it's not a precedent or isolated case. I don't have a Facebook account but I'm well aware they have tons and tons of data on me, even if I've blocked all Facebook traffic from my devices. The fact that they have so much data about me is to no fault of my own but an unpreventable circumstance. The only thing I can realistically do is just shrug it off


Wait, does the Facebook client have access to one’s contacts?


It does and uploads them regularly unless you disable it.

https://www.facebook.com/help/355489824655936/


Not by default, but they aggressively ask for access to it. It's usually framed as helping you find your friends already on Facebook.


If you explicitly give the app access. Unfortunately, it doesn’t matter. Your contacts may still give the app access.

When I left my previous company, I gave a few coworkers my phone numbers. Within 24 hours they showed up as suggestions on FB.


If you have the app on your phone, or any app that uses their sdk, yes.


> any app that uses their sdk

Do you mean any app that has a “login with fb” button?



That got me curious. How would hobbies be inferred? Is it merely from association?


Examples (and this is why I'm so picky about giving apps Contacts access):

* Friends that I went to graduate school with. Just based on our degrees (MLISes) and career titles, which are pretty public, and the fact that we keep in contact suggests certain types of hobbies we have in common.

* I grew up partially in some pretty rural areas, as did my sister. So which people my sister keeps in contact with from back then and what they're into is a pretty good indication of what she likes. Otherwise, why would they be in her contacts?

Basically, if your contacts have less scrupulous information practices than you regarding hobbies (e.g. your friend tells FB EVERYTHING) and they're in your phone, it's pretty easy with that info to peg you as 'Greg's friend from rock-climbing club who probably goes out with all of them for craft beer afterwards.'


Presumably association including temporal association around events that are described by others. If I text <shadow friend> frequently around the time I post about frisbee games and bluegrass concerts, odds are they are engaged in those types of activities with me.


Just like anything else: contact name "John airsoft team". It is a circumstantial but accurate assumption in most cases. For example I'm currently on the second hand market for a car, scroll through my contacts and you'd easily work out what I'm looking for


Facebook also has shadow profiles for non-users that connect directly to name, email address, and/or phone number. At the very least these seem to be populated when people share their contact list with Facebook.

I have long wondered whether they can match these two kinds of shadow profiles together. One profile with personally identifiable information. The other profile with detailed browsing history. That would raise a huge privacy concern especially since these are non-Facebook users and therefore people who have not opted into this at any level.


I was told by a friend that when they uploaded a picture with me in it Facebook auto tagged me. I have never created a Facebook or Instagram account. I want no business with that company. I am annoyed that I am still profiled by them.


In the grim darkness of the near future, nobody leaves their house without painting their faces with CSAM in order to prevent cloud services from storing their images.


People can be identified by way too many things - if you're Facebook and you buy all data available for sale and correlate and enrich it, then you're known forever and every step you make is also recorded and analyzed. Worst of all, they are those idiots who take pictures all the time and you eventually end up in pictures from many angles (all timestamped and geolocated). Combined with "anonymized" data from, say, carriers, you're no secret to anyone. But the biggest problem is that even if Facebook doesn't trace you, the governments will, and they have a lot more sources. Privacy is extinct!


Too late. And even if did that. that could likely be deanonyimized with little effort.

For those who don't already know, it takes very few data points to deanonyimize someone's data.

With greater effort and efficiency you can have even less data points.


Gait recognition already exists.


Moonwalk everywhere and copyright trolls will do the rest.


Wow, that's pretty dark.

On the plus side, pedophiles working at Facebook will have a great time. /s


Did you happen to read this today, or did you just come up with that purely by coincidence?

https://www.invenglobal.com/articles/16506/facebookmeta-exec...


I saw it yesterday. Yes, I was referring to that.


s/working/running


This could happen because previously people explicitly tagged you in their photos, so in new photos you're auto tagged

On one side it is weird, on the other side it makes sense. For example: Einstein is not an FB user. But if people upload photos and tag him it makes sense after a couple photos FB learns what is Einstein


FB is doing with facial recognition what Google is doing with image recognition for captchas.

Calling that "weird" is an understatement, considering what's going on there; Users are asked to identify friends and family for facial recognition, to train an algorithm that FB then monetizes for all kinds of advertisement and surveillance shittery.

The whole thing is imho way past "weird" and firmly in dystopian territory.


But you need a FB account it page to actually tag someone, no?


I killed my facebook account years ago, but back then you could write anything you wanted in a tag.


I think you missed the point about SHADOW PROFILE


Them and the credit reporting companies.


Well there were multiple ai's that existed in the 90's that could work out who was typing at the keyboard, one of them was about 1-2MB in size, the need for usernames and passwords is somewhat confusing when you know about that and other ai's!

When you consider the level of surveillance that exists today with the mobile phone, smart tv's vehicle tracking, cctv, jobs on computers you can tell when someone is running late and you can probably even work out why. Even if you dont have a mobile and live in a rural location, as you walk around our bodies interfere with wifi signals, we block them, so you could use routers like a phased radar array detecting temporary signal weakness and other anomalies, plus any cctv that is installed can be used to pick you up as many might have seen with crime stories or Edward Snowden. I've had PTZ cctv hacked and I know of other companies with PTZ cctv have had their's hacked because they have spotted them moving around inside their office as they were not hidden behind darkened domes.

I dont think people realise just how much surveillance there is today and the US Military have largely driven this at arms length, just like state assets have been ipo'd to keep them at arms length from the state.

However when I have reported stuff to the police, its surprising how MS Windows suddenly doesnt work and makes it impossible to access your screen dumps, thats why you need a basic digital camera with no wifi or bluetooth to capture whats on screen because even cdr/dvdr's along with rw's will get wiped.

I've even had my home accessed and old hard drives used as last resort backups wiped from a locked suitcase when on holiday one time and I only found that out weeks later. But that might be because it had allegations of Jeb Bush & other people rigging the elections in Florida for Bush to get a 2nd term in office. I dont know.


Meetup/LinkedIn invented this. Everyone does it.


A link that is shared on facebook gets an additional fbclid query parameter (interesting that google has it's equivalent glclid query parameter for links from google adds).I think that it is easy to get rid of this query parameter, when want to use such a link. (just adding, as I don't see that mentioned in the pdf)

I think that the purpose of this query param is to act as a substitute for cross site cookies: when you click on some link then the original url is further transfered as the http referrer header of the http request, if they can get hold of the logs of some affiliated site, then they can possibly track the flow of these fbclid attributes.

Interesting to observe, how every tiny detail is getting used for tracking purposes. I guess that browsers won't get rid of any specific query parameters when the url is passed as the referrer header, as that would be a violation of the protocol. However it may be possible to write a browser plug-in that does so.


I work in advertising. The death of cookies is a huge annoyance to the industry. you are 100% correct that gclid or fbclid are just a means to store "the cookie" in plain text in the url string. All the big programmatic media platforms do it too now (The Trade Desk, Yahoo, etc).


Advertising will change significantly in the years to come. All of this surveillance is not useful for both sides. I think we have reached a point where we can use better social networks and more fun concepts to connect with one another. Hint: I work at finclout.io


not useful? The ability to push targeted advertisements based on this surveillance is what is differentiating internet media from TV; actually cable TV is in steady decline, because they don't have this feature.


Targeted advertisement is an illusion in my opinion. We have been made believe that it is a game changer for advertisement. But does it really work?

I can only argue from running social campaigns in the past and while the charges to the credit card were real, the clicks from FB, Google, Twitter, etc did not relate to real world app downloads or website visits. Also there is research which at least to some extent backs up my opinion [1]

Just because someone talks about a vacation in Cancun that doesn't mean he/she wants or is able to go for a holiday in Cancun.

You are correct that TV ads are a shotgun approach to advertising. Yet, at least in my opinion, that made them more engaging in the past.

[1]https://www.techdirt.com/articles/20190530/10330742303/new-s...


thanks! I don't know much about the advertising industry, I just thought about the following question: if some site is hosting a pop-up advertisement, then the javascript on that advertisement can get the referrer header by means of its own javascript code. Now my question is: do the different advertising networks share that data between one another? I guess that it might be a big business, if yahoo ads would trade the obtained glcid parameter values with google, or vise versa.


I’m reminded of those Apple commercials where a person is wandering around town and is followed by increasingly large groups of people (e.g. the cashiers of all the places they were just shopping), creepily looking over their shoulders at everything.

This was never “OK” for people to be doing before the Internet/Facebook so why should it be “OK” now? Stalking is now stalking “with computers” so that makes it a novel concept?

P.S. That’s why you never give your phone number to ANY company, it’s too easy for them to connect a bunch of dots. And if a service one day decides to ask “for security”, interpret it as “for farming your data” and stop using them.


I've always wondered how this works with CCPA. I attempted to exercise my right to data deletion and was directed to log in. Do those with shadow accounts not have the right to have their data deleted?


I always thought this was an interesting thought exercise just conceptually (not even legally.)

If you and I are friends, is the knowledge of that friendship mine? Is it yours? Can I freely share that knowledge with someone else? The impact of this just became so absurdly large when we started saving those data points forever and mining them for all sorts of purposes they weren't originally intended for.


Technically that isn’t your data it’s their data (or your friend’s data).

It’s definitely a gray area.


If a company has your personal information, it is your data, no matter who uploaded it.


I'm not so sure it's so obvious, legally, that you own facts about you.


Not a lawyer, and I know nothing about CCPA, but under GDPR at least, this is unambiguously your personal data.


I think the point was that this data, the shadow profile, is derived from a friend’s pictures etc, and as such the legal copyright to the picture (that would need to be deleted) lies with the friend.

Or that’s how I understood it anyway, correct me if I’m wrong.


The copyright doesn't affect anything. The information about people derived from the picture is a separate artifact from the picture itself. That information is owned by Facebook, not the copyright holder of the picture, and its distribution is subject to a different set of laws. It counts as Personal Data under GDPR and Personal Information under CCPA, as well as biometric information under BIPA in Illinois.

The real issue is that CCPA can only be enforced by the California Attorney General's office. This means that Facebook's violation of CCPA is, quite literally, as much of a political issue as a legal one. (CCPA's private right of action only applies to data breaches, not to other CCPA violations.)


> We have access to individual-level desktop browsing data of a representative sample of the U.S. population via the market research firm Nielsen. Participants are incentivized to install a software that records all web browsing activity and fill in a survey of basic demographics, such as gender, employment, age, education, and income

This seems to be an inherently flawed collection methodology. The users that one would expect to be involved with these "install this software, download this app and earn free money!!" schemes would also typically be associated with certain activities that would not necessarily reflect the overall population.

The experiment groups themselves are flawed, but then again, I also cannot think of an ethical/legal way to conduct this kind of research.


Selection bias is kind of a HN go-to for dismissing this people don't agree with.

Bit something I learned in my college statistics courses is that with a large enough sample, self-selection bias stops becoming a problem.

It was a long time ago, so I can't explain the math here. But from what I remember, you need a surprisingly small sample size to actually achieve real representation.


> Bit something I learned in my college statistics courses is that with a large enough sample, self-selection bias stops becoming a problem.

What you might have learned is that small (random) samples might not represent the full population, but as you get bigger (random) samples they tend to get closer to the true values.

However, if you sample badly, errors will persist even when you sample more.

Example - estimating height in a population:

- If I get a perfect random sample of people then I can estimate population height really well, even with a smallish sample. The estimate gets better the more people I (randomly) sample.

- However if there's selection bias in my sampling and I only sample women, then no matter how many women I sample I'm going to be getting a bad estimate of height across the full population, because I'm excluding men who are taller.

Sample size can't overcome selection bias, you need to use other techniques to manage it.


Erm, no, self-selection bias never disappears or even attenuates with sample size, the best you can hope for is to reweight the sample according to known demographics of your target population, which does indeed get a bit easier with larger samples.


> the best you can hope for is to reweight the sample according to known demographics of your target population, which does indeed get a bit easier with larger samples.

That doesn't actually deal with self-selection bias if self-selection correlates with the feature of interest within the demographic groups, which is probably the normal case.


> Erm, no, self-selection bias never disappears or even attenuates with sample size, the best you can hope for is to reweight the sample according to known demographics of your target population,

Which, to be fair, is literally Neilsen's entire product (for TV at least). I mean, I guess that everyone here understands selection bias at a deep level, but to think that people who sell representative data to big corporations (and have done for longer than many of us have been alive) don't have a similar level of understanding is just weird.


The math is called law of large numbers. It might work well, badly or not work at all depending on the distribution and the sampling methodology.

Statistics as an area is full of gotchas, I never dismiss this sort of complain unless I have robust assumptions about the distribution being studied.


> The math is called law of large numbers

That reduces sampling error, not non-sampling error.


Who needs a distribution when you have a collection of means?


> Bit something I learned in my college statistics courses is that with a large enough sample, self-selection bias stops becoming a problem.

That's...not true, until you are so close to the whole population that your maximum error from excluding part of the population is less than the error that would otherwise be introduced by bias.

With larger samples, sampling error of a random sample is reduced, but non-sampling error is (with the above caveat) not.


If I'm reading the actual paper correctly, they looked at Nielsen data from 2016 and figured out that FB trackers fire for a large percentage of web traffic for both FB users and non-users. Their conclusion is that it's likely FB has the data to be able to build shadow profiles, not that they have any indication FB actually does build shadow profiles.

I would be curious about an update based on newer data. 6 years later, even more traffic is mobile where privacy protection is stronger and GDPR has companies more concerned about data sharing and trackers. I'm sure if you included mobile traffic, the trend over time is dropping (with a big dip when iOS 14 came out).


Yeah, I was really curious about the current state of things, but lost interest after reading the abstract, which was very careful to use "may" everywhere. It's quite a leap to go from "they could do X if they wanted to" to "they're doing X". FWIW, when I worked at Facebook the internal narrative was that shadow profiles were not a thing. Given that everyone had access to source control, one would think any enterprising engineer could easily contradict this, but I don't recall it being actively questioned internally (while many of the company's policies were).


Man, i hate finally admitting this on HN, but I spent a bunch of years as an FB employee, as a data scientist (on ads).

I can speak with complete sincerity and say that shadow profiles were not a thing, and were never a thing during or before my time (before, I can't be 100% certain, but I schlepped through the repos and never found examples of same).

What generally happens is that you pick a userid (maybe zero for arguments sake), and everyone who doesn't match to an FB userid gets that number. It didn't make it impossible to build an individual profile, but it made it much, much, much more difficult and I never saw anyone do it. I left in 2018, and would be massively surprised if anyone had built this since.

Now, it is entirely possible that not everyone was as rigorous as removing userid=0 (for example) and so some FB data probably counts them, and they may be in some of the clustering models but the notion that they have profiles indexed by browser/device id is completely false (for ads at least, some of the crap they did for PYMK was insane).


Zuck was asked about shadow profiles during his congress hearing. He replied "I'll have to get back to you on that, senator.".

There is also "Your off-Facebook activity" (I guess depending on jurisdiction) which shows me online stores that uploaded my data to FB for ad targetting purposes, sadly I use the same "junk" email for online shops and Facebook, and FB's page showed me a lot of businesses who gave it my data!


Businesses gave FB data that matched an email with a valid FB profile, so they stored the data and expose it to you in OFA. That's not shadow profiles, that's exposing the info they collected for your real profile.


> Zuck was asked about shadow profiles during his congress hearing. He replied "I'll have to get back to you on that, senator.".

I'm pretty sure that you could ask Mark about loads of existing ads products at FB and he would say the same thing, as he basically delegated all of ads to other people.


> Google... has shifted to using its Chrome browser to track online activities.

I really wish they were more specific about some of these claims.


The whole thing is wall-to-wall innuendo. There's no hard facts in here anywhere.


FB should fix the dozen or so shadow profiles of everyone's grandma that have been cropping up for several years now.


I would be interested to know how the claim here that 50% of all sites are tracked by Facebook compares to Google's profiling across the web...


Without looking up any sources, I would say that Google is even more pervasive. I would estimate around 80-90% of the websites include some kind of Google tracking.


I have been using DDG on mobile for about a few years. It shows you the most blocked trackers and google is by far the worst. Google trackers were found on 31% of websites I visit whereas Facebook trackers are at 4%. My results may be skewed because I never visit or view any content on Facebook, but I do the occasional google search when I can’t find something.


Only 31%? Are you sure? I mean, I block all Google domains, including fonts and stuff, but 31% seems low. Does that count things like fonts from Google domains and other things that could be used to track users? It feels like whenever I check my ad blocker's block settings for a page, I see Google stuff.


Note that this is on mobile where my searches aren’t all that unique. I mostly visit HN and a few other sites daily. I guess if you want more realistic results then I should look at my pc. I visit a lot more sites when researching/coding so that would be more accurate, but I don’t have the extension installed.


I'm not using DDG but I have checked how many sites use Google facilities which track. A lot more than 0.3.


Are you counting Google Analytics as tracking as well?


Of course. What else would you name it?


If I wanted to learn how to build an algorithm that can create shadow profiles from say a set of data/inputs, where would I best learn that short of working on this at Facebook?



I've seen many apps sending metrics to domains owned by Facebook while I was testing them. This does not surprise me at all.


This evil company going to shit should feel like good news to most of us regular folks.



Direct link to view the pdf file: https://docmadeeasy.com/v/003107877


Every now and then, on HN people complain about how the EU is boring with its stupid regulations (GDPR, mainly). But see, this is exactly the kind of things GDPR is made to prevent, and without it that's what you end up with.


It doesn't prevent much in this particular case though. I closed my Facebook account some 12-13 years ago and I recently contacted them and asked to see what data they had on me (under the GDPR, because of shadow profiles). They absolutely refused to neither confirm or deny that they were building shadow profiles but kept repeating that if I didn't have an account, they didn't hold any data on me and then followed up by answering that if someone let Facebook rummage through their contacts and I was among them, then they'd of course hold that data until that someone decided to delete it... So, yeah...


Law alone doesn't prevent anything by itself, enforcement does.

Facebook's answer is likely illegal (Not a lawyer), get in touch with your local privacy defense group.


I watched the Zuckerberg Senate hearings back in 2018. Transcript here [1].

I remember his answer on whether users are tracked when logged off. I mean the answer can really be a very simple Yes. But instead we got this evasion (I lightly cleaned up):

WICKER: One other thing: There have been reports that Facebook can track a user's Internet browsing activity, even after that user has logged off of the Facebook platform. Can you confirm whether or not this is true?

ZUCKERBERG: Senator — I — I want to make sure I get this accurate, so it would probably be better to have my team follow up afterwards.

WICKER: You don't know?

ZUCKERBERG: I know that the — people use cookies on the Internet, and that you can probably correlate activity between — between sessions.

We do that for a number of reasons, including security, and including measuring ads to make sure that the ad experiences are the most effective, which, of course, people can opt out of. But I want to make sure that I'm precise in my answer, so let me ...

WICKER: When — well, when you get ...

ZUCKERBERG: ... follow up with you on that.

[1] https://www.washingtonpost.com/news/the-switch/wp/2018/04/10...


The senate hearings are just theater. Important questions like this should probably be async - compel the companies to provide specific answers on the record.


Especially Feinstein grilling him about privacy, when she votes for and often even sponsors every privacy invading bill and three letter agency out there.


It's not necessarily inconsistent to hold the position that government should be able to invade privacy in ways private corporations should not. As a concrete example, search warrants for crimes.

That said, I'm of the opinion Feinstein is largely senile at this point, and wish she'd retire.


She has supported NSA collection of information on all citizens of the US. There is no reasonable argument for this and it's not even in the same universe as search warrants.


Again, though, "the NSA should be allowed to do this" and "Facebook should not be allowed to do this" are not inherently contradictory.

I share your opinion that the NSA's surveilance is bad, and I'd assert it's unconstitutional, but the hypocrisy/contradiction you're trying to highlight still isn't necessarily there.


We'll have to agree to disagree. I can't square the logic that it's unconstitutional for the NSA to do this and legal for Facebook and yet somehow it makes sense to argue that it's ok for the NSA but not Facebook.


I'm a bit baffled as to why not.

The government can kill me; Facebook cannot. The government can imprison me; Facebook cannot. The government can require I pay taxes; Facebook can not.

It shouldn't be surprising when similar disparities exist on surveillance. The NSA's program has yet to be deemed unconstitutional by the courts, which is what matters.


Ok I get you on that front, the government has a monopoly on violence, might makes right, yes. I mistakenly thought you were trying to argue that somehow it legally or moralistically made sense.


Would you rather have all of your activity information utilized against you for national security or for the private profit of one corporation?


Except for the time she called the out the CIA for reading her email.


Oh yeah she's all for spying on everyone but herself.


Yeah that's the thing when an everyday American--and especially a political insider--thinks of espionage, they seem to think of it from the protagonical perspective, like they're going to be James Bond and they're going to have legal privileges ("license to kill") and they're going to have the technological upper hand always. They think spying will benefit them, that they're destined to win because that's how it has always been. Despite identifying much more with the underdog in most contests, Americans don't identify with underdogs in espionage. With exceptions, like in movies about guys beating the system because of a branch of the government "went rogue."


Absolutely. Zuck is the ideal subject, as of you asked him if the sun came up this morning the answer would sound evasive.


Being tracked when logged off/on other sites is a bit different than shadow profiles.

Users are absolutely tracked when logged off or on other sites through 3rd party cookies, aka the Facebook Pixel. If you go on a news site that has the Facebook Pixel, it will record that you went on their site. When you go back on FB, they will check that FB Pixel cookie and see what other sites with that same cookie you've visited. Through that, they can compile a profile of what interests to use to advertise to you.

Shadow profiles are a bit of a different story, since that would essentially be FB compiling a profile of someone that has gone to all these sites with FB Pixel, but doesn't have a FB account. That's entirely possible, and would make it so that if you do eventually make an account with a FB-owned product, they've already got all of that info on you to start targeting ads.

The most low hanging fruit and directly impactful way to prevent this as a user:

1. Use a browser or browser extension that blocks 3rd party cookies

2. Use an email alias service like Firefox relay. This allows you to generate a random email address for every site you make an account on, and all those email addresses forward emails to your actual email.

Using the same email everywhere is essentially the same as what FB Pixel does, it allows all these sites to share with data brokers that bob@gmail.com has made accounts at these other websites.

3. This is a bit harder/not as cheap to do, but the same applies for using the same phone number when signing up to sites, it allows data brokers/ad networks to connect accounts across dif sites to the same person. If it's not required, don't provide a phone number.

If it is required and the number will be used to send important info, try to use a disposable phone number service that forwards to your personal phone number. If it's required but the number won't be used for important communication, use a fake number like 123-456-7890


100% agree with strategy #3, except I'm finding it harder to implement:

1. Everybody wants your phone number these days, especially those you don't want to give it to. From whatsapp and signal that use it as your main identify, whether you want to or not; to social sites like Facebook or Twitter that MAY let you sign up without phone, but "flag" you for security on first login and require phone; to other sites whether gmail or otherwise that require phone to sign in

2. More and more of them these days send a text to phone to verify it belongs to you

I'm therefore finding it harder and harder to not give my phone to everybody (of course, "not using the product" is always a possibility, so I still don't have a twitter account and by all accounts my life is better for it :)


For sure, though at the very least the examples you gave are mostly other social networks. So at worst, it allows them to know what other social networks you use.

What you want to avoid is using your phone number when doing things like online shopping, since that's when more personal details about you can be connected to your number, and therefore to the other social networks you used that number with.


> Being tracked when logged off/on other sites is a bit different than shadow profiles.

It all tastes like chicken. The tracking mechanisms are identical. You are probably given some "Ad ID". And that Ad ID correlates with your facebook ID if you have a facebook account.

Calling it a "shadow profile" sounds sinister. But its just commonplace tactics that any ad network is going to deploy. Facebook just happens to have more information on you than others.


Ha, that is so funny Mr. Zucker! "the ad experiences are the most effective" -- Yes, effective for whom? Surely not the users, because their "ad experience" (even that expression in itself already makes me gag, omg, must be out of some marketing horror movie or something) is surely most effective, when ads do not show up at all. Ads are annoying, manipulating, and getting in the way of productivity. Surely Zucky did not mean to track users to get rid of those pesky ads, which are shown without actual consent.


It's such a broad cop out. Very much like the "we use (horribly intrusive surveillance technology that is complete overkill for any imaginable purpose today, is almost certainly illegal to collect without explicit consent, but we may find useful in 5 years when we trawl all the logs and mine it for a tiny modicum of profitable edge) to improve our products and services, and some other random long sentences that sound innocuous and something else, look at the fluffy bunny, (huge) *CONTINUE* button, pre-selected (tiny) not right now, doesn't even look like a button".

Improving the ad experience could mean they stick a probe up your rectum and see if your bowels move better or not, for all they care.


I've seen several ads in my life that directed me to things I was legitimately interested in. Certainly a hell of a lot more online than on TV.


That is not a good tradeoff for a massive surveillance state that pesters and obsesses over users, profiling them to a degree literally unimaginable in the time of Orwell.


I don't find the surveillance to be intrusive, so it's acceptable.

There are already several companies that build a credit history out of every major transaction I do. There's at least two companies that have parts of a full credit card transaction history on me. Almost every store I walk into has security cameras monitoring me. The level of surveillance I'm already living under is so high that if I had anxiety about that sort of thing I'd have run for the hills when I turned 21.

The ad surveillance networks are impressive in scope, but about on-par with their peers in finance.


> I don't find the surveillance to be intrusive, so it's acceptable.

Interesting opinion for somebody with nick called 'shadowgovt'. I get your point, but even as that its shouldn't be OK in any meaningful way. It can easily end up as a slippery slope that is extremely hard if not impossible to come back from, and the intrusion to ones privacy goes deeper and deeper till you have absolutely 0.

Nobody alive in this world has absolutely nothing to hide. Maybe ass warts or shape of sub-par penis, some rather unusual preferences or opinions on XYZ, body odor when sweating or locations of body hair, whatever.

We shouldn't have OK categories for intrusions to our most private parts of our lives, period. Terrorists, ad optimization, blahblah whatever, just nope. At least disabled by default and if one is brave enough just go ahead and enable it to get that 5$ discount. I feel very strongly that I don't want my children to live in a world like that, how can we fuckup with such a basic and important item.


I don't think we disagree on anything other than whether "the sites we read on a global telecommunications network" count as "most private parts of our lives." Every page accessed is a two-way conversation between you and a stranger's property. That's pretty shifting philosophical sand to ground an argument on the privacy of those conversations in.

Why does the stranger not have the right to let a third-party know you talked to them?


Yes, people tend to have different experiences in life. Ads to me are extremely disruptive, I cannot focus and read text when there are moving images and flashing colors surrounding it. I consider it an affront to accessibility for the attention-challenged, they are designed to distract and derail my train of thought, shouting "forget what you came here for and look this way"


As an accessibility issue, I'd recommend modifying your user-agent with ad-blockers, etc. A11y is something every site should support, but expecting the world to bend everyone's experience around particular accessibility needs is likely to leave one disappointed.


Of course I do what I can to adapt things on my end, but I will also publically complain that the "ad experience" is not a benevolent one.


When you consider that Zuckerberg lies all the time about everything, it makes sense


I think he doesn't lie, he just has a severely warped view of reality, and an incredibly inflated sense of his own capabilities. He won a lottery and thinks it was intentional.

His public blabber on ai and automation are a great peek inside his mind.

The mindset is "That problem isn't solved because I haven't worked on it yet." There's no self awareness or humility involved, and he can afford the apparatus to maintain that for the rest of his life.


He's not even close to this naive. He's a sociopath, as evidenced by even his earliest statements calling his users idiots for trusting him. I worked closely with Facebook around 2011 and everyone I encountered knew exactly what they were doing and that it was fucked up.

It's very difficult for those with ethical and moral standards to grasp that there are truly nasty people out there.


Both things can be true. He can be a sociopath who knows what he is doing and naive true believer of his own abilities in alternating time slices. Most crazy people are several contradictory varieties of insane all at once.


What reason to you have to believe that he is anything but a selfish liar only out for himself?


This is solid reasoning here. On the “lottery” side, imagine if he went to college (where he scraped student profiles) a couple of years before or after, his timing would be completely off and there would be no Facebook.


He won a lottery from stealing other people's ideas


Rhetorical question ahead: Wasn't this a conspiracy theory?What happened?

Well, short memory/attention span or straight out ignorance of the masses happened.When those ideas first circulated mainly by affected users noticing such practices(shadow-banning for example), those were the first to be burned at the stake.Then whistle blowers came and dropped some well intentioned crumbs first amongst circles of "techies", mainly anonymously.(it was much later on when 'actual proof' was given to the media outlets -- out of which the vast majority disregarded them --) Did not matter in the grand scheme though: employees were fired, media articles did damage control(>for< the company, most often than not; because a Company doing the damage control is partially admitting a degree of truth to the claim), and let's also mention the 'useful idiots' who believed the authoritative voices because 'history is written by the victors': which is now mostly a cyclic numbers game of how well one controls a narrative in the social network(/any other information channel).Considering the attacked entity is the social media platform itself, the discourse medium was inevitable advantageous and easily skewed for the platform to defend itself.The more principled either staid in the mud, fighting skeptics of the rumors, or moved platforms towards less censorious and/or anonymous places.Truth ultimately did not matter, the platform did the required divide & conquer to shift the attention.

It's really miniaturized politics, except it's actually worse: there's no democracy(well unless you're talking board of directors and such, but that almost never happens: such optics tank your stock).To quote the hypocritical statement of people who like authoritative voices and also 'like the free markets'[which by the way ideologically speaking is a contradiction, unlike you're by definition a fascist; Granted here we've substituted the authority from the state to the company itself]: "They're a private company, they can do whatever they want."; At the end of the day FB is already ~dead, and Zuck knows this.Some people (users who know the skeletons in the closet, the company, entities that use it to push their narratives) probably will continue to ride it out as long as the naivety of the vast majority continues.


NSA, Facebook, LexisNexus (neé Seisent), others, have uniquely identified and track everyone, in near realtime. There need not be any bots, sockpuppets, fake accounts.

Alas, fraud is Facebook's biz model. Effectively preventing inauthentic activity would reveal the lie. Better the bureaucratic kabuki, shielding Facebook (and others) with the respectable veneer of plausible denability.


for multiple reasons, this is not accurate. The actual situation is difficult, lets not use words like "everyone" "all" "always" for untangling the mess, right?


For the Americas, since at least mid 2000s, everyone is tagged and tracked in near real time. I'd be surprised (shocked) if those systems weren't global today.

How could it be otherwise?


Got proof, or "just trust me"?

How do people still disappear, or commit attacks like the Boston bombing and the authorities have to go on man hunts trying to figure out who they were?


Proof? You want proof for something that has to be true? Phones, accounts, habits, preferences, property, births, deaths. Everything recorded, aggregated. Everything. Do you doubt this?

Everyone's entire lifestream is archived. To leave no record, no trace, takes extraordinary effort and resources. Or to be completely off grid, living like a neolithic nomad; the same as not existing at all.

--

Even in 2000s, Seisent was being used to solve cold cases. Queries to identify suspects. Big data to reveal any one without an allibi.

It's like the 1,000s of rape kits, collected but unprocessed. Why? What possible reason can there be to not investigate? To not close those cases?

We can only guess. Perhaps the people responsible don't want to know.


I think we need a few more dots between "my birth certificate is in the county courthouse" and "everything I do is tracked in real time" to make the connection.


Near real time.


This event was ~9 years ago. Thats eons in the tech industry.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: