Hacker News new | comments | ask | show | jobs | submit login
Created a fake account and Facebook still figured out who I am. How? (reddit.com)
129 points by 99percentfound 18 days ago | hide | past | web | favorite | 89 comments



I think there's a much simpler answer than some sort of "deep browser fingerprinting" or other scary voodoo.

1) User requests account deletion. Facebook does not delete it, keeps track of his phone number via shadow profiles. (Mike is still mike@gmail.com, 800-555-1234)

2) User creates a new Proton Mail address. (Fakey@protonmail.com)

3) One of his friends adds that email address to his contact. (Mike, mike@gmail.com, fakey@protonmail.com, 800-555-1234). This user is on Instagram, Facebook, or has some mobile app that uses their analytics.

4) Facebook makes the association after scraping his friends contact list.

Essentially, his friends betrayed him, likely unintentionally.


This is 99% likely the case. Everyone thinks that we need technical voodoo to figure out who a user is.

No, people are just far worse at separating their “new” identities from their old ones than they think.

You don’t even need a shadow profile tracking previous info to do this. Just infer PYMK via a few friends he adds on the new account.


From the comments it sounds like the real answer is cookies: https://www.reddit.com/r/privacy/comments/am1hi0/a_created_a...


My one and only facebook profile was a complete fake and yet it managed to recommend I follow many of my real family and some of my social circle. My guess this was simply based on my location because the people it recommended were also family/friends of people who had used facebook from my house.


IP address, ever heard of that nearly impossible to anonymize thing?

Facebook can also build a profile based on loads of their logo image using the IP... Including rough geolocation if it is shared.


Hence my guess that it was likely based on location.


In the Reddit thread someone brought this up, and OP explicitly stated the protonmail address was brand new, and created solely for this purpose


Hi, I'm the original poster on Reddit.

It's a great theory -- and I can see how this could easily happen to someone - but it was not the case in this scenario. In fact, doing so would completely defeat what I was trying to achieve. The email address I used for Facebook was unique, not used for anything else, and not shared with anyone else. It was completely isolated.


So...cookies? https://www.reddit.com/r/privacy/comments/am1hi0/a_created_a... (suggested below by eridius)

And riverdan points you you didn't seem to clear browser cookies since you last used you old personal FB account?


Yeah, that's what the running theory is. Kind of angry with myself if that's the case. However, as I point out in my second update, the whole endeavor seems to be awfully tricky. One misstep and privacy leaks.


Maybe recreate it, but take out all the associations that people are suggesting here?

Spin up a new Windows VPS and create all activity over there and see if FB still connects you?


Yep, IIRC it was confirmed that FB crosses people's contact lists to match profiles to people.


This shows how just trying to protect your privacy alone doesn't work. Facebook relies on the unintentional betrayal of your friends and associates.

You might not have put your phone number into facebook, but someone you know put your number in with your full name and the facebook app harvested it. Facebook now has your phone number in your shadow profile.

Everyone less privacy savvy than you will inadvertently permit a corporation to harvest your personal information to profit from.


Facebook also matches people by who searches for whom. If one of his friends searched for him under the new name, then it can make the connection that way.


> One of his friends adds that email address to his contact.

And what a user creates a fake account without adding his real or current/old friends to it?


Here's another, this time with LinkedIn.

I took an ambulance ride. The attendant in the back talked with me during the ride to monitor whether I was remaining lucid. Did not know him, never met him before, never saw him since.

A week later, I was scrolling through LinkedIn recommended connections and saw a face I recognized, but could not place.... until I saw that they worked for the ambulance company that I had ridden with.

Did LinkedIn track both of our locations and figure out that we rode in the same vehicle at the same time? This is absolutely possible. Did LinkedIn use voice prints to confirm that his voice came through my phone and mine through his and therefore we had a conversation? Can do.

Did they?

Color me freaked out.


Most likely explanation: the ambulance attendant looked at your LinkedIn profile.


Attendant Google'd name of the patient -> clicked on LinkedIn profile. That would be enough to get them suggested to you.


On the drive to my location... makes sense.


You said you saw it a week later. He could've looked at your profile any time in that week to establish a viable connection between you two.


Why would it have to happen on the drive?


You could use it to find questions to check how lucid they are.

Equally they said something that piped the interest of employee to check after.


Similar thing happened to me with both Facebook and LinkedIn. I visited a bank and spoke to an employee for about 10 minutes about opening an account for my business. My location was switched off and at no point did we exchange numbers.

A day later, I saw that particular employee in my suggestions on both LinkedIn and Facebook

We of course had no mutual connections at all


You had facebook app in your phone -- what is it you're surprised about?


So what? There had been hundreds or thousands of faces in the recommendations you didn't recognize.


Whats the population of your town?


A 120,000 suburb in Silicon Valley (3,000,000 in contiguous cities/suburbs).

Start and end points both in the same city limits.


People suggesting outlandish, complex fingerprinting methods should read through the post comments. OP admitted they didn't clear browser cookies since they last used their old personal FB account. Mystery solved.


"complex fingerprinting methods" sounds spooky, but in 2018 that's just a pug and play library, equivalent to cookies.


Yup. And even if he did clear cookies, it would still be pretty trivial for Facebook to associate the accounts through other cookie-like persistence mechanisms. No need for fingerprinting, though they're probably doing that, too.


While the user might have issues (like cookies and other features) - I can guarantee you Facebook does all kinds of creepy stuff to identify who you are. Worse yet, once they think they've successfully identified you - they share your details with who they think you are. Personal example:

Recently I wanted to have a look at a few ex coworker profiles (who are not my friends on FB). I didn't want to use my personal account because then it suggests me to them (something I wanted to avoid, as I'd not been in touch with them for almost a decade).

1. I created a VM (Ubuntu 18.04 + Firefox + uBlock -> enabled everything in uBlock).

2. Tried to create an Fb account -> asks for phone number. I didn't want to be identified so I could not continue.

3. Tried another way to create a new account -> success.

4. Fb obviously tried to figure out who I am -> was unable to do so at that point -> Forced me to post a picture of myself (and suspended my account until I did and they verified it).

5. Posted a made up picture and got past the first hurdle

6. Fb asked me for a phone number -> Logged out and used another means to log in.

7. Fb locked my account and asked for another picture (did similar in Step 5 once again)

8. Looked up my ex co-workers.

9. Until now, I've not been identified, I looked up a friend's profile (this friend is also my personal friend on Fb). FB immediately identified me and showed up my entire friends list as suggested friends).

10. I immediately tried to delete that profile (took 30+ days and they asked for Govt ID).

I've had multiple fake FB accounts, and FB's fingerprinting and data sharing is insanely crazy - I recently logged out of one my fake accounts on iOS via Safari Incognito (no FB app, Safari is always used as incognito) - it showed my personal phone number in the log in field.


You'll need to change your IP as well. Something you have never logged in as to create the account. Sock puppet accounts can be associated by IP and social graph searches.


I used VPNs. But FB keeps a track of a lot of public VPN & Cloud providers and then throws a ton more "captchas" your way - asking for your picture, govt id, phone number verification etc.

As for searches - I've searched a lot of random stuff totally unrelated to my personal account JUST to throw FB off while acting like a real user (liking, reading, scrolling etc.).


It only takes one to five high probability data points to link you to the original identity near flawlessly. Something rare. Clicking on a nonpublic person is one of those.


You can use disposable phone numbers to receive text messages


FB blocks those numbers, I've tried those in the past. Almost all major services that use phone verification ignore those numbers out right or act like they accept it but either:

1. don't bother sending texts

2. shadowban you


I've had some iffy experiences doing that with FB, Google and Azure.

Azure wouldn't even let me use my actual main number because it happens to be a google voice number and they actively block voip numbers (seems they look up the CLEC info somehow).


OP was probably not a very technical user, else he/she would've understood that

1. They should've deleted _all_ relevant cookies (in the browser, as well as in the browsers cookie database)

2. There are many 3rd party companies that sell data packs that derive residential IPs from VPN IPs (we use some at work). A trusted/good VPN is a must

3. They probably came via the same User Agent (didn't mention changing browsers)

IP + Cookie + User Agent = Fingerprint (not a good one, but will work for Facebook's needs)


There are many more identifying characteristics

https://panopticlick.eff.org


Great resource, thanks


> 2. There are many 3rd party companies that sell data packs that derive residential IPs from VPN IPs (we use some at work). A trusted/good VPN is a must

Are you implying that ProtonVPN is not trusted/good? I'm seriously interested in what you know about this particular VPN provider as this is the one OP mentions he's was using.


I suspect that the VPN provider has nothing to do with it (ie. they're not getting the data from VPN providers ratting you out). Rather, they're linking your ips using third party cookies.


Yea definitely don't take that as me insinuating anything about ProtonVPN. However _absolutely_ take that as me saying "just because you're behind [insert VPN name], doesn't mean someone can't derive your actual IP" (however that may happen). There are companies that sell this service. Again, I specifically mention this because this is how we at [large, known ad tech company] deal with user VPN traffic


Can you say what companies? or at least how to find them? I tried searching but only found junk.


>2. There are many 3rd party companies that sell data packs that derive residential IPs from VPN IPs (we use some at work). A trusted/good VPN is a must

link to some of those companies?


What is a trusted/good VPN to use?


Take all advice on this with a grain of salt please. And never directly trust someone who says "use this specific VPN because it is good". Their threat model may not match yours, so good for them may not be good for you.

If you would like to learn a good VPN using the data yourself, consider using these tools:

This site has a detailed list of all VPN providers and properties of their services.

https://thatoneprivacysite.net/vpn-section/

And this site talks about some of the concerns you may want to have when picking a VPN service.

https://www.privacytools.io/#vpn


NordVPN


NordVPN is in a weird spot.

https://restoreprivacy.com/lawsuit-names-nordvpn-tesonet/

I use PIA but do your research.


NordVPN owners are weird, their side businesses are shady, but the VPN is completely legit.


There are tens of attributes that can be used to identify you and generate a unique id of it. Cookies are just the 101.

Canvas fingerprinting, extensions, screen resolution and etc.

I read it somewhere that FB is pre-creating profiles for those who haven't even created a FB account yet (Face recognition and etc).

Avoid using it.


I've heard that a man and women are having sex, facebook already pre-creates a profile for their future child.


I know for sure if you make a second account and sign into a device like a phone or tablet where you've already signed onto another account - it will realize the connection and start suggesting you friends from your original account.

I'm not sure if that's taking MAC address of the device or the phone number into consideration or what - but it's definitely a bit creepy


If he ever accessed FB by phone, that was the rat.FB is built into most popular apps, and those rat you out many times an hour...in some cases like Spotify, every song change.


Of course it's possible, they use tools similiar to https://github.com/Valve/fingerprintjs2 . I found a way to "confuse" such algorithms using a browser called Palemoon, yet there lot of factors that could affect anything, a simple mistake and everything will be screwed up.


If he uploaded his photo maybe it was facial recognition associating him with group pictures in his friends accounts.


Browsing just/mostly the profiles of people you know is enough, I suppose. You'd have to browse the FB randomly and hide true intentions in the noise.


Don't forget data-sharing relationships with major digital retailers


IP address, keystrokes fingerprinting, and more


cameradust, browser fingerprinting, graphic/sound card canvas, memory canvas mouse usage, keyboard usage, cpu serial number, MAC, router MAC, system logs, telemetry


Sounds too complicated, and half of that is not even accessible from the browser. He used just the browser.


BTW stop creating voting cadres, it is trivial to trace them back to the central login and ban that too


javascript has a lot of power, and it acesses more than the browser, and none of this is remotely too complicated, it happens on a daily basis. your browser does a lot more than you think. FB most definately does not stay in your browser when it gets its hooks on you. BTW modal dialog buttons do more than the label says, the old meme "what button do i push to hack someone" can be written to "what button do i click to let facebook hack me"

It soon becomes apparent that not a lot of people know as much about thier hardware, and or software as they think they do. Dont forget zuckerburgs roots, hes a black hat 101%

FWIW all you dvoters need to bone up on your skills if you think this, and more is too complex.


How would you access my CPU serial number or my router MAC address from JS? I would think you’d need a zero-day to accomplish this, and it seems dumb for Facebook to be burning browser exploits on fingerprinting when they could make much more money selling them.


> javascript has a lot of power, and it acesses more than the browser, and none of this is remotely too complicated

While you can finger print a browser relatively well with javascript, it does not have access to a number of the things you list, like MAC, router MAC, CPU serial number, etc.


when you click a modal button are you sure it does what is assumed? recall win10 upgrade scandal? (the X)


>when you click a modal button are you sure it does what is assumed? recall win10 upgrade scandal? (the X)

Hijacking a button to do something different is a matter of attaching a different event to it - which is exactly what you can do with windows forms for things like "do you want to save?" To access information that's not actually available is a whole other level. Why would you consider those two are equivalent?

edit: To include parent's question and clarify why it's trivial.


Too much "1337 h4x0r" stuff.

It's public knowledge (or even open source code) what browsers expose, and system logs or CPU serial numbers are certainly not part of that.


if you give permission from an admin login, it is trivial to take carte blanche of a machine, its trivial to convince someone to give you full access


You're implying that Facebook is - widely - using unknown privilege escalation exploits in browsers for tracking purposes? That's absurd.


of course the practice is absurd, but that doesnt make it non existent. a non trivial number of average users run around the web in an administrative account, there is no escalation required when a script is already executed with admin, or even root permissions. The rest is academic.


Could you provide the JavaScript code that would allow a modal button to access these items?


You should look at what microsoft did to sneak its way into a win10 install, you should also look at what google does to snarf permissions with a button by another name.

the breadcrumbs can start here.

https://apenwarr.ca/log/20190201

your smart kid figure it out youll learn more that way versus being spoon fed.

BTW im sure that posting such code here would be a criminal activity that dang and others would frown upon profusely.


Original:

Link?

Edit: >You should look at what microsoft did to sneak its way into a win10 install, you should also look at what google does to snarf permissions with a button by another name. the breadcrumbs can start here.

>https://apenwarr.ca/log/20190201

> your smart kid figure it out youll learn more that way versus being spoon fed.

That has zero technical information, just a lot of vague hand-waving. Give me something technical here.

New Edit:

> BTW im sure that posting such code here would be a criminal activity that dang and others would frown upon profusely.

You're claiming this can be done in JavaScript. If it can be done in JavaScript, it's not going to be illegal.


@mod team/dang, can you take a look at the origin of accounts ohWARisme, meetuu, and zucksablackhat? Highly suspect they're just alt accounts, and they're spamming this thread with... questionable quality discussion/FUD.


The best way to get the mods' attention for matters like this is to email them directly via the Contact link in the footer.


cameradust? that's clever if it means what i think it means, aka looking at artifacts in a photo caused by "gunk" on your lens.



I wonder how that's affected with post processing - most cell phone camera's seem to have a ton of post processing applied these days that you think would cause that sort of gunk to count for less.


yes it does, there is also acoustic analysis of keyboard noise, quite a lot and this all happens at the time of account creation and early account use until the FB AI thinks it knows who you are, there is no need for constant listening or watching,


So you're claiming that Facebook is using 0days to bypass browser controls on your mic to do acoustical analysis of keyboard noise when you sign up?


Do you have a source for this? I've done a reasonable amount of work on weaponizing these sorts of attacks and it's definitely nontrivial. I'd be shocked to find out that Facebook had successfully deployed them at scale.


nontrivial is in the eyes of the beholder. FB does a lot of things quite shocking, and uses zero days, and soc eng like jack the bear, dont forget zucks roots, he did this stuff from day one. BTW im not abou to distribute hack source on HN, pearls among swine goes nowhere here.


Let's please not hypothesize exotic attack capability without evidence. It makes it difficult to get people to pay attention when we really need them to be concerned about sophisticated adversaries.


after reading through this thread, i find some very concerning things. The use of the word nontrivial is one thing. Calling something nontrivial does not mean you have expertise, there are a number of analytic tools that are being called non trivial or too complex, and that means it is non trivial to the caller. The rest of us who use these techniques as often as breathing find them trivial to say the least. The lack of knowldege regarding just how extensively big data reaches into our platform is another. lets look at google, the practice of asking permission for something mundane like sorting pictures according to location can be accomplished by scraping exif data, but google uses this for an opportunity to turn on location tracking in the background. Facebook most definately does not refrain from background permission jacking. The tools and API's laying around in the open for any one to take advantage of is another. win10 telemetry provides an intimate snapshot of a win10 instance, and any one can use that telemetry to fingerprint your hardware. Even encrypted telemetry is a highly individual number that doesnt need to be decrypted to indicate an individual machine across the internet.

The title of this forum "hackernews" is concerning, there seem to be no hackers here at all. All of the "non trivial" actions here are non trivial to those that know very little about CS in general, or have an extremely antiquated perspective on the nature and extent of system penetration as it stands today. A "normie" is not likely to know anything about lockpicking, but even an apprentice locksmith finds it extremely trivial, its just a matter of perfecting dexterity over a couple of months to be quick and slick about it.

Facebook is one of the biggest threats to national security we have in our back yard, and the lackadaisical attitudes displayed here regarding security, only set that threat in stone and perpetuity.


>after reading through this thread, i find some very concerning things. The use of the word nontrivial is one thing. Calling something nontrivial does not mean you have expertise, there are a number of analytic tools that are being called non trivial or too complex, and that means it is non trivial to the caller. The rest of us who use these techniques as often as breathing find them trivial to say the least. The lack of knowldege regarding just how extensively big data reaches into our platform is another. lets look at google, the practice of asking permission for something mundane like sorting pictures according to location can be accomplished by scraping exif data, but google uses this for an opportunity to turn on location tracking in the background. Facebook most definately does not refrain from background permission jacking. The tools and API's laying around in the open for any one to take advantage of is another. win10 telemetry provides an intimate snapshot of a win10 instance, and any one can use that telemetry to fingerprint your hardware. Even encrypted telemetry is a highly individual number that doesnt need to be decrypted to indicate an individual machine across the internet. >The title of this forum "hackernews" is concerning, there seem to be no hackers here at all. All of the "non trivial" actions here are non trivial to those that know very little about CS in general, or have an extremely antiquated perspective on the nature and extent of system penetration as it stands today. A "normie" is not likely to know anything about lockpicking, but even an apprentice locksmith finds it extremely trivial, its just a matter of perfecting dexterity over a couple of months to be quick and slick about it.

>Facebook is one of the biggest threats to national security we have in our back yard, and the lackadaisical attitudes displayed here regarding security, only set that threat in stone and perpetuity.

I see meetuu, zucksablackhat, and ohWARisme all posting in succession in the same places - are you all the same user? edit: Threw in parents post in case of deletion.


We all send out packets with our MAC address, which is supposed to be unique, so it seems like it would be easy to correlate the two profiles, since the MAC address is the same.

Makes me think there’s not going to be an effective technological means of resisting tracking.


The MAC address doesn't leave the local network, an internet service can't see it. (Sometimes IPv6 addresses are derived from it, but any modern OS should use privacy extensions, which create randomized addresses)


The your local router strips the Ethernet headers (with the MAC address) before forwarding it out to another router




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: