I mean, at this point I think everyone should just accept that at the very least their name, age, address(es), email(s), phone number(s) and screen name(s) have been fully leaked if you have ever had any kind of online presence. Not saying that's right or good, but at this point it's just a fact.
So if that's the case, I think we should move beyond really even trying to think of this info as private or a marker of identity, and we need to move everyone to more secure forms of identity verification.
As has been pointed out on HN before, "identity theft" is a made-up concept to make it seem as if you had something stolen from you, when the real problem is banks and other service providers do an absolute shit job of identity verification. They're the ones at fault, and they try to shift the onus onto you to fix things when they screw up.
Indeed, a social security number is pretty much the only additional piece of data to the stuff above that one would need to open up a bank account in someone else's name, and those have been leaked plenty of times too.
The government needs to make harsher penalties for banks and others that can ruin your credit, etc. because they accept all this leaked info as "proof" of identity.
The problem is completely the opposite. The flaw is the existence of social security numbers. Prohibit them from being used for anything but social security.
Then there is no "your social security number" or "your identity" for someone to open a bank account against. You open a bank account and they give you a bank card and you set an address and phone number. The day you open it, they shouldn't need to know who you are, because it's a new account with $0 in it. After that, the owner of the account is the person with that bank card who can be reached at the address and phone number given when the account was created.
Get rid of centralized identity and there is no centralized identity to steal.
None of those rules are worth the cost. Identity theft, created by the concept of centralized identity, costs billions of dollars. Investigations of those other crimes are still possible without government-mandated privacy invasions by banks, and the privacy invasions are a huge cost in themselves.
Why not FIX how an SSN can be used? It was not created for this purpose but that doesn't mean how we use it can't be fixed.
Any time someone attempts to use your SSN to identify themselves as you, you should be notified and your authorization should be required for that use to be allowed.
And the higher the value of the authorization, the more care should be required.
Companies are able to do this already with 2 factor authentication. And I think we SORT of have that for change of address, as the Post Office both requires ID and notifies you by mail. Maybe that's really just one factor.
Allow people to require as many additional factors as they want. Confirming my identity in 6 ways when I decide to sell my house sounds good. 4 ways when I buy a car. One or none is fine when I make a $10 purchase. Let me decide.
And let them CHOOSE what organization manages authentication. A private company might do the job a lot more effectively than the Post Office.
Add human methods too. A call from a sibling or child might be one people could set up. Require validation by a notary public.
> Any time someone attempts to use your SSN to identify themselves as you, you should be notified and your authorization should be required for that use to be allowed.
So now the government needs a way of contacting you. Suppose they have your address and phone number on file.
Then you lose your way a while and become homeless for two years. You can't afford a phone and no longer have the same address, and have lost your ID or it expired. You finally start to turn it around and go to open a bank account. The government contacts you how? How do they know it's you?
The answer is that it's a new account and you're not trying to prove anything about whether you're the same person who lived at the old address, so you shouldn't have to.
And once you have a bank account or a mortgage or such, you and the bank can arrange for any form(s) of authentication you like. It shouldn't have anything to do with the government, and it definitely shouldn't have anything to do with how you authenticate yourself to your job or your wireless carrier.
If a new account can never used to establish your identity for other purposes, there's no problem opening such an account without government identifying you.
Your phone gets stolen so you use your bank card and password to set a new phone number on your account. Your bank card gets stolen so you use your phone and password to get a new bank card.
If someone steals your phone and password and bank card and ability to receive mail at your home address all at once then you're pretty screwed, but you're pretty screwed then regardless, right? That's the level of screwed where somebody else can also get a government ID in your name.
For sufficient amounts or suspicious operations, banks should be in charge of authentification, ie showing up in-person for bank transfers such as a car, showing up and supplying fingerprints if the bank transfer buys a house...
And in fact, for poor people, anything that might engage their entire savings might also benefit from a more advanced identity verification.
For example my stock exchange account just takes a password. It’s a scandal.
I use a slightly different address with every business I give my address to.
For example
123 Main St. Apt 45-WXYZ
Where "WXYZ" is different for each business, to pollute any data mining algorithms trying to collate my address across sites. In some cases when they use address resolving functions, it also helps to spell out your address e.g.
The scary thing is how much ones phone number (a somewhat ephemeral thing) is actually bound to your IDENTITY.
Considering your phone number is more and more being used in 2FA ... if you were to ever change your number and someone else got it, this would pose a serious security risk if you failed to change over ALL of your internet accounts 2FA to the new number.
I've always thought the most scary thing about this practice is that your (unique) phone number is a powerful "foreign key" which could potentially join data from many other leaked databases, forming an even larger dataset on you.
There are plently of other places we give our phone numbers to, which might not have anywhere near the protections that Facebook say they provide.
Absolutely, and e-mail or Paypal account name too. Neither of them are trivial to change. If you try to create a new account for each thing at a generic mail provider such as Gmail, your accounts will be shut down by automatic abuse filters. If you roll your own domain, then, well... the domain becomes the foreign key.
The solution to this is unlimited true email aliases as e.g. StartMail [1] and Fastmail [2] provide. I wish this was more common place for email provider. Besides the front up cost of developing / setting up the solution, email aliases have the marginal cost of one small database row per alias. And it would be such a boon for privacy.
Would using a separate service email accounts help mitigate issues? seng-baking@gmail.com, then seng-banking+icici@gmail.com, seng-banking+axis@gmail.com, etc? That way my primary email would stay private and will used only for email, not for identity.
Your private email that you don't use for signing up anywhere is irrelevant except for phishing and spam. Your secondary email address will become the foreign key that is used to correlate the datasets from everywhere you signed up with it. The +tags can just be removed since it is known how they work. Might give you a small protection against attackers who don't know about email address tags.
No in the Us you can easily open one online just by answering a few questions and giving your SSN. Sometimes they will ask you to upload an ID. But that could also have been leaked.
It turns out that signing up for a new bank account is something that people commonly do at the same time as they're moving to a new place and change their address and phone number.
"As has been pointed out on HN before, "identity theft" is a made-up concept to make it seem as if you had something stolen from you, when the real problem is banks and other service providers do an absolute shit job of identity verification. They're the ones at fault, and they try to shift the onus onto you to fix things when they screw up."
Well, I think at this point we can conclude that companies will not be good shepherds of our should-be-private data if we don't incentivize them to. Who'd have guessed?
I appreciate the pragmatic stance you take, and we should definitely move to more robust identification methods.
But making this case for better ID tech should IMHO not be confused with giving facebook and the others a pass. This should never have happened, and the very fact that we have a data trove this big is already a problem. That they don't seem to even attempt to protect that data is another one.
Close that shop up. The fines for this kind of stuff (and the other stunts they've pulled) should make it economically no longer viable to keep it open, really. The first time you fuck up this hard, the fines should hurt. This is not their first time.
Companies effectively run the government, via the 2 party system and the back and forth of high level corporate execs back and forth into gov't. They don't want to rock the boat once that bountiful position is achieved. That said, how to get the government to regulate corporations who regulate the government. Maybe I'm becoming too cynical, but there has to be an externality like the great depression in order for The People to take back the reigns of power (at least in the US).
Wont happen. All the politicians world over have been trained like Pavlovs dogs to chase Likes as a route to power. So its like asking addicts to not just give up their drugs but to take down the cartels.
The time to shutdown Facebook has past. Now we just have to suck it up and endure them and their effects like we do Cancer.
> Indeed, a social security number is pretty much the only additional piece of data to the stuff above that one would need to open up a bank account in someone else's name
I am not from the US, but is this really all you need to "proof" your identity?
The most common thing I have seen in the EU when companies have a KYC requirement is that during the sign up process you will have a quick video call where you have show your ID card while they verify that your ID is legit.
"I mean, at this point I think everyone should just accept that at the very least their name, age, address(es), email(s), phone number(s) and screen name(s) have"
Yes. I think the scary part is WHAT these identifiers are connected to. EG. your email being linked to a private antifa community, pro-Taiwan communities, adulteration meetups, etc.
s/if you have ever had any kind of online presence./if you, your friends, your family, your cleaning lady etc. has ever had any kind of online presence.
> So if that's the case, I think we should move beyond really even trying to think of this info as private or a marker of identity, and we need to move everyone to more secure forms of identity verification.
Even if this info wouldn't be good enough to sign for official stuff it's still private and unique enough to target you though.
We should start thinking of these breaches in terms of their accumulated impact. It's not the 1990s anymore, where data is difficult to store and networking too slow to move it.
We should assume the leaked data doesn't go away; that instead people out there are consolidating Equifax data with Vastaamo data, adding data from Exchange hacks and the Accellion hack, to cross-reference with data from Facebook... it's like water flooding a levee now, instead of evaporating.
Honestly sounds like a fun job for future historians. By aggregating all the leaks over a long period, how much of a person can you reconstruct?
For example even though I am using a throwaway account, HN's logs might one day get compromised. So now they can join the IP address to other compromised sites that I was logged into using my usual email. And from my email they already have my name, SSN, address, phone number, usernames, passwords, etc, exposed from prior breaches. But now they know about my shitposts too.
That's exactly my point. I think I am safe on HN because I'm using a random user name with no email attached. But their logs definitely have my ip address and that ip address will be common across other compromised logs on other sites, some of which I might be logged into with a real email (this is true regardless of incognito mode since it's the same computer).
The root of the problem is not the privacy policy or the system security. The root of the problem is the collection itself. All large businesses, health care providers, and governments maintain databases. Every one of them will eventually be leaked. All it takes is a corruptible trusted insider.
I don't trust in the government, but I think digital "personal data" should be only available for "confirmation" to companies that need it. Say, a government entity could have an API that allow you to send hashed personal data that they can verify is right. This way companies will ask the user for their data and hash it client-side. Then they can send the hashes (hashed with a custom provided salt to the entity (government, maybe private) who will basically reply with a True or False on the verification of the different data.
It may even be an interesting use case for a public blockcahin, where your personal data is stored in a Merkle Tree type of data structure, so that one can verify that certain pesonal data of a person is true, without disclosing the data.
It's probably not implemented as closely as what you described but check out Europe, this small continent across the pond and the tech scene in the smaller countries.
For example Estonia has had famously and online identity stuff linked via a federal ID (in europe there are more republics then federations so it's easier to manage country-wise) [0] [1]
Or more familiar to me with a bigger sample is a movement in Poland which is gaining popularity - mojeID (myID) which is a Single Sign On system with major banks as providers (they really regligiously check the identities when you open a bank account) or the statebacked login.gov. The mojeID system allows other entities to use your actual identity as an authentication factor without having to keep that much data and pose risks - for example an online alcohol shop can verify the age. [2] [3]
Wouldn't this fall down as soon as someone enters your details with the wrong casing or uses a +country code rather than a localised phone number and so on?
Humans will error and enter data incorrectly so the hash would be different every time potentially despite being "correct" to a human at a glance?
You could standardise everything (lowercase etc) but I imagine there are country and regional edge cases such as capitalisation having meaning in a given language (I can imagine it being a thing but I don't know it for sure)
There could be instructions in place to specify only all lowercase or just use .toLowercase() when sending. Also there could be a specified format for phone numbers or a function that turns all input into the desired format by stripping all special characters. Possibly only hashing the last 9 digits of the phone number for non mission critical applications instead of the full 10 digits.
This sounds like a good use case for short brief documentation.
Equifax has more at stake than most. And they've been hacked. Repeatedly. The government has been hacked. Yahoo was COMPLETELY owned. I mean, if someone would put together a list, it would make for shocking reading. It's become so common, that we go, "Oh no! Anyway."
Yeah, both my health insurance company and the company that eventually bought my mortgage have both sent me letters. And this point I just dump them in a folder in the filing cabinet with the rest.
Google has a huge number of activist (and surely some corruptible) employees, and yet the incidents of users data getting out are very close to zero.
I think this demonstrates that user data can be managed safely and effectively.
Usually the incidents reports on user data leaks show that the company seemed to barely be trying - We need laws that force them (even small companies) to put serious effort into it.
You don't know that. While the publicly available data leaks are indeed rare, you cannot know if they don't use the data for trading or other purposes for their personal gain without disclosing it to the public.
People do have access to privileged information and that will influence their decisions on both conscious and unconscious levels. It's not possible to detach yourself from work completely and asses your thinking whether it is influenced by something you saw or not on an objective level.
It will also be difficult to prove. For example if an employee hobby is trading, how do you prove that the trades they made are based on their own independent research or based on what they saw?
If they saw something that could make them money, they could easily create a trail of evidence that they researched the matter on their own - it will be difficult to prove that it originated from looking up the privileged information and unless someone is going to be making millions, it's frankly not economical to commit resources to.
It is also in the interest of the company that such incidents don't see the light of day.
I understand your point, but everything you're saying is hypothetical. You're not going to persuade anyone of something happening if you can't come up with any evidence of it happening.
There are infinite things we can't know - opening the discussion up to that really makes anything possible, but the discussion wasn't even about what they might do with the data beyond leaking or selling it.
> Google has a huge number of activist (and surely some corruptible) employees, and yet the incidents of users data getting out are very close to zero
Am I reading this wrong, or are you saying that activists would be more likely to leak data? Then I would wonder what kind of activists you have in mind.
Agreed that yes indeed it seems possible to build a security serious company, and that Google is (seems to be) a good example. (Now, there are other things I don't like about Google but I guess that's of topic.)
Surely they would. We already learned that members of their own security team don't seem to see any problems with employees abusing privileged access to mandatory Chrome extensions to agitate for unionisation (at Google of all places!!). Twitter employees screwed with the account of the president of the United States.
Ideological employees of big tech firms taking a sudden disliking to someone or some group and abusing privileged access is certainly a threat that ever larger numbers of people are talking seriously. In particular, it is a concern for industries that do things activists don't like, such as working with immigration control (though perhaps that's no longer an issue now Trump is gone).
Kathryn Spiers, who worked as a security engineer, updated an internal Chrome browser extension so that each time Google employees visited the website of IRI Consultants — the Troy, Michigan, firm that Google hired this year amid a groundswell of labor activism at the company — they would see a pop-up message that read: “Googlers have the right to participate in protected concerted activities.”
Note that she wasn't able to do that unilaterally. Some other member of the team approved her CL and others defended her in public.
I have a vague feeling there was another case like this some years ago where some security engineer modified a Chrome extension for political reasons, but I can't remember the exact details and can no longer find it.
I mean, yes, but.. what's the solution? Never collect data? In at least some of those cases (and arguably all), that data does need to be collected and stored. What is the government going to do, not maintain birth registries, tax registries, land owner registries etc? What is a big business like a bank going to do, not collect customer data like your name and address?
I have a different view: it’s not the collection that’s the problem, it’s the firehose attached to the database. For the applications you mention, make aggregation over all records prohibitively expensive by design.
I still think there might be something here. You can allow certain aggregations (like “sum of the tax column”), but they have to be explicitly permitted; otherwise shuffle and hash everything enough times to make a single lookup sort of cheap, while a scan very expensive (plus distribute over enough physical servers and make the network between them low bandwidth to thwart lower level attacks). With enough regulatory or legal pressure on companies to lock down their data, paying this premium might start to look attractive; one could even found a startup peddling the World’s Slowest Database™!
Edit: what I was thinking originally was that in the world of paper-only archives, these massive leaks were all but impossible, yet business could still be done. It should be possible to combine this slowness with the convenience of computers.
I mean, that slowness is the reason we moved to computers.
That said this is certainly interesting, I wonder if there has already been an exploration of this topic. Could definitely make an interesting startup idea :)
Exactly - either the data is basically not valuable at all (the category for which PII rarely fits) or else when the company collapses or is bought, the data moves too.
There's always an incentive to steal or leak it to other companies for money; so as long as the incentives are aligned with GATHER ALL DATA and KEEP IT FOREVER then yes, it will just be a matter of a time before each data store is compromised by mistake or purposefully.
I doubt the claim, but the sentiment I think is valid. If you think about what data these entities are holding, it's not unique to a single database or entity. Your name/address/phone/ssn/etc. Is likely stored in so many places that the probability it gets leaked from at least one eventually I'd say is very nearly, if not 100%.
Looks like this is the "To match users to their friends by phone number, you need an API which can take as input a phone number, and return information about if that number has an associated account" problem.
There is no way to let a user find their friends on a service without such an API. Yet if you have such an API, someone can simply brute force all phone numbers worldwide (there are only 10^10), and now they have a database of all users...
Rate limits can help defend, but considering many users might have 1000 phone numbers in their address book, you can't set the rate limit very low without impacting user experience. Attackers can reduce the search space dramatically by only checking phone numbers that resolve to an active line (using VoIP stuff to test a number).
The only real solution is for your app not to have a "Here is a list of your friends already in the app" screen... But as you can imagine that means you won't get any user growth or VC funding...
I think there are way more than 10^10 phone numbers in the world. I think there are 10^10 combinations in the USA alone (filtering by unused area code, etc will decrease that number, but even then https://www.ck12.org/c/probability/permutation/rwa/Wrong-Num... says almost 8×10⁹ remain)
Also, at least some countries have longer phone numbers (Germany, the UK and China have 11-digit ones, for example), and the international public telecommunication numbering plan says plan-conforming numbers are limited to a maximum of 15 digits, excluding the international call prefix (https://en.wikipedia.org/wiki/E.164), so the search space, potentially, is a lot larger.
> I think there are 10^10 combinations in the USA alone
Canada and USA share the same numbering plan and that's sometimes overlooked.
I always "press 1" to interact with the 'your social has been criminally suspended blah blah' or 'card-services' robo-calls.
When I have time I share circular stories about my grand-kids that don't visit me anymore, but when I don't, I just respond in French with something like "Je pense vous etes une pamplemousse."
Most recently, one responded with "NO HABLOS ESPANOL". lordy lordy...
This is the same fallacy that leads to apps asking for permission to access your whole picture library.
Facebook could have an API by which an app can prompt its user to show a list of all of that user’s friends who have the app installed. The app would only learn the identities of people whom the user explicitly selects, and phone numbers would not be part of that identity.
It works for photos because the threat model is about protecting local files against malicious apps.
But for phone numbers, you about protecting Facebook API (which is publicly available via the internet) against arbitrary devices, which Facebook has no way to tell from legitimate ones
What I mean is: Facebook should remove that API entirely. Apps do not need a way to look up a phone number in Facebook’s database. The “find my friends using this app” feature does not require this capability.
I think it should be illegal for apps to help find friends. If you genuinely meet someone offline, then they could generate you a token that then you could enter on the site to "connect".
Telegram had this issue too and they made a setting "who can find me by my number" you set it to "my contacts" so only mutual contacts can find each other.
I think you misunderstood that. This setting isn't about privacy of your friends number its about who can find you with your number. For example by brute-forcing numbers.
Whether you upload your address book or not is another story and it does ask if you want to do that. Obvious if you decline and then set to above setting to "my contacts" then no one will be able to find you by your number (which is exactly what I personally want)
There is absolutely no need to upload your address book to make you number unsearchable.
Obviously it is bad if your personal data is compromised after you (or some else) upload it to an online service like Facebook.
But in this case, it’s important to remember that phone companies used to regularly leak most of their customer’s phone numbers (and names) in the form of a telephone directory. So a question to consider is: suppose that the white pages were still commonly produced and contained most people’s numbers. How would you then feel about something like this.
Personally I feel like the problem with phone numbers being leaked is mostly the epidemic of spam calls (especially in the US) rather than some particular breach of privacy.
Aside: I think it is good to consider these counterfactuals in general for questions about information privacy, for example how would you feel if everyone’s tax returns were published publicly like they are in Sweden?
The "new" risk with phone numbers is the overreliance on them for login and 2fa and the relative easy of taking one over. I use security keys but still have accounts I can't remove the phone 2fa from despite having two keys tied in.
If you can avoid it, simply don’t give anyone your phone number. Then you can use security keys, hardware keys and recovery codes. Obviously some places require a phone number, but for those that don’t, avoid giving them one.
Spam calls are likely not even affected by leaked numbers. Source of suspicion: My partner and I have phone numbers in close numeric vicinity, and deliberately use one for public purposes and the other one is not known outside of a very close circle of family.
We still get spam on both numbers within short time frames - so I'd say it's likely spammers just auto-dial through.
That's been going on for many years. Brute force calling costs nothing. I've always wondered if charging 5 cents per call would stop them cold, but I am sure no one wants to implement that now.
As far as I can remember, the white pages don't include "biographical information". The kind of details used for idiotic "security questions" on websites too lazy to implement 2FA (your mom's maiden name, your first school, the name of your first pet, etc).
As for public tax returns in Scandinavia, first of all it has guardrails - searches are recorded with your information when you lookup someone - and second, countries have different culture and History for a reason.
Finnish income for those earning 100k and over is bulk downloadable as a csv/json with full names, birth years and provinces, along with taxable salary, capital gains, etc.
You can't compare that at all! They leaked IDs and from that you can go to user profile and learn more about them. You cannot do that from a phone company leak.
Before the Internet, all you could do with a person's phone number is call them. But now, with PeopleSearchNow.com (the ultimate free source of personal info--feel free to look up your own real phone number in it, assuming you're American) and with Google and the MULTIPLE social networks that virtually everyone between the ages of 20 and 45 has been on now--typically using the same username on most or all of them--you can find out way more in a matter of a few MINUTES than a 1980s private investigator ever could in a span of a month. I can look up your number on PeopleSearchNow and then, if I can't find YOU on Facebook, I'm confident I can find one of your relatives (several of whom will be listed on PeopleSearchNow), and get to you from there--meaning find your other online profiles, thanks to those Facebook friends who know you.
Phone companies didn’t leak phone numbers in the conventional sense of the word. I used it to try to draw a comparison. Phone numbers used to be printed in big books and you could usually look someone’s phone number up if you knew their name and rough location. That is, phone numbers were not considered to be particularly private information at all.
I think the comments I most agree with talk about the different security threats people face today with current usage of phones.
The issue of having someone's phone number is completely different these days; in 1985, I couldn't find out your age and your previous addresses and then find dozens or even hundreds of photographs of you alone and/or with your relatives and friends, as I can now - see my other comment:
This is insane. Phone companies published numbers because it was generally considered helpful and the costs of unsolicited calling were relatively high. By the 70s delisting was an option, and by the late 90s it was very common (in the US). The internet made this a no-brainer, and to suggest that it’s somehow ok just because it used to be (in a totally different world) is beyond ridiculous.
We don’t have the option here — people provide their number to a service to be able to use it, and the numbers are then compromised, in breach of that contract and because of the service’s failures.
The two are not remotely alike, what the fuck are you even talking about.
Please make your substantive points without name-calling and swipes. Those are against the site guidelines and we ban accounts that post that way—it's because we're trying for a different fate than internet-default here, or at least to stave it off a while longer.
I work in the security field and let me tell you something I realized: nobody cares about security. If someone cares about security, it's because they've had many many incidents in the past. We humans are not a species that is good at preventing, we are good at reacting.
the security handbook[^1] has a chapter on that actually, and they basically say that role playing is the only way of not getting burned. Humans are excellent at role playing, and it can help you prevent a lot of catastrophe without having experienced them before.
I think part of the problem is that many orgs see security as an overhead that engineers do to sleep well at night. A few more breaches, a few more fines and it will finally be seen as a feature to keep the CEO out of jail.
This is just it. I also work in the security industry, and the fact of the matter is that we (security professionals) can't give guarantees. I don't know what exotic exploit or bug will exist tomorrow. Security professions basically offer what (to me) seems like a crappy insurance policy. Depending on your orgs threat model, it is often just cheaper to deal with the breaches. --- I am not saying facebook falls into this category. ---
Security is not a replacement for a data breach insurance. Security is basic hygiene for insurance to work at all.
To me, a good parallel is home insurance. If you get robbed, a good insurance will cover your losses. However, if said insurance determines that you were negligent -- say you never lock your front door -- you are on your own.
Do you have precious art at home that you want insured? No problem. Just make sure you add an alarm and sprinklers.
That is how I want discussions around security to be held. Are you a start-up with 10 users? It's okay to do minimal security. Are you a bank whose wires carry $1B? Make sure you throw sufficient "bodies" at the problem, from the top of the hierarchy to bottom.
"You keep using that word. I do not think it means what you think it means."
I "deleted my account" in the run-up to the 2016 election, because I could see how social media was being manipulated, and what it is doing to society. And I mean I _deleted_ it. I took the extra steps of researching how to REALLY delete it; not just suspend it.
A couple years later, I needed to get a new account to help admin a page. You can guess where this is going. I tried my usual email -- since it should be free, and it was deleted, right? And... what do you know? Everything was still there. All my posts. All my connections. Everything.
I think this is because is very expensive to delete and also to develop true delete process. I think that is not acceptable and cost shouldn't be an excuse for keeping content forever.
Even if you manage to get deleted from the live servers, chances are your content is still going to live in backups and will never be deleted. Some backup systems are write-once and could only be deleted by physically destroying the medium.
I wish this was properly addressed in legislation.
That kinda sad, because that is what’s going to happen and then we’ll nothing more.
At this point I’m not really sure what it will take for companies, like Facebook, to understand that you need to not fuck around with peoples private data.
Put a monetary cost of holding user data, and a steep monetary cost on losing user data.
Ex, pay x amount per month in perpetuity for each piece of information about a user your keep. And have to pay the "net present value" of those payments if you lose the data.
Having to pay for hoarding user personal data changes the incentives from gobble up as much as possible, to instead only pay for a users data that is worth the cost to your business.
And as an extra incentive to not hold unneeded user data, know the costs you'd pay if it was breached.
Who would get this money? I agree that it needs to be some solution involving a cost, given that most of these companies have shown multiple times that profit isn’t just their main concern, it’s the only concern.
Think of it like a class action lawsuit on behalf of investors. Instead of entrusting their savings to a company, people are entrusting them with their personal information. If there is gross negligence on part of the company leading to that data being leaked then all of the people whose data was stolen should be able to claim monetary damages. If a legal precedent is established so that these claims can be pursued whenever this happens it should provide enough motivation for these companies to take preventative measures.
The government typically... who might in turn do something like a tax rebate (write a check to everyone, ontario has been doing with the carbon tax) or just stick it into the general pool of taxes (reducing everyone's taxes).
Honestly the EU need to finans a organisation to deal with GDPR violation, hell it could finans it self. The GDPR is the single best piece of legislation ever written, in term of privacy, but enforcement is lacking.
An "exact" google search excluding adjacent phone numbers seems to work well for my numbers, and culls a lot (not all) of the autogen pages. So if your number was 212-555-1239, search Google with these strings:
Just submitted a removal request for myself, a flow full of dark patterns (in fact the Remove button didn't even show up until I disabled my Pi-Hole). Remains to be seen whether all I did was make the data more valuable by confirming my email address. The page recommends signing up at BrandYourself to prevent various other data brokers from showing the same data. How is this not extortion?
Tried it, you're right. Got 6 of my past addresses, 9 past phone numbers, 8 relatives, all correct. Some incorrect info, but not much as a percentage.
If you reverse search the PO Box address listed on the site contact page, you'll find an Amateur Radio license listed to a person that is probably the owner of the site, based on his past experience.
Also, searching for their Adsense publisher id reveals some other sites they own: peoplesearchnow.com, fastbackgroundcheck.com, smartbackgroundchecks.com
Those sites have new and different PO Boxes in other cities, etc.
I am amazed and horrified at the fact data brokers like that are legal. and the hoops to which you must leap in order to get your information out of them, even with california privacy laws.
This is just one of the websites. Here is a list of all the websites which have your information with easy links to opt-out from them.
I do not maintain this but don't know where I got this from, however I have notifications when this spreadsheets gets updated to remove my information from another website
Yikes. That's the only one of those that was even close to being accurate for me. And I'm not sure I can get it removed because they don't have any of the right email addresses. I don't usually leave much of a trail on these sites, but the info that is correct vs incorrect makes me suspect they probably got it from one of my parents.
so if I search my phone number, it brings me to my name and everything. But if I search my name it doesn't get my phone number right. Any ideas why it's like that?
Not really an answer to your question, but one partial solution to the problem of having your number leaked or sold is to setup a service like Twilio to act like a phone proxy. You can have Twilio forward calls it receives on a different number ("spam number") to your actual phone number ("real number"). You provide spam number to anyone who isn't a business or personal contact. Every few months, you rotate spam number. If your spam number is leaked, you don't care because its only a transient number which isn't more permanently associated with you.
You can also have more permanent proxy numbers for services or people that may need to get in touch with you long term.
Can you think of any drawbacks of using this for important services like say PayPal? Or are you strictly using this for throw away products and services?
Is this available to people outside of the US as well and is there a guide for setting this up? Last time I used twilio for a basic sms gateway there was a lot of clicking and typing.
I've been using voip.ms in Canada to great success. Even SMS codes from banks and Whatsapp work correctly. Excellent service, highly recommend, especially with voicemail auto-transcription (then sent to email) and SMS from desktop via email.
I've been getting a lot more recently as well and I figured it was due to the phone companies promising to get rid of caller id spoofing this year so scammers are working overtime until they can't anymore.
Oh, is that a real thing that's happening? Caller ID spoofing is the main reason I hold onto my phone number from [small town] Texas, since only my immediate family ever calls me from there, so I somewhat reliably know anything else from that area code is a scammer.
I recall there were a ton of them in France. Usually pretending to be DHL or another courier asking about a package. Nobody I knew interacted much with the calls.
If you're in Europe, but don't share a language with a much poorer country, you're safe from these.
Telemarketing or political campaigns. Check out the Robocall article on wiki. In Europe it depends on the country. In Poland I receive a few calls daily but they are people calling me, not bots. Never received a robocall here.
In the US, the vast majority of them are simple frauds. For a year I got a robocall every few days from a Chinese woman (in Chinese) that a friend of mine said is a threat to get (the hypothetical Chinese immigrant) me deported unless I pay them.
Right now I'm getting a fake credit card debt collection call (I've never had a credit card in my life, only debit), and a call telling me that I'm eligible to have my AT&T (phone) bill halved (I don't have AT&T phone service) and all I have to do is call the number "on my caller ID." I think those two are both being read by the same woman (not the Chinese one.)
I'm more of a texter than a caller, so the vast majority of calls I get are robocall frauds. I'd love to get a robocall that was just annoying for a change, rather than completely predatory.
Same here, i started recieving both calls and SMS which the last i find more annoying. I do use Android and these ones haven't been able to be detected as spam
Those are usually generated, they call numbers in area code/exchange randomly, assuming you will pick up something that seems familiar. Jokes on them, I moved to another state, easy for me to tell.
Best thing I ever did for myself was to get a Google Voice number in an area code I've never lived in.
A lot of these fake calls rely on people assuming "local number = neighbor" kind of mentality which rarely comes into play these days as we're much more mobile than we used to be. Plus area code splits, separate area codes for cell vs landline services becoming increasingly common.
If my GV number were: AAA-BBB-CCCC
And my actual home area code is DDD-EEE-FFFF
Any and all numbers from AAA-BBB-xxxx can be ignored.
I never answer calls that come in on my cell carrier provided number, so that eliminates that issue. Silent ring, no forward to voicemail.
Anything left, about 95% of the time, tends to be legit.
With regional calling being a thing of the past and most cell plans being unlimited text and talk, it makes very little sense to keep a local number. Especially now as it's SOP for roboscammers to fake Caller ID and try to match the first 6 of your 10 digits.
Not natively, but there is an API that apps can use to do it for you. I use Mr. Number because it’s literally the first one I found and it’s worked good enough for me.
I toyed with that for a while but I kept missing important work calls. I might have a look for an app later, but I have a feeling it might not exist...
Yeah. I tend not to pick up calls that are in the "Who would be calling me from Texas?" vein. But while it's annoying to have to look at my phone when it rings, I do get calls from locations that seem plausible and they usually are legit. I'm not really willing to make myself harder to reach for legitimate and even important reasons because of the occasional junk call.
Work uses slack/teams/Webex. One person sends me Signal. No one has ever used telephony, except I use it to call he dial in numbers because my phone audio is better than Bluetooth / virus agent laden laptop displaying ten videos of peoples homes thru vpn.
This could be the first large breach we've seen from FB like this. Most past breaches were of a much different and smaller nature (scraping or API access abuse), and seeing a real leak like this could change the landscape for FB quite a bit, since historically companies like Facebook and Google have been very good with preventing them. I don't know a ton about FB's specifics, but there's a chance this data could be 'public' from people with the given privacy settings, if perhaps 25% of users have that turned on. If that is not the case though, then this would be the first serious breach from FB imo.
Either way at this point I operate under the expectation that most information I input into a database may be leaked at some point. This is particularly rough for services that demand and track a lot of things, but it cannot be helped.
Looking at the leak others have pointed to, there are a surprising number of people working in a particular imaginary company:
sqlite> select company, count(*) as c from usa where length(company) > 0 group by company order by c desc limit 10;
company c
---------------------------------------- ----------
Self-Employed 459119
Facebook 181013
Retired 71210
The Krusty Krab 61550
Hollister Co. 42304
U.S. Army 39682
Stay-at-home parent 33095
Walmart 31600
McDonald's 30792
Student 25326
I wouldn't call any of them SpongeBob "fans". We did grow up with SpongeBob though and SpongeBob is a common subject for Internet memes, so that might be it.
SpongeBob was originally for adults. But adults who grew up on Nickelodeon cartoons. Kids found it funny but don't always get the jokes because they are adult humor. They had to dumb it down for the kids. Before Netflix kids watched cable TV and Nickelodeon, and made a lot of memes about SpongeBob and the Krusty Crab. Every one of them wanted to work at the Krusty Crab with SpongBob and Squidward.
I associate "The Krusty Krab" to fake/secondary/tertiary/spam profiles. People I know personally and those I was able to confirm as legitimate profiles doesn't use that or any other fake information (or just leave it empty most of the time). I only see such fake information in accounts used to advertise their business in someone else's threads.
Culture / group related maybe? I have barely any friends on Facebook with real work details - a lot of them are made up. Especially doctors like to keep their profession (and real names) private.
I definitely know real people (especially highschoolers or college students) who put fictional jobs in their profile. Also common is using some fake name, like that of a fictional character.
Could you please tell me how did you convert it to sqlite? I've got a huge 1 GB txt file that crashes my comp every time I try to search for myself there :( Thank you!
I suggest using a shell program like grep (or even the search feature of less), as shell programs are notoriously good at lazily seeking through files to keep memory use low.
Firstly don’t do something like open it in notepad. 1GB text files are not exactly difficult to work with once you use a proper text editor or parsing tools.
I used to use Borland's Brief to edit large files. But it is commercial. There is a free version in beta here https://sourceforge.net/projects/grief/ Grief. It pages the file in and out of memory to save space.
How is it that NOBODY mentions suing FB for this, the single largest data breach/leak in history? Are they to be let off with zero accountability? Nerds and geeks of ycombinator, pls let us know how you feel.
if something sounds like a hyperbole, it probably is. In this case I believe you are mistaken, and the Yahoo breach of 2013 was the largest, with data leaked from over 3 billion accounts.
This does not look like scraping. A prima fascie database leak, and an invalidation of Facebook's claims of them not using your phone number past the validation, as well as them claiming using encryption at rest.
I've had a play with the data for a few people whose phone numbers I actually know, and they all seem old enough users that they just have the number on the account anyway. I could be wrong but I haven't found anyone my age who's number I can confirm.
You need to do a bit more than that; a one-way transform with no secrets isn't good enough for easily brute-forceable data like phone numbers, SSNs, passport numbers, credit card numbers etc. There's just not enough entropy in the data.
There are ways to do these things though so the spirit of your comment is correct.
I’d assume encryption wouldn’t help much since wouldn’t the key most likely be available if the database was compromised?
I would have thought hashing would work if it’s made more expensive such as by choosing an expensive hash function and increasing the number of rounds.
Edit: Would first encrypting the value with the salt and then hashing the encrypted value and salt add more entropy and make hash collisions less revealing?
To protect "sensitive, low-entropy data", the main things I've seen people do are encryption, tokenizing, or anchored hashing. I'm certain there's a bunch of academic work out there I'm not across so I'm writing from the limited perspective of "things I've seen people do in industry".
The best thing to do tends to depend on how you need to use the data, exactly.
With hashing alone there's just no reasonable cost function that will provide (say) 1 year of security in the event of database exfil, but also not DoS your service computing it :/ The problem is being offline-attackable.
Encryption is one possible answer and I think most HNers understand the tradeoffs. Generally the less transparent it is, the more effective it is. Volume encryption or transparent database encryption are good to turn on, but don't protect you much. Keys available at application level only (let's say some fields are KMS'd) are better and will be of use under common failure scenarios (SQLi / DB exfil). You still have to get key management and application security right though and it turns out those are hard to do at scale. Your encrypted fields will also not be efficiently searchable unless you are using deterministic encryption.
The tokenize pattern replaces sensitive data with a random value which is mastered in a centralised, controlled service. This really only makes sense if you can set things up so that almost all operations can be performed using the token.If you allow too many things to do token -> value lookups then it's pointless. Also all your eggs are now in basket so you have to watch that basket. Operations look like:
- Exchange sensitive value for token
- Compare tokens for equality (optional, but usually handy)
- "Domain operations on token". For credit card, "bill the user", for phone numbers your domain operations might be "send SMS" or "robocall".
- Exchange token for value (controls go here; limit access to customer service staff only, auditing, rate limits etc. The value should ideally only come out if a human has to look at it, and you should be able to definitely say who looked at what).
Anchored hashing uses a secret value in your "hash" operation. Keeping this value actually secret is hard, so an "industrial strength" implementation will use an HSM or other hardware to do the operation. This means any brute-forcing has to happen inside your network where you can see it. You ideally want a bit more entropy than with tokenization to make this work, but with appropriate rate-limits against attack from inside your infrastructure, it has legs. It's hashing, so works well for "have I seen this sensitive data before". The main advantage of this pattern is that it doesn't have to keep state.
So it sounds like all of these techniques are to force sensitive data access to a single well secured point 1) with the hope that these actions will be eventually noticed and 2) to enforce rate limits on actions to slow progress and increase exfil duration to support (1). (2) can only be made so slow since, as you mentioned, the service needs reasonably quick access.
On a related note some HSMs can enforce rate limiting in hardware so even if the machine enforcing this access is compromised the rate limits still cannot be bypassed.
You can not prevent the phone number form being found eventually but that's not the goal you just need to make it more expensive than a phone number could ever be worth to someone.
If you use a secret you have the same problem as before the legit system need to have access to the secret but an attacker should never get it. So if an attacker gets hashes and the secret(s) he has everything.
Would like to know if non Facebook users are included because Facebook has non Facebook user's phone numbers due to the fact that Whatsapp uploads the entire phonebook to Whatsapp. That means Facebook is likely to know your phone number although you don't use Facebook or Whatsapp.
It goes so much further than this and it is absolutely frighting. The following sketched situations applies if you don't use Facebook at ALL.
99+% of every single person you meet has either FB, IG or WA installed on their phones and shares their phonebooks with them (assuming you live in [insert western country here]). There is also a very big chance at least some have your full name and address in their phonebook. Facebook not only knows who you are, but also who you are in contact with, when you meet new people and who they are. They also collect phone and text records with their apps so they also know the frequency that you have contact with them and they can even read the content of text messages (most people these permissions to the apps because it will automatically verify the associated phone number). Add all the location data, ssid/mac address collection and countless of other datapoints to it and they can draw out your entire life even when you don't use anything from facebook. There is no escape.
As a counterpoint I can think of dozens of personal acquaintances who are happily non-users and never interact with Facebook properties (retirees not into tech, busy executives, to cool for Facebook hipsters). If your country or social circle doesn't use WhatsApp, Facebook itself is already dying and Instagram is getting their lunch eaten by Tiktok.
It's not about the information you give, it's all those friends and family who signed up for it and uploaded their address book... They now have your phone number and email probably your date of birth, and even some photos of you.
They are like the credit companies, they have information on you whether you allow them to or not.
Exactly. People who save your details in their Address Book and click the "Sync your Contacts" when an app suggests it, are the loophole in privacy and security.
The only way is to have a separate set of email, phone numbers, for those family and friends who doesn't care about privacy and security, that way it is easy to dispose those information later.
Beyond that … if we truly want to avoid it … the only course of action is to not give them anything at all. Not a single email, not a single phone number, not even our home addresses.
That is exactly how I deal with it. I use a throwaway webmail address and a Google Voice number. No way they're getting my real email or phone number that I use for work and clients.
You're not avoiding having a shadow account created about you though, once I've upload my contact list and you're right there, Jim Wyclif, with your real phone number, and I've also tagged you in a photo we took together last year. And then there are all the other people who have your real phone number and who have been on Facebook, Messgenger, Instagram, or WhatsApp since they saved the number. And then there's even relatives and former classmates and co-workers who've looked up your name and typed in your town, school, and/or former company to find you. When I use Facebook, it still recommends former classmates and acquaintances who I literally looked up once ever. (As a side-note, Snapchat does the same type of things; people I've worked for in the past, as a tutor, are recommended to me on the "Add Friends" page.)
> it's all those friends and family who signed up for it and uploaded their address book
Hmm, not many of my friends have my phone number either. I don't usually use phone with friends. Come to think of it I almost never use phone, I always use FB, WeChat, LINE, Hangouts, etc. for voice calls, and sometimes just use a straight up WebRTC call.
I suppose one way to alleviate this for the masses is that when giving your number to a friend, have them use a pseudonym that is easy to remember for them and not use your real name.
I've found that VOIP numbers from certain countries and area codes can evade this problem. Not listing publicly, in case the idiots in charge of the system are prowling this site.
But I don't even understand why they're allowed to look up the provider, or why I can't define myself as a mobile operator.
I don’t use Facebook or their other apps (eg WhatsApp). Facebook has my email address as I used to get regular invites to sign up. Facebook also knows what I look like from friends tagging me in pictures, and seems knows my date of birth as people tell me that they were notified by Facebook.
So even if you have avoided all their stuff, you aren’t immune.
Every time someone argues that people can avoid the privacy problems of Facebook by simply not using it, I point out this issue (plus the shadow accounts).
I recently purchased a phone that had the Facebook app preinstalled. If I had to guess, the mere act of connecting to WiFi caused a whole slew of info to get sent.
I didn’t check what permissions it was given by default but hopefully not too many and with those not much spying. It would be nice to have a clear map of what data can be obtained with what permissions.
There was a dark phase when it looked as if the only way to sign up for various services was going to be Facebook. If memory serves, there was a time when Spotify sign up required Facebook.
The difference between malware and "legitimate" software is whether there's a "legitimate" company behind it and whether that company has a "legitimate" interest in the information. Sad but that's how it is. Just like how governments give themselves the right to crack computer security and surveil everyone but throw citizens in jail if they do the same thing.
Yeah, I feel Messenger or WhatsApp doesn't really come into play here. Facebook has at least two different privacy settings when it comes to phone numbers and how it's shown to people:
1. At https://www.facebook.com/USERNAME/about_contact_and_basic_in... - there's a contact info section where one can fill in a phone number (among other details) and set the visibility to Public/Friends/Friends except <group>/Only me/Custom/custom lists.
2. And https://www.facebook.com/settings?tab=privacy - there's a "How people can find and contact you" section that covers "Who can look you up using the phone number you provided?" with options of Everyone/Friends of Friends/Friends/Only me. I imagine it'd be very easy to select the wrong thing during setup due to the overwhelming number of things to read and click.
I suspect those who have set the latter, or both options to "Everyone" will most likely be in the "free" data dump (except perhaps for most Australian Facebook users, for now).
I feel the second setting is riskier, especially if you don't want employers or colleagues to be able to simply look you up on Facebook by phone number. For example, I could hypothetically display my phone number to "Everyone" on an incognito profile and no one should be able to just wander by and spot my profile and immediately figure out who I am (assuming this profile doesn't somehow get suggested to "Friends you might know" - big assumption, yes; but this would depend on how I complete my profile). Regardless, either one or both set to "Everyone" is a recipe for disaster.
I believe more people would have allowed number searching compared to those who just have their phones displayed to everyone on their contact info sections. In effect, it's a bit of a reverse phone book, plus extra.
How are you able to send whatsapps to people you don't have a prior conversation with ?
I am doing the same boat...and was working fine until i lost & replaced my old phone. All conversations were lost, and this makes it challenging to use whatsapp for any non-group conversations (since I can't start any).
Others can still allow access to their phone book and the information stored in them about you will be transmitted and saved at Facebook, won't it? Is there a way to disable that?
Exactly this. I recently started a twitter for my academic career. Didn't share my contacts or anything (I only follow academic twitter too). I get tons of suggestions of people I know and several have followed me. The information is from their contact list because twitter knows my number and connected us. There's a clear benefit to this, but there's also privacy concerns too. The lack of control over this is what is concerning.
As I've seen on WhatsApp on every smartphone I've had: when you download WhatsApp and tap the "New chat" icon in the upper right corner, the app SHOWS YOU a list of everyone from your phone's contact list who has a WhatsApp account. They also show the list when you click "Chat" at the bottom and then "New Group."
it just occured to me that the data from the leak could be used to test if they are in compliance with the Gdpr, Now in the event that it turns out that they don't follow the rules, could this leak be used as evidence in court?
That doesn't seem to be correct, although what does 'phone numbers' mean in this context?
Quote: "WhatsApp, which was acquired by Facebook in 2014, does share some limited data with Facebook, including phone numbers. However, the firm has reassured users that messages will always be protected by end-t0-end encryption, which means neither WhatsApp or Facebook can see these private conversations"
"Limited" is a weasel word, as it can mean anything. e.g. a "limited time offer" can mean it lasts for 2 days or 2 years, because it is not unlimited.
Likewise, sharing a limited amount of information with Facebook simply means they don't hoover up every single bit. Perhaps Facebook is not interested in those automated texts you get confirming haircut appointments...
On the other hand, if you just got a haircut, then they know that you’ll be looking for another one in a set amount of time (based on your hairstyle, which they also know from photos), and they could advertise hairsalons to you then.
I’m not sure their algorithm is this refined, but it’s not impossible.
It claims not to, which isn't a guarantee. After all, they also claimed not to use phone numbers given to them for 2FA for anything else, and yet ended up using them for ad targeting.
"Cathcart: It’s true that we do have some information about how people use WhatsApp and that we do know, for example, the device ID. We collect this only to secure our services and protect from attacks. When you use WhatsApp and allow access to your phone book, we only see the phone numbers, not the name.
DER SPIEGEL: Do you share these numbers with your parent company Facebook?
Cathcart: No, we don’t. The updated privacy policies will actually not change anything globally in our ability to share data with Facebook."
Affiliated Companies. We are part of the Facebook Companies. As part of the Facebook Companies, WhatsApp receives information from, and shares information with, the Facebook Companies as described in WhatsApp's Privacy Policy, including to provide integrations which enable you to connect your WhatsApp experience with other Facebook Company Products; to ensure security, safety, and integrity across the Facebook Company Products; and to improve your ads and products experience across the Facebook Company Products. Learn more about the Facebook Companies and their terms and policies here.
AFAIK, that addition was what caused the uproar earlier this year.
(Also note the dark pattern in both terms of service that seed confusion as to which are the ones that apply to the EU. In the first sentence, “If you live in the European Region, WhatsApp Ireland Limited provides the Services to you under this Terms of Service and Privacy Policy.”, ‘this’ doesn’t refer to the text you’re reading, but to the texts behind the hyperlinks)
Someone scraped some public profiles. Someone then brute forced a poorly implemented "look up by phone number" feature. They linked the two datasets on the unique facebook user id.
Leaking data that is or was in the public domain is not much of a leak. The only noteworthy thing would be the leak of the non-public phone number, however that vulnerability has been widely known since 2019 (and has been resolved by Facebook), so there's nothing new here?
Where could I, or any Internet user, trivially download these details on 533M Facebook users prior to this dump? If nothing else, it seems extremely noteworthy that someone was not only able to obtain the data through scraping or some attack, but has shared with the world.
> Where could I, or any Internet user, trivially download these details on 533M Facebook users prior to this dump?
On Facebook. Literally. You can scrape any public profile info. It's against ToS, but it's not illegal (some caveats apply, see the hiQ Labs v. LinkedIn case for more info).
The only noteworthy thing is the phone number vuln. Except that's been known since 2019, so it's certainly not news.
There's a difference between programming a scraper capable of scraping 500 million records, running it and storing the results without getting caught by Facebook and downloading a file.
How hard is it to change phone numbers? So say I release my old number and take a new one, how do I make sure I am not forgetting any 2FA services I signed up for?
Because of this I use fastmail or Apple sign-in to make a new email for every online service I register for, and when it's a username I pull random words together from a dictionary. I also avoid using my real name as much as possible.
No, nobody has 1-3. It starts at 4,5 & 6 who are the founders:
- Mark Zuckerberg is 4: https://www.facebook.com/profile.php?id=4
- Chris Hughes is 5: https://www.facebook.com/profile.php?id=5
- Dustin Moskovitz is 6: https://www.facebook.com/profile.php?id=6
Is it possible to download this without giving money to criminals? (The article says free, but my 2 minutes of googling hasn't found it, somewhat unsurprisingly).
Is doing so legal?
If the answer to both of those questions are yes... I'd like to take a peak. Mostly to check whether or not some numbers I know haven't been directly give to fb are there.
Last night I was browsing Facebook, and all of a sudden, it said there's been suspicious activity and I've been locked out of my account. To unlock it, I had to review the email address and phone number I associated with my account (in case the hijacker added their own contact info), but all it had were my info that I added in 2011 (before I knew what a piece of shit Zuck was). Then it asked me to change my super-complicated password because it said the password is no longer secure.
So, can I assume this leak is related to this strange event?
Highly unlikely to be related. It's not a password leak. It's also not really a leak, someone scraped some public profile info and then used the phone number lookup feature to match up the two.
And yet it is still considered audaciously paranoid among the general public to protect your privacy by not having a Facebook/LinkedIn/Google/... account.
I've noticed that some people who don't have personalised social media seem to assume that other people do because they're mentally deficient or ignorant.
It's the same as how unsympathetic people ask why fat people don't just stop eating, or drug users stop getting high, or the cyberbullied don't just turn off their phone.
It's a lot more complicated than "just don't use facebook".
But parent is not talking about calling out people for having social media accounts. He/She is talking about those having a social media account judging those not having one as paranoid. You've just propped up a straw man here without addressing the point the parent comment made.
I removed my phone number from Facebook when it was reported that Facebook used this as some sort of tracking mechanism across third party vendors - specifically with purchases from merchants - in order to serve more "relevant ads". From what I recall, if the merchant is somehow hooked up into FB APIs then regardless of whether you signed up for their rewards program using an e-mail + password or via FB SSO, then they would send back "anonymized" data back to FB for each purchase(s).
I wonder if my phone number still persists (aka "soft delete")
When did you remove your phone number? Looks like this relates to a vulnerability that was patched in 2019.
I'm slightly concerned about this myself. I'm also seriously ticked off with Zuckerberg and co. I can tolerate the fact that internally they do scumbaggy things with my data. I tend to have less forbearance when they let my data out into the wild.
I guess if you use facebook you just deserve all the shit you get. What sucks is that the rest of us have to live with it too. I suppose we shall just keeping waiting for that darn market to correct itself!
So when are we going to stop companies from accessing your address book and 'uploading it' as part of the sign up process? Or even using Facebook and its services in general.
Well the biggest offender now has leaked the data of hundreds of millions of users who have attached their phone numbers and full names.
Now let's see if the users REALLY care this time that when they signed up to Mark Zuckerbergs website, it wasn't a good idea to sign up with a phone number in order to 'stop bots'. They did not learn with the Cambridge Analytica scandal, are they finally going to learn?
How is it a leak? There is no information how the data leaked. My bet would be that it’s hoarded through FB api and passed around. Nothing new happened here is my guess
Are there immediate actions people should be taking at this point?
A lot of password reset flows work via username + SMS using "we've sent a code to your phone number (xxx) xxx-xx12". This database unmasks that phone number, so my assumption is this makes sms hijacking more viable, but perhaps someone more knowledgeable can weigh in.
Does Facebook allow password resets like this, and can that be disabled?
Used to be phone books filled with people's phone numbers, names and addresses. In that light, this leak isn't too bad. Not great, but not that bad.
(Easy enough for me to say, I never gave FB my phone number in the first place, I've never logged into FB on my phone, and I'm not in any of these files.)
It's mind boggling how absolutely unaccountable these big tech companies - especially Facebook have become. I think we hear about such massive leaks at least once a year. And nothing ever gets done to prevent this from future and even the fines if any imposed are chump change for the companies.
Interested to know the GDPR implications of this for Facebook. This seems like one of those occasions where the regulator might be tempted to impose the maximum fine…
Long story short, regulators already have more than enough evidence about Facebook's lack of GDPR compliance so they could've already imposed large fines if they wanted to. The fact that it hasn't happened yet shows there's no motivation to actually enforce the regulation.
But Facebook have enough money and lawyers to ensure a plea deal is agreed. We work out as nothing more than a slap on the wrist.
All the big tech companies budget for these fines. You'll often see it called out at earnings where they allocate funds for a particular fine in a given quarter even though the investigation isn't finalised.
Tried to lookup some info, but it's not there. Maybe it's from some web scrapper which collected public info, or other means (some ambiguous mobile app which had access to contacts?). Or the leaked files are incomplete.
I’m curious about the pool of Facebook users who seldom use the product, retaining it solely for groups and to keep in touch with family. Will this event loosen that final brick and drive these users to delete their accounts?
in my case, no, because I still need to occasionally keep in touch with those people.
I use Facebook a LOT less nowadays though. Here's what I do:
- removed all of the apps from my mobile devices.
- only check it on VPN, with Firefox, using Facebook Containers.
- log out each time and do not use the "save this browser" feature.
- unlock origin & pihole are active on the VPN too.
I have managed to completely ruin its targeted ads for me. It's been an amusing experiment.
I still use it less because it is a HUGE memory & resource hog and eventually makes my browser window slow to a crawl.
"keep in touch with family" can be subsumed by chat apps. But for discussion groups and special interests, facebook is still the most accessible site to run (small) groups in, or am I mistaken?
If you have some need to know the people, maybe, but if not, hobbyist subreddits are better for discussion groups and special interests. The only thing I can't see leaving Facebook is some group that requires real-world interaction like a buy nothing group, but there are neighborhood specific platforms popping up that would enable applications like that without having to use Facebook.
This option still causes the call/sms to show a notification bubble... which is annoying since you can’t ignore it if you rely on notification bubbles to know if someone you know has messaged you.
Nice option, but whitelisting is a bit much for me. Amazingly, the iphone still does not have an option to automatically block anonymous calls (those that hide their numbers) and only them.
Interesting numbers in the linked tweet in the article.
5M accounts for the Netherlands exposed. Almost 1/3 of the population. Compared to Germany where “only” 6M are leaked, not even 10%.
Found myself in the data set, but didn't find several people I expected to find. Seems to be only those who added their mobile number (I did so for account recovery purposes only).
Thanks. I'm just getting a "Please open Telegram to view this post from @freedomf0x" message. Any way to access this without signing up for Telegram? The irony of giving my personal info to another 3rd party just to check if my personal info was leaked by a different party is too much...
I've tried three different browsers and none can get the download to work. It's possible I'm blocking some tracking domain at the router-level that's integral to the download functioning.
I doubt you'll find it anywhere else, it probably gets removed anywhere it gets posted due to violation of terms of service. I created a Telegram account and the Telegram link still works. I think that's your best bet.
Which link? The ufiles? Why does it go without saying? Not like stuff is instantly executed by downloading. All I got for my selected country was a plain text file.
Thanks. Was just able to verify I'm not affected (deleted my acc years ago), but it's crazy how many of my friends' names plus phone number are on there.
OT: Failed their hCaptcha probably 10-15 times before I gave up and just closed the tab. Hadn't seen it in the wild yet; I'm glad it's not pervasive (yet).
You might give this a look. It's a FOSS browser extension that auto-solves google captchas by switching them to sight-impaired mode and using speech recognition.
I believe they mean it can't effectively be sold if everyone has it. It loses value as a commodity if anyone can access it, but the value of the data is still in tact.
But facebook is not in the business of selling your data. It's in the business of selling your attention and it uses data to do so. There's nothing about this leak that changes Facebook's position in this market in this regard.
I can’t find any Facebook resource that says I can buy data, or any other reputable source that reports Facebook is selling details about people. Can you provide any?
This one is interesting and refutes the argument you are making
“When the company argues that it is not selling data, but rather selling targeted advertising, it’s luring you into a semantic trap, encouraging you to imagine that the only way of selling data is to send advertisers a file filled with user information. Congress may have fallen for this trap set up by Mr. Zuckerberg, but that doesn’t mean you have to. The fact that your data is not disclosed in an Excel spreadsheet but through a click on a targeted ad is irrelevant. Data still changes hands and goes to the advertiser.”
The opposing view is that the leaked data is only valid up to the time of the leak, and there is no guarantee it doesn't go stale, while Facebook has the fresh data.
The only way the data can lose value is when everyone in the list change their numbers. If making it public drives them to do that, then good, but the inertia to do that is so big that I doubt most would. Spammers and marketers don't care if the list is being used by competitors, so the value of the data as spam target is also not reduced by making it public.
I wonder why India with population 1.3bn has only 145MB zipped footprint when USA with population a quarter of that has almost 1 GB? AFAIK, FB is huge in India.
It looks like they abused the old feature to locate user by phone number and just bruteforced all numbers in a country. This feature is no longer available.
Personally, I wish Facebook would finally get slammed with the long overdue consequences of questionable practices when it comes to data handling and transparency, let alone minuscule control users have on own account and PII. This leak may have been preventable for a vast number of individuals. I suppose many are familiar with the old account "deletion" process that would — years later, too — prove itself not to be a real removal, but a mere deactivation, waiting to return from their graveyard whenever pinged by the simplest of login attemps by bots or ill intentioned individuals. At this point in time, considering the sheer amount of I believe accounts struck in a limbo, a dedicated fast track deletion process should be enforced on Facebook. I have, in my little knowledge, not found any case of GDPR requests granting one's wishes to see old accounts (that did not accept their newer ToS and cannot be authenticated in any possible manner permitted currently, in which registration and connected e-mails are not) be permanently removed from their systems. My attemps, at least, have come short.
facebook should be obligated by law to use this leaked dataset and notice the individual user that if the data is still valid, it is now widely distributed.
There are risks: stalking, sim-swapping and phishing via company info.
Right? They're masters at adopting the (supposedly) moral high ground and acting all hurt when others criticize them - you'll hear 'we need to be better' but there's this overriding sense of, how dare people differ from what we feel is best?
Great, so we get the worst of both worlds: outrageously obnoxious opt-out games (which, if skipped, implies free rein) and non-compliance as a cost of doing business. Wonderful.
GDPR/CCPA were always a way to punish the many (i.e. us) for the sins of the few (Facebook and Google).
And, I might add, it's to their benefit. What we're entering is a future where only Amazon, Google, Facebook, and Apple can handle data.
You see this constantly on HN. "Don't handle your own auth, just let Firebase do it!" People are against anyone other than AWS touching their data. And, well, we're getting the dystopia we deserve. What else can I say.
> Facebook already breaches the GDPR in many ways and has yet to see significant consequences, so this is unlikely.
Not having the data encrypted at rest seems to me a different infraction than the previous ones. The scale also matters, and that it isn't the first infraction.
Facebook's tracking consent flow has been in breach since the regulation went into effect in 2018, and has affected millions of people, both users and non-users. Keep in mind that had Facebook been compliant with the GDPR, the recent Apple changes regarding tracking consent on iOS wouldn't have been an issue for them at all.
I'd argue this is a much bigger issue than the lack of at-rest data encryption, and yet nothing has been done.
In my opinion, I think everyone should accept the fact that some of your personal information have been leaked online. But to think that every user's phone numbers are bind to other sensitive information is quite scary.
I don’t have FB or or WhatsApp but my Insta account (using a separate email address and no personal details) keeps recommending my therapist to me. How are we still ok with this shit?
The sooner we get rid of the cancer that FB is, the better.
I didn’t share my contact book with FB apps either. It was probably her—a person in her 70s, not necessarily experienced with tech.
The main reason this company exists, or that ad tech can maintain a facade of not being a mainly bullshit industry with made up metrics, is the lack of informed consent.
It’s almost funny how we accept the current situation as normal. Because, I think that we’ll look back at these times with disbelief of reckless we were and how cheap we’d sell ourselves.
You do know how this happened right? Wifi SSIDs with similar strengths reveal if people are in the same area, then just correlate timestamps and viola!
I wouldn't throw the elder person under the bus on this one, the tactics are sophisticated, and honestly, just a precursor to what will happen with AR.
To give a bit more of how it's implemented (at least how I would propose it in iOS), Insta/FB/Whats queries available wifi SSIDs as a background process (or whatever they have for notifications/networking etc), and does the same to your therapist since you both have insta / fb / whats ... and based on the signal strength, can say with confidence you two were in the same room because XYZ Wifi strength is -Xdb just like yours (walls are strong signal augmenters), and you are both there for some time based on the background thread timestamp.
haha, that's a good point, but in this case I think it's more trivial than that: she probably shared her contact book with FB or Insta (still, not her fault imho).
But, at the same time I've worked with FB SDK which was just one big shit show. It's hard even to describe it without turning a comment into an essay, so I'll pick the two I found somewhat amusing: sending data to FB before the developer could pass user consent (or thereof), sending hashes of the (non-FB) libraries installed on your phone to FB servers.
Minor tangent: The best thing about the web is that user agents are still pretty good at fighting some of the tracking practices (ETP/ITP, cross origin security, etc...). It's actually quite impressive. Then, native is just one big black hole. This is why the current browser changes, although positive overall (less $$ from 3p tracking), are a double edged sword (pushing people towards walled gardens).
It’s almost certainly just the phone number. Recently Instagram told me that a former business partner of mine had joined and I was surprised to learn that his account was an hair braiding service in Atlanta for women with African lineage (we’re both Canadian men with European ancestors). We figured out that years ago we had taken a business trip there and picked up temporary SIM cards back when Canadian cell phone plans charged injurious roaming fees. I still had that phone number in my contacts for him when I joined Instagram, and it had finally been recycled and used to create an account.
It’s a cool thought experiment for nerds and paranoiacs to imagine how you might use relative wi-fi strengths, bluetooth beacons and complex interaction patterns, but it’s less sophisticated than that.
Yeah, my first thought reading the parent comment was two words: "Occam's razor". But, I still find it amusing that companies like FB want to project the image of "informed consent" whereas we have a bunch of developers here trying to figure out what the hell happened and coming up with plausible solutions.
What's interesting thought (and I know that from my professional experience in ad tech) is that the "cookiegeddon" did push companies towards non-deterministic, more fuzzy ways of cross-device targeting (and we're talking about people who already think that fingerprinting is ethical).
The upside is that metrics are mostly bullshit anyway.
> It’s a cool thought experiment for nerds and paranoiacs to imagine how you might use relative wi-fi strengths
I'm honored to be called a nerd on HN... I'll ignore the latter ;)
Though while I agree the phone number is absolutely used, I don't think it's the only. Trying to get out ahead of the public's changing privacy tastes is a must for any advertiser that collects social-graph-like data. So strategically, if FB is not doing this, I would pull any FB investments because they aren't trying to do their job.
is it even legal for a therapist to share their clients contact details with a third party?
certainly I would expect that a person who works as a therapist would be aware that the concept of client confidentiality exists and that they should not share their clients details
It's not like Facebook is being transparent with what data they collect and how it's going to be used. Furthermore they don't understand the concept of "no" and will keep asking, hoping to catch you off-guard as you press the wrong button and give them access.
>You do know how this happened right? Wifi SSIDs with similar strengths reveal if people are in the same area, then just correlate timestamps and viola!
The problem is that someone decided to correlate them, not to mention without asking.
It is possible to opt-out of Google's Wi-Fi network location mapping by appending "_nomap" to SSID[1], I'm not sure if it works with other providers. Although I think this should have been opt-in instead of opt-out, the least we deserve is a standard, guaranteed way to universally opt-out.
Why it's always us who have to do the work to avoid being harassed by google? If I don't want to have my site harvested for snippets I have to add a no-snippet tag. If I don't want my WiFi data harvested I have to append an ugly nomap to my SSID. What about being it opt-in, as you said? I'm tired of doing Google's dirty work...
By the way, quoting from the article:
> "Specifically, this approach helps protect against others opting out your access point without your permission."
Oh, thank you for your kindness, Google. Yes, the idea of another person denying me the joy of having my WiFi data harvested by you is terrifying. Thanks, Google. You really know how to be helpful...
Especially because Google mapping your WiFi comes with real downsides for you. Two years ago a random stranger rung my doorbell and told me their Android phone got stolen and according to Find My Device, the device was inside my house and even showed it to me live. I told them to wait on the street and checked the roof and yard, but didn't find the device. I simply told them I can't help further and they luckily took it well, thanked me and left. Imagine how easily such a situation can get ugly though. A day or so later i realized that my Wifi router happens to be at an oddly open corner of my house, facing the backyard, and visible for much further than you'd expect since there are also no other structures for quite a distance. I bet his phone was somewhere there but saw my WiFi and so it erroneously located itself in my house. Thanks Google!
This isn't relevant - we're not talking about building a map of SSID to location, we're talking about using SSIDs to infer relationships between people; the SSIDs don't even have to be in any kind of location DB for that, what allowed Facebook to infer this relationship is that both the author's and their therapist's device regularly saw the same SSIDs.
> You do know how this happened right? Wifi SSIDs with similar strengths reveal if people are in the same area, then just correlate timestamps and viola!
I mean yeah, they _could_ do that, but thats a pain in the arse to do. Its far easier to do it on contact lists, interests and implied location from business page follows.
I don't think iOS allows you to track SSIDs, which explains the lack of wifi scanning utilities in the app store.
Not sure why you're suggesting shenanigans like wifi SSID tricks (and others jumping the bandwagon), when the actual thing that happened here is obvious:
GP visited their therapist's website, the website had FB/IG advertising tracker installed, the therapist had a campaign running that targeted all visitors from their site.
I appreciate that idea, however, I've been testing my own 'friend suggestions' and keep a strong track of my antics... also, it's become a hobby of mine to debunk each time someone says 'they're listening to my microphone!!!'
Most of the time the 'listening to me' conversations are based on origin IP to insta/fb/whatsapp servers. One person talks about idea X, another person looks it up (either in the room or later at home by themselves), and now everyone who was at that IP together will get ads for X.
What's more, Google maps uses Wifi SSIDs to get better location data when GPS gets a bit spotty... so, I'd venture to say it's a small step to associate accounts and make friends.
what's making it possible is the lack of privacy regulation. People by and large don't care enough about privacy,it's too diffuse, too complicated, the damage to oneself and others is too intangible etc.
Only way to end this is to destroy the business models that make it possible. What stands in the way of it is the mindset that this somehow harms innovation. (Innovating who can drive the titanic faster into the iceberg isn't innovation), that the government has no right to regulate private companies, and so on. The main problem is that people are trying to incrementally fix a broken thing, as Peter Ducker said
"There’s a difference between doing things right and doing the right thing. Doing the right thing is wisdom, and effectiveness. Doing things right is efficiency. The curious thing is the righter you do the wrong thing the wronger you become. If you’re doing the wrong thing and you make a mistake and correct it you become wronger. So it’s better to do the right thing wrong than the wrong thing right. Almost every major social problem that confronts us today is a consequence of trying to do the wrong things righter"
>People by and large don't care enough about privacy
Not to play dumb or sealion, but what opportunities are they given to do so? How often have those opportunities been one-and-done, "if you don't do something to protect your privacy in this particular instance at this particular moment, it's gone forever?"
> How often have those opportunities been one-and-done, "if you don't do something to protect your privacy in this particular instance at this particular moment, it's gone forever?"
I don't think that question really captures it, because an easy response to that is "Why do I care? Why is my privacy so important that it's a problem that it's gone forever?" To some of us that might seem like an absurd question; we see privacy as an obviously valuable thing that we are struggling to maintain.
But I don't think that's the case for most people; I think most people adopt the "I have nothing to hide, so what does it matter?" attitude. Especially when they (likely correctly) believe that online services that are central to their lives (like GMail or GDocs or Facebook or Instagram or WhatsApp) wouldn't be free to use if they didn't give up their data (and privacy) in return for the service.
You can try to point to data breaches, but, even then, most of those don't have a tangible effect on people. 533M Facebook users' phone numbers and personal data leaked? Most of those 533M probably won't notice anything bad happening because of it, and any bad stuff that does happen... well, they probably won't be able to draw a causal line from the FB breach to the bad things.
>I think most people adopt the "I have nothing to hide, so what does it matter?" attitude.
I think when this comes up it's a rationalization when the question becomes personal, "what if YOU had your identity stolen." However, for large-scale stories like this, I think defeatism drives the response more than a sense of innocence. You can't fight city hall, a cultural principle that anything a business does is justified, and other "oh well!" type reactions.
Yes, we need better laws, opt-in consent and alternatives to ad tech (such as better ways for supporting publishers). The issues are systemic, going deeper than ad tech itself (e.g. conflicting incentives even within same publishing org, metrics being mostly nonsense, Goodhart's law).
I think that the existing incentives can be moved, but we will need a chance in mentality that might require a generational shift, or who knows what how many fucks-ups. I'm becoming more and more pessimistic wrt to the latter.
Yup, I don't share my contacts with FB or insta, but I think that she did. I don't blame her, she's not a very "technical" person and the UX is not meant to help her make a conscious choice.
There are many other ways this could happen, did you google her address on your phone browser or something like that? IG always seems to give recommendations based on what I've watched on youtube recently or looked up somehow.
Honestly, it's almost certainly either her uploading her contacts, or location. I know that I normally get FB friend suggestions for people I've been at parties with.
The metastasis is companies and organizations that have FB groups and insist that’s the only way to get data or collaborate with them and their members or customers.
The reason we are here is because the one subset of the population which can do something about it has sold out. Is it the congressmen? No, it is us. Also the professors that taught us and the departments that accredited us. Either we did nothing to fight back or we are ourselves complicit and helped them build this world we live in.
I see what you mean but I think it's a bit more complicated than that. It's hard to make the right choice when most of the information you receive comes from the entities in whose interest is you not making the right choice (e.g. Google, FB).
An average HN reader is in a very comfortable situation compared to the remaining 99.9% of the population, who might not have time to think about this.
Unless, and I might've misunderstood you, by "us" you mean the people who work on those platforms, and have the time and resources to think about these matters, in which case I'd say that I agree with your statement. What's worse is how much brain power we're wasting on solving problems that shouldn't exist in the first place.
"The best minds of my generation are thinking about how to make people click ads"
> I don’t have FB or or WhatsApp but my Insta account (using a separate email address and no personal details) keeps recommending my therapist to me. How are we still ok with this shit?
I'm no attorney, but isn't there a doctor-patient confidentiality breach (in the U.S.) if a psychologist/iatrist's rolodex gets Facebooked out to the ad tech bidding systems?
> The main reason this company exists, or that ad tech can maintain a facade of not being a mainly bullshit industry with made up metrics, is the lack of informed consent.
Exactly, the industry is built on a foundation of obfuscating the myriad ways in which they are using people's personal data. Uninformed consent is the cornerstone of their business model.
The technology is just creepy. I recently experienced a wtf moment the other day when a friend stopped by and her new bf was in the car. We said hello and they soon left (I sell eggs). Later that day he is being suggested as a possible friend. I have my location services off but Facebook knew somehow.
Or FB knew that this person was your friend's boyfriend and decided to show them as a possibility. You might have even seen them there before and didn't know them and thus ignored them.
I'm not happy with ANY of it which is why I have no social media accounts and I've been seriously considering a "dumb phone" to replace my smart phone. I simply don't use most of the features and it's a security/surveillance threat anyway.
I had the same problem but figured it out at last. The Instagram recommendations are based on who is on your phone contacts. Anytime I add a new contact number, they show up on my Instagram recommendations even if we never interacted in anyway not even by the phone.
Once people accept that there’s no such thing as a free (as in beer) app or service. In addition to there needs to be serious laws put in place that gives users control of their data. And they should be getting paid for facebooks profits—not the share holders.
The problem is less about whether people accept to pay for services and more that it's currently more profitable to provide ad-supported services (paid for by non-consensual data collection) than paid ones.
Regulation that forbids non-consensual data collection such as the GDPR ought to fix that, but its lack of enforcement means it didn't have any effect on the market. Once regulation starts being enforced, it will rebalance the market where paid services will start to be viable because free services would no longer be profitable.
Nice (and detailed) blog post. In such a case there is a clear escalation path (in the EU). Either email your DPA (Data Protection Agency) or take legal action. Here are the emails addresses of the various DPAs: https://edpb.europa.eu/about-edpb/board/members_en
We are working on automating the escalation to the DPA part as well.
Yes, I submitted GDPR (Article 17) right to erasure requests, and I got utter garbage (please use the UI)
Facebook:
> Thank you for contacting Facebook. We have reviewed your report and it appears you would like to delete your Facebook account.
>
> Please note, for security reasons, we are unable to delete accounts on behalf of users so you will need to log into your account and delete it yourself. We have put in place a very quick and easy process for people to schedule the permanent deletion of their Facebook account.
>
> Before permanently deleting your account, you may want to log in and download a copy of your information from Facebook. Once your account has been deleted, it cannot be recovered.
However, after back and forth with them for a few weeks, I got this:
Hi,
Thank you for contacting Facebook. Based on the information you've provided, it looks like you're trying to request the erasure of certain personal data under Article 17 of the General Data Protection Regulation (GDPR).
Additionally, as per your request, your account has been scheduled to be deleted.
Please keep in mind that you have up to 30 days to cancel the deletion. Once your account has been processed for deletion, it may take up to 90 days for all of your information to be permanently deleted.
For more details, please visit the Help Center article below:
We store data until it is no longer necessary to provide our services and Facebook Products, or until your account is deleted, whichever comes first. This is a case-by-case determination that depends on things like the nature of the data, why it is collected and processed, and relevant legal or operational retention needs. For example, when you search for something on Facebook, you can access and delete that query from within your search history at any time, but the log of that search is deleted after 6 months. If you submit a copy of your government-issued ID for account verification purposes, we delete that copy 30 days after submission.
When you delete your account, we delete things you have posted, such as your photos and status updates, and you won't be able to recover that information later. Information that others have shared about you isn't part of your account and won't be deleted.
> I don’t have FB or or WhatsApp but my Insta account (using a separate email address and no personal details) keeps recommending my therapist to me.
So what? What's the harm?
People sure like to write emotionally charged posts arguing for privacy, but they're always suspiciously low on details on what bad things (actually) happened.
Even in this case with phone numbers and other data leaked, so what? What harm do data leaks cause?
Seems like making a fuss about nothing.
> How are we still ok with this shit?
We're ok with a lot of shit. I think if we were to make a list of shit this would rank pretty low.
You've obviously never been a victim of identity fraud, stalking or psychological terror.
As long as the legal justice system hasn't caught up with that (in the sense of efficiency and prevention of financial problems) every data point that's leaked about you is a potential threat.
> fuss about nothing
Ever heard about rape victims? Ever heard about stalkers? Ever heard about psychological threats? Ever heard about someone being forced to do something they don't want? Ever heard about the fappening? How do you think those things have happened in the past and literally ruined people's lives?
> You've obviously never been a victim of identity fraud, stalking or psychological terror.
And that's the point: most people haven't, and many who have probably weren't able to link it to something specific like "Facebook vacuumed up all my data and then lost it". And "most people" are the people who influence and make policy.
> Even in this case with phone numbers and other data leaked, so what? What harm do data leaks cause?
Lets imagine a situation. You've got an officially looking letter, from unknown to you organization, claiming that for example, your lawn is infected by a grass variant of COVID-19 and must be disinfected, and this organization could do it in a jiffy for a mere $1k.
Probably it is a scam, isn't it? How do you judge it? One of the sign of a scam is a lack of personal information in the letter. But if you see that letter contains your name, address, phone number, lawn dimensions, then you probably shouldn't throw letter to a garbage bin, you should find some other kind of test to judge is it a scam. Isn't it?
So when you made your personal information public, scam detection is going to impose bigger costs on you. Even if we assume that you are perfect scam detector and will not let any of scam to pass you undetected, then the lot of people are not perfect in this regard. So the more difficult detection is, the more prey for scammers. It impose costs for a society overall, because society start to give money to scammers, to finance all that activity that is counter productive for an economic growth.
But as for me it is just a nuisance to decipher such letters trying to spend as little time on a scam detection as possible while having no false positives.
> People sure like to write emotionally charged posts arguing for privacy, but they're always suspiciously low on details on what bad things (actually) happened.
Two bad things (random selection, because the comments below already make some really good points):
1. targeted behavioural advertising is proven to increase polarisation, literally turning people against each other.
A single instance of violating someone's privacy doesn't matter as much as your single vote won't shift the result of elections. But a single vote does matter, because is a part of a bigger whole.
2. My family member suffers from PTSD acquired because of living in an abusive relationship for 2 decades. That person started a new life, but ads targeted at her and her partner more than once triggered actual panic attacks. I know this might sound ridiculous without the context. This is because that person didn't understand how clever the tech behind targeting was and assumed that the ads were related to their partner cheating on them. It's irrational, I know, but we're talking about someone who is psychologically vulnerable.
I'd still say that 1. is a more important argument here, 2. just follows the line of thinking presented in your comment. (the main problem behind 2. is that person's mental state and the actions of their abuser, yet the amount of suffering that could've been removed is not negligible.)
> Even in this case with phone numbers and other data leaked, so what? What harm do data leaks cause?
Cambridge Analytica, voter manipulation, bias in behavioural targeting, increased polarisation in media--please Google these queries and educate yourself. There's a tonne of resources on the subject, including peer reviewed academic papers.
> targeted behavioural advertising is proven to increase polarisation, literally turning people against each other.
Can you provide some evidence for this please? Certainly, filter bubbles make it easier for people to radicalise themselves, but I've not seen very much evidence that it's specifically the advertising.
And polarisation in (US) media has been underway since long before Mark Zuckerberg left elementary school.
I guarantee you that the majority of the population does not understand or care about your #1.
And I expect that the majority of the population has not experienced the horror of your #2.
If the majority (in this case, likely vast majority) doesn't care about something, there probably is not going to end up being any public policy protecting against it.
It's a common misconception that the purpose of security is used to provide privacy. I'll deal with that first, then we'll get on to the comment thread.
Information security can be about trust, i.e. I trust that person A sent this message because of X, y, z. I also trust that the message hasn't been tampered with because of X, y, z.
Privacy is a sub/side topic of information security. E.g. keeping all network connection data about an individual obfuscated at all times i.e. All data is kept hidden in a way that cannot be made unhidden.
Privacy is part of information security, and serves to ensure certain systems could be considered secure in certain cases (depends on the threat model/requirements of the system).
Basically, you've got it the wrong way around. Privacy (as a purely technical idea) exists to keep some information secure in certain cases.
Recent Fawkes paper is a good example of privacy as a security consideration.
Now for a case where it doesn't matter...
Whenever you're asked to run an MD5 hash check of a file you've just downloaded, that's an example of authentication/verification.
Doesn't matter if someone has seen that you've downloaded the file, just that the downloaded file is correct (for you).
Good example is Linux OS distribution ISOs.
Privacy doesn't really matter in that case (depending on your threat model), what matters is that the file you've downloaded matches what you wanted to downloaded. No-one intercepted and tampered with the data in transit.
You can trust the data that you've downloaded.
It doesn't matter if Mr FBI saw that I downloaded it, because it's not illegal. So why waste energy and resources on solving a problem that's not a problem?!
Now on to protection of confidential data...
Facebook is actually a good example of this. Most people are not anonymous on there. You can search and find people (depending on settings). Privacy, in that sense, is not provided.
However, they do (or are supposed to) keep our data protected from external malicious adversaries, whilst not making it completely private to everyone.
I can see my friends information, it is not private. It is, however, supposed to be protected and kept safe e.g. a credit card number.
A credit card number can't be completely obfuscated because then it can't be used. Instead, that personal information should be protected.
Now, in relation to the parent of the parent of the.....
The point of the comment, and I agreed with it, is that if personal information is leaked to the public -- that's not privacy, it is improper confidential data access -- really bad things can happen.
I can call that number every 2 minutes to perform a denial of service attack (eventually they'll turn their phone off, no more phone service!).
I could send horrific child porn to that number.
I could do X, y, z with a phone number.
I don't need passwords and encryption keys or zero day access to your hardened Linux box to fuck up your life.
I can do it with a phone number.
And here's the real kicker --- I don't even know who this person is! They're anonymous to me. Their privacy is mostly intact, but I've got access to confidential information which means I can fuck up their life regardless.
So your point of "well, why don't they just give out access to ALL the confidential information" was, actually, kind of on point!
That's exactly the kind of data we definitely do not want out in the wild. That's extremely sensitive data with which I could cause absolute havoc!
Where you fell down was the "leak all of it cos why not". One tiny piece of leaked confidential data can be massively dangerous. That was the point of the comment.
One tiny piece of data and I can ruin your life. I don't need everything, just one thing. One phone number.
Hopefully that was helpful. It's all a shade of grey depending on your threat model tbh.
These events are not a matter of if but when. And since the overwhelming majority of the people in my social circles has zero understanding of the real nature of the relationship between them - FB users and FB I just hope this will become increasingly frequent and painful experience for them. As in: I really hope this will get FB users in trouble as a result of identity theft etc.
This may sound extremely cynical but at this point it's the only way for the non-technical folk to understand the implications of giving away your privacy so that you can share cat pictures with other people.
> people in my social circles ... I just hope this will become increasingly frequent and painful experience for them.
Very strange to wish harm upon your friends with the hope that that will convince them to join your side in a political fight! I would suggest instead that you only wish that if it becomes a painful experience, they would realize why and renegotiate their relationship with FB. Typically wishing pain on your friends is not a good stance.
It's a pretty minor harm and it's one somewhat like ripping a band-aid off. The pain will come sooner or later since we (at least in the US) aren't addressing the irresponsible data practices in industry. The sooner people detach themselves from the likes of FB, the better off they'll be when leaks happen.
Not that strange. The whole "rock bottom" concept for addicts is similar, right? Sometimes you have to see a friend or family member truly experience real pain to get them to want to change. People are like that.
The sad fact is that as much as I wanted to believe that positive reinforcement was "better" for me because it was supposedly "better" for people in general, in practice it's only ever been negative reinforcement that has enacted any change in my life. Trying to deny that fact for so long only accomplished setting my life back by several years. Even the simplest things like dental hygiene only became habits because I suffered catastrophic losses from neglecting them.
I think it's because my imagination of the failing scenario will never compare to the experience of the failure itself. Whereas if there's no singular point at which the failure becomes obvious and decidedly life-changing, then...
I think it would take more than this to be leaked, particularly if users had their 'private' messages on services leaked, then they would start to realize it.
I think most normal people acknowledge that so many companies know their phone number and name that they may be past caring.
So if that's the case, I think we should move beyond really even trying to think of this info as private or a marker of identity, and we need to move everyone to more secure forms of identity verification.
As has been pointed out on HN before, "identity theft" is a made-up concept to make it seem as if you had something stolen from you, when the real problem is banks and other service providers do an absolute shit job of identity verification. They're the ones at fault, and they try to shift the onus onto you to fix things when they screw up.
Indeed, a social security number is pretty much the only additional piece of data to the stuff above that one would need to open up a bank account in someone else's name, and those have been leaked plenty of times too.
The government needs to make harsher penalties for banks and others that can ruin your credit, etc. because they accept all this leaked info as "proof" of identity.