Hacker News new | past | comments | ask | show | jobs | submit login
267M Facebook users IDs and phone numbers exposed online (comparitech.com)
175 points by JeanMarcS 34 days ago | hide | past | web | favorite | 65 comments



There's a lot of very obvious "didn't bother reading the article but I'm going to comment on the headline" behaviour in this thread.

FB users put their details on their publicly accessible FB, someone ran a scraper across FB for publicly accessible info and dumped it into an insecure elasticsearch cluster and a researcher found that cluster.

How is FB at fault there? I say this as someone who has colossal issues with that company in general.


Facebook have a history of making settings default to little or no privacy, making those settings obscure and difficult to set, and changing those settings and their defaults faster than most people can keep up. This data dump is the not at all surprising result of Facebook's policies.

I remember the Zuckerberg family being caught out by Facebook's settings over a photo and complaining when the photo spread: https://gizmodo.com/randi-zuckerberg-is-just-as-confused-by-...


I'm not sure I agree. I've had a FB profile, that I admittedly don't look at often, which I've had set to private for its duration and can't remember a time where I've had to jump into the security settings to change something due to my private information suddenly becoming public.

We can argue that the core UI is trivial and hence non-technical people jump on and essentially make user errors, but I'd be inclined to put the onus on people using a private platform to host their private information to ensure they've set up their account in a way that reflects the settings they want.

Don't get me wrong here either, if someone hacks a platform and extracts private information, or if the platform's infrastructure is misconfigured and exposes private information, then by all means I want the company responsible held account. Which is something that currently does not really happen. I'm just hesitant to blame FB for users misconfiguring their accounts and alien actors taking advantage of that misconfiguration.


I disagree, this seems to require users to read and understand the EULA, navigate all the settings (and again each time they change) and have a high degree of information op-sec for the things that aren't mentioned in the EULA or settings (such as 1 pixel tracker images on third party websites). This would have to be done for each and every service you come into contact with.

People may not be able to opt out of these services. Many people will buy phones and devices and, when they receive them, discover that the Facebook app and similar apps are pre-installed and cannot be removed. Additionally, here in the UK many schools use Facebook to communicate important information to parents. Even without semi-official requirements, the effect of so many people and organisations running social interactions through Facebook means that people have little choice but to join themselves.

I think we need both legal requirements and social standards about how these things are handled to make things like Facebook safe by default, not safe if you and the people you know put effort and expertise into controlling your exposure.

PS. I haven't downvoted your comments, I don't know why one of them looks to have been downvoted.


> Diachenko believes the trove of data is most likely the result of an illegal scraping operation or Facebook API abuse by criminals in Vietnam, according to the evidence.

The only ones capable of preventing either the scraping operation or the API abuse would be Facebook. Scraping is an arms race, but I certainly don't trust Facebook to care about protecting my data, except where it would infringe on their ability to sell it. If it's "API abuse," that's definitely on Facebook to prevent.


Wasn't facebook pestering me for my phone number for "extra security" a while back? Seems like the exact opposite.


This is like the Cambridge Analytica scandal: people allowed 3rd party apps to access their data and then they complain when, eh, they had their data.

Solution? Facebook closed the API. And now people complain that Facebook is a silo and they hold onto your data and they don't allow 3rd party apps to access it.


That’s an oversimplification. The Facebook API had a gaping hole in it that allowed third party apps to access data on the authorized user AND friends.

Facebook discovers these kind of issues on a regular basis. The idea that they’ve clamped down on this type of thing is the joke of 2019.


Facebook could have restricted 3rd party access to only those applications that had given permission, or provided a way to manually export it so users would become familiar with a process they should probably know anyway.


Or they could have a sound app review process like Apple.


Last time I tried Twitter will disable your account if you don’t provide a phone.


You can open a support ticket and they'll restore it and include a BS apology about how your account was supposedly a false-positive of their anti-abuse system (where in reality it's just a way to harvest phone numbers out of easily coercible victims).

However, the main question is: why would you ever give your time to such a toxic platform especially when it's explicitly working against you like in this case? Just move on and give them the finger.


> why would you ever give your time to such a toxic platform

Because a bunch of people you know are on the toxic platform.


A good idea if you want your personal relationships poisoned.


> A good idea if you want your personal relationships poisoned.

not how it works for me! i have friends on fb that i haven't talked to or seen since high school. i don't use fb proper but it's good to know that it remembers everyone so i don't have to. i guess it goes the other around as well.


> it's good to know that it remembers everyone so i don't have to

There is something wrong with this for me.. If you don't remember people, are they really worth kepeing around ?

It is a great way to reach someone you want to get in touch with after a really long time, but that scenario is far more rare than people claim when they speak of the 'magic of facebook'.


> If you don't remember people, are they really worth kepeing around

How in the world are you ever going to know if you never contact them again?


:) I guess I just keep a select group of friends that I want to be with. What I don't know wont kill me and all that, but fair enough. Each to their own !


It's not that rare. I met some friends in middle school about 10 years ago. When I move to the US, FB is the place to connect them again because we keep FB connections. A lit of people around is also in the same situation. Like the above comment, you never know.


Doesn't make it a less toxic place in any meaning. If somebody is such a diva they have an active presence on Twitter, there are almost always other means to connect.


When the Guardian closes its own comments section (usually when they are heavily pushing opinion without evidence) you can usually comment on Twitter.


> why would you ever give your time to such a toxic platform

Did you ask this back in 2008?

As someone who was ridiculed, outcasted and made fun at uni for not using social media because I didn’t want to submit my data online, it’s finally nice to find people are actually starting to realise this; just twelve years too late.


I agree, it’s a choice where you feel the consequences pretty much every day when it comes to Facebook or Instagram.

But Twitter? Pretty much nobody uses it for day-to-day communication, so you can opt-out with no consequences.


I think it has changed recently: https://www.theverge.com/2019/11/22/20977436/twitter-2fa-pho...

Still, it's a shame it took them so long to implement a proper 2FA.


Can't confirm. Registered a new account a few days ago (deleted my old one years ago). Next time I tried to log in, they asked me for a phone number. I declined and didn't log in. Now the account is gone.


>Seems like the exact opposite.

Worse still, they used the numbers provided for ad targeting.

https://techcrunch.com/2018/09/27/yes-facebook-is-using-your...


Just a reminder that Comparitech "pays" security researchers for "data breaches" and most likely encourages people to report these things to them without getting servers patched: https://twitter.com/securinti/status/1196850409924681728

No offense, but if you need to "pay" for your researcher, you're probably not that ethical and are most likely behind some intentional offensive hacking, so people can make money off your back.


To be it just sounds like they're offering a bespoke bounty programme.

If you can assume that they are reporting the exploits or breaches through the right channel, it might actually be more convenient for bounty hunters to have 1 place to funnel them all into.

If the Comparitech also make some profit off their reporting of the breaches then you can start to get an idea of where they're getting some funding from.

I am fine with this practice.. It incentivises more grey/white hat eyes on potential breaches. And in my book, thats never a bad thing.

Given how public they are about their methods and approach, I will give them the benefit of the doubt for now..


Facebook has fundamentally lost control of their infrastructure. It is insanity. There are now VPNs out of Hong Kong operating output of FB ASN space. I truly have never seen anything like this in my life.

At FB the morale has collapsed. The support forums and bug bounty submissions are piling up and have been for weeks.

FB cannot and will not act. It is a problem of leadership not engineering and I have tremendous respect for nearly all of the staff there.

That being said the fact that Facebook continues to ignore that servers in Vietnam are hosting what appears to be all 71 million records of the Vietnamese ppl is shocking. If you are a Muslim in Vietnam the information is shockingly detailed.

http://125.212.244.27:9200/_cat/indices


Isn't that actually another thing to worry about: what is going to happen to all this data when FB eventually goes bankrupt? Seems hard to believe they're just going to delete it from their servers...


> There are now VPNs out of Hong Kong operating output of FB ASN space.

ELI5?


Five-year-olds don't need to understand BGP.


Slight reminder that Whatsapp is Facebook owned and that one is based on phone numbers only. Talk about phone numbers, heh?


Everyone is at fault except Facebook. Vietnam, illegal scraping, criminals.


As much as I hate Facebook, I can't really blame them for someone scraping data users decided to share publicly. When you publish data online available publicly it's normal and expected that someone can make a copy of it, either through manual data-entry or automated scraping.

The only question here is how were emails & phone numbers obtained and whether users were made aware that they would be available publicly.


The consequences of sharing data do not appear immediately. Delayed effects of things are never going to be simple to deal with. People always have been and always will be hard-wired to keep on doing things if they aren't immediately struck down.


I'm sorry but you cannot expect every user to understand the implications of putting data online. Facebook makes zero effort to protect your grandma who has no idea of the implications of putting info online. Blame facebook.

This is a terrible trend of "well you agreed to the terms and conditions so this is YOUR fault", when certain T's and C's shouldn't exist in the first place. IIRC the CEO of twitter even thinks this is bullshit.

We need to blame citizens, consumers, and users LESS than corporations, not more.


> Facebook makes zero effort to protect your grandma

How much effort do you put in to protect your grandma?


She is dead


Maybe you should care for your grandma and not expect a megacorporation to parent her. Maybe grandma shouldn't be using the internet.


There's so much heat about that problem because it's not that simple. Implications of technology changes are not always obvious even to HN users and they are definitely not clear for majority of the people.

So yes, we need something cleaner than tons of ToS that's continually changing and we also need more educated people that know that if they put something online it's now public.

My comment is not saying anything because balanced view is like zero information. But the same is going on with politics. People tend to polarize into two camps. You can make good argument for both. By rationalizing away the other one you get a sens of self-coherence and some dopamine because yay brain we solved that issue, it's that simple.

But agreeing on small steps that seem like a good direction and understanding that some things that don't seem like a good ultimate solutions may be good local optimizations is hard. Plus it doesn't grab attention. Attention economy even in HN threads. I'm most eager to respond to views that I heavily disagree with, balanced comments get out of the way.

E.g. if I ask you if you know Bob and you say yes, I don't think you would say you violated Bob's privacy (you probably would think that if you gave me his number). So who are you friends with commonly feels like an info that is fine to share. But if I ask everybody on the planet I know all friends of Bob and he may not be fine with that. I think both technology and mindset improvements are necessary. Regulations probably too but it's hopeless how much it is lagging and understandably so given highly technical nature of most of these problems and centralized law making.

Voting with your behavior like not using fb hardly works (see Bob's problem above, but also power law and everything being connected) and we still haven't figured out how to punish bad behavior of big corporations. Losing some money is not an issue and how can you put somebody in jail if the crime was emergent and hundreds of people participated without necessarily knowing anything about it.


Is there a way to check if your data was in this database? Is it on haveibeenpwned yet?


“product:elastic country:vn” on Shodan. From there grab the IP address and if not over 1000 shards you can simply do...

IP:9200/_search?q=myName


Yes. You can check by attempting to login to Facebook.

If you have an account your data has been in a leak.


I bet, even if I deleted my account 2 and a half years ago my data can be found there.


You did what? There’s obviously no delete on Facebook, only an isDeleted column in the db.


I've opted to delete my account, according to the EU laws, Facebook must delete all my data. But I'm sure Mr. Z keeps everything somewhere…


how is that obvious


> This will reduce the chances of your profile being scraped by third parties, but the only way to ensure it never happens again is to completely deactivate or delete your Facebook account.

Translation: the only way to have an account is to not have an account.


This is what happens when you centralize data - it leaks


There has been a few elastic search data leaks recently. I do not know the product. Is it unsecure by default like MongoDB?


Yes and changing it is not easy. In particular securing it on AWS has reached totally preposterous levels. I am a Pro Architect Cert holder and I am routinely baffled by the docs there.


Their developers should take as much heat as the guys who left that data in the open, then, in my opinion.


The author describes himself as: "TECH WRITER, PRIVACY ADVOCATE AND VPN EXPERT" (capitalization from source)

"...the trove of data is most likely the result of an illegal scraping operation or Facebook API abuse by criminals..."

More cyber alarmism. What would these "VPN experts" say to a phone directory?

He goes on to describe how this was reported as abuse the service provider instead of notifying the owners of the DB.

Finally he concludes that users can manage their privacy settings from within Facebook. Thereby acknowledging that users can manage their data or have chosen to provide it publicly.

The cyber-alarmism trend from self appointed security experts has gone too far.


It hasn’t gone far enough; not by a looooong shot.

Billions of compromising documents, photos and personal details are now sitting around on the servers of a half dozen for-profit companies.

Only Equifax has given us a taste of what is in store.

Is the world prepared for the day when a trillion Gmail messages leak? Billions of personal camera-roll photos? Trillions of search history entries?

We needs to start taking these issues seriously.


Sure take it seriously, but don't trivialize the issue with alarmism. A few decades ago more information (Full name, phone number and home address) was available in phone directories. These were commonly distributed to every home with phone service.

Today we have security-expert-journalists calling the equivalent of a phone book a "verified threat incident".

I'd wager that gmail data is already available to the intelligence services of the five-eyes countries.

What does taking it seriously mean in this case? Not using Facebook? Acknowledging that data posted publicly is, wait for it... public?

To top it off the bread and butter of this blog is affiliate links to VPN providers. Another centralization of data for those seeking privacy. Not only is this contradictory, but it preys on the ignorance of the audience.

When companies sell this same data to employers it is called OSINT. When someone finds this data through Shodan and it is hosted in Vietnam they call it cyber-crime. Many times the OSINT groups are the same ones making the accusations.


I hear this analogy used sometimes but modern data isn't equivalent to phone book data. You had the option to be unlisted and it was respected. You also didn't have your name and address tied to so much personal data in one place. And do we know this data was public? I don't use FB but I know you have options to control who can see what you post, whether it's public or private.

Sure, if you're the HN crowd and assume every database is penetrable, this may not be a "threat incident" but when you're Grandma Jones who assumes that when she ticks the "private" box on her posts that it'll be private. If you care about infosec but are still ignorant of just how incompetent these companies can be then this breach is a "verified threat incident" and there's no reason to be alarmist here because someone mentioned that it happened.


    Zuckerberg: Yeah so if you ever need info about anyone at Harvard

    Zuckerberg: Just ask.

    Zuckerberg: I have over 4,000 emails, pictures, addresses, SNS

    [Redacted Friend's Name]: What? How'd you manage that one?

    Zuckerberg: People just submitted it.

    Zuckerberg: I don't know why.

    Zuckerberg: They "trust me"

    Zuckerberg: Dumb fucks.
https://www.businessinsider.com/well-these-new-zuckerberg-im...


No no, you see it's the users fault for... uh... using a service promising to protect them?

Yes indeed putting information online you expect some level of privacy for was your fault, because someone posted on display for everyone to see. Mark is never at fault.


What this highlights is that it is damn simple to be a poor developer yet achieve a particular goal. You can brute force your way towards that goal, ignoring any sort of costly 'useless' security, usability or user privacy aspects. Even more so if you're a criminal. GDPR|CCPA < INTERPOL!

This is never going to end. This is true for criminal orgs but also legit businesses that despite regulations will mostly prioritize features to their customers over less tangible/monetizable value like hardened infrastructure and updated software.

Maybe I'm wrong and this cluster was left exposed for another reason, though.


Facebook only has an issue with people getting the data for free.


just a matter of time before messages are exposed too, ruining a lot of lives in the process. One by one, all castles will fall.


At this point, in 2019, does anyone truly care about leaks anymore? I feel like this is becoming a new norm.


I am so sick of getting nuisance calls that I just don’t answer numbers I don’t recognise or expect anymore (thanks OVH!). If someone leaks my phone number another time I’d still be pretty pissed off.


Normalization of Deviance[0]. Even people are acclimated to it, it's still a problem.

[0]: https://www.ncbi.nlm.nih.gov/pubmed/25742063




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: