
540M Facebook Records Exposed - themattress
https://techcrunch.com/2019/04/03/facebook-records-exposed-server/
======
dang
[https://news.ycombinator.com/item?id=19565408](https://news.ycombinator.com/item?id=19565408)

------
testplzignore
The "540 million records" wording seems misleading (probably intentionally by
UpGuard and/or TechCrunch). The screenshot on
[https://www.upguard.com/breaches/facebook-user-data-
leak](https://www.upguard.com/breaches/facebook-user-data-leak) leads me to
think that this is 540m object records of various types (posts, comments,
etc), not records of 540m distinct users like some readers would think.

It sounds like a lot, but it's not. You could probably scrape that much data
from public Facebook pages in a few days without even being logged in,
especially a few years ago. Heck, you could say right now that Reddit has
billions of user records exposed if you define them that way. The Hacker News
first page itself links to thousands of user records :)

------
kerng
Amazing how a third party can harvest that amount of data and Facebook is
freely handing it out... they really have no control over the data they
process and handle. It's been shown again and again.

It seems Facebook should be forced to disable any kind of data sharing with
3rd parties since they obviously cannot make it work. They have enough issues
with the security of internal data handling procedurs already that they have
to fix, before giving data to third parties.

~~~
SlowRobotAhead
>It seems Facebook should be forced to disable any kind of data sharing with
3rd parties since they obviously cannot make it work.

That is a massive part of their model, so that will never happen. The
alternative of course is to stop giving them data.

------
anonytrary
These things are very hard to stop. First law of the internet says that if you
have a public website, it will be scraped and turned into structured data.
Over the years, Facebook has been adding more options to make profiles
private, etc. but there are still loopholes around these things with 3rd party
"delegated" authentication.

~~~
torqueTorrent
They were moving so fast that they broke things, badly!

------
badwolf
Seems it was 3rd party apps data stored in ... openly accessible S3 buckets.
-_-

~~~
nvr219
Isn't this the most common reason for these leaks? Does Amazon not have
screaming red banners saying "this is gonna be openly accessible?"

~~~
themattress
No screaming red banner, but each bucket does indeed get marked with large
orange rectangle that reads “Public”. It’s easily noticeable. In my limited
experience, a lot of the open buckets problem comes from the fact that access
to * is the path of least resistance vs a proper IAM and bucket config. When
feature work is on the line ain’t nobody got time for that :/

~~~
wangchungtonite
A lot of devs likely using aws in this manner are also not likely using the
web interface an instead operating from cli pipelines where it’s easy to miss

------
collingreene
[https://www.facebook.com/data-abuse](https://www.facebook.com/data-abuse) \-
as mentioned in the article this scenario (non-fb companies mishandling fb
user data) is exactly the reason Facebooks data abuse bounty program exists.
Hopefully the finders of this submitted to the program.

------
jakequist
tl;dr - Somebody scraped massive amounts of FB data over a number of years and
then abandoned it on a public S3 bucket.

------
mindfulplay
It's the 21st century and I think it's time to stop calling these'records'.

~~~
nvr219
540M Facebook compact discs exposed

