
Millions of Instagram influencers had their contact data scraped and exposed - idlewords
https://techcrunch.com/2019/05/20/instagram-influencer-celebrity-accounts-scraped/
======
astrea
This makes me wonder: How many Instagram influencers are there in total? What
is the percentage of influencers in the entire population of (active) users?
At some point it's just influencers influencing each other, right?

~~~
oldjokes
I love how we're all just using this invented marketing term "influencer" as
if it's actually a real thing (it is not a real thing).

It is slightly less obnoxious than just calling it "first class" vs.
"commoners" instagram, but still pretty obnoxious nonetheless.

~~~
hn_throwaway_99
> I love how we're all just using this invented marketing term "influencer" as
> if it's actually a real thing (it is not a real thing).

That statement is ridiculous to me - actually more ridiculous than I find the
whole "influencer" culture in the first place.

"Influencer" absolutely is a real thing. The reason brands are willing to pay
many thousands of dollars to get something as simple as a tweet, story or post
about a product is quite simply because it works. On the other side, Snapchat
lost $1.3 _billion_ in valuation when Kylie Jenner tweeted "sooo does anyone
else not open Snapchat anymore? Or is it just me... ugh this is so sad." Seems
like "influencer" very accurately describes the role.

~~~
manigandham
Famous people with influence have existed for centuries. There's nothing new
here, other than more people claiming they can do it with fake numbers and the
rise of talent agencies 2.0.

~~~
hn_throwaway_99
> Famous people with influence have existed for centuries. There's nothing new
> here

Baloney. How many teenagers could make literally millions of dollars in
centuries past based primarily on their relatability (in addition to their
video game or makeup application skills). Modern social networks and YouTube
have opened up an entirely new way not to just _have_ influence, but to
monetize, productize and quantify it.

I get it, you probably think it's all dumb and pointless, but to say it's
"nothing new" is like saying the internet is nothing new because people have
been sending messages back and forth for centuries.

~~~
manigandham
"Influencer" is just a buzzword. Influence itself is not a new concept. Fame
is something that provides influence, and can be gained by being recognized
for something and growing a following.

Sure there are new avenues to create and connect with people to gain that
recognition, along with increased opportunity from a bigger population and
lower costs, but how fame works and the monetization of that fame is a very
old concept. There's nothing new there.

Study the principals, not the buzzwords. You can find all this in a 50 year
old marketing book.

------
jedberg
> Security researcher Anurag Sen discovered the database and alerted
> TechCrunch in an effort to find the owner and get the database secured.

Wouldn't it make more sense to contact AWS, who presumably has the contact
info of the owner?

~~~
throwaway13000
AWS does not provide Anurag Sen with any exposure/advertising. (Note: I don't
think this is wrong. Everyone has to earn a living)

~~~
jedberg
This may in fact be the right answer. Which is sad if true.

Unless he tried AWS first and got nowhere. Then Techcrunch would be a
reasonable place to go.

------
downandout
It should be noted that most of this data (including emails and phone numbers)
is published publicly by the influencers themselves. This headline is
clickbait. Thousands of scrapers are constantly running all around the world,
scraping any data that you choose to make public on social networks. Don’t
make your phone number or email public if you don’t like this.

~~~
HNLurker2
I would like to know the phone number and location though.

~~~
downandout
Many influencers include this info in their bio.

------
oooshha
Interesting... Some of the "influencer search" platforms eg heepsy allow users
to download a limited number of email/phone details for influencers. I always
wondered how they managed to get this info in bulk to begin witht

~~~
r_singh
I mean just like Twitter data is public, I always assumed that public
Instagram profiles are public data too. Everyone can see their posts, likes,
comments, followers and following, so what stops a computer from doing so?

Update: it's against FB's T&C to scrape and store this publicly viewable data.
Quite convenient, they'll only sue / take action when a rule breaker becomes
non-trivial in size or a nuisance

------
wgj
How is Techcrunch able to figure out the owner of this database, and why did
the researcher expect them to be able to?

> Security researcher Anurag Sen discovered the database and alerted
> TechCrunch in an effort to find the owner and get the database secured. We
> traced the database back to Mumbai-based social media marketing firm Chtrbox

~~~
askafriend
> How is Techcrunch able to figure out the owner of this database, and why did
> the researcher expect them to be able to?

Relationships in the industry and resources.

~~~
ummonk
As in someone at AWS leaked private account information?

------
kitotik
It seems that the data was not all public, so this was more than just
aggregated scraped public info.

So that means Chatr did some combination of:

\- Directly exploit the Instagram API bug before it was patched

\- purchased the leaked data from a 3rd party

edit: formatting

~~~
Sendotsh
> At the time of writing, the database had over 49 million records — but was
> growing by the hour.

This implies it's still going, so not related to the patched bug.

Which begs the question... how are they now, currently, scraping and adding
people's phone numbers from public accounts?

~~~
helloindia
Due to lack of strong data protection law in India and awareness, I won't be
surprised if its been purchased from a 3rd party.

There are some "Indian startup" groups on Facebook, where it's common for
people to sell such databases, and nobody asks if the seller has consent from
the people in the database. Such posts also never gets taken down, or the
seller doesn't get blocked from the group.

------
sdeep27
Naive question - What is the process like for a security researcher to go
about 'discovering' an open database like this?

~~~
kitotik
I’d imagine it starts by port scanning known AWS IP blocks.

~~~
procinct
Isn’t this illegal? What happens if the owner decides to press charges instead
of being thankful?

~~~
chefkoch
You get so many scans, failed logins that you can never press charges against
them all. Imagine the reaction of law enforcement if you show up with 10 000
login attempts per day and you want to press charges.

------
jmspring
It speaks a lot to modern culture that "influencer" is a thing. Every article
I've read about influencers, the majority sound like entitled, spoiled,
children.

~~~
rchaud
Couldn't the same thing be said about '90s grunge/alternative bands and
rap/hiphop groups? They were "influencers" as well, with a clear impact in art
and fashion. And they crafted their legend in part by wrecking hotel rooms,
arriving hours late to shows and making public spectacles of themselves.

The common thread among them is that they were polarizing. It was OK that lots
of people hated them and would never buy anything they were pitching, as long
as there were thousands of others who went the exact opposite way and bought
in to the whole charade.

------
teekert
Wow, there are millions of influencers? Some must be influencing just 1-5
people I guess then.

~~~
asadjb
The audiences for different influencers can overlap. I guess even the smallest
influencer I've seen have audiences of at-least 1K to be taken seriously.

------
Cypher
isn't that what they want? how else can they influence if they're not
contacted..

------
dvfjsdhgfv
How is this a story? The contact information is available, people and
companies scrap the web all the time. There are tools and libraries for that
purpose for any programming language. What's so surprising abut this piece of
news then?

~~~
save_ferris
> each record contained public data scraped from influencer Instagram
> accounts, including their bio, profile picture, the number of followers they
> have, if they’re verified and their location by city and country, but also
> contained their private contact information, such as the Instagram account
> owner’s email address and phone number.

They make a point to note that some of the contact info exposed wasn't public.

~~~
benzible
Sloppy reporting. Email & phone number are available for Instagram accounts
linked to a business profile: "Business Profiles include a Contact button near
the top of their profile. You'll be able to include directions, a phone number
and / or an email address. Keep in mind that you must include at least 1
contact option when setting up your Business Profile."

------
backpackway
Don't get this.

All influencers and wannabes on IG expose their email in their about section
on purpose.

Crawling them incl. further specs like followers, posts is the first thing
every Instagram marketer does/should do.

Before GDPR, there was even an open API which gave everything out.

------
cheeyoonlee
Looks like nearly 50 million accounts have been exposed. That's an alarming
amount.. most of which Chtrbox had obtained/scraped themselves without the
account owners knowing.

~~~
781
According to the recent LinkedIn court case, it's legal (at least in US)

[https://arstechnica.com/tech-policy/2017/08/court-rejects-
li...](https://arstechnica.com/tech-policy/2017/08/court-rejects-linkedin-
claim-that-unauthorized-scraping-is-hacking/)

------
kot-behemoth
This looks like the perfect time for some of the EU-based influencers to raise
a GDPR infringement request against Chtrbox. They collected the geo-location
of the people, so the company should've known they would be liable.

------
willdotphipps
micro influencers, the data's useless.. it's against Facebook T's and C's to
scrape from Instagram.

------
willdotphipps
Birmingham that's what it was called

------
nippler
Lmao who cares

------
781
Make you contact information public. Companies collect it. Surprise Pikachu :o

In the past, people used to call this crazy technology "The Phone Book".

This stuff is also on Google: "1-800" site:instagram.com

~~~
detaro
According to the article, people did not publish that information as a
possible source.

~~~
izzydata
Then how was it scraped? It sounds more like a security leak of some sort if
it wasn't publicly accessible anywhere.

~~~
9HZZRfNlpR
I suggest you to take a look at the page source on browser and you'll find
out.

~~~
izzydata
I'm mostly suggesting that the word "scraped" is not being used correctly. If
they used an API security flaw to access all of the private data then I
wouldn't consider that scraping how the term is traditionally used.

~~~
barbecue_sauce
If that data is injected into the page (possibly without being visible), I
would say that still counts as scraping.

------
yhoneycomb
Semi off topic, but "influencers" are so absurd to me.

I mean, they're basically people who build a following based on being
attractive, right? And the idea is that other people want to do whatever they
do based on that? Seems so shallow.

Not gonna lie I follow hot people on instagram, but I definitely don't aspire
to be exactly like them.

~~~
Liquix
They're people who have X thousand or more followers. This gives them the
power to _influence_ their audience.

Yes, many got to where they are based on their looks. But there's also plenty
of talented photographers, 3D artists, traditional artists, craftspeople,
athletes, and hobbyists with large followings on IG.

Influencers are created and supported by followers like you. They don't gain
that status because they're particularly skilled at anything or good marketers
- they gain that status because people choose to follow them. If the majority
of Insta users valued higher-quality content over shallow looks, the talented
creators mentioned above would be the top influencers. It's a reflection of
the user base (and on some level society as a whole).

------
elorant
The article is misleading. The data wasn't scraped because nowhere in the
public profiles are emails or phone numbers visible. They were obviously
obtained by hacking Instagram.

~~~
Giroflex
It is, actually. Sometimes it's not visible but it's available in a neat JSON
format if you view page source.

~~~
elorant
I stand corrected.

~~~
Dolores12
viewing source of webpage is not hacking.

