
Collecting huge amounts of data with WhatsApp - jboynyc
https://www.lorankloeze.nl/2017/05/07/collecting-huge-amounts-of-data-with-whatsapp/
======
antirez
If you think there is no problem, you are wrong. The blog post does not show
all the information leaks that this implies. Example: I can modify the script
to monitor all the numbers I've in my phone, so that based on the
online/offline status in a few weeks I can be able to guess who is having
conversations together, discovering cheatings, work affairs, ...

EDIT: Practical example. After collecting enough data about user X I create a
table about the probability of this user being online in a given few-minutes
time ranges. Then I check the online frequency of that user compared to the
online statuses of another user Y. If the difference compared to the expected
probability is significant, than I can suspect the two are chatting.

Another thing I can use is that attivation delay of the online status, since
often X sends a message to Y and this results in, a few seconds after, Y to be
online, and then the contrary.

~~~
jamesfe
Well, maybe. I don't want to discount this concept entirely because some info
is being "leaked" if that is the right word, but...

Let's say one of your contacts chats a lot because it's a chatty person.
They're online far more than another person. What if that other person only
chats on the bus on the way to and from work at roughly the same time every
day to tell their wife they're on the way home. This activity will overlap
with the chatty person's activity all the time.

By your rationale, they are having a conversation, maybe cheating, and maybe
having a work affair.

I think the more contacts a user that are active, the higher probability that
your model predicts they are having a "conversation" with another user. You'll
probably find that your thresholds are really hard to fine-tune: maybe we say
A chats with B if abs(A.activeTime - B.activeTime) < threshold, but that
threshold is going to be __super hard to find __* and even __harder __to
validate.

Sure, there is some information here (the picture probably being the most
concretely weird) but the fact that you can just go to the App and check a box
for privacy means that this seems like not a huge issue.

Yes, WhatsApp made the software, but its your responsibility to apply your own
privacy settings.

~~~
antirez
If you check the model I described in my comment, it should filter the "bus
problem", since it will detect a chat only if, compared to the standard "bus
time" probability of the user A chatting, it is chatting more if in the same
range also B is chatting. If you add to this that people on Whatsapp usually
do not talk to the exact minutes, it is definitely possible to create a robust
system for guessing with good probability of two have often conversations.
Also note that the phone numbers in input are not random, are the ones of a
connected circle of persons. Add to this the fact that we can split the ranges
even, potentially, by few minutes, and you can even detect interesting stuff
for people having continuos chats with multiple persons like teenagers.
Another thing that is possible probably is also "groups detection", since at
new messages a set of users will activate at the same time.

------
cryptarch
Just checked if I was affected and happily I found I'd set every possible
setting to private or disabled.

I figure five eyes already had this information. I think they would have tools
to decrypt all communications from any app, by using rooted phones that scan
their own memory for common crypto libraries and then extract the keys.

On the initial run it would not know where to look, and the phone would be set
up to go through a proxy that blocks all non-decryptable communications, to
avoid detection. A profile would be extracted to quickly and silently extract
the keys from the phone's memory and subsequently send them to the decrypting
proxy.

Then on the second run, the phone would be wiped/reset and the
decrypting/blocking proxy would attempt to decrypt the communications that are
now extracted from the phone in real-time. The wipe functions to avoid
detection (it makes it look like the phone is simply crashing). Perhaps the
wipe would include changing some device ID's and the source IP.

Rinse, repeat until only decryptable signals leave the phone.

(Something similar could be done with stubbing the encryption code in memory
and then "moving" it to the proxy.)

Either based on virtualization tools or on memory inspection. Or perhaps ring
-1 based.

The kind of tool I wish every techy had, so they could easily discover what
their apps are really doing.

 _I 've seen footage, I stay noided, I've seen footage, I stay-_

Edit: If you know of similar or related tooling, please let me know! I want
this software.

~~~
julioneander
If there was a way of tricking the app to think it is always online, the real
status could be obfuscated. WhatsApp probably uses some Android and iOS API to
know when it's open, tapping into that could confuse the app and make the
online status pretty useless.

LineageOS (previously CyanogenMod) had a Privacy Blocker or something like
that, which you could block specific apps access to major APIs like Media
Access, Phone ID Access, etc. It's been a while since I last used that, don't
know if it still exists, but it sure helped my paranoia. It was fun seeing
apps trying to access all sorts of stuff on my phone just to see them being
denied access.

Nice Death Grips reference btw.

~~~
sandov
I think that rather than deny access, it feeded the app with randomly
generated data.

------
topranks
Seems to me that Whatsapp should be able to rate-limit these requests and work
to secure the interface so only the legit website can actually pull the info?

The initial website display comes from a QR code you can on your phone, which
the website then gets authorized by. Could they not then limit queries to that
account?

I could be way off the mark, but it seems to me like the worst of this could
be mitigated quite easily without much loss in functionality for users?

~~~
detstrat
WhatsApp does monitor their network and API calls being performed. This way
they ban spammers and fake clients going on the WhatsApp network. I suspect
they use some form of Machine learning to filter out the non-users and when
you're identified: you're getting banned. This means that you can no longer
use your phone number and need a new SIM. So this story isn't feasable for
"huge amounts of data". You might be able to download a few 100k of profile
picutures but you will get banned.

------
tinus_hn
> Some of the information that’s being sent back include the following:

Reality:

> All of the information sent back is the following:

Why would you post information in a status update if you don't want it to be
public? Why would you use a picture you want to keep a secret?

~~~
rodorgas
I think the concern is that the information is now associated with your phone
number. Personally I don't care, but who cares can just enable correct privacy
setting.

------
detstrat
I don't get why people make such a big problem about this.

First of all, the privacy issues from WhatsApp have been discussed many times
before. Yes, the default option "public" is bad, and just like making your
profile picture on facebook "public" means anyone can scrape it. The fact that
its a mobile app doesn't make it different from a website.

Secondly; people in the comments talk about the fact that you cannot control
your precense in WhatsApp, and yes that is indeed a serious privacy problem
which has been discussed before, but this article mentions nothing about that.

Third; WhatsApp monitors their network for non-user clients (to prevent spam
and non-official clients). You may be able to request profile pictures of 500
people, but what about 1 million? Iterating over such a large set will likely
cause a ban of your WhatsApp account, which means you need to spent another 10
bucks on a SIM card which will make it unfeasable to exploit.

Source: 4 years of experience with chat-bots on the WhatsApp network. I got a
lot of SIM cards banned from WhatsApp by experimenting how far I could go. Not
only sending messages, but also scraping.

~~~
thmzlt
Do you have contact info somewhere? I'd like to chat. (Or you can reach me at
<username> at gmail.)

------
bartj3
When you give your phone number to LinkedIn you're not surprised either if
people will find your job title, photo and name based on your phone number
right?

I'd think most people are pretty aware of how public the Whatsapp info is.

~~~
sdoering
If I ask the semi- and non-technical people within my rech, I find that the
majority is shocked by the described possibility.

Sadly not shocked enough to even change the privacy settings. Or - beware -
deleting WA.

~~~
detstrat
Why do people think "public" does not mean the same as "public" on a website?
Essentially there is nothing different here..

------
selckin
Being able to see this information for everyone in your contact list without
having to add them is one of the main features of whatsapp

~~~
piracyde25
What was the service called again?

------
mih
Ah the HN hug of DOS... obligatory link to the cached version

[https://webcache.googleusercontent.com/search?q=cache:gR6rTz...](https://webcache.googleusercontent.com/search?q=cache:gR6rTzHumvAJ:https://www.lorankloeze.nl/2017/05/07/collecting-
huge-amounts-of-data-with-whatsapp/+&cd=3&hl=en&ct=clnk&gl=de)

~~~
teekert
It's working for me now.

------
zv
What's more. If you see their profile pic, you can image search google to
check if that picture matches some social networks. Now you have phone->name
identification.

------
bradvl
Wow. Combine this with a reverse image lookup to try to match numbers with
celebrities and other noteworthy people and there are a lot of opportunities
for a lot of bad things.

I wonder if Trump uses Whatsapp on a personal phone?

------
therealmarv
This is nothing new but well explained and nice script.

I'm totally shocked... maybe I will make a story or picture on my public
Instagram about that issue ;)

------
pearjuice
It clearly says so in the WhatsApp settings what people can and cannot see.
This is the modern day equivalent of cold calling a range of home numbers and
see who answers. If you don't want that information to be public, disable it.
Simple as that.

This functionality is pretty much what made WhatsApp so easily accessible for
anyone in the first place.

~~~
tqkxzugoaupvwqr
If you get cold-called, you will catch on that somebody is checking your
status (home/not home) or tries to elicit information from you – because you
are directly involved. With this API, you'll never notice that somebody tracks
you. Most non-tech people will never check the settings tab and even fewer
will check the privacy settings. They are not aware they are sharing their
info publicly.

~~~
pearjuice
You are directly involved by installing the Whatsapp application and agreeing
to the ToS, which clearly states:

>We collect information about your online and status message changes on our
Services, such as whether you are online (your “online status”), when you last
used our Services (your “last seen status”), and when you last updated your
status message.

And

>Your phone number, profile name and photo, online status and status message,
last seen status, and receipts may be available to anyone who uses our
Services, although you can configure your Services settings to manage certain
information available to other users.

"B-but people don't read those!" \- well then maybe that's something to worry
about instead of complaining about an API which is the nature of the product.

------
libeclipse
This could be extended slightly to get usernames, I believe.

When you're added to a group with a person in it who's not in your contacts,
their messages have the name linked to their account in the corner.

~~~
tinus_hn
The push name is not a username and if you don't want to 'expose' it, leave
the group without sending messages. They won't see your name.

~~~
dingo_bat
The username is visible in the group info screen. Also, if anybody types '@'
it will show a list of group members and their usernames.

~~~
tinus_hn
Not if you've never sent a message to the group or the specific user who's
looking.

------
retox
>Please, use it wisely!

Made me laugh.

------
_pdp_
This reminds of the Citibank incident a few years back. Not technically a hack
but can land you in trouble.

------
GrumpyNl
I did almost the same, its insane how much data you can collect on a person.

------
arjie
Man, my public GitHub profile ties my email address to my photograph. And that
stuff is also in each of my public commits.

With modern Android's anti-spam features, I can even dodge spam calls to my
phone, so this is not a problem for me.

~~~
tqkxzugoaupvwqr
What is the point of your comment? That it isn't a problem for you so
therefore it shouldn't be a problem for the remaining billion of WhatsApp
users?

~~~
arjie
I went back and forth on this, but I think it's quite clear what the point of
the comment is so I'm not going to explain.

------
anc84
This has been done before.

~~~
TonnyGaric
By who and when?

~~~
anc84
eg [https://nakedsecurity.sophos.com/2015/02/17/whatsapp-spy-
too...](https://nakedsecurity.sophos.com/2015/02/17/whatsapp-spy-tool-lets-
anyone-track-when-youre-online/)

