
Tell-all telephone – Six months of phone metadata visualized - danielhunt
http://www.zeit.de/datenschutz/malte-spitz-data-retention/
======
drpancake
You can infer some amazing things from simple metadata. I spent six months in
an R&D team at a large mobile telco, with the task of trying to infer as much
as possible from anonymous customer data just like this.

Figuring out where you live and work, to a reasonable accuracy, is quite easy;
you simply look at where the most outgoing calls/SMS originate from at certain
hours of the day over an extended period.

We built up our own social graph. You treat calls and text messages as
directed edges and phone numbers as nodes. These were fascinating to look at.

You can even try to guess when someone gets off a plane. When a plane lands
you'll suddenly see lots of incoming undelivered text messages as people turn
their phones back on. If a node was last seen in a far away cell, but then
reappears in this group, you can cross-correlate with arrival times and make a
reasonable guess.

~~~
DanBC
INterestingly what you describe is probably not legal under EU privacy laws.
People are horrified by NSA just collecting this data. And yet you calmly
describe this process.

Your opinions are not given in your post - you're not saying whether it's good
or bad to do this - but it's clear that the company you worked for didn't see
doing this as evil.

I find it fascinating that this kind of data mining has been going on for
years and that opposition has been so quiet.

(Please, this post is not any judgement about you!)

~~~
drpancake
All the telcos collect this data as far as I know. They're allowed to for the
purposes of improving and maintaining their network. A few crunch it for
marketing purposes but this has to be opt-in (not that customers would have
any idea what that might entail, even if the privacy policy describes it
broadly). I can't comment on the legality of the project I worked on, but I
assume it was checked out by legal counsel.

I personally wouldn't want my data mined in this way. I don't retain any brand
loyalty, lets put it that way.

~~~
adamcanady
Does the EU actually have laws against collecting this data without opt-in for
marketing?

On a related note: it would be really interesting to see privacy laws
visualized around the world.

------
rayiner
The argument isn't that meta-data can't be used to get a lot of information
about someone. The argument is that in the U.S., meta-data isn't protected
information. Call meta-data is not your information, but information the
telephone company keeps about you. In the U.S., the 4th amendment does not
protect those sorts of records:
[http://en.wikipedia.org/wiki/Smith_v._Maryland](http://en.wikipedia.org/wiki/Smith_v._Maryland).
Your cell phone, which you use voluntarily, gives the phone company tremendous
information about you, and under U.S. law nothing keeps the government from
getting that information from the phone company.

Does call meta-data give the government a lot of information? Yes. Does it
give the government too much information? Quite possibly. But arguing shrilly
about how collecting call meta-data is "illegal" is counter-productive. Maybe
it should be illegal, but you can't start the process of making it so by
proceeding from an incorrect premise. And you can't dismiss the goal of making
it illegal, by arguing that the government is already ignoring the law, with
reference to activity where the government is clearly attempting to stay
within the law, even if it is pushing the boundaries as much as it can.

~~~
duijf
The purpose of the web-page is to illustrate how much information you can get
about a person with just meta-data.

The law should be a reflection of our morals, not the other way around. That
would be a recipe for disaster.

~~~
rayiner
Totally agreed. The law should reflect our morals. But we judge the legality
of an action by the law, not our morals.

Moreover, the problem for privacy advocates is that the moral debate is even
less clear than the legal one. Being able to declare the NSA's actions
straight-up unconstitutional under the 4th amendment would avoid the mess of
resorting to the democratic process to determine what the people, as a whole,
really thought about surveillance.

------
skwirl
"Metadata doesn't matter" to me seems to be a really poor strawman. Maybe a
small minority of people think that, but I'm pretty sure most people are smart
enough to realize that if it "didn't matter" the NSA wouldn't be collecting it
to begin with.

Also, I don't believe that it has been shown that location information has
been collected. That claim is conjecture only. We've seen a lot of conjecture
related to these leaks that has been taken for fact. Sometimes it is hard to
tell them apart.

~~~
smokeyj
I call it "just the tip" fallacy.

------
mtgx
And that's just from the phone metadata. Imagine how much more they can do
with all your online info from all the services you're using, all the blogs
you're commenting on, and so on.

The same person being talked about above wrote this article in NYTimes
yesterday:

[http://www.nytimes.com/2013/06/30/opinion/sunday/germans-
lov...](http://www.nytimes.com/2013/06/30/opinion/sunday/germans-loved-obama-
now-we-dont-trust-him.html)

~~~
amenod
Thanks, that is a great post and well worth reading! Especially the part that
describes possible consequences of trading privacy for security (Nazis,
Communists).

------
grey-area
What a remarkable visualisation - this is a clear demonstration of just how
intrusive these metadata records can be. If they're not controlled by law,
they should be.

------
blackdogie
Malte Spitz (the guy who's data you see) is a German Green Party politician
and did a TED presentation in 2012
[http://www.ted.com/talks/malte_spitz_your_phone_company_is_w...](http://www.ted.com/talks/malte_spitz_your_phone_company_is_watching.html)

~~~
TheCraiggers
I would encourage anybody who haven't watched this to do so. It's a very
interesting video, especially for younger people who didn't grow up during
that time period.

------
lazyjones
Let's not forget that combined metadata from millions of people allow much
greater detail than this (who you meet, talk to regularly, share interests
with, are likely to run into ...).

------
moreentropy
I'm afraid the actual definition of "meta-data" is up to interpretation in the
context of IP communication.

What if the NSA considers not only IP source & destination as "metadata" but
also anything down to the application layer that is not strictly content? Like
the HTTP GET line or HTTP headers.

~~~
grey-area
I think you can take that as a given. If you look at the GCHQ leak - they're
basically just recording _everything_ (including content) for 3 days, and
keeping headers for 30 (shared with the NSA of course). That would give them
most websites visited by an IP (which would take hardly any space to store,
but are still really intrusive).

The only things preventing this from being a total capture of all information
(to be sifted through later) are technical issues with storage, not moral or
legal ones.

------
qwerta
What do you thing that graph databases with trillions of connections are used
for? The real fun will start after someone leaks couple of terabytes of
tracking data.

------
Bosence
Of course it matters, otherwise they wouldn't collect it.

~~~
parliament32
Exactly.

------
tripzilch
Well, if location data is considered part of this "metadata", then I don't see
how anyone could argue against the dangers of this.

My physical location in the real world I consider _way_ more private in
matters of wide scale tracking than what I write or say.

For instance, I hardly ever let my browser determine my location and send it
to some site, it's none of their business where I am, and if I want the local
weather they can get the name of the city I'm at.

But I was hoping this article would be about another, way more dangerous,
because way more information-rich type of "metadata": Social graphs and
contact lists. The problem with this is, humans underestimate the depth of
this kind of data because we're not really well-equipped to reason about them.

If you have a table that consists of (time, location) records, it's pretty
easy to envision what sort of information could be extracted from this data.
Add a few more fields, and it becomes harder, maybe you need some creativity
and statistics, but it's all basic detective work.

A free form directed graph (such as a social graph or collection of contact
lists) doesn't look like a table at all (well, you can represent it as a
table, but that won't make you much wiser). It's in fact a very high-
dimensional object.

The older generation out here, may remember when they first encountered the
WWW, when you could only navigate it by clicking links. I got this sense of
vastness, perhaps even helplessness. They don't call it _hyper_ text for
nothing. The sense of vastness comes because clicking and navigating those
links gives an idea of moving through a space. Except this space is in some
sense "larger" than our usual 3D space. Every door (link) can open into every
room, regardless of whether it would be possible in a physical space.

This is why those "graph of (part of) the Internet" pictures you sometimes see
are generally always a tangled clutter of strings, usually vaguely ball-
shaped. This is because there is no sensible representation of this type of
inter-connected data. You can't make a hierarchy or a map, at least, not in
the general case (and the thing you want to reason about _is_ the general
case, most of those graphs are exponential small-world graphs, highly inter-
connected).

Same thing for social / contact list graphs. Except they usually don't have
web-rings or directories (you can sometimes make them like FB does, but they
aren't generally available, again the general case).

So okay we're not really good at keeping large graph networks of "friends of
friends of friends" and other relationships in our heads and reason about
them. We're really not. What you think you can reason about those graphs is
just scratching the surface.

Computers, however, and Big Data Machine Learning algorithms in particular,
have no problems at all with this type of data. An algorithm never lived in a
3D space, it doesn't care if a dataset makes no sense as a physical
configuration of nodes, in order to navigate it and extract information from
it.

Another important distinction is, people tend to think of these social graphs
as labeled nodes with edges between them. Which is correct, in a sense. But it
gives the impression that the labels are more important than they actually
are. This may sound weird, in the building/room analogy, if you have millions
of rooms, and every room is directly connected to 50-200 other rooms, somehow
_the shape of the paths between the nodes and way they are connected becomes a
vastly more information-rich data source than the actual values of the labels
of the nodes themselves_.

They don't need your name or your photo, the local shape of your social graph
is a _highly unique_ fingerprint of whoever you are.

And you can delete Facebook, but on the next social network you sign up for
(or any of the other social graphs you're generating, email/IM contact lists,
etc), this fingerprint will echo, and in many cases be similar enough to
clearly indicate this is the exact same person. No names necessary. (this may
be a bit harder if you have a strictly separate business persona and social
persona, but there are still some unexpected artifacts to pick up for a ML
algo even in these cases) If you're not on a network at all, your presence can
be extrapolated from the "hole" in the graph you left (all your friends are
there, with their particular local graph shapes, but one node is missing),
that is even if you have nothing to hide, you will be leaking info about those
who do.

~~~
paganel
> Well, if location data is considered part of this "metadata", then I don't
> see how anyone could argue against the dangers of this

I remember a "scandal" that occurred in my country's Parliament in the early
2000s (2002 or 2003), when one of the local mobile carriers decided to display
the GSM cell towers' names on the mobile phones' small screens (close to the
"battery still left" icon). Some of the MPs thought that as being way too
obtrusive, but nobody cared because they're seen as being corrupt by
definition, the mobile company ended up by not displaying the info anymore
(but still collecting it, of course) and everything was fine.

There was of course that other thing that happened to the same company (one of
the 3 largest global brands in the industry) a couple of years later, with one
of the mobile company's office people (a lady) being jealous on her boyfriend
and asking some guys "in the IT department" if there wasn't a way for them to
check said boyfriend's messages and calls, all this "as a small favor from
colleague to colleague", which of course there was a way to do that. I can't
remember if the boyfriend was cheating or not.

~~~
e3pi
1\. To communicate, Paula Broadwell and David Petraeus shared an anonymous
email account

2\. Instead of sending emails, both would login to the account, edit and save
drafts

3\. Broadwell logged in from various hotels' public Wi-Fi, leaving a trail of
metadata that included times and locations

4\. The FBI crossed-referenced hotel guests with login times and locations
leading to the identification of Broadwell

[http://www.guardian.co.uk/technology/interactive/2013/jun/12...](http://www.guardian.co.uk/technology/interactive/2013/jun/12/what-
is-metadata-nsa-surveillance#meta=0000000)

~~~
Wingman4l7
Didn't the 9/11 hijackers use this same technique _(sharing an email account
and communicating via drafts)_? It sounds very familiar.

------
mikecane
Given the remarkable intel that can be gathered, I'm surprised the NSA/CIA/FBI
aren't giving away smartphones to targets as anonymous presents or under the
pretense of winning a contest.

~~~
PanickedOmlette
Who says they aren't :)

~~~
dendory
New NSA agent position: cellphone vendor!

------
lifeisstillgood
Eventually, all the social and location graphs will be mapped for all of
humankind - and we shall find out that everyone, on the whole planet, is
_exactly_ 42 feet from Kevin Bacon.

------
elgenesys
If some agency like NSA etc wants to know about you in great detail, clearly
they have the data, and will be able to very quickly put it all together.

The other side of this coin is that commercial parties like Facebook etc have
the same potential detail and insight about anyone.

There is also very high probability that similar data is being put together by
entities somewhere between the NSA and Facebook, for purposes that are much
more starkly not in your best interests eg fraud.

Bottom line: anyone is an open book on the internet.

------
binarymax
Does anyone know if these work as advertised?
[http://www.ebay.com/sch/items/?_nkw=cell+phone+signal+block&...](http://www.ebay.com/sch/items/?_nkw=cell+phone+signal+block&_sacat=&_ex_kw=&_mPrRngCbx=1&_udlo=&_udhi=&_sop=12&_fpos=&_fspt=1&_sadis=&LH_CAds=)

I rarely receive calls on my mobile - and only really carry one just in case I
need to make a call.

~~~
phryk
Why don't you just switch off your phone? That would save precious battery
time, too…

~~~
binarymax
I know this may sound ultra-paranoid, but I have heard rumors that switching
it off may not be enough.

\--Edit-- Thanks for all the replies - So does the faraday cage accomplish the
same as battery and SIM removal?

~~~
MisterWebz
Not sure about smartphones, but IIRC, my old cellphone used to ring any alarm
that was set even if the phone was completely turned off.

~~~
vidarh
My generic Android phone does this.

~~~
BHSPitMonkey
Is this common? This is the first I've ever heard of such a feature, ever.
Considering that we're talking about the OS being completely shut down, I'm
skeptical of this existing in smartphones.

~~~
justincormack
I believe it is normally a separate microcontroller. You generally need
something to power up the main phone and to deal with battery charging (thats
not usually the main CPU, although my Android phone does display an animated
icon on screen when powered off and charging, so unclear whats driving this).

Most computers have a number of extra microcontrollers. You would have to do a
teardown to see how they might be wired up.

------
sfaruque
Slight off-topic question: I want to collect my own metadata at this level
(for just calls and SMS)?

From what I can tell I need to collect:

\- List of all incoming and outgoing calls and SMS

\- Get my location data and match them to the timestamp (?) of the calls and
SMS's

\- Display this on a map.

Any suggestions on how to do this?

------
teeja
People might think that (apart from GPS) signals to one tower only are
unlocalizable. Add the variable of signal strength (with fairly uniform xmit
pwr) to that single vector and it gets more interesting.

------
SourApples
Just me or, anyone else just throw up a little bit.

Almost overwhelming.

