
Unique in the Crowd: The privacy bounds of human mobility - lemming
http://www.nature.com/srep/2013/130325/srep01376/full/srep01376.html
======
samolang
Somewhat relevant: My brother attended a conference where a speaker said that
gait of human beings are so unique that an individual can be identified with
100% certainty using the accelerometer of the phone in their pocket.

EDIT: Found a paper on the subject from 2009:
[http://www.cs.yale.edu/homes/mfn3/pub/mfn_gait_id.pdf](http://www.cs.yale.edu/homes/mfn3/pub/mfn_gait_id.pdf)

EDIT 2: Found another paper from 2012 claiming 99.4% accuracy:
[http://users.ece.cmu.edu/~juefeix/btas_2012_felix.pdf](http://users.ece.cmu.edu/~juefeix/btas_2012_felix.pdf)

~~~
n1cked
I worked in a lab at Univserity that had several projects using gait analysis.
It is very identifying, but much harder to collect than this kind of data --
you need a physical device on the target.

~~~
samolang
Still, it could be very useful for intelligence agencies. If they have
identified that a phone is being used by a terrorist they can identify which
terrorist is carrying it if they're able to get access to its accelerometer.
Or they could gain access to millions of phones' accelerometers and identify
which ones are being used by known terrorists. At the very least I expect to
see it in some spy movies in the next few years.

~~~
jevinskie
Given the recent leaks, I think we should start using "terrorist" in quotes,
at least when referring to NSA "terrorists".

------
innino
I've always assumed that, given the sheer and ever-accelerating quantity of
data produced by people in the 21st century, the similarly increasing
collectability of that data, and the massive benefits stemming from making all
data more rather than less accessible, algorithmic systems capable of knowing
practically everything about everyone are utterly inevitable. Attempts at
camouflage are hopeless. Data collection from every avenue is only
accelerating (imagine Glass, Kinect, Leap, Streetview and Fitbit recording
everything, everywhere, across the whole globe, 24/7) and even the absence of
a signal is a trace.

The main flipside is that I see no reason why this power should be restricted
to any one sector of society, although it will flow first from those sectors
which can sustain the most focus and wield the most resources (as we see
currently with the primary use of these systems by governments and large
corporations). So the flipside is, okay, maybe the pattern of mouseclicks,
touchscreen interactions, body movements and physiological signals of
incipient terrorists can be identified, but maybe so can individuals planning
government fraud or cronyism. Maybe the positive and negative traces left by
surveillance agencies can also be detected and wielded against them. Code has
no loyalty.

In other words, equilibrium of social power will be reached not by trying to
prevent these possibilities from being explored, but from following simple
economic logic and endeavouring to make use of them yourself. I don't see this
as the death of anything, more as a new and inevitable frontier, a radically
new state of play with massive rewards open to individuals willing to
relinquish the old paradigm and embrace the new.

~~~
visarga
Just imagine the uses:

\- the govt could build a really detailed voter database; they could pin the
political leanings of a person by the list of web pages they browse, their FB
and Twitter feeds, or the analysis of the email and phone contacts. This list
could be used to improve the efficiency of "get out to vote" campaigns and
donation drives

\- the govt could run a "graph rank" or "page rank" algo over the network of
interconnections to determine the influencers; then, in a sensible situation,
they know who to silence first; this would make political crackdown very
efficient

\- the govt could data mine who's committing crimes and infractions; people
who imagined they would slip under the radar will be caught; in the past there
was a cloak of anonymity and an asymmetry - there were too few policemen and
judges to cope with the illegalities, they had to pick and choose whom to
prosecute, but now they could auto-prosecute by machine learning (just like
MPAA auto-sends lawsuits based on mere IP addresses)

\- if they wanted, they could selectively silence certain people instead of
blocking FB wholesale; this, applied on the list of influencers would wreak
havoc in the activist social networks

\- our health, sexual, religious, political and drug use status would be used
against us by governments and corporations; there would be no forgiveness and
no forgetting

\- economic espionage; ability to blackmail people (because they know all
their secrets); ability to blackmail people inside various companies to
secretly install backdoors (that's how China gets FB data, from what I read)

There is no escape except self censure. Whatever escapes our heads is public
and there is no privacy left. It was inevitable.

The first to be monitored will be people working for the state and especially
in NSA, politics and large corporations. They will be the first victims of
their own creation. Activists too.

~~~
innino
Self censure is futile. We all generate far too many data points, its simply
inescapable, and machine learning can work with anything, it doesn't have to
be explicit as long as you have enough little details. But I don't think
there's any reason to panic. Governments and large corps just have a head
start over the public in utilising these possibilities, but there is no reason
for the situation to remain one-sided forever. Besides, any serious attempt to
leverage these possibilities to their fullest extent would result in
essentially an instant civil uprising.

------
sologoub
Someone please edit the title - extremely misleading!

Here's what the article states: "In fact, in a dataset where the location of
an individual is specified hourly, and with a spatial resolution equal to that
given by the carrier's antennas, four spatio-temporal points are enough to
uniquely identify 95% of the individuals."

In other words, they need the location from which you made the call. This data
is not available in a normal CDR (call detail record) that carriers routinely
use for billing. This is the same data you see on your phone bill. All it has
is the date/time, to/from, duration and outcome.

As for the location, there were a number of papers floated in the past that
specified that anywhere from 3-5 destinations identified a person uniquely. In
fact, a persons daily commute is enough to identify most people. (Don't have a
link to paper that found this.)

~~~
lemming
That's true, but according to the Guardian here:
[http://www.guardian.co.uk/world/2013/jun/06/nsa-phone-
record...](http://www.guardian.co.uk/world/2013/jun/06/nsa-phone-records-
verizon-court-order) Verizon handed over location data to the NSA (the
original court order states "comprehensive communications routing information"
and "trunk identifiers" should be handed over). It's reasonable to assume that
the NSA will have that information, which makes their defence that "we don't
get your name" pretty laughable since they can almost certainly just derive
it.

~~~
yva
According the Washington Post and other outlets, they do have approximate
location data: "So the NSA is collecting information about my location as well
as who I’ve called? It appears so. Cellphones make calls using the closest
tower. So if the NSA knows you made a call using a specific tower, they can
safely assume you were near that tower at the time of the call."

[http://www.washingtonpost.com/blogs/wonkblog/wp/2013/06/06/e...](http://www.washingtonpost.com/blogs/wonkblog/wp/2013/06/06/everything-
you-need-to-know-about-the-nsa-scandal/)

------
FireBeyond
I'd absolutely agree. I'm sure I'm an outlier, but I'm definitely recognizable
- my calls:

* my girlfriend, in my town (Olympia, WA) * the ambulance company I work for (also in Olympia) * my "daytime" employer, a software company in Scottsdale AZ

Even the convergence of 2 or 3 of these calls would likely identify most.

------
bobwaycott
Even metadata is _too much_ information.

------
h0w412d
Just like how it only takes a few GPS coordinates to figure out who you are
with a high degree of accuracy. Anonymizing data doesn't mean much in the big
data world.

------
geedy
Seems similar to claim that you can identify a person based on a handful of
locations she visits from her regular daily routine.

~~~
tuttut
*he/she, his/her

------
philsalesses
Cesar was my advisor and I was his first student. I've never been happier to
see someone make a splash.

