

"Anonymized" data really isn't—and here's why not - timwiseman
http://arstechnica.com/tech-policy/news/2009/09/your-secrets-live-online-in-databases-of-ruin.ars

======
randomwalker
Hey all,

I'm one of the authors of the Netflix and social network de-anonymization
papers referenced, and the author of <http://33bits.org>. I also had the
pleasure of having many talks with law professor Paul Ohm over the technical
aspects of his excellent paper (which another commenter has posted a link to.)

Paul's paper is a great example of people in the law community "getting it."
(He happens to have a CS degree and is a hobbyist perl hacker!) We need more
people like Paul, and we equally also need tech people who understand law and
policy. I encourage everyone to give Paul's paper a quick read, gain awareness
of the issues such why the current privacy laws are wrongheaded and what needs
to be changed, and be on the look-out for ways to change things (as consumers,
as tech influencers, as citizens). Cheers.

~~~
pbhjpbhj
_"Impatient readers can skip to the bullet-point summary at the end."_

Have you heard of jump links? ;0)>

------
patio11
Folks interested in this subject should read this very fascinating blog:

<http://33bits.org/>

(The title comes from the factoid that 33 bits of information is enough to
uniquely address any living person. After reading through a few practical
examples I'm convinced: anonymity is an illusion.)

~~~
mmt
Fortunately, we can fall back on pseudonymity to some extent, as well as
lying.

A close friend of mine, throughout over 5 years of college, shared one grocery
store "club" card with all roommates and acquaintances, eventually totaling a
couple dozen users.

The disincentives by the store to doing this, usually in the form of a bonus
paid to the instant user at some predetermined spending level, were
ineffective, since the group was one who would gladly pay $1 for a lottery
ticket, and this was a bargain by comparison.

Personally, I routinely give bogus ZIP codes for any situation other than
where I wish to receive mail.

~~~
pradocchia
The big grocery chain around here recently introduced a new program where
shoppers get $0.10 off a gallon of gas for every $50 they spend in the store,
tracked of course through their "Advantage" card. You show the card at
participating stations, and they take the discount off there.

The actual savings are nominal: spend $150, fill up w/ 20 gallons (the limit),
and save what, $6.00? That's only 4%!

But wow, people have become _much_ more serious about using _their own_ cards.
Participation must be near 100%.

------
dfranke
Anonymization is in essence a cryptography problem, so any anonymization
scheme should by default be considered insecure without extraordinary evidence
to the contrary.

------
jacquesm
When AOL released their 'anonimzed data set' I happened to be in a datacenter
that had some pretty good bandwidth so I grabbed a copy before it was pulled
it, even now, several years later it is amazing what you can mine in terms of
data that is relevant today.

~~~
Anon84
You can still find it around the interwebs, in torrent sites etc... I'm just
not sure about the legality of using it for practical or academic purposes.

------
mmt
Here's the Paul Ohm paper referenced therein:
[http://papers.ssrn.com/sol3/Delivery.cfm/SSRN_ID1450006_code...](http://papers.ssrn.com/sol3/Delivery.cfm/SSRN_ID1450006_code487663.pdf?abstractid=1450006&mirid=5)

~~~
pasbesoin
Ohm's blog on "Freedom to Tinker".

<http://www.freedom-to-tinker.com/blog/paul>

I couldn't get SSRN to cooperate; the download buttons just kept returning me
to the abstract. BTW, Ohm himself admits SSRN has problems, in the comments of
the post where he announces the paper.

[http://www.freedom-to-tinker.com/blog/paul/anonymization-
fai...](http://www.freedom-to-tinker.com/blog/paul/anonymization-fail-privacy-
law-fail)

------
xel02
I guess this is analogous to a Heisenberg principle for computer science. You
either have information on a person which increases the likelihood of
identification, or you can have no information at all.

------
ams6110
It's the classic tautology: if more than one person knows something, it's not
a secret.

