Hacker News new | past | comments | ask | show | jobs | submit login

There are multiple definitions. The most basic (and common) is k-anonymity [1]. Basically, for a given collection of data you group by all variables that are already non-anonymous (like age, address, gender, occupation) and end up with groups of fewer than k people (where k=5 is common), any other data items in the data set linked to the same individual also become non-anonymous (PII).

Even if you have groups of size greater than k, though, information elements may be non-anonymous if there is not enough diversity in the group. For instance, if every 49-year-old male on a given postal code in a given occupation has a certain religion, then religion is non-anonymous for that group, according to l-diversity [2].

This can be narrowed down even more by t-closeness [3].

  [1] https://en.wikipedia.org/wiki/K-anonymity
  [2] https://en.wikipedia.org/wiki/L-diversity
  [3] https://en.wikipedia.org/wiki/T-closeness



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: