Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not GP but I've worked in a similar industry.

For me I knew how our data was anonymized. So while our system would be able to say "I have seen person 1234 at locations 4,7,9,11 on dates x,y,x" we had absolutely no way of knowing who 1234 was or anything about them, even the unique identifier was just a hash.

Obviously it depends on how much data you collect/store, personally I don't think the things shown in OP are all that onerous (sex, age group, gender, rage, time spent looking at ad).



> So while our system would be able to say "I have seen person 1234 at locations 4,7,9,11 on dates x,y,x" we had absolutely no way of knowing who 1234 was or anything about them...

Minor nitpick, but giving someone a nickname isn't the same as anonymization.

"Hey Bob, thanks for logging on. Did you know we've been calling you 1234 these past five years!"

When a passive recognition system _uniquely_ tracks & identifies a person, it just takes time before that gets cross-referenced.

(different story if the data gets aggregated, or you scrub the uid completely after some window)


A friendly reminder that there's no such thing as "anonymized data", there's only "anonymized until combined with other data sets".


By definition anonymization is supposed to be irreversible. What you're describing is de-identification (https://en.wikipedia.org/wiki/De-identification#Anonymizatio...).


Under this strong definition, anonymization doesn't exist in practice at all. Strong anonymization requires serious destruction of information (e.g. reducing all samples to a single average number). It's not what people in ad industry do.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: