
δ-presence, for when being in the dataset is sensitive - p4bl0
https://desfontain.es/privacy/delta-presence.html
======
majos
For those wondering how this compares to differential privacy, the author of
this post has also written a short nontechnical post about that:
[https://desfontain.es/privacy/differential-privacy-
awesomene...](https://desfontain.es/privacy/differential-privacy-
awesomeness.html)

The theoretical guarantees of differential privacy (e.g. resilience to
postprocessing or outside data, which removes the need for most threat
modeling) make it the one to beat from a privacy standpoint, IMO.

~~~
bo1024
Thanks, yeah, that's what I was wondering -- it doesn't compare them very much
though.

The main complaint against differential privacy is that it's a very
restrictive definition. However, at least in theory, it seems to often turn
out that if an algorithm releases enough information to violate differential
privacy, then it has released almost enough info to blatantly violate any
privacy definition.

So the key question for any other privacy notion is to show circumstances
where it can be satisfied easily, but DP can't (or DP incurs a much heavier
accuracy loss).

~~~
TedTed
Hi, I wrote these articles.

It's doable to compare syntactic definitions (k-anonymity vs. k-map vs.
δ-presence, for example) and I tried to do that a bit. Differential privacy is
something pretty much fundamentally different, as it applies to the mechanism
and not the output dataset.

I'm trying to fix this gap: my PhD thesis is about rephrasing syntactic
definitions in terms of relaxed versions of differential privacy. Typically,
I'm trying to get to a point where I can say "hey everyone, remember this old
definition that is easy to use but we don't really know what guarantee it
gave? Turns out it's differential privacy with weakened assumptions, so here,
have a formal guarantee for free". These are (I feel) natural questions that
are still open problems. I will most definitely write about them when I have
time (and solid results) =)

~~~
bo1024
Sounds like a great line of research!

------
donatj
I’m well aware this is an example and extreme set, but I feel like the
anonymous data set lost almost all value in the process and might as well not
even include age as a metric as the ranges it includes are nearly useless.

I would think that anonymizing ZIP Codes would be far more useful to
researchers in this particular case. Perhaps even inventing a system where
nearby regions can be coded as such without giving away identifying
populations or perhaps even physical sizes.

