
Semi-Supervised Knowledge Transfer for Deep Learning from Private Training Data - Katydid
https://arxiv.org/abs/1610.05755
======
laGrenouille
The title and abstract of the paper makes it seem like their approach can
assure some form of absolute privacy. However, when you get down into the
weeds of the article they really only have probabilistic guarantees. This
seems like a problem because (1) most people will incorrectly assume the
stronger form of privacy and (2) there is no objective way to determine the
threshold required to be considered "private enough" for a given application.
This is exactly the same as the issues I have with differential privacy.

In the end, I could see this being useful for protecting private corporate
data where the concern is that the company does not want to lose the perceived
value of their datasets just because they have released an external model
using internal data. Theoretical guarantees that most data will be private
should be good enough for this case. On the other hand, I would worry about
using it on truly sensitive data (such as medical records) where even one
compromised datum is of high concern.

------
dgax
It will be interesting to see how it plays out, but it's worth noting that
there is a healthy amount of suspicion, some fear, and even a little bit of
animosity toward google in the US health care industry. Opinions about
surveillance and marketing (mis)uses aside, getting data can be a long and
difficult process even for seasoned researchers in teaching hospitals.
Personally, I don't fancy giving them my medical records any time soon.

