
Protecting privacy in genomic databases - ohjeez
http://news.mit.edu/2016/protecting-privacy-genomic-databases-0809
======
astazangasta
Argh. The amount of headache we must go through to lock down this information
is ridiculous compared to the risk. While this might be an issue for Bill
Clinton or Kanye West, the vast majority of people won't be terribly adversely
affected by breaches of their medical records.

A much better solution than encrypting every byte of data and vainly trying to
secure the impossible (to say nothing of purposely corrupting the data) is
just step up enforcement: punish people who leak, sell, and buy private health
information more assiduously. Data breaches will happen; the social contract
is our best defense against them.

------
jamesblonde
I've worked with Biobankers/Bioinformaticians on this type of problem before
and none of them wanted us to got the Differential Privacy route. Why? The
vast majority of genomic diseases are rare diseases (with 10s-1000s of
carriers). Destroying information is not an option with small sample sizes.
For huge GWAS studies, it could be an option. But in the general case,
security is preferable over privacy when a choice needs to be made.

~~~
alex_hirner
Differential Privacy has its limits for small data sets. Luckily, more
research is on the way to run arbitrary calculations on multiple datasets that
remain private in its entirety (basically homomorphic encryption, but
efficiently) [1]. Algorithms that work reasonably well are simple statistical
tests and ridge regression [2].

I think these innovations are rare instances where technology can really
solves a social problem.

[1] [https://www.microsoft.com/en-us/research/microsoft-
researche...](https://www.microsoft.com/en-us/research/microsoft-researchers-
enable-secure-data-exchange-cloud/) [2] [https://www.microsoft.com/en-
us/research/wp-content/uploads/...](https://www.microsoft.com/en-
us/research/wp-content/uploads/2016/06/SDE.pdf)

------
hyperion2010
No scientist will ever use a database like that and no government agency will
fund it. Data provenance is absolutely vital for getting your work published
and good data accessible is infinitely more valuable than garbage accessible
to everyone. I say this as someone who is dedicated to open data and open
science. Privacy is something that science has to deal with. There may be a
set of questions that can still be answered with degraded data, but I bet that
if such a system were built scientists would still try to used the degraded
data to answer questions which it could not answer, wasting their time and
federal money.

------
gregw134
So they want to improve privacy by corrupting data...seems like a bad
solution.

~~~
throwanem
GWAS isn't precise in any case. Adding a few bits of entropy isn't going to
materially impair the quality of results.

