

Do we need a Human Data Project? - kumarski
http://kumar.vc/2013/06/20/do-we-need-a-human-data-projecthdp/

======
JunkDNA
There is a fundamental problem with opening up so-called anonymized data:
usually the best, highest detail data can't be properly anonymized. Take
fitbit data. You'd think that would safely be anonymous. But all I have to do
is tweet a few times about my progress and I bet you can identify me. I might
not have minded tweeting that I reached 1000 steps today. But now that
innocuous tweet links all my "anonymous" data submitted to the commons
directly me.

Most people don't realize how easy it is to use scraps of public info to
identify people in data sets (the canonical example is here:
[http://arstechnica.com/tech-policy/2009/09/your-secrets-
live...](http://arstechnica.com/tech-policy/2009/09/your-secrets-live-online-
in-databases-of-ruin/)).

For research to be bioethically sound, people have to be able to give
_informed_ consent. Most people are ill equipped to get their heads around
these ideas enough to actually be informed. I see this all the time with DNA
data. Even exceptionally bright researchers can delude themselves into
thinking that DNA sequence data isn't identifiable because you'd "need some
other DNA linked to the person to identify them". As if that's going to be a
problem in another few years.

I agree in principle it would be great to have vast troves of data to mine.
I'm just not sure how to square that with existing laws and regulations.

~~~
kumarski
That makes sense. I guess the system would have to be very much about informed
consent or filter through data that makes it too easy to trace down to
individuals?

Nice ARS link.

~~~
JunkDNA
Just to close the loop on this thread, this summary by Zack Kohane is pretty
spot on: [http://commonhealth.wbur.org/2013/06/more-surveillance-
healt...](http://commonhealth.wbur.org/2013/06/more-surveillance-health-
records)

------
danielsiders
This is one of the long term goals of Tent
([https://tent.io](https://tent.io)). Store all your highly personal data
somewhere you control it completely with the option to share it under certain
conditions with others.

~~~
dangoldin
Pretty cool - thanks for sharing.

I've always wondered why this information can't be kept in your browser and be
shared through that. That way the data is kept locally without even needing to
rely on any 3rd party service.

~~~
danielsiders
Plenty of data isn't part of the web, so really doesn't belong in browsers. It
would also make syncing data across devices more difficult as all devices
would nee to be on at the same time and have sufficient storage capacity. You
certainly couldn't store Dropbox-scale data in the current generation of web
browsers.

------
k2xl
Also interesting to note that President Obama announced recently that
taxpayers received a crazy 800 billion dollar return from the 3.8 billion
investment in the human genome project.

~~~
wslh
May be he's talking about the value of patenting DNA ;-)

