

Amazon, Google race to get human DNA into the cloud - cjdulberger
http://www.reuters.com/article/2015/06/05/us-health-genomics-cloud-insight-idUSKBN0OL0BG20150605

======
noname123
I think there's been a lot of research/grants for working on population
genetics; specifically, sequencing genomes and genetic expression profiles of
patient pool of a particular disease (e.g., schizophrenia) and comparing it to
the general population pool and clustering to find particular groups of genes
that might be responsible for a particular disease.

Likewise, there are state agencies and startup's alike working on pre-natal
genetics screening that basically tries to identify from pre-natal samples
particular markers (SNPs) known for Down Syndrome, Tay-Sachs etc. Not to
mention people who submit their samples to find interesting markers from
23andme.

I'd love to hear more from folks in the field what they think the next big
application of computational biology, clinical genomics will be. Obviously,
there has been a lot of mixed opinion regarding both on the consumer side (FDA
banning 23andme from making broad statements of users having a higher-
probability of cancer due to outdated and controversial data) and research
side (has Human Genome Project since its completion in the early 2000's
yielded a significant gene to target compound to drug on the market?)

More practically, Illumina
([https://www.google.com/finance?q=NASDAQ%3AILMN&ei=af5xVYHoB4...](https://www.google.com/finance?q=NASDAQ%3AILMN&ei=af5xVYHoB4eTsQf64IDoBg))
has more than 10x in the past 6 years, riding the wave of demand and growth of
next-gen sequencing technology. If Cloud Computing is hailed as the
sequencer's of the next phase in the biotech revolution, what stocks should I
buy now to take advantage of it?

~~~
donovanr
A lot of the low-hanging fruit, in terms of statistical significance for
heritable disease, has already been picked, in the sense of a naive search for
SNP X in gene Y that causes disease Z. The more people continue to look for
single SNPs or even single gene causal factors, the more we learn that they
are of limited use, due to the complexity and redundancy of biological
interaction networks (genes/proteins/epigenetic modifications/environment/etc)

One thing that people are definitely still interested in regarding single
genes or SNPS is looking for biomarkers. For instance, it's pretty easy to
look at RNA abundance with RNA-seq and make a model with good discriminatory
power to predict who does or doesn't have (say) breast cancer. But Those tests
can be expensive, and profiling the appropriate tissue can be difficult. On
the other hand, if we could find SNPs that correlate well with the RNA
expression model that predicts disease, then we could just do a cheaper
test/faster/easier test for the SNPs. Even better if we can validate causation
independently using the emerging, though incomplete corpus of biological
pathways.

Beyond that, some folks are excited about the field of functional genomics,
which aims to correlate genome-level data to the structure and function of
gene products (eg proteins) to get a more low-level, causal look at things.

Biological network extraction is something I'm excited about. Here, you
usually want to extract pairs of features, or better yet, higher order
structures from data to learn about what factors are important in disease
pathways. This puts you immediately in the regime of way more potential
features than data points, even if you sequence everyone in the world. I think
this consideration worth thinking about when people try to hype you that big
data and machine learning will just "solve" biology or medicine. L1
regularization only gets you so far. But it gets you somewhere, and can be
hugely useful in suggesting new experiments.

I also think it's important to think about how it may soon be cheaper to store
actual RNA/DNA and sequence it on demand, than it will be to store the data
itself. DNA sequences would be a fine thing to sequence once and store, but
for stuff, maybe not so much.

------
jobu
DNA databases in the cloud seems like a scary prospect, but it could also help
a lot of people.

A family member of mine was recently diagnosed with a MTHFR gene mutation
([http://en.wikipedia.org/wiki/Methylenetetrahydrofolate_reduc...](http://en.wikipedia.org/wiki/Methylenetetrahydrofolate_reductase)),
and it's been a life-changer. After years of misdiagnosis as depression,
hypothyroid, or adrenal fatigue it's nice to have some answers and see results
from getting the right treatment.

~~~
bduerst
Can insurance companies in the U.S. still reject or charge more for new
customers based on the patients' prior conditions?

That was one of the major reasons to keep patient genomic data offline.

 _Edit_ : Nevermind, looks like it's not a problem anymore
([http://www.genome.gov/24519851](http://www.genome.gov/24519851))

~~~
stillaproblem
Nope, still a problem. GINA only applies to health insurance and employment.

[http://www.genome.gov/10002077#al-2](http://www.genome.gov/10002077#al-2)

"Where GINA Does Not Apply

GINA does not apply to employers with fewer than 15 employees. GINA's
protections in employment do not extend to the US military. Nor does it apply
to health insurance through the TRICARE military health system, the Indian
Health Service, the Veterans Health Administration, or the Federal Employees
Health Benefits Program. Lastly, the law does not cover long term care
insurance, life insurance or disability insurance."

While a few states have laws restricting the use of genetic information in the
underwriting of these forms of insurance, nothing addresses the use of this
information for other forms of discrimination. (Think credit redlining.)

Unfortunately, participating in any form of genetic testing is still a
terrible idea for Americans, as it can have real financial consequences. Data
are forever. Once this material is "out there," the law only provides recourse
in a few specific circumstances (and you have to have the resources to enforce
these rights through the legal system.)

The only way to win is not to play.

------
click170
Perhaps I'm being naieve but I don't want my DNA in any database without my
expressed consent or criminal conviction.

DNA is inherently personal, and while its true you leave your DNA in almost as
many places as your fingerprints (skin flakes, stray hairs, et al) DNA can
tell you much more about someone's medical conditions than their fingerprint
can.

I'm interested in personal medical devices, but not if that means leaking my
medical info to the companies selling said devices. That info is personal and
private and is nobodies business except mine and my doctors.

~~~
amitutk
Your genetic information can not be used to discriminate against you.

"The President has signed into law the Genetic Information Nondiscrimination
Act (GINA) that will protect Americans against discrimination based on their
genetic information when it comes to health insurance and employment."

[http://www.genome.gov/24519851](http://www.genome.gov/24519851)

~~~
jakeogh
At some point anyone can use it to construct a model of you (face shape
reconstruction is now, litter in Hong Kong and your face may show up on a
billboard). We have just scratched the surface in regards to what can be known
about someone if you have their genetic code. Or what can be done... like make
people specific bio-weapons. I wonder how fast individuals will be able edit
their own DNA on the fly to moot some of this.

~~~
bduerst
That's almost as science fiction as there being an "intelligence" gene.

Genetics is only one factor of your appearance - believe it or not, your
environment, lifestyle, and upbringing factor in heavily.

~~~
colonelxc
First google hit for "face generation from dna"

[http://www.nytimes.com/2015/02/24/science/building-face-
and-...](http://www.nytimes.com/2015/02/24/science/building-face-and-a-case-
on-dna.html)

You're right of course, there is a lot more to appearance than genetics, but
these approximations do exist now, and will get better. jakeogh's litter
example is closer to reality than he probably intended, which just shows how
fast this technology is coming along.

> Nonetheless, the police in Columbia, S.C., last month released a sketch of a
> possible suspect. Rather than an artist’s rendering based on witness
> descriptions, the face was generated by a computer relying solely on DNA
> found at the scene of the crime.

~~~
jakeogh
It's reality. I should have included the link.
[http://www.scmp.com/lifestyle/article/1804420/hong-kong-
litt...](http://www.scmp.com/lifestyle/article/1804420/hong-kong-litterbugs-
shamed-billboard-portraits-made-using-dna-trash)

The point about genes vs environment/expression is right on, I assumed it was
common knowledge and therefore implied.

------
bbgm
(edit. Had the wrong link)

The NIH has documented guidelines [1] on genomic data sharing that may be of
interest.

1\.
[http://www.ncbi.nlm.nih.gov/projects/gap/pdf/dbgap_2b_securi...](http://www.ncbi.nlm.nih.gov/projects/gap/pdf/dbgap_2b_security_procedures.pdf)

------
stuxnet79
So glad I majored in bioinformatics / comp bio. The future looks bright =)

& FWIW I know of a few postdocs who are either currently employed or are in
the process of being poached by Google. It seems they are very serious about
this.

------
fao_
This seems interesting. How long until people are sequenced at birth and their
ailments predicted from that? (I know they're already doing it on a small
scale, but I'm talking of it as a routine procedure).

How long still until our memories are able to be stored, etc ad nauseum.

Very exciting stuff!

~~~
eggie
Exciting but mundane. A lot has been invested but there have been very few
useful results. It turns out the genome offers very little more than you
already know from living in your body every day.

~~~
magicalist
There's a wealth of knowledge in there, it just turns out it's incredibly
complicated and everything is interrelated (exactly as you'd expect from a
system that evolved). The problem is that the media has been selling this idea
of personalized medicine for decades now on the premise that we'll find a
single gene for every malady and physical advantage and be able to then tweak
them at will.

------
Albright
Thanks once again for the lulz, CloudToButt extension.

------
FlaceBook
"Amazon, Google race to get your DNA into their databases"

------
anigbrowl
I didn't expect _The Matrix_ to become reality quite so fast :-/

