

A Decade Later, Human Gene Map Yields Few New Cures - MikeCapone
http://www.nytimes.com/2010/06/13/health/research/13genome.html?hp

======
carbocation
Thank you for posting this. NYT didn't allow comments on this article, so I'm
happy I can do so here.

Nick Wade's articles have long read as though he has an axe to grind against
the Human Genome Project and its progeny (HapMap and GWAS in general). First,
10 years is an awfully short time to go from the development of a scientific
_tool_ (the human genome map) to real-world medical treatments. I emphasize
tool because the genome, per se, is not really a discovery; it is a framework
that helps you make discoveries.

Then there is his failure to understand genetics, or refusal to do so. Take
the following sentence: "If each common disease is caused by a host of rare
genetic variants, it may not be susceptible to drugs."

Let's examine that assertion by way of example: hypercholesterolemia, a common
disease. Its rare familial forms -- and its common forms -- are caused by
dozens of different, often rare, mutations in _APOB_ and other genes like
_PCSK9_. If Nick Wade's assertion is true, then hypercholesterolemia would
probably be insusceptible to drugs, since presumably we would need dozens of
different drugs to target each specific mutation.

Except he's totally wrong. We just put them all on statins, regardless of the
causal mutation. And they work like a charm -- demonstrably reducing all-cause
mortality.

So the current evidence gives lie to his claims. And this is just scratching
at the surface. Nick Wade's article have long made it clear that he believes
that rare variants are the only important ones. Nevermind the fact that we
know where to look for rare variants thanks to the presence of common ones.
And common variants actually can have large effect sizes ( _PCSK9_ ,
anybody?). Etc.

~~~
ben1040
_First, 10 years is an awfully short time to go from the development of a
scientific tool (the human genome map) to real-world medical treatments._

Not only that, it's only really been in the last 3 or 4 years that the
sequencing technology has exploded in terms of data volume, as well as modern
computer hardware that can analyze that data in any reasonable time frame.
That's whats enabled whole genome sequencing of disease patients to become
something practical.

When the human genome was published the state of the art sequencing machines
produced a remarkably tiny fraction of what we can get now, and the state of
art computer on which to crunch the data was a Pentium 3. Just because a human
reference genome was published didn't change the fundamentals of how future
research would still be conducted.

~~~
p3ll0n
_in the last 3 or 4 years that the sequencing technology has exploded in terms
of data volume, as well as modern computer hardware that can analyze that data
in any reasonable time frame. That's whats enabled whole genome sequencing of
disease patients to become something practical_

As a computer scientist working at one of the major genome centers mentioned
in the NYT I can attest to ben1040's claim.

In the last five years alone because of technological advances in sequencing
technology we have moved from talking about genomic data in megabases (Mb) to
gigabases (Gb). Illumina newest HighSeq sequencing technology is capable of
300 Gb per run, 10x more than there competitor ABi's SOLiD instruments which
were released as little as 2 years ago!

------
antirez
Since we are among coders, the idea I've about that is more or less the
following:

It's like that the Genome Project provided you with the assembler dump of a
program. But you don't know very well the CPU used, nor what the program is
doing as a whole. You understood just a few spots, and starting from now it is
possible to make progresses, otherwise not possible without the dump.

So there is a great deal of work to do, but the dump is not useless at all...

------
chrismealy
Wade was the subject of a hilarious post on Language Log: "The hunt for the
Hat Gene" <http://languagelog.ldc.upenn.edu/nll/?p=1896>

_Nicholas Wade is an inveterate gene-for-X enthusiast — he's got 68 stories in
the NYT index with "gene" in the headline — and he's had two opportunities to
celebrate this idea in the past few days: "Speech Gene Shows Its Bossy
Nature", 11/12/2009, and "The Evolution of the God Gene", 11/14/2009. The
first of these articles is merely a bit misleading, in the usual way. The
second verges on the bizarre._

------
Eliezer
> _A medical team led by Nina P. Paynter of Brigham and Women’s Hospital in
> Boston collected 101 genetic variants that had been statistically linked to
> heart disease in various genome-scanning studies. But the variants turned
> out to have no value in forecasting disease among 19,000 women who had been
> followed for 12 years._

...that's really not good. This sounds like the translation is: "Routine
medical abuse of statistics so bad that 101 out of 101 'statistically
significant' results failed to replicate."

~~~
carbocation
You're right, it's not good. But it's also not _quite_ as bad as it's made to
sound. Here is the actual journal article: <http://jama.ama-
assn.org/cgi/content/abstract/303/7/631>

The conclusion from the abstract: "After adjustment for traditional
cardiovascular risk factors, a genetic risk score comprising 101 single
nucleotide polymorphisms was not significantly associated with the incidence
of total cardiovascular disease."

Since traditional risk factors include _family history_ , it's totally
unsurprising that a mere 101 genetic variants cannot beat them out. Since
novel deleterious mutations in cardiovascular genes are rare (human mutation
rate is ~10E-8 per nucleotide per generation; unpublished data) your family
history contains virtually all of the information you're going to get from
genetics.

Also, we're talking about 101 common genetic associations - not 101 genes. To
put this in perspective, _APOB_ itself has over 101 known deleterious variants
itself. They didn't test all of these; they just tested 101 common-variant,
disease-associated SNPs. So the study is actually much less interesting than
it might be.

EDIT: Oh, not to mention that traditional risk factors include things like
LDL-cholesterol, the best-established causal risk factor for MI. The genetic
risk score includes many SNPs that influence LDL-C level. LDL-C is ~50%
genetically determined, 50% environmental. So knowing the LDL-C level is,
unsurprisingly, a more powerful predictor of MI risk than is knowledge of the
variants that influence LDL-C.

------
tocomment
How does one find a job discovering patterns in the the genome? I'm not even
sure what search terms to use on indeed.

From the article I'd get the impression there should be a huge demand for
statisticians to analyze the genome but I'm just not seeing that when I
search.

~~~
p3ll0n
At Baylor College of Medicine's Human Genome Sequencing Center where I work as
a Software Engineer one of our biggest current needs (software-wise) is a
working LIMS (Library Information Management System) that tracks the progress
of DNA samples through our analysis pipeline (chemistry and software sides).

There is also a great need for Software Engineers/Mathematicians to improve
the analysis software and the algorithms behind them (primarily string
matching) to account for advances in the "chemistry" that companies like ABi
and Illumina are making in regards to their sequencing technology.

These are both just on the "production" side of the process - i.e. the
processes and people that produce the first round of analysis and statistics
from the raw read data (sets of As, Gs, Ts and Cs). Further analysis that
looks for SNPs (single nucleotide polymorphisms, what Wade calls 'variant DNA
units'), carries out genome annotation and eventually attempts to statistical
link both of those results and numerous others to disease traits is carried
around by teams of programmers and biologists/geneticists. However, as the
data becomes increasingly large and complex so too does the role programmers
and clever software play.

