
Accurate Genomic Prediction of Human Height - gwern
http://www.biorxiv.org/content/early/2017/09/19/190124
======
kickout
Good post/paper in general. Predicting traits in humans has been more
challenging than most geneticist want to admit. The good news is that most
high heritable traits will become "solved" as we enter a new order of
magnitude of data availability (i.e 5 million people instead of 500k).

Much work left to do though, and there are interesting methods being developed
in this space

~~~
jonmc12
> there are interesting methods being developed in this space

Would you mind posting any relevant links? Is there general agreement in the
field that more data is the solution?

------
beefman
This result is explained by Hsu in a recent talk at the Allen Institute.[1]

[1]
[https://news.ycombinator.com/item?id=16386749](https://news.ycombinator.com/item?id=16386749)

------
gumby
This looks pretty good: digging into the supplement it appears they controlled
for age, which is how the controlled for things like improved nutrition,
elimination of smoking while pregnant etc

------
guiambros
Is there any way to export data from 23andme, and find the SNPs and calculate
on your own?

Would be nice if you could predict the height of your kids at a young age.

~~~
scardine
Looks like we are not that far from the dystopian future depicted in the movie
Gattaca[1].

[1]
[https://en.wikipedia.org/wiki/Gattaca](https://en.wikipedia.org/wiki/Gattaca)

~~~
civilitty
Selective in vitro fertilization has been around for at least a decade now,
where you select fertilized eggs based on genetic markers.

It's not direct genetic engineering yet, but we're pretty much already in
Gattaca.

~~~
gwern
PGD has been in use to a very limited extent, testing on just 1 or a bare
handful of specific genetic diseases, along with basic checks for gross
abnormalities like microscope examination. Use of full SNP arrays is still
very new and no one has ever used them to select based on complex traits like
height.

~~~
Footkerchief
For anyone else who's wondering:
[https://en.wikipedia.org/wiki/Preimplantation_genetic_diagno...](https://en.wikipedia.org/wiki/Preimplantation_genetic_diagnosis)

------
petra
"Replication tests show that these predictors capture, respectively, ~40, 20,
and 9 percent of total variance for the three traits. ".

This seems very low, almost like saying genes don't have any practical
predictive power, right ?

~~~
jamesmishra
In his book Behave[1], biologist Robert Sapolosky explains the issue as
follows:

> Heritability scores are relevant only to the environments in which the
> traits have been studied. The more environments you study a trait in, the
> lower the heritability is likely to be.

He goes on for several pages, describing the ways the surrounding environment
can change an organism's gene expression, and I think this quote best
summarizes his point:

> Here’s a rule of thumb for recognizing gene/environment interactions,
> translated into English: You are studying the behavioral effects of a gene
> in two environments. Someone asks, “What are the effects of the gene on some
> behavior?” You answer, “It depends on the environment.” Then they ask, “What
> are the effects of environment on this behavior?” And you answer, “It
> depends on the version of the gene.” “It depends” = a gene/environment
> interaction.

[1]: [https://www.amazon.com/Behave-Biology-Humans-Best-
Worst/dp/1...](https://www.amazon.com/Behave-Biology-Humans-Best-
Worst/dp/1594205078)

~~~
StavrosK
Not very related, but Sapolsky's A primate's memoir is a very funny and
edifying read. Strongly recommended.

------
amelius
I didn't read the article, but isn't this like predicting country of birth?
I.e. how well is the prediction if you take someone from the US and let them
grow up in Africa?

~~~
moyix
Like many traits, height is a mix of genetic and environmental factors. Based
on things like twin studies, we know that 60-80% of the variation in height is
due to genetic factors [1]. This paper shows that you can predict height from
genetic information with an accuracy that is in line with that heritability
estimate. It doesn't mean that it will be 100% accurate all the time, since
environmental factors (e.g. nutrition) will account the remaining 20-40%.

[1] [https://www.scientificamerican.com/article/how-much-of-
human...](https://www.scientificamerican.com/article/how-much-of-human-
height/) "The short answer to this question is that about 60 to 80 percent of
the difference in height between individuals is determined by genetic factors,
whereas 20 to 40 percent can be attributed to environmental effects, mainly
nutrition."

------
michaelhoffman
Updated version at
[https://doi.org/10.1101/190124](https://doi.org/10.1101/190124)

------
jpalomaki
Step towards designer babies? Check the genes after fertilization, reject
embryous if the genes are not ”right”.

~~~
sszz
The models are predicted off of "~20k activated SNPs" \- 20,000 separate
mutations, most of which have really small effects on height. "Designer
babies" are pretty far off (IVF is awesome and great, but not trivial, so it
would be risky to do this just to make sure a baby will be...probably
somewhere near average height). Engineering the right mutations with
CRISPR/Cas9-based gene editing would be astronomically expensive (and is
realistically impossible); filtering embryos for the right genetic background
would also be incredibly expensive and probably impossible, as you'd be
relying on the natural, stochastic arrangement of those 20,000 mutations to be
in sync with the vision you have for your baby.

~~~
gwern
Having small mutations is irrelevant; it could be dozens of large effects or
thousands of small effects, it doesn't matter, you get a normal distribution
anyway. The question is how much variance there is and how much of it you can
predict.

Or let me put it this way: how much do siblings (ie embryos) vary in height?
It's by quite a bit, often several inches. Most of this is due to genetics.
And this PGS is able to predict half of the genetic contribution. So...

(If you're curious, my best estimate is that embryo selection could boost
height by about an inch on average.)

> filtering embryos for the right genetic background would also be incredibly
> expensive and probably impossible

It would cost about $2000 for the biopsies and SNP arrays of the usual ~5
embryos, and can be done either now or in the next few years and could have
been done years ago if any real effort was put into it.

~~~
sszz
You would only be able to select from those 5 backgrounds, however, and the
predicted heights for those embryos will be drawn from a normal distribution;
you can only "design" from what you've sampled.

On the other hand, if you sampled millions of embryos, you could find the rare
few predicted to grow to be 6'5" person -- but this would be very expensive
and basically impossible without synthetic embryos.

~~~
gwern
> You would only be able to select from those 5 backgrounds, however, and the
> predicted heights for those embryos will be drawn from a normal
> distribution; you can only "design" from what you've sampled.

Yes, and this is cheap and does lead to gains. (Specifically: around 1 inch.)

> but this would be very expensive and basically impossible without synthetic
> embryos.

First, this is not remotely what you said. Second, it's not true: you only
need more eggs and they don't have to be 'synthetic' whatever that means, and
there's a lot of work on inducing egg development from stem cells so in
another 10 years it may well be possible for parents to do massive selection
like that. Third, you don't need millions of embryos if you just want a very
tall person, as the advantage is cumulative over generations (which is the
critical insight behind Iterated Embryo Selection: it's much more efficient to
take a few hundred embryos through multiple generations of selection than it
is to try to brute force a single selective step). That's omitting any gains
from CRISPR or genome synthesis or other methods not yet thought of. Fourth,
why would any parent want that in the first place as that's into the realm of
potential healthy problems (even if we're assuming only male embryos) and
beyond the useful level of height advantages, especially when they could be
instead spending that count of embryos to maximize a weighted sum of all
health and other complex traits? (Remember, just because everyone talks about
doing embryo selection on a single trait at a time doesn't make that remotely
a good idea; there are big gains to selecting on many traits simultaneously.)

