
Zip code better predictor of health than genetic code - mnkc
https://www.hsph.harvard.edu/news/features/zip-code-better-predictor-of-health-than-genetic-code/
======
hedgew
What a massive claim! Of course, after carefully reading the article several
times just to make sure, it's just one attention-grabbing line said in one
speech, by one person.

To prove this sort of claim, you'd need to challenge the millions of hours of
research geneticists have done, all the open data they've published, all the
medical advances they've fueled. Instead, without even considering that the
claim might be wrong, without any scientific humility, without linking to any
supporting research, Harvard confidently makes the most extreme claim; that
zip codes have more predictive power than genes!

Disgraceful and absurd that Harvard would publish something like this.
Disgraceful that I believed it even for a moment.

~~~
marchenko
At present, the claim made in the article is not as controversial within
public health as it may seem: zip codes are a better predictor of many health
outcomes, including life expectancy, than our present genome-based predictive
models. The public tends to over-estimate the predictive value of polygenic
risk scores, especially for complex conditions like metabolic syndrome. Zip
codes capture a great deal of environmental variance, plus a bit of genetic
clustering by shared ancestry. Most polygenic models fail to capture all of
the relevant genetic variance, and are uninformative about environmental
factors. Additionally, for some conditions, the magnitude of environmental
variance and relative importance of environmental factors is greater than the
variance and effect size of genetic factors.

~~~
mcherm
The original article did not provide any supporting evidence for this claim.
Can you point us toward actual research to back up your statement?

~~~
marchenko
There are very few genetic risk score models that outperform traditional
observational disease markers (here[1] is a non-paywalled discussion of
cardiovascular GRS performance as an example). The best GRS results tend to be
in relatively genetically-homogeneous populations that are similar to the
population in which the GRS model was developed. In some cases, knowing the
ethnicity or simple family history of a patient can buy you a good portion of
the AUC of the relevant GRS.

So if you have a classifier like ZIP, that (1) epidemiologists have done a bit
of legwork correlating to classical markers like obesity (or their correlates,
such as income and dietary/smoking/prescription patterns) and (2) tends to
follow familial/ethnicity clusters in (3) a heterogeneous population, you can
amass a fair bit of predictive power on the cheap for complex disorders where
environmental variance plays a role, as well as beating the spread on
behaviourally-determined mortality/morbidity factors.

It is likely that the predictive power of GRS-based approaches will improve
for many conditions in the future (they are of course already powerful for
Mendelian disorders).

[1][https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4527979/](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4527979/)

~~~
mcherm
I don't have the slightest doubt that a classifier like zip code (or
traditional observational disease markers) delivers much greater value than a
genetic risk score model because its accuracy is nearly as good while its cost
is a couple of orders of magnitude less.

I do have some doubts that the zip code actually gives BETTER prediction than
the genetic risk score. I have difficulty believing that if someone had done a
genetic profile on a patient and was willing to tell me either the patient's
zip code OR the genetic risk score, that I would be better off asking for the
zip code because it had greater predictive value. It' certainly not impossible
(because of environmental factors that correlate with zip code), but it is
surprising and I haven't yet seen actual research supporting it.

~~~
resu_nimda
Why do you have those doubts? Do you have any "actual research" to support
them?

As someone who doesn't really care either way, I will say that his arguments
have been more convincing than you simply casting doubt and demanding proof
based on what seems to be a gut feeling.

------
verbify
Bear in mind that people tend to live near family and near people of the same
ethnic group as them. So not only does zipcodes include info on how rich you
are, and therefore your access to healthcare and good nutrition, it also
contains some information about your genetics.

------
wavegeek
The linked article just quotes someone saying this with no citation.

------
lngnmn
Probabilistic models based on naive gross-over-simplification of wastly
complex reality at its best.

There is indeed a huge distance between DNA and non-genetic diseases and
attempt to bridge this gap with a naive probabilistic model will surely yield
nonsense.

Social and environmental factors are much more fundamental than actual DNA
(gene regulation is still poorly understood) for non-genetic diseases. So,
yes, statistically it is true.

------
will_pseudonym
This may be quite useful to health insurance actuaries in the years to come,
depending on state laws. Though I imagine in the long run, they would vastly
prefer a fully sequenced genome for all insureds, or IQ tests.

I wonder if we'll see those kinds of things in the future. I guess on a long
enough timeline, the question isn't if, but when.

~~~
timwaagh
isn't sequencing genomes expensive though? what if they could have 90% of
predictive power for the price of a filled out zipcode instead of requiring
everybody to do lab tests?

~~~
will_pseudonym
It is expensive currently. It's dramatically decreased in price over time,
though. ($2.3B in 2003, to $1000 in 2014[0])

[0] [https://techcrunch.com/2017/01/10/illumina-wants-to-
sequence...](https://techcrunch.com/2017/01/10/illumina-wants-to-sequence-
your-whole-genome-for-100/)

