
Twins get different results when they put 5 ancestry DNA kits to the test - Gaessaki
https://www.cbc.ca/news/technology/dna-ancestry-kits-twins-marketplace-1.4980976
======
truantbuick
None of the differences, either split by company or by twin seem egregious or
even particularly unsatisfactory. They cite that AncestryDNA measured each of
their DNA to be 99.6% similar, and really, that seems like the error rate you
would about expect for the Ancestry results they got. They got the exact same
regions and only a percentage point or two off in each region's share.

It's not like they're fully sequencing every DNA sample they get.

> AncestryDNA found the twins have predominantly Eastern European ancestry (38
> per cent for Carly and 39 per cent for Charlsie). But the results from
> MyHeritage trace the majority of their ancestry to the Balkans (60.6 per
> cent for Carly and 60.7 per cent for Charlsie).

This part of the article especially seems like hair splitting considering the
Balkans and Eastern Europe tend to have a lot of overlap. In fact, "Balkans"
in particular is an extremely ambiguous linguistic term and can mean so many
different things to so many people.

~~~
endisukaj
> In fact, "Balkans" in particular is an extremely ambiguous linguistic term
> and can mean so many different things to so many people.

Not really, Balkan countries are well-defined and have been so for over a
century now. It's basically European countries that were under the Ottoman
Empire:

Greece, Albania, North Macedonia, Bulgaria, Romania, Serbia, Kosovo,
Montenegro, Croatia, Bosnia. One could also throw Slovenia in there but
historically speaking, they have always been pretty different from other
Balkan-folk.

~~~
truantbuick
"There is not universal agreement on the region’s components." [1]

"It can be difficult to define exactly which countries are included in the
Balkan States. It is a name that has both geographic and political
definitions, with some of the countries crossing what scholars consider the
'boundaries' of the Balkans." [2]

Almost the entire wikipedia article is dedicated to explaining the various
definitions of it and how it is a problematic term. [3]

[1]
[https://www.britannica.com/place/Balkans](https://www.britannica.com/place/Balkans)

[2] [https://www.thoughtco.com/where-are-the-balkan-
states-407024...](https://www.thoughtco.com/where-are-the-balkan-
states-4070249)

[3]
[https://en.wikipedia.org/wiki/Balkans](https://en.wikipedia.org/wiki/Balkans)

------
jedberg
You don't need twins for this test. A single person can just sign up twice for
these services and get different results, even if they make a single sample
and split it between the two vials.

I thought it was pretty well known that these services are estimates only to a
large degree? These services only take samples, they don't sequence the entire
genome. Of course there will be errors when you have to extrapolate from a
sample.

Even for me personally, my results changed when my parents did it and linked
up to me, because their results were able to flow into mine.

~~~
benj111
I havent looked into getting a DNA test at all. But I would expect it to be
accurate and repeatable.

~~~
ramraj07
Unlike what you might think, this is not a sequencing based method, but
something based on dna sticking to small parts of a chip coated with
complimentary sequences. This method is super error prone, more importantly
the error rate is dependent on who does it and with what level of care. So yes
it's not going to be repeatable!

~~~
nkrumm
It's true that these companies are not offering "sequencing" per se but rather
are genotyping commonly polymorphic SNPs using an array or other genotyping
method.

These SNP arrays, however, are _exceedingly_ accurate and reproducible in
their calls. A common medium-density SNP array used for human genetics is the
Illumina CytoSNP-850K [1], which routinely gets >99.5% accuracy on calls. This
is at least as good, or better than, whole genome sequencing.

The challenge in ancestry assignment is not determining what is in the sample,
but rather in the interpretation and assignment of ancestral haplotypes.

[1] [https://www.illumina.com/content/dam/illumina-
marketing/docu...](https://www.illumina.com/content/dam/illumina-
marketing/documents/products/datasheets/datasheet_CytoSNP850K_POP.pdf)

~~~
yorwba
Since the test results for the twins in the article were 99.6% identical, that
seems to match what you say is the usual accuracy of SNP arrays. However, I
don't think that can be called exceedingly accurate and reproducible if you
consider the intended application.

>99.5% accuracy is great for a scientific measurement device that's only been
developed very recently; it's absolutely rubbish if you want to sell the
results to consumers.

~~~
jhall1468
If you feel that way about 99.5% accuracy, I'd advise that there are a number
of consumer products you absolutely shouldn't buy, including: kitchen scales
(small ones), measuring cups, store brand spirit levels, etc.

You can get considerably better accuracy, you just have to be willing to fork
over $1500 rather than $60. How much are you willing to pay to know if your
31% German vs 30%?

~~~
ramraj07
The difference is that it's not 99.5% accuracy for the entire experiment but
for each of the millions of spots, which means there will be hundreds of
thousands of spots that are not accurate in each chip. Not an apples to apples
comparison.

------
hatchnyc
> Lack of oversight

> Despite the popularity of ancestry testing, there is absolutely no
> government or professional oversight of the industry to ensure the validity
> of the results.

> It's a situation Gravel finds troubling.

What is up with this incessant clamoring for a nanny to supervise every little
thing?

This is published on a page where, thanks to "oversight", I have to click a
little spam box warning that this page, like every other damn website on the
internet, uses cookies. No thank you. I really don't want to pay more taxes
for the Central Bureau of Making Sure You Know You Might Be 7% Less Eastern
European than Your DNA Test Indicates.

~~~
panopticon
That's overly reductive. There are people making lifestyle changes and
healthcare decisions in reaction to the results of their DNA test, so the
accuracy of these tests may be more important than "woops, we placed you in
the wrong European ethnic group".

And I guess we remove all oversight now because of a bad example? Pack it in,
FDA. We're done monitoring food safety thanks to cookie banners.

~~~
dahfizz
You're being equally reductive. Obviously all of this exists on a spectrum.
Most reasonable people would agree that the FDA is a net positive. Most
reasonable people would agree that the government monitoring every private
conversation happening in the country and jailing everyone who tells a lie
would be a net negative. The happy medium for government checks on behavior is
somewhere in the middle.

Your parent makes the argument, very reasonably imo, that the overhead and
cost of regulating these types of company's results are not worth the upside.
People make lifestyle choices based on things they hear from palm readers - we
don't use the government to regulate and test the validity of psychics. I
think this falls into the same category.

~~~
panopticon
> The happy medium for government checks on behavior is somewhere in the
> middle.

That was my point.

> Your parent makes the argument, very reasonably imo, that the overhead and
> cost of regulating these types of company's results are not worth the
> upside.

I think we read different comments; all I got was a snarky condemnation of
government oversight.

------
Balgair
Semi-related: I was thinking about getting one of the newer iWatches, the one
with the heart-rate monitoring and all that jazz. To me, those features are
really cool and future-y. So, I waltz on down to the store and try one on. It
reads my heart-rate and it seems to be fairly accurate. Ok, well, let's test
it a bit. I start running in place. The heart rate goes up. I then just stand
there, after a bit, the heart-rate goes down. Nice, the thing actually works!
Well, they had a few of the iWatches sitting there, without anyone else taking
a poke at them. So, I strap all the free ones on the arm.

Well, they all gave different readings. I repeated the running in place test.
Some went up and down much more than the others; maybe 10 bpm more. I tried
re-arranging them on my arm and doing the running in place test again, you
know, maybe it only works on the wrist since the blood is closer to the
surface there? Nope, the readings were all over the place still. It was not
super scientific of me. I mean, I was some strange person running in place in
a store with a bunch of iWatches strapped to my arm.

Maybe the watches have some sort of learning algo in them that was getting
messed up with all the folks trying them on. Maybe I was wearing them 'wrong'.
Maybe I should have waited 10 minutes between jogging sessions. I don't know.

But, to me, these things have a _long_ way to go. A lot of the bio-tech and
bioengineering out there seems to be a bit of snake-oil right now. I really do
want it all to work, I think that would be a great boon to us all [0]. If
something like the iWatch can't give repeatable readings in nearly any way,
then it's all just a gimmick and not useful.

[0] I mean, can you imagine if Theranos' tech actually did work?! That would
change medicine for the better in incalculable ways. I can see why that
company's plan was so intoxicating.

~~~
odyssey7
When I tried on an Apple Watch at an Apple Store, I noticed that the heart
rate it displayed was definitely different from mine.

I mentioned it to the employee who was manning that table, and she told me the
demo watches had some pre-programmed data for demo purposes.

I was puzzled by this since heart rate monitoring is a key feature users care
about. It occurred to me that heart rate data of everyone who has tried on the
watch might count as the kind of medical information you shouldn’t publish on
demo watches.

Imagine ridiculous stories like “CEO of X’s heart rate was 130bpm when he
tried on the new Apple Watch, says journalist who tried it on right after.”

~~~
Balgair
Well, I'll be damned! Thanks for the explanation, I really appreciate it. I'm
still wanting to buy one, but now I don't know what to think. I wonder how I
am supposed to try one out before I buy one and to make sure it works.

~~~
ralfd
In a study two years ago Apple Watch was the most accurate tested device with
a median error of 2%.

[https://www.macrumors.com/2017/05/24/apple-watch-heart-
rate-...](https://www.macrumors.com/2017/05/24/apple-watch-heart-rate-most-
accurate-fitness-study/)

------
inetknght
Disclaimer: I currently work for a genetics genealogy company mentioned in
this article. My comments are my own. Some relevant previous comment threads
of mine about other DNA related articles: [0], [1], [2], [3]

Personally, I would love to help create legislation aimed to protect consumers
from perceived and actual problems related to DNA sequencing and analysis. I
don't know _how_ though and I fear it'd fall to the back burner in the current
US political climate. Our customers are not only US but international though
and that presents yet another dimension to the challenges therein.

This article states:

> _Whatever your ancestry results, don 't get too attached to them. They could
> change._

That is absolutely correct. I've stated why in a previous comment [1]; _Some
DNA analysis software employ stochastic algorithms. That means that the answer
they provide can be different if run more than once_.

The article also states:

> _Despite the popularity of ancestry testing, there is absolutely no
> government or professional oversight of the industry to ensure the validity
> of the results._

That is also correct. I've heard peers state that the industry isn't regulated
outside of the field of medicine during product discussions. I think that's
something which needs to be addressed: leaving it unaddressed can encourage
predatory business behavior.

[0]
[https://news.ycombinator.com/item?id=18196717](https://news.ycombinator.com/item?id=18196717)
[1]
[https://news.ycombinator.com/item?id=18196984](https://news.ycombinator.com/item?id=18196984)
[2]
[https://news.ycombinator.com/item?id=18564380](https://news.ycombinator.com/item?id=18564380)
[3]
[https://news.ycombinator.com/item?id=18569659](https://news.ycombinator.com/item?id=18569659)

~~~
buboard
> there is absolutely no government or professional oversight

They are not allowed to sell health-related reports anymore in all of europe
and other countries. While i think that some regulation and standards would be
good, banning it overall was a dumb measure.

~~~
inetknght
Can you quote a specific law for that?

~~~
buboard
[https://link.springer.com/article/10.1007/s12687-017-0344-2](https://link.springer.com/article/10.1007/s12687-017-0344-2)

[https://int.customercare.23andme.com/hc/en-
us/articles/21769...](https://int.customercare.23andme.com/hc/en-
us/articles/217696258-Will-I-receive-health-reports-)

~~~
inetknght
Excellent, thank you!

------
drugme
I wouldn't mind paying a fair sum for a service that would:

(1) reasonably guarantee that my data not only won't be distributed - but will
be destroyed (on their side) within a short time frame

(2) give me access to all of the raw data and methodologies used in their
analysis

Does anyone know of such a service?

~~~
_red
The only reasonable way of achieving those goals would be to allow totally
anonymous submissions. Mail in a sample and simply have a code that you keep
to check on results once they are processed.

The fact that none of these services allow for anonymous submission should be
a huge tell about their real motivations.

~~~
erentz
Anonymous submissions are in theory somewhat feasible, albeit cumbersome. You
can by the 23andme kits with the full fee paid from certain pharmacies (I
think Target had them) using cash. This means you don't have to put down a
credit card when signing up. I leave the rest to the reader to work out wrt
anonymizing themselves while signing up to the website.

~~~
jellicle
Err, isn't the first thing they do is link up the new submission with every
other submission ever received in the past or future, almost none of which are
anonymous?

Sure, you can be anonymous, but you're the son of John and Jane Tompkins, the
brother of Emily Tompkins, the cousin of Mary Smith, and the father of Dave
and Buster Tompkins.... huh, I wonder who you are.

~~~
erentz
Very good point. Added requirement: Make sure all your ancestors and extended
family are dead and did not do 23andme before their passing.

------
kitbrennan
Misleading headline in my opinion...

Carly's results have a larger 'Broadly European' percentage, but that is a
catchall that includes: Italian, Eastern European, Balkan, French & German,
Iberian (Others category), and Broadly Southern European (Others category).
Therefore Carly's 'Broadly European' traits could be used to match Charlsie's
results.

It is unfair to call the 23andMe results 'different'. They are getting the
same results, however one twin is getting more specificity.

~~~
ssnistfajen
Why do people assume ancestry results from these DNA tests are "accurate"?
Human ethnicity is a continuum, not distinct pools of genes. We are applying
labels conceived by modern nationalism to several thousands of years of
migration and intermarriage. It would be great if people stopped treating
ancestry results from these tests as authoritative reference.

There was a controversy about Korean & Japanese ancestry results from 23andMe
where ethnic Koreans are sometimes classified as >30% Japanese. These two
groups shared common ancestors since ancient humans migrated via the Korean
Peninsula to the Japanese archipelago centuries before the modern nations of
Korea and Japan came into existence. Slapping either one of these specific
labels on shared ancestry is wildly inaccurate and short sighted.

~~~
acdha
> Why do people assume ancestry results from these DNA tests are "accurate"?

Years of highly visible advertising by these companies which implies they're
accurate? All of those “discovery your ancestry” posters they put up around
here had specific countries and locations.

~~~
lancesells
Agreed. Just a quick look at the 23andme site doesn't show me anything that
says they are "estimates" or "not accurate".

The landing page shows me this headline "This new year, commit to a healthier
you - inspired by your DNA." I'm sure they have it legally covered somewhere
in their TOS but it's still not good if these are inaccurate or estimates.

------
derekdahmer
Their results seem well within the margin of error I'd expect out of a $100
DNA sequencing service. It's incredible that this kind of thing is now
available at this price point.

------
buboard
The great thing is that their dnas were shockingly similar. It means the
sequncing is valid. Now the ancestry composition thing is known to be a
nonstandard procedure , just look at their research publications.

~~~
captncraig
I'm imagining some kind of machine-learning training set based algorithm for
determining the ancestry percentages. Even at 99.6% difference (due to
mutations over time or sampling error) over 700k samples there's gonna be ~300
sequencing differences between twins. Enough to make their ML system draw
different inferences for sure.

~~~
buboard
23andme has some publication about their method. the thing is , they have a
lot more data than what is available in public repositories so they basically
established their own system.

------
faitswulff
Not surprised. Some consumer DNA testing companies can't even tell when the
DNA submitted isn't human: [https://www.nbcchicago.com/investigations/home-
dna-kits-4812...](https://www.nbcchicago.com/investigations/home-dna-
kits-481292431.html)

~~~
macinjosh
Haha, very interesting. I've always wondered what would happen if I sent in a
sample from my dog.

------
virusduck
99.6% is still awfully dissimilar, depending on what that number means.
Thinking about single polymorphisms, of ca. 3 billion bases, that is still
120,000,000 SNPs between twins. Of course, they aren't doing full genome
sequencing, and they are probably targeting loci that are know to be more
divergent. Still... that similarity seems awfully low for twins. I wonder what
they typically find for similarity between any two individuals. I also wonder
if their lab is a little disorganized....

~~~
klmr
Current estimates are that there’s a roughly 1 in 1000 difference between two
individuals purely due to random mutations in the germline (or, put
differently, the DNA replication error rate in human cells is about 1 in
1000). This would imply 99.9% expected similarity between two identical twins.
However, this is in the germline. Somatic cells accumulate mutations over
their life time (since they, too, divide and therefore need to replicate their
DNA, and are furthermore exposed to environmental stress). In addition, the
tests work off saliva samples which are mixed with all kinds of stuff in the
mouth, and degrade slightly during transport. Lastly, there’s an inherent
replication variability in the genotyping process (though this is _very_
small).

This back of the envelope estimate suggests that 99.6% similarity for somatic
samples of identical twins is not unexpected.

~~~
adenadel
This is not accurate. Coarsely, there are roughly 3 million SNPs (out of 3
billion total haploid bases) per individual. When we talk about individuals
"sharing" DNA we mean these variants and not the 99.X% of DNA that we all have
in common. When we refer to random mutations in the germline (i.e. de novo
variants) current estimates of de novo variants per individual are in the low
hundreds.

The other factors you discuss are more likely to cause issues. One major one
is the error rate of the microarray because of the high number of genotypes
being assayed. If the chip has an error rate of 0.1% and 500k tests are
performed you would expect 500 mis-calls. I used to work for Illumina, who
makes these chips, but I worked on the DNA sequencing side of things so I
don't know the actual error rate off the top of my head.

~~~
klmr
Right, de novo mutation rate is much lower (even lower than what you wrote,
just slightly above 1e-8 according to deCODE). The 1/1000 rate is the average
difference between two random individuals (not twins), which corresponds to
your 3M SNPs per 3B bases.

~~~
adenadel
Yes, but you referred to them as random mutations, which is not the case. They
are inherited variants. The DNA replication error rate is nowhere near 1 in
1000.

~~~
klmr
Hence my correction. Without looking it up now, the actual replication error
rate is probably close to the deCODE rate for de novo mutations (modulo gamete
selection, so probably higher than that).

------
VLM
Translating into IT / tech terms, people are VERY used to the concept of two
.wav files sampled from the same CDrom being bit identical and useful in a
court of law to prove both .wav files came from exactly the same music cd.

However people are really worked up about a new online service that encodes
multiple analog music sources into lossy mp3 samples using different analog
encoders each time, and the service sells an "identify matches similar to this
song" as a service that people think is the same level of exactness as the old
fashioned court-of-law .wav file bit by bit comparisons, but its actually more
of a lossy best guess pattern matching service instead. Yes they DO use the
same general technology, mostly, but its quite different in purpose and
outcome.

An even better analogy is we are VERY used to an online service that OCRs a
scan of a music CD and outputs the musician name every time based on OCR of
the disc itself, and now people familiar with that technology are VERY
confused by an app that listens to your cell phone mic and often squirts out
the musician name based on that raw analog sound sample.

Its more a miracle the new tech works at all, than a scandal that both
business models aren't identical in accuracy.

------
chillingeffect
I wonder why they didn't mention male microchimerism, a still poorly
understand phenomenon, in which male DNA is found in females. Possible sources
of gene transfer are " unrecognized spontaneous abortion, vanished male twin,
an older brother transferred by the maternal circulation, or sexual
intercourse"

[https://www.ncbi.nlm.nih.gov/pubmed/16084184/](https://www.ncbi.nlm.nih.gov/pubmed/16084184/)

[https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3458919/](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3458919/)

------
nanomonkey
I'd like to get this done for myself and my parents (to distinguish where the
different traits actually come from), but I'm a little wary of putting this
sort of information out there when we don't know how it will be used in the
future. And by whom.

Are there services that you can pay extra that will do a full DNA sequence and
have access to 23andMe's comparison data?

Or am I being paranoid?

------
AngeloAnolin
From the article (and the company themselves):

"The company said it approaches the development of its tools and reports with
scientific rigour, but admits its results are "statistical estimates."

It is very likely that there's a fine print on their procedure that everything
is a Guesstimate and while their algorithms will process things the same for
every sample, there's just too much variations that are unaccounted for even
in such controlled environments.

What would surprise me for example is if someone sent in their DNA for
analysis twice on the same company and get varying results. That would mean
that the process itself is not well-established.

------
stcredzero
DNA in Europe is all mixed up. A lot of this is due to the Romans. There were
also a lot of other migrations and displacements.

I was watching a BBC documentary about how Rome influenced the history of
Scotland. The announcer, who has a very pronounced Scottish accent, took a DNA
ancestry test. The results (IIRC) indicated Germany, Italy, and eastern
europe. (The announcer interpreted this last one as Dacia) Come to think of
it, he doesn't look so different from Metatron.

The twin results not matching is kinda disturbing, though.

------
EnFinlay
I can't get past the sub headline: > Twins' DNA 'shockingly similar'

Identical twins are supposed to have identical DNA, right? Or did I really
misunderstand a lot of school.

~~~
ralusek
No amount of schooling could have truly prepared you for precisely how similar
twins' DNA would be. Shockingly similar.

------
husamia
The twins DNA is "nearly exactly" the same is an oxymoron. Nearly is not an
exact. The twins have "virtually identical" ancestry profiles yet, their
profiles are "shockingly similar"

They are comparing a low resolution map the genome. The genome is 3+ billion
nucleotides, only 700,000 (0.0002%) is being compared to find 0.04%
difference! that is shockingly similar!

------
devereaux
Techniques used to read DNA only read short segments, from which larger
segments are inferred.

Not surprisingly, the results are not reliable.

Feed the unreliable result into any classification algorithm, and you still
get unreliably results (actually I was expecting worse)

Not only that, but you only read small parts- SNPs. So there is no bijection;
just a rough correspondence. On the bright side, it's good for privacy.

~~~
virusduck
I think you underestimate the quality and power of short read sequencing. Most
well-designed sequence-based genotyping assays take into account the length of
the read and are easy to read out. (You are thinking of issues with copy
number variation and perhaps assembly of longer genes, which do have issues).
The sticky part is the analysis--how realiable are your databases and your
algorithm for representing the mix of SNPs and INDELs as %European...that is
the question.

------
drallison
They really should have done the obvious quality control test: each of the
twins should have done each DNA test several times, at least twice. Most
comments seem to assume that the test samples were uncontaminated and that the
sequencing was correct. Redundancy will help identify the source of the
differences.

------
ekianjo
I don't know if any company provides that, but I would like to have access to
my raw data (I guess, several Gbs worth or DNA pairs information) and use a
different 3rd party of my choice for the analysis. (This way there would be a
lot more competition for various ways to look at data).

------
jlarocco
TFA seems to be making a mountain out of a molehill to be sensationalist.

I don't think anybody's ever claimed this type of mail in test is 100%
accurate, and most of the results were within a couple percentage points of
each other.

------
phkahler
What are the chances that these two people are not identical twins? i.e.
fraternal. That could explain the differences in results from the same
company. But they should also be quite trivially declared siblings in that
case.

~~~
gwern
Mistaken zygosity is quite rare, in the single percent range, after doing some
tests like blood groups or hair whorls. In this case, the DNA overlap is
vastly higher than would be expected for siblings:

> According to the raw data from 23andMe, 99.6 per cent of those parts were
> the same, which is why Gerstein and his team were so confused by the
> results. They concluded the raw data used by the other four companies was
> also statistically identical.

If they were siblings, ~50% of the (several hundred thousand) SNP calls would
be the same, not 99.6%. (Which is about what you would expect from identical
twins, since that's where either one disagrees with the other, and errors
usually don't strike twice, so a 99.6% overlap roughly implies a per-SNP
accuracy rate of sqrt(0.996) = 99.8%, which sounds reasonable given the
difficulty of sequencing and the inherent noise & randomness.)

On a fun side note, identical twins are not _exactly_ 100% genetically
identical, which is how rapist/murderer identical twins are being convicted
these days based on DNA evidence; but there are so few unique mutations to
each twin, and the new mutations wouldn't be on the common-SNP tests, that you
have to do the equivalent of like 4 whole-genome sequences (IIRC, you need
coverage of ~120x vs the more usual WGS of 30x) to get enough evidence to
override the prior of any difference being just a repeated sequencing error,
and it's very expensive.

------
ekianjo
> According to the raw data from 23andMe, 99.6 per cent of those parts were
> the same

99.6% is hardly an impressive figure. I would expect 99.9999% between twins at
least. Chimpanzees and humans already share about 96% of genes.

[https://news.nationalgeographic.com/news/2005/08/chimps-
huma...](https://news.nationalgeographic.com/news/2005/08/chimps-
humans-96-percent-the-same-gene-study-finds/)

------
kbody
It makes sense, what I was wondering though was about the difference if any of
the raw DNA (SNP) data they provide.

------
dekhn
I would expect this level of variance in tests given the original twin genomes
are not identical.

------
socrates1998
This is a great reason why I won't get this test done. The math is bad and
(genetically identical) twins getting different results shows how bad the
science is right now.

This reminds me of when economics try to predict what the GDP will be next
year. It's almost always wrong, so why would ever listen to anyone who is
always wrong?

It's mostly a scam, in my opinion.

~~~
devereaux
Actually, the results are not so bad. The technique is rough, but even some
basic MDS (multi dimensional scaling) on DNA gives things that look like the
actual maps, so there is some scientific base.

Also, you should not underestimate how not knowing where you ancestors came
from is disturbing for many people.

EDIT: for those who disagree, check what just the 1st two PC can give for
European DNA after a simple rotation of the axis to match a map :
[http://blogs.discovermagazine.com/gnxp/files/genmap1.jpg](http://blogs.discovermagazine.com/gnxp/files/genmap1.jpg)

I think it's fair to say there is a scientific base.

~~~
ralusek
Why should that be disturbing to some people? I think the vast majority of
people, at least in America, have no idea where their ancestors came from. Is
the usual "I'm part Irish part German part Cherokee" meaningfully more
satisfying than "I'm west African?"

I don't think it's a good thing for people to attempt to derive any aspect of
their identity from their genealogical ancestry, it has no bearing on who you
are as a person, and has the potential for great harm. I've seen it used to
give people a false sense of culture and belonging to a community that they
previously had nothing to do with. If you felt a connection to a particular
culture or identity, should we really be telling people that their genome is
responsible for determining whether or not they can associate parts of those
cultures with themselves?

Racism results from a failure to separate identity from genome. If you listen
to the dogma of a white supremacist, it's heavily predicated on associating
the accomplishments of their "European" forefathers with themselves. If you
listen to the dogma of a black supremacist, it's heavily predicated on
assuming the identity of the Egyptians or Israelites and associating their
accomplishments with themselves.

I really think that if an individual is truly "disturbed" by not knowing the
precise nature of their lineage, that is an individual in search of an
identity on a plate, and therefore highly corruptible. The motivation to learn
more about one's ancestry should ideally not surpass the level of an
intellectual curiosity.

~~~
devereaux
I wholeheartedly agree with your analysis on the risks.

I still think it can have a positive role, if only to help the person
emotional investment.

> Is the usual "I'm part Irish part German part Cherokee" meaningfully more
> satisfying than "I'm west African?"

For that, "I'm west African" is less satisfying than "I did trace my roots to
Mali" which can lead the person to study the history of the country, Timbuktu,
etc.

------
TheArcane
So ancestry companies are a sham?

~~~
docker_up
Being off by a few percent from nothing more than a cheek swab doesn't seem
like a sham. I'm sure if they got a lot more sample to test from it could be
made more accurate.

Have you looked at things like thermometers? They don't offer more than +- 2F
accuracy for those under $100. Some meters, etc only offer +- 5% accuracy or
worse. Those aren't shams.

~~~
kodablah
In the article, one service reports 61% Balkan and another reports 23%. That
one algorithm is mostly accurate within itself is hardly notable. That they're
not close with one another means that one of them is way off or the industry
is very inaccurate. If the latter is correct yet they're purported to be
reasonably accurate, it's a sham.

~~~
mrkstu
My wife's percentages have changed over time w/o a new sample- they've just
refined their data and can give better results from the original data.

~~~
kodablah
Sure, but does it account for a large percentage change? If their reference
data is so limited that adding to it can result in monumental changes, then it
is currently inaccurate and a sham. If it isn't a large percentage change then
lack of refinement doesn't adequately explain the significant variance across
companies.

------
crb002
Mitochondrial DNA chimeras?

------
Improvotter
I got a question to all of the Americans here. Why do you all care so much
where you come from? Why are these services so popular? I honestly am not
interested in my origins. Perhaps because I'm from Europe?

------
Dunedan
I'm wondering what would happen if you'd send a GDPR request for deletion of
personal data to such a DNA sequencing company. While you might not have sent
them any of your DNA, relatives of you might have sent them theirs. Since DNA
material can also be used to infer characteristics from close relatives, would
DNA material of close relatives be considered personal data falling under the
rules of GDPR?

------
aj7
OK now you can see the S/N ratio.

------
wenc
This is one of those HN threads with a lot of misinformation by people who
don't know the subject area well but have a strong opinion anyway.

It would be helpful if folks would state their credentials/experience with
respect to the subject area in their comment.

Mine: none whatsoever, but am interested in reading informed discussions with
folks with either credentials or experience in subject matter.

~~~
OldSchoolJohnny
> This is one of those HN threads with a lot of misinformation by people who
> don't know the subject area well but have a strong opinion anyway.

So essentially just like every thread in every online discussion group then?

~~~
wenc
Well, computer-related topics tend to be ok around here... not always, but at
least there's a critical mass of folks who work in the area so the probability
of coming across an informed opinion is much higher.

------
mberning
The results are crap, but at least the national security apparatus has your
DNA to use against you in the future.

~~~
selimthegrim
They swab your cheek if you’re detained at a protest on federal property, why
would they need this?

------
dentemple
It sounds like these companies are running very inefficient machine learning
algorithms.

~~~
hannasanarion
Isn't all scientific endeavour "very inefficient machine learning algorithms"?

~~~
shawn-butler
"recreational scientific activity"

------
paulcole
These DNA kits are like astrology for people who think they're too smart for
astrology.

------
jmull
> ...suspects it has to do with the algorithms each company uses...
    
    
        switch (rand() % 100) {...}

