
New vs. Old – a comparison of 23andMe’s health reports and the raw data - devonEnlis
http://www.enlis.com/blog/2015/11/06/new-vs-old-a-comparison-of-23andmes-health-reports-and-the-raw-data/
======
tempestn
Is there any way to have sequencing done anonymously? I would love to get the
data, but I'm not crazy about multiple corporations and/or governments having
a copy of my DNA forever, to do with as they like (privacy policies and such
notwithstanding).

Edit: Ah, it appears that anonymously handing over your DNA may not even be
possible: [http://www.technologyreview.com/news/509901/study-
highlights...](http://www.technologyreview.com/news/509901/study-highlights-
the-risk-of-handing-over-your-genome/)

~~~
jghn
Your edit touches on this but a SNP panel, much less an exom or God help you a
genome is hardly anonymous

------
maxander
> "We are supposed to believe that the technology can tell us accurately if a
> person has 1 copy of a variant, but it cannot determine if there are 2
> copies? What does 23andMe report if there are 2 variants? Nothing?"

Actually, this sounds _eminently plausible_. As I understand it, a DNA
microarray consists of a bunch of probes that each transmit a signal when they
bind to a specific strand of digested DNA. A probe is either bound or not, and
since you're applying a mixture obtained from multiple cells, there's many
copies of each DNA sequence present in either case. There's no way you could
establish whether or not two copies of the same gene were present in an
individual cell.

I could be wrong (I'm a bioinformatician, not a biologist), and 23andme could
be doing something interesting which reveals this data- in either case, I'd be
interested to hear. But this makes fairly bad advertising copy- everyone else
with the same cursory understanding of this technology is going to have the
same thought, and think that you _don 't even_ have said cursory understanding
if you don't explain your reasoning better [1].

[1] For that matter, you come off as wildly unscientific simply because you
make this sort of conjecture without _citing sources_. There is a resource out
there, somewhere, that would confirm or deny what you're saying, and your
ability to make a credible scientific statement rests on you finding it.

~~~
devonEnlis
23andMe's genotyping chip is based on the Illumina Infinium beadarray
technology. ( [http://www.illumina.com/technology/beadarray-
technology/infi...](http://www.illumina.com/technology/beadarray-
technology/infinium-hd-assay.html) )

There is a nice animation of the technology on Illumina's website. It's a
single base extension assay where you have hybridization of a probe, then add
a labeled nucleotide where the SNP is that you are studying. For a A/T SNP, if
you basepair and add an 'A' you get a green signal, if you add a 'T' you get a
red signal. If you end up with a mix of green/red, you have a heterozygous
variant.

Illumina has a tech note on the call rate and error rate here:
[http://www.illumina.com/Documents/products/technotes/technot...](http://www.illumina.com/Documents/products/technotes/technote_genotyping_rare_variants.pdf)
At a minor allele frequency of 0.5% their minor allele homozygous error rate
was 0.1%.

The nice thing about this technology is that the more data you have, the
better your calls get, and 23andMe has real world data on over 1 million
customers. (Illumina tech note used data from 2,000 genotyped samples) I think
23andMe has enough data to know which of the SNPs on their genotyping chip are
accurate, but they don't exactly open up their data for public inspection.

If you are a 23andMe customer you can see the improvements to the raw data
they have made over the years here: (
[https://www.23andme.com/you/download/revisions/](https://www.23andme.com/you/download/revisions/)
) If you can't see it, it says things like: "July 28th, 2014. Analysis of our
data has allowed us to improve the interpretation of over 10,000 SNPs genome-
wide on the V4 chip." When they get more data, they are improving their calls.

Finally, the FDA document about 23andMe's approved Bloom Syndrome carrier test
says that "all homozygous variant genotype samples receive a 'no-call' result,
since the calling software was designed not to detect homozygous variant
genotypes." It sounds to me like they designed the software to ignore and
throw out homozygous data.
[http://www.accessdata.fda.gov/cdrh_docs/reviews/DEN140044.pd...](http://www.accessdata.fda.gov/cdrh_docs/reviews/DEN140044.pdf)

------
geomark
What's going on here? 23andme are not allowed to report on the full data but
Enlis is? Or is Enlis flying under the radar until they get squelched by the
FDA?

~~~
devonEnlis
The information that we report is all publicly available from places like
Pubmed (
[http://www.ncbi.nlm.nih.gov/pubmed/](http://www.ncbi.nlm.nih.gov/pubmed/) )
and ClinVar (
[http://www.ncbi.nlm.nih.gov/clinvar/](http://www.ncbi.nlm.nih.gov/clinvar/) )

It's conceivable that that one could look at each of the 600,000 lines in
23andMe's raw data and find these research reports individually.

We simplify that process and make the connections to research reports for you.
Then, wrap it up in a nice package where you can compare your family member's
data, get information on genes, generate PDF reports, etc.

Unless the FDA wants to shut down public access to scientific knowledge, then
we don't expect to hear from the FDA. Others have been operating in this space
for several years.

~~~
davej
Is there anything that Promethease doesn't already provide?

~~~
devonEnlis
From the website:
([https://www.enlis.com/personal_edition.html](https://www.enlis.com/personal_edition.html))

\-----------------

Here are some of the differences and advantages of Enlis Genome Personal:

It's a full software application. Rather than just a webpage or single report,
Enlis Genome Personal is a full application that you install on your computer
and load your genome data.

Ability to load multiple genomes and compare them side-by-side. You can load
all your family member genomes together, or load multiple copies of your own
genome (i.e. you have data from 23andMe and Ancestry.com), or you can compare
your genome to available sample data.

Ability to generate custom disease and trait PDFs, suitable for email or
printing. A sample Macular Degeneration PDF is sent with every free genome
report.

For a particular disease or trait, this software will tell you how many known
disease positions were successfully sequenced in your data, and how many are
missing. For example, if you are interested in a hereditary disease like
Cystic Fibrosis, it will indicate there are 270 known variants that cause
Cystic Fibrosis. Your data covers 101 of those positions, and there is missing
data on 169 positions.

More extensive information on each SNP, like variant mammalian conservation,
or variant deleterious predictions. Much more information on the function of
genes and where in the body those genes are expressed.

User friendly interface that is easy to use. Includes a genome browser so you
can see your data in the context of genes and chromosomes.

For 23andMe users - better quality control and more data: We discovered over
500 SNPs with inaccurate data in the 23andMe results, and we automatically
filter those out. See blog post here. We are the only service to identify and
analyze over 1,000 of 23andMe's proprietary insertions and deletions. Most of
these can have a health impact. See blog post here.

For users with whole genome or exome data: We give you the tools to make new
discoveries about your data. Advanced variation filter, phenotype explorer,
homozygous region analysis, and over 20,000 built-in gene categories. An
example - our founder used this software to discover a cause of a rare
phenotype in his whole genome data. How long did it take? Less than 30
seconds.

