Hacker News new | past | comments | ask | show | jobs | submit login
23andMe has signed a $300M deal with GlaxoSmithKline (businessinsider.com)
189 points by dwighttk on Jan 5, 2019 | hide | past | favorite | 82 comments

I'm not exactly an expert in this field, but I work in big pharma, and I've seen the presentation 23andMe gives to drug companies. Basically, there are two ways GSK can use your data stored at 23andMe, and both require explicit opt-in. First is that they can basically just give access to huge databases. You might be surprised at how inefficient this will probably be for target hunting. There are already lots of GWAS studies that don't result in a single viable drug target.

The other way, and where I suspect the bulk of the $300M/4yrs is going, is in recruiting patients for clinical trials. GSK can tell 23andMe that they need 150 patients living near metropolitan centers who have indicated they suffer from X disease, and don't have Y mutation. This could drastically reduce the cost of recruiting for Phase 2 trials (though it may skew the patient populations in ways we haven't really had to deal with before). <edit> If I remember correctly, 23andMe contacts these potential patients to ask if they're interested in the GSK trial, then refers them to contact the GSK clinical trial site. They don't just give your phone number to GSK </edit>

I guess I'm not really sure what the panic is about. First, every step is opt-in. Second, the data provided en masse is blinded to patient identity and industry standard is to disincentivize unblinding the data. Am I too close to and brainwashed by the industry to see what's wrong here?

This comment is not correct as to what 23andMe is doing with the data.

The goal is to identify targets for new drugs, and it is known that drugs based on specific genes or mutations often have the best chance of success - look up, for example PCSK9 inhibitors. 23andMe, along with having genetic information, have very, very detailed phenotypic information on the people they sequence, which they collect in the form of surveys. This in turn lets them do GWAS-like analyses better than anyone else on earth. These GWAS hits can in turn go through a pipeline to find those genes which might confer protection, or susceptibility, to a particular disease. Then, it's up to GSK to make a molecule to mimic or inhibit that effect.

What should be clear here is that 23andMe is NOT giving GSK access to their data. They are giving them leads.

Source: work in research R&D in biopharma (with a particular interest in genetics)

Sorry, the blanket dismissal is not entirely correct. Companies like 23andme are indeed trying to repurpose their data to efficiently enroll participants in trials, as the parent post suggests. While it's not the focus of the GSK partnership as publicly reported, it is absolutely part of the strategy in this market, and arguably more likely to succeed in the long run. As far as target discovery: yes, there are examples of success with GWAS data (though probably more with rare variants identified in sequencing), but 23andme phenotypic data is rather craptastic vs traditional studies or biobanks. No doubt they'll find things, but you're overstating the power of their data. Source: work in drug discovery in academia; I've been pitched by companies like this and worked with a few.

Agreed re: target discovery. Most of the 23andMe data is not much better than what is already publicly available to researchers via initiatives like the UK Biobank. 23andMe uses the Illumina Global Screening array which has significantly worse output (in terms of number of SNPs) than the 500k patients genotyped in the UK Biobank. Furthermore these genotyping based assays make it impossible to detect rare variants that have never been seen before and may be protective. The data being generated by Helix is probably much better for target discovery, but still not competitive with private datasets constructed by companies like Regeneron.

I don't know exactly what phenotypic data 23andme gets but if it's through surveys then it probably isn't as good as the clinical data that, say, regeneron gets for its genetic center through partnerships with healthcare providers. Also 23andme's database is large but presumably much of their data is from young healthy people -- I don't know for sure but it seems a reasonable assumption. So it wouldn't be as useful in studying disease as the overall size of the dataset seems to suggest. It also isn't as useful as a company like decode or Genomics medicine Ireland that study more homogenous populations. And GWAS have been done for many years so much of the low hanging fruit is picked

23andmes database's best advantage is probably studying things like genetic basis of mild to moderate anxiety and depression as those diseases are probably more prevalent in their population than others. Plus those are complex polygenic traits so each individual gene would have a small effect size, so a big database like 23andmes would be useful

However that isn't great for drug discovery because drugs only hit one target. A bunch of genes with small effect sizes won't support much of a drug dev program

I may be wrong as I'm not familiar w 23andmes database but these are my deductions from what I do know

The article says it's opt-out not opt-in, and purpose is to find drug targets and clinical trail participants.

My concern is not only privacy but equity. Commercialization of my data without consent or compensation is my fear. Nice that it’s opt-in now. I hope it stays that way or there’s not a massive breach.

Breaches are indeed can be considered as the new 'opt-in without consent'. I think people should have right to delete their data before it happens. 23andMe seems to have a clear state about it already:

"If at any time you are no longer interested in participating in our Services, you may delete your 23andMe account and personal data, directly within your Account Settings."

source: https://customercare.23andme.com/hc/en-us/articles/212170688...

The compensation is subsidized genetic testing. Without these deals, I'm sure the service would cost a lot more. I agree that being opt-in is important here, at the very least because people who signed on early did not agree to this. In the long term, if they changed it, I'm sure it'd require a new terms, and that older users would be grandfathered into the opt-out.

>Commercialization of my data without consent or compensation is my fear.

I agree sort of, but how interesting is it that many people seem to really care when it’s their immutable genetic information to 23&Me - but when it’s their habitual and social information to Google or Facebook they seem care a lot less.

I care about Google and Facebook as well.

>Commercialization of my data without consent or compensation is my fear.

You should have assumed when you spit in the tube that you waived all rights to protecting your genetic data.

I'm not saying they should be allowed to use it however, including selling, but I'm saying when you spit in a tube and send your DNA to a company... you should kinda assume the worst.

Recruiting people for drug trials seems unlikely to be a goal, but I could be surprised. I suspect they are planning to use sequence data to find drug targets. Sequence data has more potential than GWAS ever had, e.g. finding rare mutations.

"Blinded to patient identity" is true, but it can be relatively easy to re-identify these individuals using this data. So this data has your identity intertwined with potentially very private information. Some people intuitively desire to be as cautious as possible with their genetic information, and there is good reason for caution. Worst case it somehow ends up on the black market. You can't change your DNA, so if it gets out, there is no turning back.

Recruiting patients for clinical studies can actually be quite hard, especially if you are looking only for patients with a specific mutation. Many companies now focus on studying genetically defined patient populations bc the signal to noise ratio is better. There is a massive shortage of patients for cancer studies now

As mentioned, 23andme uses genotyping not sequencing so it's unclear whether there data is that valuable for target discovery. and while it's not unprecedented for pharma to spend $300M for a target id deal, it is quite rare

23andme uses genotypimg and not sequencing, so you won’t be getting those rare mutations. That’s how they can keep the price so low.

I think your point still stands though, your genotype is still identifying information and should be safeguarded.


There is no reason to think that they are not sequencing some of these samples in their research programs, and/or waiting for sequencing to become cheap enough to do so. Considering that there are numerous large GWAS studies around, why would pharma pay big money for access to this data if it were limited to GWAS? They might currently only sequence specific genes and/or specific individuals, guided by the GWAS data, but as sequencing gets cheaper there is nothing stopping them from full genome sequencing in their research.

Their prices are low for the "consumer" (product). The $300 million GSK deal is an example of how they can make real money.

When you sign up they ask you if they can store your DNA sample for up to 10 years. I assume this is so they can sequence it once it becomes cheaper.

> You can't change your DNA


Supposedly. That we know of.

There's probably already some mad biohacker out there doing this as we speak and being called a loon by everyone who knows them.

> Sequence data has more potential than GWAS ever had, e.g. finding rare mutations.

Sequencing studies really don't show this.

Sequencing finds rare variants. GWAS finds loci with common variants. We know that for many common diseases, the population-level variability due to rare variants is small. If you want to find drug targets, there is plenty of reason to use both approaches.

This article is highly disingenuous.

23andMe ask you during sign up if you wish to take part of anonymized research. They ask you again for each person whose sample you submit. You can also opt out at any time.

The article author even knows this, since when they asked Ancestry (23AndMe's owner) that was exactly what they were told. So why does the article have such a hyperbolic tone and misleading information?

I don't have high expectations for Business Insider but this article still disappoints me. This is tabloid level "journalism."

AFAIK 23andme and Ancestry are competitors. This would makes sense why they asked Ancestry what they do and doesn’t use that to make the argument as each company may treat sharing differently.

>AFAIK 23andme and Ancestry are competitors

I'm not sure I'd call them competitors. Yes they both take your spit and process it, yes they both allow you to see potential extended family but that's about the only similarities.

Ancestry added it as an extension of their genealogical services while 23andme was originally for personal medical discovery before the FDA came down on them and they had to change tactics.

writing articles that blame pharma and biotech for stealing people private data is the hot thing now. It increases page hits. Same with blaming google, facebook for stealing user data; it's the fear du jour.

While there are legitimate concerns, 23&Me is pretty good in terms of privacy practices and their founders/employees are working in good faith to improve the health of humans. It's kind of sad that gets twisted into "compabnies are profiting from your personal data", because while that is true, it's not nearly as invasive or gattaca as people seem to think.

In this case, nobody's stealing anything. People are opting to give them not only their information, but those of their descendants as well. Apparently they think that some sort of "privacy clause" (written by the corporation's lawyers no doubt) will remain intact if things get financially tight, and that the courts and governments will step in if the businesses try to take advantage of their knowledge in a way detrimental to the customer.

But we all understand that technology is only going to make more information available from those samples over time. The board has a fiduciary responsibility to its shareholders, and it doesn't take a great imagination to see that if the information becomes valuable in new ways then there will be pressure to use it however they can.

What we have here is a baby step.

> since when they asked Ancestry (23AndMe's owner

Ancestry is not 23andme’s owner. They are competitors.

> They ask you again for each person whose sample you submit.

You can submit a sample for someone else?

Sure, why not? I've seen friends assist their parents many times.

How much of a population’s DNA needs to be mapped before you can reliably triangulate the the identity of any individual by comparing to known DNA profiles? For instance if I have two cousins on different sides in the database, can they look at my DNA and narrow down the possible matches to me and my siblings of the same sex? What about more distant relations?

(I’m thinking about how police caught that serial killer recently because some of his relatives had used one of these sites)

That's how a local cold murder case was solved. DNA from a 25-year-old crime scene was run against a genealogy database, and found a familial match. Police set up a sting to obtain DNA from that person's relatives. They found a DNA match with the brother's water bottle taken from the trash. The brother had never submitted a DNA sample to any databases himself.


The alleged killer was the DJ at my wedding. The trial's in May.

This depends entirely on the population. Rare individuals with unique DNA markers are easier to re-ID than individuals with more common DNA markers.

But also consider that ethnicity is relatively easy to get from genetic data. So individuals with unique ethnic combinations can be trivial to identify by their genetic data. Now consider that genetic data can also be used to predict traits such as height, hair color, eye color, etc [0].

Reliable triangulation is already possible, and becomes increasingly more accurate as more data becomes available.

0: E.g.this company- https://snapshot.parabon-nanolabs.com/

Title requires reference to this article being from July 2018

I can't believe 23andMe isn't free yet. Or more accurately, that a competitor hasn't come in and offered a free version which would force 23andMe to make their product free. The fact that they can get away with _charging you_ to collect and sell your genetic information is nuts. Same goes for Amazon's Alexa.

This is almost the case, as far as just the raw SNP genotyping goes. Have you seen the rash of new startups like Nebula or Helix? They can be 'free' in some cases. Which makes sense as the genotyping cost has plummeted to like $20-50 in bulk.

But you're underestimating that one of the big draws of 23andMe is their detailed ancestry analysis - which isn't going to be replicated overnight by a competitor - and their related social networking/genealogy aspects, which definitely isn't replicable overnight. I have zero interest in that but I still get contacted every couple of months on 23andMe by genealogy hobbyists looking for more information. (And they also have FDA-compliance on some of their health reports, which isn't easy either.)

Well this aspect of their business is 100% opt in apparently. Maybe a discounted pricing tier would make sense if you opt in but its not as if you are the product. I dont own or understand Alexa because I personally wouldnt use one so I cant comment

I'm happy that 23andMe is using my data for drug research. I don't want companies to read my private messages, but drug companies are trying to fix things in me, and ask for a lot of money in return, which is more fair than what many other companies are doing.

Same here. I care about privacy. But in this case I'm willing to trade that for contribution to medical research.

Am I being too optimistic?

I don't think it's possible to be too optimistic. Naivety could be an issue though.

BTW, you can have 23andMe delete your data, including the genetic data they have stored. I did this a couple of years back and it was fairly straightforward. Sounds like it is even more so now:


Also, at the time you could download an export of your genetic data and that is probably still the case.

Do you know if it's relatively easy to analyze your genetic information yourself instead of relying on a third party?

I used Promethease: https://promethease.com/.

It basically takes your genetic information and maps it to what is in SNPedia: https://snpedia.com/.

The problem with this approach is that what you are getting isn't really analysis, and you will get lots of conflicting data and information that is not possible to interpret without specialized knowledge. Still, it can be both fun and interesting.

Sure. There are lots of SNP databases out there. The NIH has a great repository, for instance. There’s even a wiki of most known SNPs, with descriptions.

It’s kind of labor intensive without automation. But, I’m sure someone has written a script to compare raw SNP data to SNP databases.

Things to keep in mind: 1) raw SNP data contains errors. 2) A particular SNP may be associated with condition X, but many other SNPs could be also be associated with X. And, rarely is it understood how they work together (since they’re often located in different genes). 3) a lot of SNP research is done using small samples, i.e. not much confidence. (A GWAS will produce more confidence, but also identify fewer SNPs of interest.)

No, since an analysis is itself a value judgement. What I mean is that a DNA profile is becoming even more useful every day due to additional research telling us what different parts of the genome is doing and what interactions it has (with drugs, food, environment, other parts of the genome).

So you can read the raw DNA sequence yourself but it is the link into thousands of pieces of academic research that is the real value delivered here. Without that it is just a bunch of letters that mean nothing.

If you are an EU citizen, GDPR applies, which means you can request deletion of your personal data at any time (unless there is an obligation by law to keep that data, for example invoices).

I just don't know how well it can be enforced with businesses that don't have an office in EU land.

This article is from June 2018 - I've seen the news go around social media in the last day or so but I'm unsure what gave it newsworthiness again. Did people just freak out now that they've gone through the 23andMe kits they received for Christmas?

Pass legislation regulating the usage of genetic data, meeting proper encryption standards, purposefully not allowing tracking an individual’s data and history to a person like companies are not allowed to track a specific person to their browsing history. Invoke hefty punishment for not following the laws. And I am completely in favor of this as an opt in. It could save so many lives and be safe as well.

Question from someone ignorant on the topic: if the US ever adopted a universal health care system in the future, would disclosures of 23andMe/Ancestry DNA to insurance companies still have negative effect? I’m assuming that under a universal system, it wouldn’t matter how severe your conditions are, as you’d be covered anyway.

There's a lot of other non-health related insurances and more that depends on an assumed average lifespan. Mortgage/Loans, employment etc. One could become an second class citizen in such a dystopian future.

I recently watched the film GATTACA. I was shocked by how appropriate it was for our time.

In the film, discrimination by genetics is, strictly speaking, against the law, but a matter of course because of how easy it is to get a sample and sequence someone. Humans are judged based on what's in their genes, not who they are.

The synthetic, designed humans are classified as "valid", while those born naturally are "invalids" or "de-gene-erates".

It would be quite easy for people of all walks of life to become second-class citizens due to information in their genomes if things do not change quickly.

At that point, from a medical perspective, I don't believe so. The other main issue would be privacy, which would not at all be affected. Under the third-party doctrine, there is no expectation of privacy when individuals share information with a third party. Therefore, that information can be demanded by police without a warrant.

Having the information collected into a central location (the insurance companies) would make it easier for a large number of these requests to be filed, rather than needing to track down individual services for each investigation.

23AndMe/Ancestry don't reveal DNA information to insurance companies. Even if you opt into providing your DNA for research purposes, it is anonymized.

You are putting a lot of trust in them. How did that workout with equifax? Surely such a large company with such sensitive data would take extreem measures to secure it. /s

Medical information is much better regulated than Equifax. HIPAA applies here for one example. Plus we're talking about acknowledged disclosures, rather than leaks which is apples and oranges anyway.

HIPAA applied to healthcare info, and I believe, 23andMe is consumer data and so not covered by HIPAA [0]. If they give your data to pharma or a healthcare provider it becomes covered.

Similarly the blood pressure data in my withings app or activity data in my Fitbit app is not covered by HIPAA.

[0] https://slate.com/technology/2017/12/direct-to-consumer-gene...

"We're truly sorry, for we had no idea that we were selling your private genetic data to insurance companies, researchers, advertising firms, and the FBI. We are really sorry. We are going to investigate how this happened and make sure to fulfill our fiduciary duty to our stockholders."

Fool me once.

But if your DNA report reveals markers for some potential health problems, and you are aware of that and do not disclose it to the insurance company when you sign up, they can cancel your policy in the future for not sharing all known health issues with them, if they find out somehow that you obtained that report and did not share it.

Since the introduction of the ACA ("Obamacare") the US no longer works the way you're claiming. You don't have to disclose preexisting conditions during sign-up, and your policy cannot be cancelled due to failure to disclose.

this is true for life insurance but not health insurance. also it seems reasonable that your insurance company shiould have access to data that defines your risk since it helps them make better statistical models to manage risk across large pools of people as well as tailor the policy costs to your individualized risk.

This is how it works in other areas, like car insurance; with an absence of data, your policy is roughly based on the average individual, but as you gain more years of driving experience (good or bad), they factor that in. Bad drivers who get in more accidents pay more because their behavior is riskier and costs more. (the big difference being, in this case, genetic risk is predetermined and not something you can easily change through behaviur modification).

I worked in insurance. A large part of how they manage costs is by policy exclusions. I believe universal coverage where they can't exclude pre-existing conditions breaks the insurance model.

It really needs to be covered by the government, like police and fire. Universal health coverage makes sense as a public good. It doesn't really make sense as an insurance product.

How can my DNA be anonymized? It's a unique identifier to me and leaks traceable information about my family.

Great question! I am working on a method to anonymize DNA, applying cryptography at the molecular level [0].

Regarding the statement in the article about anonymization, you are correct. These data are not anonymized, your name is just removed from the file. This does not truly anonymize the data, for reasons you outlined.

Interesting quote from Linda Avey, the co-founder of 23andMe: “It’s a fallacy to think that genomic data can be fully anonymized”[1]. I disagree, and think that eventually we will be able to anonymize your genome in a test-tube, so that the data is secure before it ever touches a computer.

0: https://geneinfosec.com 1: https://undark.org/article/dna-ancestry-sharing-privacy-23an...

Because it impractical to tie the information found in the DNA back to you as an individual (e.g. name, social security number, etc). It is therefore impractical for companies to use that information to target you.

Even if it is impractical now, is there any objective reason to believe it will remain so in the future?

The sequenced DNA exists without any identifying information linking it to an individual. If they re-sequenced you and matched the two, they've gained nothing (since they had to re-sequence anyway), if they didn't re-sequence then they have no way of associating the two.

So now, tomorrow, or into the far future the two piece of information cannot be re-associated because there's nothing to map them. The information about the link is lost.

It is like asking "Can a lossy MP3 be restored to the lossless version if technology improves enough?" Nope. The information is simply gone. Same is true here, the associated link is just gone.

I think there are degrees to the extent of tying / linking / re-associated (you use all these). If my DNA is present in some database and at a latter point in time I am sequenced again, all "metadata" (time, geography, etc...) associated with the circumstances of the first sequence getting into that DB immediately applies. In addition all genetic relatives (known or unknown to me) whose DNA is present in the DB may be affected by the circumstances of the second sequencing.

The entire discussion is predicated on 23AndMe or Ancestry

* not pivoting their business model and selling the unanonymized data to the highest bidder.

* not being forced to reveal the unanonymized data to the authorities.

* not revealing unanonymized data to hackers

This is a common misconception, but you are incorrect. These data can be used to identify you, therefore they are not anonymous. If a name is removed from a file that contains zip-code, age and gender, that data is nevertheless sufficient to re-identify most people. Removing a name from a data-base does not necessarily anonymize that data-base.

DNA can be used to uncover your last name and ethnicity, estimate your height, hair color, eye color and more, so anonymization is not as simple as removing a name.

> DNA can be used to uncover your last name and ethnicity, estimate your height, hair color, eye color and more, so anonymization is not as simple as removing a name.

It cannot be used to uncover your last name without sampling from several generational family members (and for them to be de-anonymised). Simply having a range of heights, hair color, eye color, is not enough to identify a unique individual.

The mechanism you're claiming could exist seems to be based on already having de-anonymised sequenced DNA for most of the population. But if you had that you could infer someone's DNA profile without having to actually sequence (or more to the point, de-anonymised it) making the entire activity pointless.

Essentially your entire argument is: If we already have your sequenced DNA (or could easily infer it) then de-anonymising you is easy. Which is both true but completely pointless.

A hypothetical health care system could be designed in all sorts of ways.

While your gut feeling is likely correct, they could implement a system has reduced benefits for people deemed to costly to support. Or prioritize treatment of critical problems and not count preexisting conditions as critical.

It is much more likely that everyone would be covered in the same exact way in my opinion though.

The writing was on the wall for a long time...

Best that people can do in the future will be to make the tools as cheap and accessible as possible so we wont have major genetic data monopolies acting as a walled garden that they can reap the benifts from the most... though they’ll be the head honchos in town for awhile until then.

Makes me glad I didn't use 23andMe and makes me not ever want to use products like that in the future.

Can I ask why? Is it because that company is profiting off of your data or something else?

I think there are some simple situations that provoke this type of response. The first is that while there are protections in place they are only paper protections. History has shown us those are not always effective. What if 23andme goes bankrupt and the data is "lost" or just sold to the lowest bidder? I am not certain of the legal loopholes present, but I wouldn't doubt we'll see a case of this happening in the next decade.

The second is long term exposure to the unknown. Even if you knowingly choose to opt-in, what if that data is used against said persons at some point in the future for something they failed to consider, maybe because they couldn't know because it's not been identified as of yet?

Profit isn't always the detractor. I get that these organizations need to make money. But without proper explanation (there should be a better legal framework in place for services like this that offer equal protections to both sides) and a potentially biased legal agreement, informed users will likely question the unknown of giving up information to a company who will continually be realizing ways to use your DNA for the foreseeable future. And, oh by the way, you paid them to do that.


Eventually this data will get into the hands of governments and others. Who is to say that in the future you are not permitted to own a firearm (just an example ) because you have some gene that may increase your tendancy for violence (some future research show)? What about an insurance denying you becuase of a gene? That may be illegal but it's you against Goliath. How many laws are broken by banks etc. without recourse because the ones affected don't have the resources to fight it?

Sure these are extremes but if this information is available, eventually it will be used and most likely first for for profit.

If you were born in California after 1983, the government already has this data:


* side note *

Government is 1 of the largest buyers of marketing data as well. They cant do the spying legally but nothing stops them from purchasing what corporations do.

Is this story exclusive to BI? Registration is required to read it.

Imagine giving your DNA to the (former)wife of a Google cofounder?

Why are people surprised they would abuse your data for profit?

Because all of the available information indicates that this is very much not the case?

If this triggers you, you kind of missed the fact that Roche was an investor in 23andMe hears ago. Plus 23andMe is located in the Genentech/Roche campus.

Which is the funny bit given GSKs takeover.

23andMe headquarters is located in downtown Mountain View, and we have a lab in South San Francisco. Nothing on the Genentech campus.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact