Hacker News new | past | comments | ask | show | jobs | submit login

> To create a recurring revenue stream from the tests, Wojcicki has pivoted to subscriptions. As media companies launched streaming “+” channels, Wojcicki rolled out 23andMe+, offering personalized health reports, lifestyle advice and unspecified “new reports and features as discoveries are made” for an initial $229, with annual renewals of $69.

I was a heavy believer of 23andMe until this point. I answered all of the available research questions, which was a thing that took absolutely hours and was filled with semi-invasive medical questions. I did this under the premise that I would hopefully be helping research and I felt really rewarded having completed all of them. Then, they dropped the + bombshell and I felt really rugpulled. I paid them for genotyping on their v4 and v5 platforms -- so I paid twice, I referred friends, I bought people kits, I helped research...and now I was being asked to pay a subscription for what I was promised to begin with? Eesh.

I think the fundamental issue is that when the FDA stepped in and told 23andMe to cut back on their reporting that they really hit a roadblock. Promethease always gave better results than 23andMe, but when that happened it commoditized genotyping fully. Just download your data from any provider and use a third-party tool and you're set.

Fast forward to 2024 and they're stuck with a platform that's semi-limited and they haven't delivered on any of the research deliverables that many people wanted in the first place. The idea of getting new genetic reports monthly was appealing and simply never materialized. It's no surprise people aren't hot on this as a business -- but what is surprising is that they were completely passed by startups like Nebula Genomics in offering whole-genome sequencing and competitive data access. I think the stored data they have is their only advantage, but they don't seem to know how to leverage it.




As someone who paid for 23andme+, it was pretty much a scam anyway, I got maybe 4-5 more reports over the year, all of which were for random popular ailments like "Anxiety" where the link between your genes and the disease(?) are pretty dubious in the literature and the takeaway was "you are 5% more likely to experience anxiety".


> where the link between your genes and the disease(?) are pretty dubious in the literature

This is inescapable; genetics usually can't show causation, for instance because you can't do an experiment where you change someone's genes.

Geneticists seem to deal with this by using statistics like GWAS that are obviously just correlation, adding a sentence that correlation doesn't show causation, and then just proceeding and pretending like it does.


Sure, perhaps that was a bit too succint. What I mean is that, as a customer of 23andme, I'd much rather know about some known genetic disease that I may or may not have than about wide-ranging things that affect everyone like stress and anxiety. It's just not as serious.

23andme went with the "only give very limited information" which I respect but is ultimately pretty much useless compared to just running your data against dbSNP and telling me that I have this or that marker for this or that disease backed by this or that paper.


Only showing correlation is one thing, but if the correlations themselves are barely noticable then that's a big problem for making a useful report.

Though if you find whatever gene is most correlated with something, what are the options for it not to be causation? If the chance of causation is high enough, it makes sense to proceed as if the risk is real.


When I said correlation isn't causation I meant it. Neither high nor low correlation is evidence for causation.

There's a more advanced form of being bad at this where you think you can show causation by controlling for everything in the environment. This is also wrong; it produces something called collider bias.

> Though if you find whatever gene is most correlated with something, what are the options for it not to be causation?

1. It's a coincidence and it's never causal.

Imagine an OSS project releases a bugfix and you diff the old and new versions. (This is basically GWAS.)

The bugfix part of the diff caused it to be fixed. The updates to the copyright dates or changelogs didn't.

2. It's causal, but the causal chain involves a specific environmental factor, and we should change that instead.

For instance, you can say every human has a genetic disease that prevents them from producing their own vitamin C, which most other mammals can do. But instead of calling scurvy a genetic disease we just eat fruits and vegetables.


> 1. It's a coincidence and it's never causal.

Half the point of analyzing statistics is to filter out coincidences, and that applies to correlations too. If something is a coincidence, it won't hold up as a proper correlation under reasonable amounts of analysis. So when the premise is we're starting with correlations, I think it's alright to assume they're mostly not coincidences.

> The bugfix part of the diff caused it to be fixed. The updates to the copyright dates or changelogs didn't.

In that case the bugfix is the "gene most correlated", isn't it? Give it a few generations to randomly spread, and the signal will be far stronger on the bugfix gene than on the copyright gene. (And if it hasn't been spreading for generations then you won't have enough samples to find either gene.)

> 2. It's causal, but the causal chain involves a specific environmental factor, and we should change that instead.

> For instance, you can say every human has a genetic disease that prevents them from producing their own vitamin C, which most other mammals can do. But instead of calling scurvy a genetic disease we just eat fruits and vegetables.

If you're testing just humans, you'll get a 0% correlation because everyone has that gene.

If you're testing across mammals, then "WARNING: Prone to scurvy". Which is a completely correct and causal result about a genetic problem, with easily accessible treatments.

So, I don't understand your example at all.


Please have some awareness of how little thought you are putting into dismissing a field where thousands of very smart people are working on solving the problem you describe. In fact, you can correct for the exact problem you describe with sibling studies where people have the same environment but different genes.

Have maybe a small ounce of humility in this respect.


> Please have some awareness of how little thought you are putting into dismissing a field where thousands of very smart people are working on solving the problem you describe.

Thousands of very smart people do a lot of dumb things. People still work on string theory. The people working on proper causal inference are also smart, work hard, and have Nobel Prizes.

https://www.nobelprize.org/prizes/economic-sciences/2021/pop...

On the other hand, the people doing genetics who aren't careful about it produced the 23AndMe report which says I have a "17% chance of having green eyes". Against what counterfactual?

> In fact, you can correct for the exact problem you describe with sibling studies where people have the same environment but different genes.

"Correcting" is a wrong way to think about it. Generally speaking, overcorrection is worse than undercorrection because of collider bias. You need to choose a study design that's correct in the first place.

You're describing a natural experiment, which is better than a GWAS of self-selected 23AndMe customers, but does have problems (silly one: selection bias because the sample only includes people with siblings) and more importantly is only guaranteed in the study environment (eg people who live in the UK in 2008) but gets reported without identifying what that environment is.


> > In fact, you can correct for the exact problem you describe with sibling studies where people have the same environment but different genes.

> You need to choose a study design that's correct in the first place.

That's precisely what a sibling study is: a study design that's correct in the first place, as far as the problem we're discussing goes.

> You're describing a natural experiment

That is correct: a sibling study, which corrects for the problem we're discussing, often contains aspects of a natural experiment, namely that the siblings tend to naturally be siblings, rather than raised together purely for the purpose of a study :)


"10% more likely to experience anxiety, once having clicked on this useless link"


> As an added security measure, we have temporarily disabled the ability to download your raw genetic data. We hope to re-enable this ability soon, and we appreciate your patience.

Apparently 23andMe doesn't let you download the data anymore, I just tried. I wanted to give Promethease a go, seems interesting.


Wouldn't laws in certain jurisdictions require them to allow you to download your data?


I couldn't find it either. Fortunately, I downloaded the genotypes in July.


"We have temporarily disabled raw data download as an additional precaution to protect your privacy. We don't currently have a timeline for when this feature will become available but will keep customers informed of any changes."


> think the stored data they have is their only advantage, but they don't seem to know how to leverage it.

I really wonder what happens to that data if they get acquired or shut down. The possible new owner of the data might have a completely different business case in mind. This scenario is something you usually don't consider when you give your data to a company that you trust at the moment.


No need to wonder, imagine the worst case scenario and assign it a non zero probability.

This scenario is something you usually don't consider when you give your data to a company that you trust at the moment.

Why would you not consider it?


Like selling it to insurance companies who can use the data to deny coverage.


This is illegal, at least in the US.


Illegal for health insurance companies. But not so for life insurance, or disability insurance or long-term care.


It's illegal in some US states like California. The California Genetic Information Privacy Act prohibits these companies from disclosing a consumer’s genetic data to any entity that is responsible for administering or making decisions regarding health insurance, life insurance, long-term care insurance, disability insurance.


If they wanted to use that information for underwriting they'd just make getting a genetic test a prereq before underwriting the policy.


And if they wanted easy cash they'd "just" make you pay $500 as a signup fee.

Which is to say: No they wouldn't just do that, it would cut their business too much despite being a way to make more money per client.

But getting it from a company that already has it avoids that downside.


Promethease is why i initially did 23andme, and it was beyond worth it for me personally.

But for the same reason I'm just a one time customer, all I needed was for 23andme to give me a very expensive hospital test at a 95% discount, and I didn't need them anumore


Can you explain how you got a 95% discount?


They are comparing with a full DNA sequencing when going through a hospital.


Is there a Promethease equivalent for ancestry data?


there's this for example, not as open as Promethease but similar in that it's "bring us your data and we'll give you new analysis on it":

https://yourdnaportal.com/advanced_ancestry_analysis


yTree would qualify: Small one-time fee, no subscription needed, simple but good results.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: