Hacker News new | past | comments | ask | show | jobs | submit login

It seems like one of the biggest issues here is that a single test for a given thing isn't enough data to be able to reliably tell whether or not the results are significant. So people are arguing that we should collect fewer data points?

I totally disagree with this viewpoint. The medical community's current approach to testing (positive result, treat ALL THE THINGS!) is an artifact of the difficulty and cost of performing the tests; the inability of many providers to apply basic concepts of probability to test results should not be used as an argument against advancing the state of the art, particularly as the industry begins baking data-driven clinical decision support into automated health systems.

If you have 40 tests spanning 20 years saying that you aren't at risk for Total Scrotal Implosion, and then suddenly, without any symptoms, you get a result saying your testicles will fall off tomorrow, you have context with which to interpret this result. Without the historic data there is much greater risk of you and your healthcare provider agreeing to an unnecessary knee-jerk scrotalectomy.

Less data is never the answer. Just my 2 cents.




"If you have 40 tests spanning 20 years saying that you aren't at risk for Total Scrotal Implosion, and then suddenly, without any symptoms, you get a result saying your testicles will fall off tomorrow, you have context with which to interpret this result. Without the historic data there is much greater risk of you and your healthcare provider agreeing to an unnecessary knee-jerk scrotalectomy."

This would be true if the issue if the reason for a test's error rate is due to inaccuracy of the result (e.g. I am trying to measure temperature, and 5% of the time it measures higher than it actually is) In that case, measuring often and keeping track of historical data would help.

However, this is NOT the major problem with these sorts of medical tests. The issue is that they are measuring something only RELATED to the disorder they are screening for, and not the disorder itself.

The actual fact is something more like: We have noticed that people that measure above value X on this test have higher rates of Total Scrotal Implosion.

However, there are lots of people who have above value X on the test who do NOT ever get Total Scrotal Implosion. You can test them every day, and the test accurately measures that they have higher than X of whatever is being tested - but they will never get TSI.

You can't fix this with more tests and tracking historical data - the test is accurate for what it is measuring, so repeated tests aren't going to change the overall accuracy of the PREDICTION that is being made from the test.


I'm not sure that's true, if the Prediction accounts for the fact that many people with a high value never get the disease. But that's only possible with sufficient data showing a negative correlation.


If many people with the high value never get the disease, then the historical data won't help you discover a false positive (which is what the person I was responding to was arguing)


You are incorrect.

All kinds of "shotgun testing" (i.e. indiscriminate testing for everything like you recommend) have been studied, and proven worthless at best, and more often than not, actively harmful.

First, there is the issue of tests' limitation, and extremely low predictive power. For instance, if testing positive on A makes its 20 times more likely that you'll get B, and the prevalence of B is 1 out 1000 000 in the general population, your own personal risk remains low enough that nothing has changed -- except that you will panic and do unnecessary interventions to reduce this risk. That is precisely the reason tests are asked when you already have symptoms, so if you're pretest probability of disease is 10%, a positive test results means you most probably have it, and it is worth doing something about it.

Second, all known treatments (including "preventive" ones) carry non-zero risk. When you don't have symptoms, whether or not you test positive, your risk of dying from a disease remains lower than dying from an intervention to prevent the disease -- thus, you gain nothing by testing.

Let's take for instance a 40-yo female who gets an ECG done for no good reason. It shows signs of heart disease, which could be a variant of normal, or a sign of a disease. The lady is worried, so she goes on with a stress test just to be sure. She tests positive (a sizeable proportion of those tests are false positive for multiple reasons), so she decides to go on and follows up with a coronary angiogram to see if there's any blockage. Angiogram is normal, but, a coronary is perforated during procedure (1/10 000 risk), and she dies on the table, when she never had any health problems beforehand. This kind of stuff happens all the times.

Finally, from an ethical stand-point, as long as healthcare -- and the individual's stupid testing choices -- are paid for collectively, individual choices should be severely restricted.

If we were in a country without any kind of state-sponsored healthcare, where you'd get to pay for any self-harm from your own pocket, I'd argue for free-for-all testing for anyone without any oversight.


I'm a physician. This is the thing most people don't get about tests. It's tiring to see these startups with their flawed agenda.


But what if we could redesign the healthcare systems, in a way that doesn't expose people to all their test results , only critical ones, but gives the doctor/system that data in order to enable to better watch their patient ?

Would that be useful?


No, as I explained, when disease prevalence is low, most tests -- even when clearly "positive" -- don't shift probabilities in any appreciable way, and just add to the noise, and render decision making even harder.

Of all patients, nurses and doctors are the ones who are the less likely to asks for "more tests", precisely because they understand that they are essentially meaningless when pre-test probability is very low.

Suggested readings :

1. Bayes' Theorem: https://en.wikipedia.org/wiki/Bayes%27_theorem

2. Base rate fallacy: https://en.wikipedia.org/wiki/Base_rate_fallacy


Well, it's a good point but how do you decide what's critical?

To answer my own question; I believe the future lies in machine learning algorithms processing symptoms and tests (I guess a symptom is also a test in the sense that it's the answer to a question).

Most of the times there's also not a simple answer to be found . The right answer depends on many factors including the capabilities of your hospital/country/economy and the state of science.


> The right answer depends on many factors including the capabilities of your hospital [...]

Absolutely !

I'll add that the physician himself is a kind of test, in the sense that his own sensitivity/specificity to diagnosing a disease can be calculated.

It is well known -- and I'a argue it's a feature, not a bug --, that the exact same patients with the exact same symptoms will have a different work-up whether he's seen a GP, an emergency physician, or, say a heart surgeon. The reason is pretty simple: because disease prevalence is different in those three practices, the doctor has to order more or less tests to get the same predictive power. E.g., when every patient has heart disease, every ECG change is probably sign of disease, whereas when almost nobody has any heart problems, ECGs are pretty meaningless.


>> I believe the future lies in machine learning algorithms processing symptoms and tests

At the core you're still left with the question - tell your patient directly results of a statistical analysis of a few possibilities and options - and often see him take the wrong one, or guide him through trust(in you or the machine) while not showing him full details. Right ?


The person you're replying to isn't incorrect. He was saying that having few tests that your treat as predictive isn't as helpful as a long line of tests that are interpreted over time as predictive.

Your example is someone taking a test and immediately moving ahead with treatment. Not someone taking a test, noting the result was interesting, and then takes more tests over the next few weeks, months, or years to confirm the results were something to be concerned about.

If the tests are cheap and simple to run, there is no reason to EVER act on the basis of a single data point.


You have no clue, sorry.


While I agree that it's easy to misinterpret statistics. And I agree that sometimes the government needs to restrict freedom of choice to nudge the collective good for the betterment of society. I just can't see testing as one of those issues compared to the other health issues of society, like overwork, junk food, sedentary life styles etc...


Nah, 'paviva is right. Unnecessary tests are a problem, and you can put them in the same bag as hypochondria and WebMD abuse in terms of things where our brain's heuristics and cutting corners in evaluating probabilities start to work against us.

> compared to the other health issues of society, like overwork, junk food, sedentary life styles etc...

You can always find bigger issues. But given that there are startups and big companies that try to solve the issues you mentioned with spurious, half-assed, unscientific pseudo-tests (yay "wearables", yay "Internet of Things"!), it's even more worrying, because suddenly testing abuse may get coupled with the problems above.


This is like the people talking about throwing away their scales or only weighing themselves every few weeks while trying to control their weight.

The solution isn't fewer data points, it's to collect the data frequently and rigorously, and the post process it into a trend.

That's how testing should be done. You get a long series of measurements that are post processed into a coherent picture of the reality.


No, this is like people telling others to weigh themselves only once a week - because they know that members of the general population can't be bothered to understand "mathy" concepts like running average, or (gasp!) low-pass filter. It's recommended because otherwise a lot of people end out freaking out over noise in the data. And this is exactly the topic here - laymen freaking out over data they're not equipped intellectually and emotionally to comprehend.

Now the other point - that moar data is always better; in principle, yes, if you follow rigorous rules about collecting, analyzing and integrating it into the existing body of evidence. Which is not what usually happens outside research conditions. As you said,

> [the solution is] to collect the data frequently and rigorously, and the post process it into a trend.

The thing is - we've having big problems with the "rigorously" part, as well as no-bullshit post-processing. The current wave of companies selling "health" sensors ain't helping - they push on frequent collection in a totally non-rigorous way, using half-assed measuring equipment. And giving that data to normal people first (besides taking it and monetizing it), many of whom will obviously be freaking out. This is not helping to form "coherent picture of the reality" much. It's just helping those companies line their pockets.


Completely agree. False positives are a fact of statistics, what we need is more data to establish baselines for people and populations and understand when is the right time to worry. I'm genuinely surprised that doctors aren't leading a charge to get more data about their patients and arguing instead for less testing and data. Do they really feel that their diagnostics are so good that they only need a few data points every couple years? My watch and phone know more about my health than my doctor.


It's been addressed elsewhere in the thread, but the problem is not statistical false positives, but rather technically "true" positives (for what the test measures) that do not correlate to an actual disease. More data points do not help.


More data points do help the individual know their baseline range for when they feel healthy compared to range when they experience a health problem. Looking at second order effects like when a value suddenly changes for a particular individual can be very indicative of a problem. More data points would also allow us to look at correlations in the data to better refine the interpretation for combinations of tests.

I can understand economic reasons why since we're all paying for insurance collectively we might want to limit testing. What I have a harder time understanding is medical profession demanding that physics change to accommodate their process rather than changing their process to accommodate physics and statistics.

I really don't expect any tests to be perfect. I especially don't expect any test given when I'm sick to be able to tell me what a normal value for me should be when I'm well. What I would like to see is us embrace the reality of data and have enough of it that we can start to separate the signal from the noise. Just look at examples like the success of The Nurses Health Study[1] because they looked at lots of data over lots of years from lots of people. Not surprisingly a lot of health issues are difficult to understand looking at single data points.

[1] http://www.nhs3.org/


> The medical community's current approach to testing (positive result, treat ALL THE THINGS!)

No, that's what the patients want.

"You do have cancer. We're going to watch and wait." is something that's only recently been accepted by some patients, even though the side effects of treatment are so drastic. And those positive results only happen because people push inappropriate testing.

> Less data is never the answer.

More dirty data isn't particuarly helpful.


> > The medical community's current approach to testing (positive result, treat ALL THE THINGS!)

> No, that's what the patients want.

And the insurance companies, because they don't want to be sued for a false negative with unfortunate consequences. And there's an additional terrible incentive in that if you treat "just in case" and it has a negative consequence in terms of lifestyle, it's OK because "it's better than the alternative".

Consider prostate cancers that develop slowly and could probably have been left alone (referred to as "watchful waiting") -- but if you operate, the patient will survive; the side effects like incontinence aren't the doctor's or insurance company's problem.


The data isn't "dirty", it's perfectly accurate. You are just using it wrong. The test can give you an objective answer like "you have a 20% probability of having cancer".

But the system doesn't weigh the cost and Quality Adjusted Life Years of treatment vs not treatment, it just defaults to treatment. This is the problem that needs to be fixed, not eliminating collecting data.

And if the patients really are the problem, then don't show them the raw numbers. But having them is potentially useful. But maybe they should see the numbers, and maybe if they decide on treatment anyway that is their right to do so, and taking it away is wrong. Either way the problem is the system, not tests themselves.


You're assuming people know what the test results mean. Every time we ask people what the tests mean we find they don't know.

Gerd Gigerenzer (Reckoning with risk) shows that doctors, nurses, and patients don't understand the results of screening tests.

Here's another example: https://www.sciencenews.org/blog/context/doctors-flunk-quiz-...


That's a problem that can be fixed. Very basic statistics is much simpler than most of things doctors have to learn about. The article explained it elegantly in a simple graphic.

I find it hard to believe there is ever a time where collecting less data is an improvement. At worst the data doesn't change anything, but at best it gives you new information that improves outcomes.

If more (correct) information is actually making outcomes worse, it's not the information's fault. It's the system using that information incorrectly.


> But the system doesn't weigh the cost and Quality Adjusted Life Years of treatment vs not treatment, it just defaults to treatment. This is the problem that needs to be fixed, not eliminating collecting data.

Doesn't it? I mean, depends on the place probably, but I remember having a class with an MD once and we were discussing the overall goal of healthcare, and how to balance physical and mental well-being. The problems that arise there are exactly like this: you know, with your "perfectly accurate" data, that patient has X and, say, 3 years to live with serious symptoms showing up only close to the (for lack of better word) deadline; telling them about it will most likely mean 3 years of stress, painful treatment and heavy strain on patient's family&friends for, at best, a small extension of the lifespan. Not telling them means they live 2.5 years happy and then for the last 0.5 year they get sick. Should you tell them?

Most people scream "yes", and that's exactly your approach of "defaulting to treatment". Doctors would sometimes like to answer "no", but that means lying to the patient, and not showing them the data.

> And if the patients really are the problem, then don't show them the raw numbers. But having them is potentially useful. But maybe they should see the numbers, and maybe if they decide on treatment anyway that is their right to do so, and taking it away is wrong.

It seems like a free will issue, except that if 99% of people do the same wrong, stupid thing when experiencing a particular situation, it doesn't seem right to let them suffer from it. It's one of those human rationality errors. Sometimes people do need to be protected from themselves.

Now the problem is that the current trend of separating the doctor's office from the lab - whether via third-party private labs or all those half-assed smartphone-based tests - means that it's hard to hide raw data from the patient.

And yeah, I'm a bit conflicted about it - I want to look at my own raw data, I want to play with it, graph it, whatever, but I'm also aware I might freak out if something really weird shows up in them.


> More dirty data isn't particularly helpful.

Depends on how it is dirty. If it is systematic error, then of course it doesn't help. If it is statistical error, then repeating the test over and over is exactly what you need to do.


Most medicals tests are systematic and not statistical error. False positives will continue to be positive if you re-administer the test because they are due to variation in the individual and not the test.


I definitely wish I had a comprehensive log of tests over my lifetime. I'm now more interested in the changes (trends) than the absolute values.


To clear out some of the issues that may come from too emotional approach to medicine, let me state an equivalent problem: arguing that MOAR DATA is good for medicine is the same as arguing TSA needs to do MOAR screenings of all kind on airports to everyone.

Yes, the same statistical issues apply to both cases, and so do harm/good tradeoffs.


What do you think happens to the person who receives a single lab result, out of many, indicating that they have cancer? Human nature is to focus on the negative. To assume that humans can, by looking at a multitude of data points, ignore the small # of false positives is to misunderstand the evolution of human emotion.

https://www.psychologytoday.com/articles/200306/our-brains-n... talks about why we are evolutionarily programmed to latch on to the negative aspects of our life


> Less data is never the answer. Just my 2 cents.

That assume perfectly rational reactions. Many people can't deal with "You tested positive for X. We should keep an eye on it and see if it develops into something." It makes them nervous. They want a pill. They want surgery. etc.

The problem with even really good tests that test exactly what you want is that they have 4 modes-2 good: test positive for X/you actually have X, test false for X/you don't have X and 2 bad: test positive for X/you actually don't have X and test negative for X/you actually do have X.

The problem is that when the actual instance of "you have X" is very low, the "test positive for X/you actually don't have X" can swamp your signal.

Add in the natural noisiness of biological systems, and you wind up with lots of incorrect assessments.


> Many people can't deal with "You tested positive for X. We should keep an eye on it and see if it develops into something."

As I see it, this is mostly a healthcare UX problem. If a test is such that a negative result is very reliable in ruling out the condition tested for but, because of the combination of false positive rate and low incidence, a positive result doesn't indicate the presence of the condition, it shouldn't be presented to a non-technical end-user (i.e., most patients) as a positive result. It should be "The test to rule out Condition X was not able to rule it out."


Similarly, when running a hypothesis test statisticians "fail to reject" the null hypothesis, rather than accepting it.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: