Hacker News new | past | comments | ask | show | jobs | submit login
Beware explanations from AI in health care (science.sciencemag.org)
53 points by pcrh 13 days ago | hide | past | favorite | 51 comments

Whenver I see something about a black box algo in healthcare it makes me think of this study: https://journals.plos.org/plosone/article?id=10.1371/journal... of pigeons trained to read radiology images. Obviously if a doctor was to tell you that the pigeons thought you had cancer you would think it is ridiculous, but if an algorithm said the same thing and was just as likely to be right it comes off as much more credible, yet it might not be any more understandable or effective.

Some cultures would take a dog-smell diagnosis over an algorithm; and others would trust a pigeon before a dog, even more if you called them doves.

Another classic of the genre is the ol' dead fish fMRI analysis:


Idk, people have the same weaknesses that we always blame on AI. People have bias, people lack empathy, people make incorrect moral judgements.

Doctors have bias. Doctors sometimes get apathetic. Not because they have bad intentions, in fact I bet most doctors really care and really try. But because they're overworked, and they have to care for many patients who they don't personally know, and some of whom get very sick. And because they're only human.

In fact, I believe the way most doctor's appointments work is that the doctor gets a list of symptoms, then matches it against a checklist of what they were taught / experience to form a diagnosis. Very similar to an AI algorithm. Of course this isn't as easy as it seems, you can't just Google the symptoms yourself. Doctors' knowledge and experience is very deep, and takes many years to develop. But still, it's not something an AI couldn't eventually automate.

Right now I would be skeptical of an AI doctor, just because I don't think current AI has that capability. But seriously, if ML gets better I would trust it better than any real doctor.

The issues with doctors is that many of them are operating on knowledge that is not completely up to date.

Doctors have an extremely hard time dealing with complex disorders or even diagnosing them. People go years without a proper diagnosis.

I would love for doctors to just take all the information during first visit and then use some kind of search engine/AI that would go and analyze current literature, new research and suggest things to rule out in a cost effective manner while ruling out things that would need immediate diagnosis/treatment such as cancers.

Patients also need to be tracked if they agree to it, especially if they have a serious disabling condition. Their sleeping habits, diet and everything they do needs to reviewed in detail but just isn't done. Most of these things are self reported and that can be problematic.

Your first two paragraphs are two separate issues. If their info is not up to date, maybe they need some additional training. The reason they have a hard time with complex issues isn't their fault. Medical practice and science in general can't deal with complexity very well, and there is very real danger of iatrogenics if they over test and over treat.

Interesting, so the AI will filter out irrelevant information that the doctor can then focus on. I think from a social perspective that might be hard, because the patient/doctor has to be freely comfortable with the idea that the doctor was not an expert in that sub-area before, and had to recently learn/come up to speed to be able to help. To me it seems like doctors don't want to freely admit that they don't know something (that they can figure out), and patients don't want to think that is the case.

Slightly related but there's a website somewhere that you can upload your DNA sequence (in a .txt file!) to and it shows you relevant scientific literature. About 50% of those articles for me told me that I was probably going to develop male pattern baldness in my thirties/forties.

I technically use my own 'algorithm' before taking any new medications. It isn't sophisticated at all. However, I'm surprised the medical community is completely ignoring the fact that some medications have support groups. While that doesn't mean a drug might be unsafe, it definitely can point towards some adverse affect that was missed in the studies. If you try to find any groups for penicillin, you will nearly find nothing....now look at accutane.

Basically, pharmaceutical marketing is causing doctors to be really misinformed and a lot of doctors lack knowledge about updates from the FDA on certain drugs. They often prescribe medication which are more dangerous than others.

I am thinking about creating my own little tool to research drugs before taking them.

May want to check out this: http://snap.stanford.edu/decagon/

The repo utilizes SIDER which is a drug side effect repo: http://sideeffects.embl.de/

It also utilizes DrugBank: https://DrugBank.com

So far personalized medicine based on DNA sequencing has been mostly a bust. In most cases the data isn't really actionable. It might tell you that you have an elevated risk for certain conditions but in most cases the absolute risk is still low. Or there's not much you can practically do to lower the risk.

There are a few exceptions like BRCA where the risk from certain genotypes is high enough to justify aggressive preventative care.

That's really not how doctors actually work in most cases.

There have been clinical decision support systems around for years which can recommend a diagnosis given complete and accurate clinical data. The real problem is gathering that data, and putting it in a form that the algorithm can use. An AI algorithm can't really gather a detailed patient history or make subjective observations about the patient's state. Those are steps which can't really be automated without strong AGI. In theory a doctor could do all the necessary data entry but in practice it would usually be a waste of time and no one wants to pay for it.

An AI algorithm can't really gather a detailed patient history or make subjective observations about the patient's state

Chat bots can handle data collection, “subjective” patient state can be collected through image recognition and tone analysis. Not sure you need AGI to accomplish either of these tasks.

Feel free to try building a chat bot for that. I guarantee you will fail.

> But because they're overworked, and they have to care for many patients who they don't personally know

I keep hearing this whining about overwork, and yet when the subject of residency quota set by the AMA comes up or delegating duties to nurses with almost the same amount of schooling suddenly you can hear crickets in the room.

> In fact, I believe the way most doctor's appointments work is that the doctor gets a list of symptoms, then matches it against a checklist of what they were taught / experience to form a diagnosis. Very similar to an AI algorithm. Of course this isn't as easy as it seems, you can't just Google the symptoms yourself.

Yes you can, and should. Remember half doctors out there are below average...

This AI thing worries me not because of the application but because of how it will make society dumber. There is a lot of potential to help diagnose patients, picking up on patterns humans might not be able to parse.

Every human has a different perspective and approaches problems differently. The old adage "two heads are better than one" comes to mind. With AI there is the possibility that just "one head" will make millions of decisions using a single algorithm. Think about that. No team cooperation. No learning from experience. No brain storming.

Who's second guessing the AI? Where is the alternate perspective? Where can you go to get a second opinion when the computer has replaced humans? Imagine being misdiagnosed by a computer and the clueless human health care workers don't or won't question it because the computer said so. You'll be one the phone talking to an AI which is looking at records generated by AI full of data compiled by AI. Why pay costly humans when you can replace them with a computer.

All of this is going to dehumanize a very human industry (along with everything else.)

I have a friend and his wife had some weird stomach problems that the local doctors couldn't figure out. It sounds like the doctors basically ignored that a problem existed because they couldn't think of anything.

Eventually they had to go to the nearest big city (city they're in now is 60K people; city they went to has 900K people). The doctors there figured it out pretty quickly. And they also did not have good things to say about the local doctors. (paraphrase) "I'm not surprised they didn't find anything. They look for the simple solution they know and if that doesn't work they give up."

Additionally, I went to a doctor for a problem in the same big city and he read (and printed out) a webmd article for me.

Anyway. So, today people are stuck with human doctors who have no idea how to fix anything but the simple obvious problems. Personally, I wouldn't mind replacing those people with an AI as I'm not sure it's going to make any difference. There's always going to be specialists and if the AI doctor fails you then you can go to the specialist. But just like today, you're going to have to be your own medical researcher / advocate if the service you get doesn't actually fix your problem.

Wait until you have a rare disease before you say something like you want AI help diagnose you (or make decisions about your healthcare for that matter—the US is going to be a mess if they go this route). 7% of the general population collectively has a rare disease, so it’s not uncommon to have a rare disease. Most of them are classified as very rare.

I personally have 2 rare diseases affecting my peripheral nervous system, and one of them is very rare. The very rare disease is believed to have also caused my type 1 diabetes.

The wild part? The very rare disease, which I had prior to my type 1 diabetes diagnosis, which was also undiagnosed at the time, was blamed as “type 1 diabetes related complications” specifically autonomic neuropathy by human doctors.

Do you think AI would get this right? The answer is an obvious no.

Also, I am in my early 30s. My endocrinologist was surprised that I was 30 and was making a big deal about it (as in “you survived and you shouldn’t be alive” sense). It was never statistically likely that I would be alive, which is why AI should never be used.

So do you think AI should be making health decisions in a healthcare system that already has a ton of human error, and is trained on those errors?

I mean, the third leading cause of death in the US is believed to be preventable medical errors: https://www.bmj.com/content/353/bmj.i2139

I don't understand your argument.

You're saying that humans are unacceptably bad. Then why is it a problem to replace them with also unacceptably bad AI. Like, either way you're not getting the treatment you need.

At least if the AI fails you, then nobody is going to get their ego stepped on if you go to get a second opinion. Whereas with a doctor you have to go through a few uncomfortable conversations and then hope that the next guy isn't going to ignore you because they don't want to sour their relationship with your first doctor.

> Do you think AI would get this right? The answer is an obvious no.

Today? Sure. But really one of the most plausible ways in which AI could be more effective than human clinicians is in rare diseases. It assumes a massive input of data of course, but clinicians are biased against this in ways that are hard to do anything effective against. This is why so many people with rare conditions spend years before getting a proper diagnosis.

Getting that data is an extremely hard problem. In general the data is low quality, incomplete, poorly coded, and full of errors. Cleansing the data to the point where it could be used to train a useful decision support algorithm would require a tremendous amount of human labor by expert clinicians. This isn't something that can be automated.

Okay ... how do doctors (individually or as a whole) get better at doctoring? Either we're saying they look over the data OR we're saying that they somehow just mystically know better as time goes on.

Who cares if it turns out that we can't automate the formation of the AI. But we're doing the same work to continually educate and improve doctors already. At least with the AI it won't die of old age once it starts to get really good at what it's doing.

And maybe it turns out that the best that AI can do is duplicate a middle level doctor who isn't really interested in self improvement. If that's the case then at least everyone can get cheap and fast health care that's the same frustrating quality that we're stuck with now. Only with the AI you can say that it's dumb and everyone will believe you. With a doctor you have to fight past "impugning the reputation of the profession" in order to get a real diagnosis.

Doctors are getting better at doctoring mostly through simple things like checklists and evidence-based medicine treatment guidelines. Those reduce variance and the risk of error. Improved interoperability between clinical systems also has potential to reduce errors by making sure all members of a patient's care team have complete access to the patient's entire record.

Diagnosis isn't particularly difficult or time consuming in most routine cases after gathering the necessary data. If you see a jagged line on the x-ray then the patient has a fractured bone. If the A1c level is 7.5% then the patient has diabetes. Etc. Improving diagnostic accuracy will help in rare cases and is worth doing, but that won't have much impact on the health care system as a whole.

I don't think I understand your argument.

You at first appear to say that gathering data and automating things is hard and might not be possible.

Then here you're saying that checklists, evidence-based guidelines, and improved interoperability between systems are all the source of doctor improvements.

These are all arguably forms of AI that have been created through data gathering. Yes, a checklist is a very boring AI, but it's taking some knowledge out of the heads of a group of people and then making it available for someone else to use. The missing step seems to be that instead of having a person look at a piece of paper and execute the checklist and evidenced based treatment, having a computer do it.

You're missing the point. There's no cost effective way to provide the computer with the complete, high quality data necessary to execute the checklists.

I hope I can be forgiven for missing the point when this is the first time (as far as I can tell ... perhaps you said something in a different thread) you've mentioned cost. At first it was "this cannot be automated". Then it was "well it happened, but it's okay because it's written down on paper and not put in a computer".

Now we're talking about cost. Cost where (at least in the USA) going to the ER for a broken bone can cost thousands of dollars. Cost where (at least in the USA) a birth (something people from all walks of life go through) has a minimum of the patient spending $10K and regularly gets into the $50K and up range (not even talking about ultrasounds etc leading into the birth).

I cannot believe that we're really talking about data collection being too expensive for the medical industry. Maybe nobody wants to be the person to pick up the bill. But it's going to be a rounding error of a rounding error compared to all the other money that goes through that industry.

The data gathering and entry can't be automated. It could be done manually, but at a very high cost (not a rounding error). Commercial insurers and Medicare certainly wouldn't be willing to pay. As a practical matter it's just not worth doing in the vast majority of cases.

Thanks again for explaining it so well. Usually me and you do not agree on healthcare issues, but this is one area where we seem to be on the same page.

As for diagnosing rare or very rare diseases, yes, it is possible, as you said. However, if you search for the rare disease I have in PUBMED, there are only 111 journal articles that have been written about it, and some are in foreign languages. The information on it is very sparse. I also doubt that it would be diagnosed by AI, as it effectively looks like "diabetes complications" (I have type 1 diabetes) and even though I had autonomic neuropathy at age 5, the onset of symptoms was very insidious.

> [...] to the point where it could be used to train [...] require a tremendous amount of human labor by expert clinician.

This is how it has been done for decades at this point (at least in some systems), and yes it is expensive. To really tackle things like above rare disease interactions, it would require at least an order of magnitude change, this is at least plausible; but would require some systemic changes.

Correct and thank you. Usually me and you don't agree on healthcare matters, but on this, at least we do. Your response is on point, and is exactly the issue.

I have a lot of concerns about AI moving into spaces that involve human life and death.

With that said, I think you've highlighted one possible outcome, but it makes a few assumptions that might play out differently depending on how the tech is implemented.

Imagine, for sake of argument, that the AI is just "part of the team". It doesn't replace a doctor, it doesn't have authority over doctors, it's just another signal. If applied in this way, then you can still:

- Ask for a second opinion

- Go to a doctor/team that doesn't rely on AI

- Consult a different / (better?) AI

I realize you're focusing on an outcome where AI replaces humans in the field, but I'd hope humanity can find an implementation that can both leverage the benefits of the tech while not throwing out the core disciplines of medicine in the process. It's reasonable to be pessimistic about this given the tech-run-amok issues prevalent in 2021, but I still have some hope.

Also imagine the possibilities of AIs sharing knowledge with each other. This has its own semi-terrifying implications, but I don't necessarily think we should conclude that the only outcome is a single AI to rule all AIs (and doctors).

Of course, I could be wrong as well. But I don't think all roads lead to doom.

You could say all these same things about any time in human history where some process got automated, standardized, mechanized, or computerized. Indeed, many people did. They're called Luddites.

People always get scared that automation will completely remove the human touch from a process. They forget that people will still design, understand, troubleshoot, maintain, improve, be inspired by, learn from, and innovatively transform their new automated devices and processes.

The future has always ended up being people and machines working together, and I think it will continue to be that way for a long time. There will not be an AGI that replaces human thinking in our lifetime.

Instead of a pile of machines, imagine a vast opaque bureaucracy, where any individual participant doesn't have any decision-making power and must follow a prescribed set of steps. I think this is partially what fuels nightmares of your average modern luddite.

If you follow a happy path in such a system, everything is fine. But if you don't, it gets Kafkaesque. Check out a first hand account [0] for flavor.

[0] https://slatestarcodex.com/2017/08/29/my-irb-nightmare/

Heartless, rule-based, unappealable decisions have been a feature of human society for as long as there has been organized bureaucratic society. It is not the case that soft-hearted human bureaucrats are all that stands in the way of the Kafkaesque nightmare.

If we want better social and political decisions, we have to engage in society and politics. The AI will follow what we decide, as it always does.

I don't think AI will make us dumber, it will allow us to work faster and more abstractly. Doctors have no incentive to regurgitate an AI prediction; they're the ones liable for care. Doctors will continue to second guess AI while they use its predictions as one of many data points for their decision.

This is actually kind of what I'm hoping for. The saying is "when you hear hoofbeats think horses not zebras." But that's boring and it turns the doctor into a bored and checked out professional who just constantly sees the same symptoms day in and out and prescribes the same treatment. When something interesting does show up, they're too used to the status quo to actually address the issue.

Let the AI prescribe all of the horse treatments. The doctor can then spend their time imagining more fanciful diagnosis that we're centuries away from teaching the AI how to deal with. Then when the primary treatment fails, instead of trying to convince an overworked and checked out doctor that you've got some other problem, they'll actually already have a list of possible alternative issues that they were really hoping they would get to investigate this time around.

James Bridle wrote a rather good book titled "New Dark Age" that covers the continual increase in reliance on algorithms and often non-human-verified data in our decision making processes in all aspects of our lives. This impacts scenarios from the individual level when people drive off roads and into hazards following their GPS blindly, all the way to global markets behaving in unexpected ways due to trade algorithms being unable to account for never before seen changes to conditions that don't fit within their existing parameters.

There's some similar work that frames some of this around cognitive artifacts. There are processes people perform that create an artifact in their mind they can use for other purposes of problem solving.

The model leaves a sort of precense on their consciousness, an artifact,, and allows them to reason in ways aided by those artifacts. When you stop creating and using those artifacts and depend on technology to act in place of them, you lose some creative aspects you would normally be able to connect or utilize in other mental models.

For some things I think it's good for technology to act as black boxes we don't have to understand, just know that we can understand it if we need to. Other things are good to understand even if you can automate it. How you choose and differentiate which of those is an incredible challenge in my mind. What do I commit to memory and integrate into my mental model of the world and what do I just reference and use when I need it.

I think that a lot, perhaps most, AI diagnoses would involve a feedback loop that tracks the patient's health outcomes.

Here are some examples of feedback: - The AI misses a cancer that gets worse and is later detected on a blood test. - The AI detects a cancer on an x-ray but a (more accurate) CT or biopsy indicates that there is no cancer

These kinds of feedback loops should allow the clinical performance of AI to be measured and hopefully improved over time.

This has been going on for 25ish years now, but the iteration is quite slow and there are systemic barriers to speeding it up.

>? Imagine being misdiagnosed by a computer and the clueless human health care workers don't or won't question it because the computer said so. You'll be one the phone talking to an AI which is looking at records generated by AI full of data compiled by AI. Why pay costly humans when you can replace them with a computer.

Has already been happening.

I highly doubt this. Source: work in the industry.

I see, I reacted on a other comment and I can't delete this one.

> overstates the benefits and undercounts the drawbacks of requiring black-box algorithms to be explainable.

I wrote the essay linked below [0] a few months ago. It is very relevant here. I argue that asking ML for explanations forces you to get a dumbed down version of the result, just like asking any expert to explain all the subtlety of what they are doing. Asking for explanations is a kind of micromanagement. There are instances where explanations are important (like research), but much less so in model deployment.

The better way is to focus on the results the models provide, and confidence that the model is making supported decisions (i.e. is not extrapolating or predicting on out of distribution data). This is how we would use other kinds of experts - validate their expertise and trust them when they are working in their area.

[0] https://www.willows.ai/blog/getting-more

When I meet a new person, it could be in a hiring situation, I would question that person to know how she thinks -- how she arrives at her conclusions. Whenever she has earned the trust of her collaborators, she can work mostly of her intuition.

Why should that be different for ML models? I would expect to be able to switch between a result based (intuition) and a more thorough explaining mode (XAI) to assess soundness of reasoning. And then I am also fully aware that complexity is increased when I turn on explanations.

Well I'd argue that if you take this approach to hiring, you can only hire people who share your area of expertise, that thinks the way you do. This is not generally the case with how ML models work. You could create an explainable model that shows you the things you want to see as part of it's decisions, but (a) that won't take advantage of the full strengths on ML, it's more like a hard coded model (b) it just pushes the problem down a level - great, it said it's a dog because it has teeth, how does it know what teeth are, (c) it still is subject failure on out of distribution or extrapolated sample data. Like I say, there is a place for explanations, especially for the ML scientist working on building a model. But for "managing" the model as I imagine one would in a healthcare setting, I think explainability as commonly construed (feature attribution) is the wrong focus

> Well I'd argue that if you take this approach to hiring, you can only hire people who share your area of expertise, that thinks the way you do.

What? Why?

There are different modes of reasoning. When I am to assess to quality of other peoples reasoning, I do not necessarily need to be an expert or have them think like I do.

Think semantics. You have declarative an operational semantics. It is perfectly alright to have multiple implementation of the same type deceleration, multiple proofs of the same proposition. It is not not really the proof that is of interest, but what modus of reasoning was used to arrive there.

I declare the role I need in a hiring position. And the person being interviewed tells me how she inhabits that role. I assess that is actually constitutes and inhabation.

Though it appears that we fundamentally agree as I read from you last sentence, that the "removal" of the explainable parts are only a last step before deployment.

> But for "managing" the model ...

I suspect the middle-term role of AI in diagnosis and treatment is as a low-cost consult. The pressure point for docs to use the AI will be legal. During a malpractice case, if there is no documentary evidence the doctor(s) used the standard-of-care AI system during diagnosis and treatment, and the AI system would have recommended an improved diagnosis and treatment with the same patient information, it will not go well for the hospital/clinic/doctor. AI will be a CYA (Cover Your A*s) in their day-to-day work that would hopefully cheaply and effectively save lives while keeping human doctors in the loop. Sort of like chess engines evaluating positions during tournaments…except you’re allowed to cheat and use them in the game itself.

The problem then becomes taking a good history, using imaging and tests effectively, and charting all relevant observations. AI and IT can help here too. With the same CYA motivation…

The real problem today seems to be that docs need amanuensis-es. Entering the medical data even in electronic records systems seems to be a real challenge to docs. No data…not much an AI can do.

> We argue that this consensus, at least as applied to health care, both overstates the benefits and undercounts the drawbacks of requiring black-box algorithms to be explainable.

I wouldn't describe this as a drawback of requiring explainability. I'd describe this as drawbacks of today's models and today's explainability methods. Today's explainability tools are pretty nascent and our usual deep learning models know nothing about causation, which handicaps any "why" question. But I'm still pretty strongly in favor of requiring an interrogable answer to "why did you diagnose that," even if that isn't something we really know how to do right yet.

The challenge seems to be that we know we want "why" answers, but we don't know how to describe/codify what constitutes an acceptable "why" answer.

I worked on making a more explainable AI in bio for while. Probably a lot of the unexplainable AI that is being discusses is about deep learning. Some of the biggest advances in deep learning can also be said to have made the models more explainable. Take computer vision, for example. We have great success with convolutional neural networks because these move the models away from unexplainable fully connected layers where the mathematicals naturally produces many neurons with small effect sizes and towards detecting larger coherent structures in the convolutional layers.

I agree there is more work to be done and making good mathematical decisions when designing the layers seems like a promising way to go.

The doctor comes up with the why, the AI serves as one data point for coming to that solution.

Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact