
MIT AI tool can predict breast cancer up to 5 years early - codermobile
https://techcrunch.com/2019/06/26/mit-ai-tool-can-predict-breast-cancer-up-to-5-years-early-works-equally-well-for-white-and-black-patients/
======
aabaker99
Take these results with a grain of salt. There's a large class imbalance in
this dataset and ROC curves can be misleading in this case. The test set
contains 269 positive examples and 8482 negative examples.

From [1]:

> Class imbalance can cause ROC curves to be poor visualiza- tions of
> classifier performance. For instance, if only 5 out of 100 individuals have
> the disease, then we would expect the five posi- tive cases to have scores
> close to the top of our list. If our classifier generates scores that rank
> these 5 cases as uniformly distributed in the top 15, the ROC graph will
> look good (Fig. 4a). However, if we had used a threshold such that the top
> 15 were predicted to be true, 10 of them would be FPs, which is not
> reflected in the ROC curve. This poor performance is reflected in the PR
> curve, however.

The authors seem to be aware of this in the supplement and also evaluate
performance by a hazard ratio they define:

> We calculated the ratio of the observed cancer incidence in the top 10% of
> patients over the incidence in the middle 80% and referred to this metric as
> the top decile hazard ratio. We calculated the ratio of the observed cancer
> incidence in the bottom 10% of patients over the incidence in the middle 80%
> and referred to this metric as the bottom decile hazard ratio.

However, binning is a form of p-hacking [2]. And I'm still wondering why they
don't just post the Precision-Recall curves.

[1] [https://doi.org/10.1038/nmeth.3945](https://doi.org/10.1038/nmeth.3945)

[2]
[https://doi.org/10.1080/09332480.2006.10722771](https://doi.org/10.1080/09332480.2006.10722771)

[Edit] to add link to [2]

~~~
ska
It's one of the difficulties with attacking screening applications where the
population TPR is very low. For screening mammo, it's less than 1 in 1000.

------
wccrawford
Without information about false positives, this is just basically saying they
wrote an algorithm that sometimes points out cancer early. But if it is only
correct 1% of the time, nobody is going to listen to it. It'd do even less
than the current "You really need to check for cancer!" statements that we
already have.

Edit: From the paper:

> A deep learning (DL) mammography-based model identified women at high risk
> for breast cancer and placed 31% of all patients with future breast cancer
> in the top risk decile compared with only 18% by the Tyrer-Cuzick model
> (version 8).

So better than before, but still only detects 31%. If I'm reading correctly,
it's 95% correct? I guess that means 5% false positives? That wouldn't be bad.

~~~
b_tterc_p
95% on a 1 in 1000 event means that over 1,000,000 trials

950 true positives

~50,000 false positives

False positive breast cancer is pretty bad.

~~~
magwa101
Cost of false positive is retest and observe, this seems like a very low cost.
Also, stop consuming sugar.

~~~
timthorn
What about the anxiety and distress caused to the patient? How many might
undergo an unnecessary mastectomy to be on the safe side?

~~~
colechristensen
I have a problem with this vision.

You can be told the risk factors for everybody, but when you are told the risk
factors tailored to you it becomes dangerous? Should we tell everyone they're
going to live forever so they live in ignorant bliss while healthy?

I can see being unprepared for having that information, but I don't think the
solution is not having the information.

~~~
timthorn
It's not a risk factor; it's a test outcome. You would be told that you've
tested positive with the follow on tests and anxiety that causes.

There have been studies into exactly this:
[https://www.ncbi.nlm.nih.gov/pubmed/22859786](https://www.ncbi.nlm.nih.gov/pubmed/22859786)

Specifically on breast cancer screening, the National Institute for Clinical
Evidence in the UK has published management guidance that is quite
interesting: [https://cks.nice.org.uk/breast-
screening](https://cks.nice.org.uk/breast-screening)

~~~
wccrawford
Can't read that last link outside the UK.

They wouldn't be wold they were positive for cancer. They'd be told that the
computer predicts that they are at very high risk for breast cancer. That's
knowledge that I think people should have, personally.

------
melling
According to Craig Venter, early detection is what we need to eliminate
cancer:

[https://youtu.be/iUqgTYbkHP8?t=15m37s](https://youtu.be/iUqgTYbkHP8?t=15m37s)

The reason most people die from pancreatic cancer, for example, is because we
almost always detect it in a late stage.

~~~
skybrian
This depends on the type of cancer. In the case of a slow-growing cancer like
prostate cancer, early detection can find cancers that would never be a
problem (you'll die of something else first).

This has to be taken into account when deciding how much screening to do.

~~~
melling
Not really. You’re probably still better off knowing you have the slow-growing
benign prostrate cancer that you can monitor but not treat. Also, you might
find you have the more aggressive form.

[https://www.webmd.com/prostate-cancer/prostate-cancer-
surviv...](https://www.webmd.com/prostate-cancer/prostate-cancer-survival-
rates-what-they-mean)

~~~
skybrian
Maybe the problem is that people don't do that? What do you think of this
meta-analysis?

"Pooled data currently demonstrates no significant reduction in prostate
cancer-specific and overall mortality. Harms associated with PSA-based
screening and subsequent diagnostic evaluations are frequent, and moderate in
severity. Overdiagnosis and overtreatment are common and are associated with
treatment-related harms."

[...]

"Screening resulted in a range of harms that can be considered minor to major
in severity and duration. Common minor harms from screening include bleeding,
bruising and short-term anxiety. Common major harms include overdiagnosis and
overtreatment, including infection, blood loss requiring transfusion,
pneumonia, erectile dysfunction, and incontinence. Harms of screening included
false-positive results for the PSA test and overdiagnosis (up to 50% in the
ERSPC study). Adverse events associated with transrectal ultrasound
(TRUS)-guided biopsies included infection, bleeding and pain."

[https://www.cochrane.org/CD004720/PROSTATE_screening-for-
pro...](https://www.cochrane.org/CD004720/PROSTATE_screening-for-prostate-
cancer)

~~~
melling
many people also insist on getting antibiotics when they have a viral
infection. We have a more measles cases now than a decade ago because of fake
news.

You might be right that many people would be better off not knowing but I’d
rather we took a step forwards than backwards.

------
b_tterc_p
Addressing model bias by adjusting which data the model has access to is a bad
idea. Tweaking the data so that the model output looks equitable is going to
make your model across the board. You should train your model on what you have
and then add explicit biases to the classifier for different groups. That way
you have the best model and are clear on your biases.

If this model is equally accurate for black and white women that means that
either race is not a factor in predictability, that it is a factor but easily
adaptable into a model, or race is a factor and they’ve reduced their ability
to diagnose one group in the name of equity.

The linked article suggests accuracy gains are due to better risk models, that
use more than age. I’m not sure if that means it’s tied into the image neural
net. Would like to see false positive rate too.

------
jszymborski
Here's the paper in question.

[https://pubs.rsna.org/doi/full/10.1148/radiol.2019182716](https://pubs.rsna.org/doi/full/10.1148/radiol.2019182716)

------
stakhanov
I wish they would call it something other than AI. Like "Diagnostics" or if
there MUST be a buzzword there, then call it "Predictive Diagnostics".

Once upon a time, a necessary precondition to call something AI was that it
should be something where there is at least the hope that it could one day
generalize to pass the Turing test or something along those lines.

Medical diagnostics is one of the primary applications of pattern processing,
and since it's pretty damned impressive as it is, it's a bit pointless to try
and make it even more impressive by suggesting that you might one day enjoy a
chat with your medical diagnostic tool over breakfast, exchanging views on how
the Knicks' season is shaping up... (Which both the informed readers, and the
people writing this, know pretty damned well is never going to happen, and was
never intended to happen).

------
magwa101
JFC this is great.

