
Pulse oximeters give biased results for people with darker skin - lilrhody
https://bostonreview.net/science-nature-race/amy-moran-thomas-how-popular-medical-device-encodes-racial-bias
======
PascLeRasc
I'm a student EMT and our instructor told us not to rely on a pulse ox reading
if our patient has dark skin, or if they were in a fire, or are a lifelong
smoker, or if they have any nail polish or dirt/grease on their fingertips. We
just look for the 94-100% range and would only take action if we get a reading
around 87% or lower, or see a sudden drop during treatment.

There are a few alternatives designs that aim to mitigate this particular
issue by either preventing crosstalk [1] or by using multiple LED wavelengths
and angles - the second one is something a med device engineer told me about a
few months ago, but I'm having trouble finding any papers or information on
it.

[1]
[https://www.hindawi.com/journals/jhe/2018/3521738/](https://www.hindawi.com/journals/jhe/2018/3521738/)

~~~
Edmond
The article notes that the devices inaccurately report higher oxygen levels
for darker skin tones. Meaning if someone dark skinned reports a result that
falls into the 94-100% range, they may in fact be at lower oxygen levels that
put them at risk.

~~~
PascLeRasc
That's true. There are other context clues that we can use, like a patient's
medical history, how their breathing looks/sounds, or if their environment or
mental status suggests CO poisoning (that also won't show up even on the
perfect pulse ox, since it binds to hemoglobin the same way oxygen does).
Overall we just need a better way to get information about oxygen levels.

~~~
sbierwagen
At first thought, a hemoglobin protein loaded up with CO molecules should have
slightly less mass than an oxygenated hemoglobin complex, which could...
theoretically... be measured, maybe with Raman spectroscopy.

That effect is going to be tiny, though. Wikipedia is telling me a hemoglobin
tetramer masses "about" 64,000 daltons, so a complex loaded up with 4CO will
mass 64,111 vs 64,127 amu. 0.024% difference.

------
treis
This really isn't a big deal. Doctors know that these pulse ox meters aren't
very accurate and use them accordingly. Nobody's changing their plan because
you have a pulse ox of 93 instead of 95. They're there to quickly measure big
changes that indicate a problem. In other words, they're looking for patients
going from 98 to 75, not 98 to 95.

~~~
jrajav
What's the point of casually dismissing this article's premise? Whether the
conclusions are fully sound or not isn't the goal of this article, this is
journalistic not academic. The goal is to raise awareness of a potential bias,
and I think it achieves that (assuming the facts all check out).

For one, according to the article the discrepancy was found to be up to 8
points in some cases, not 3. And it seems reasonable that quick, practical
decisions in hospitals are made based on point thresholds, not purely huge
point leaps as you suggest.

I haven't dug any further than the article and your comment to verify these
facts. But rather than find a way to prove why this doesn't matter, why don't
we assume it does? Then the knowledge and awareness might spread a little bit
more through us, and if it is an issue, it might be that much more likely to
be solved. There's no penalty for being wrong with a weakly held but well-
meaning assumption. At the very least, we can be more aware of another
possible dimension of bias in technology.

~~~
treis
>What's the point of casually dismissing this article's premise?

Because it should be casually dismissed. This person doesn't know what they're
talking about, refused to listen to the people who told them they didn't know
what they are talking about, and wrote an article full of nonsense.

~~~
rednerrus
"Journalists" in 2020 summarized.

------
wombatmobile
GUIDELINES FOR SPO2 MEASUREMENT USING THE MAXIM® MAX32664 SENSOR HUB

The SpO2 measurement performance of a device must be verified before the
device is released to the market. The U.S. Food and Drug Administration (FDA)
suggests using standards presented in the following:

ISO 80601-2-61:2017 – Medical electrical equipment -- Part 2-61: Particular
requirements for basic safety and essential performance of pulse oximeter
equipment

Pulse Oximeters – Premarket Notification Submissions [510(k)s] Guidance for
Industry and Food and Drug Administration Staff

According to these regulations, manufacturers need to declare the calibration
range, reference, accuracy, methods of calibration and range of displayed
saturation level. Furthermore, for the performance assessment, the FDA
requires at least 200 data points equally spaced over a saturation range of
70% to 100%. Test subjects should have different ages, gender, and skin tones.
For instance, the FDA requires that at least 30% of the volunteers must have
dark skin pigmentation. The overall error or the root mean square error (RMSE)
must be below 3.0% for transmissive pulse oximetry and below 3.5% for
reflective pulse oximetry.

© 26 Mar, 2019, Maxim Integrated Products, Inc.
[https://www.maximintegrated.com/en/design/technical-
document...](https://www.maximintegrated.com/en/design/technical-
documents/app-notes/6/6845.html)

~~~
khuey
Seems like it would be better if instead of merely an overall error of at most
3% they required an error of at most 3% in the subpopulations of interest too.

------
oneplane
And darker coloured textiles are less visible on people with darker skin.
Grass is green and bananas tend to be curved.

I get that it is interesting to research optically measuring systems to
'search for bias' and I'm sure there are going to be biases in varying
degrees, but that doesn't mean it's always for the same reason or with the
same result.

There are very real problems of course, take the example with the photographic
film development from the article: that is a problem because a picture that is
supposed to reflect the subject you photographed no longer does what it was
suppsoed to do. And someone here posted the example of the soap dispenser that
couldn't "see" people that didn't have light skin. Also a problem, because now
you can't get soap. Both of those are mostly examples of errors made during
engineering (apply Hanlon's razor when in doubt - it's not always malice like
people often assume), and might be preventable if there were more different
samples and interactions available during development.

That doesn't mean it's the only type of problem we have with devices for
humans, or that keeping a color palette of people around is always going to
fix it.

Take the soap dispenser example: you could have made one that uses radar or
ultrasonic detection, solves multiple problems because you now don't have to
consider lighting conditions which always vary due to differences in bathrooms
(light fixtures, positioning, positioning of the dispenser itself etc). That
doesn't relate to skin color, but can just as easily be a similar problem that
could have been prevented all the same by doing proper engineering. But it's
likely that in cases like this (soap dispensers) the detection system was just
a module that some other third party company imports and sells, and they
bought it from a module integrator somewhere else and they bought the parts as
generic motion detectors from a company 4 layers down the chain that
specializes in measuring daytime movements of rabbits or something silly like
that.

On the other hand: if you have a government that requires certain controls to
be in place and applied for medical equipment to be used in a medical setting
you would expect some of the requirements to be drawn up in a way that
reflects the use cases. It appears that either the devices are not scoped
within those controls, or the controls failed. Or it's all good and the
difference in measurement is not significant enough to matter for the use case
and thus the devices are used as-is and this research doesn't relate to the
real world as well as people think.

~~~
vmception
The product just wouldn't have shipped with a product manager who also
experienced it not working.

Thats the point of inclusion.

Although the soap dispenser example feels like one of those problems, it is
just wear and tear of the sensor and they work properly when cleaned.

In this case, pulseox failures, can have effects on when people are admitted
for treatment.

~~~
oneplane
It would probably not have shipped in that case, but the amount of product
managers actually using their own products sadly isn't at the saturation you'd
expect.

Same with all the white labeling and supply chain and integrators in between;
the disconnect between all of the stages and the contracts that can bind a bad
product to be delivered and sold anyway can't be fixed on just one end of the
process.

~~~
function_seven
I'm going to challenge you on "...can't be fixed on just one end of the
process"

Why not? That seems like the appropriate place to reject the inferior
components arriving from that long and deep supply chain. You're creating a
soap dispenser for sale in North America. It does not work on 15% of the
population. That's a fail. You can't just throw your hands up and say, "Well,
those photoreceptors were designed in Taiwan, and they didn't account for dark
skin". If that's the case then you don't source those, or you request a
different calibration profile, or you add a lens filter that compensate. You
engineer the solution for your target market.

The same company using global components took the time to print logos in
English and the user manual in English. They made sure the power circuitry
matched North America's system. They set up a domestic phone number for
support. Established payment systems to accept USD for their product.

They do all these things to sell better into a particular market, but they
don't bother with the actual people who will be using it?

Don't get me wrong. You are accurately explaining the _how_ of the failure.
But that's exactly _why_ the last step in the supply-chain exists. To make the
actual finished product. That's where you catch deficiencies like this.

------
PaulHoule
If I understand it, they looked at 3 pulse ox meters. The expensive one from
the company that was sponsoring the study read about one point (out of 100)
lower on blacks which is inaccurate but not outrageous.

One of the off-brand meters was completely wrong (e.g. dangerous) for blacks,
the other one was in between.

~~~
simion314
They look at a few studies too and asked around companies if they handled the
issues mentioned in those studies, the most positive result was that some
models from one company are now better)probably some of the others are not).

IMO seems that the certifications for this devices, especially for the ones
used by medical stuff to decide things need to be better tested.

Other interesting point that I did not consider is you can have a device with
an average error under a safe threshold but that gives a much above average
error for some groups putting this groups at risk if the people using this
devices are not warned about this issues.

~~~
IndPhysiker
Nonin created the first of these back in the '90s, so I pulled one of their
spec sheets from 2016 ([https://www.nonin.com/wp-
content/uploads/2018/09/NoninConnec...](https://www.nonin.com/wp-
content/uploads/2018/09/NoninConnect-3245-Spec-Sheet.pdf)) which explicitly
mentions the pigmentation issue along with nail polish, poor circulation, and
breathing issues. From that, this isn't exactly a new or unknown thing since
commercial products have been working around this for some time. I wasn't able
to find an industry standard for testing on these, so maybe the article should
cite that as an opportunity to improve a product or add a new method?

~~~
simion314
> I wasn't able to find an industry standard for testing on these, so maybe
> the article should cite that as an opportunity to improve a product or add a
> new method?

It is a long article with many studies cited so I can't check all of those,
IMO "the readings can be affected by skin tone" is not something that is
sufficient for a medical device , what should a nurse or doctor do? There is
no numeric value so what does that mean? is the device useless if you have
dark sin or is it 2% wrong or as the article suggests the error is non linear
and increases if you are suffering with low oxygen?

What I would do if I would sell this products is test with a few ranges of
skin tones, and if needed have a switch on the device that you have to set for
a certain level or if that is to expensive print instead of a number an
interval and have the user trained to read that.

------
shalmanese
I wonder if the pulse oxes in China are calibrated for Chinese skin or White
skin? Many people are buying pulse oxes straight off of Alibaba and they might
be getting readings that are too low if the calibration is off.

~~~
dogma1138
I’ll be surprised if they are calibrated at all.

~~~
jschwartzi
Lots of thermometers on Amazon aren’t even FDA cleared so I wouldn’t be
surprised if the pulse oximeters aren’t FDA cleared either.

------
ceejayoz
Reminds me of the automatic soap dispenser at Facebook that can't see black
people. [https://gizmodo.com/why-cant-this-soap-dispenser-identify-
da...](https://gizmodo.com/why-cant-this-soap-dispenser-identify-dark-
skin-1797931773)

~~~
ravenstine
I can believe this because automatic doors seem more prone to having a hard
time opening when I wear black clothes. It's not a big deal or anything, but I
occasionally have to step back and then forth again to get it to open. Is this
a common experience of black people, or am I crazy?

~~~
exikyut
Oh, _THAT 'S_ why entrance barriers at supermarkets only occasionally fail,
but fail consistently on a given day...

Wait. Would this happen to PIR-based systems?

~~~
ravenstine
I'm not sure. It seems at least possible with supermarket doors because those
supposedly use a low-power microwave beam. With PIR, it still relies on a
deflection of a beam, so I think it would be a problem with that as well, but
I'm just guessing.

~~~
detaro
PIR doesn't have a "beam", it detects temperature changes (i.e. you, a warm
object, moving around). Clothes could play a role, not sure if the color
though.

------
AlanYx
This article doesn't seem particularly balanced. Unless I'm missing something,
the author doesn't really give any generalized details about the amount of
bias (for example, mean bias).

The author cites one study where doctors noticed "a bias of up to 8 percent",
which sounds high, but if you click through to the link provided, it says:

>The mean bias (Spo2 − Sao2) for the 70%–80% saturation range was 2.61% for
the Masimo Radical with clip-on sensor, −1.58% for the Radical with disposable
sensor, 2.59% for the Nellcor clip, 3.6% for the Nellcor disposable, −0.60%
for the Nonin clip, and 2.43% for the Nonin disposable.

Basically, the amount of bias varies by manufacturer and type of device, and
it is not always a positive bias as implied by the text of the link.

This seems to be more of a case of measurement error bounds inherent to any
measurement device, rather than "a popular medical device encod[ing] racial
bias" and all the other rhetoric in the article.

------
DoofusOfDeath
IIUC, dark-skinned people have higher covid19 mortality rates, and physicians
usually(?) use this kind of oxymeter in their decision-making. I wonder if
there's a connection.

~~~
sangnoir
The article also mentions that nonwhite people have a higher rate of their
trouble with breathing being dismissed as due to anxiety compared to white
people... It's possible there is a layer of biases, and at each step, there's
a risk of not getting timely medical intervention resulting in poorer
outcomes.

~~~
DoofusOfDeath
Right. I wasn't trying to imply that the oxymeter bias explained the entire
difference in outcome rates. I'm just curious if it matters enough to warrant
further study.

~~~
sangnoir
I am agreeing with you, and speculating that the oxymeter is a small tip of
the iceberg, and a multitude of "small" biases that add up to terrible
outcomes.

------
dafoex
Sorry if I'm not demonstrating adequate sympathy, but all I can thing of from
reading this headline is "oh dear, a person with a genetic adaptation that
prevents some wavelengths of light from penetrating to deeper layers of the
skin will find it hard to make some wavelengths of light penetrate to deeper
layers of the skin"

As a nonmedical person, this seems blindingly obvious - to the point I could
reference the "floor is made of floor" meme - so it would surprise me to learn
that professionals _haven 't_ considered this downfall of the technology when
taking their readings. Judging by the discussion here, I don't think I'm going
to be surprised today.

------
pj_mukh
Interestingly, the new Apple watch will supposedly have a pulse oximeter of
sorts [1]. I was looking forward to that. Now I wonder how it biases as well,
especially given the wide distribution of apple watches and them being used as
a defacto emergency health monitoring device [2]

[1]: [https://www.tomsguide.com/news/apple-watch-6-blood-oxygen-
mo...](https://www.tomsguide.com/news/apple-watch-6-blood-oxygen-monitoring-
confirmed-heres-the-proof)

[2]: [https://9to5mac.com/2020/07/01/critical-heart-
disease/](https://9to5mac.com/2020/07/01/critical-heart-disease/)

------
0xdeadb00f
I believe if we were given the race of the developers this bias would become
clear - nothing against the developers or their race, I'm not implying the
developers are racist in any way.

What I am saying is that systems developed by a team with a majority of single
race is bound to have biases purely because of the lack in opinion diversity
while testing the system.

But I could be completely wrong...

Edit: I pressed send by accident, I never meant to actually post this comment
but I'm leaving it up anyways.

------
readams
It's maybe not too surprising that a device that works by shining a light
through your skin is affected by pigments in the skin.

~~~
GaryNumanVevo
That's not supposed to be surprising, it's the fact most pulse OX meters are
known to not work well for darker skin and it simply wasn't fixed.

------
rzmnzm
A company I worked for decided one day to purchase budget fingerprint readers
for door access. They were installed with much fanfare.

Unfortunately it turned out that the fingerprint readers were racist, and
refused to allow black people into the building.

They were swiftly removed.

------
GenerocUsername
measurement devices do not have 'bias' they have 'tolerance'. Give them some
credit

~~~
GaryNumanVevo
Two separate things. The tolerance for blood oxygen is centered around sensor
readings from white skin. Which is a bias.

------
jeffreyrogers
And people with sweaty hands. And people with low iron levels. There's this
push to show that everything is biased against certain classes. I'm sure some
of the examples are true, but I can't imagine that constantly seeking evidence
of problematic issues is good for individuals or society.

~~~
krapp
Seeking evidence of problematic issues is how society identifies and
eventually addresses such issues. Do you believe society would benefit from
ignoring evidence or pretending those issues didn't exist?

~~~
jeffreyrogers
I agree that seeking evidence is how issues are addressed. The problem is that
evidence is being sought in a particular way, which systematically biases
which evidence is available for consideration.

Edit: there is a further problem that because many interpretations of the same
facts are possible if you go looking for certain things you will very likely
find them. While someone with different initial beliefs will find evidence of
something else. All from the same facts.

~~~
krapp
But you seem to be suggesting that we ignore skin color as a relevant factor,
despite the data, out of a belief that any correlations drawn from that data
must be distorted to further a political agenda.

The article presents what appears to be a plausible, evidence-based hypothesis
as to why skin color in particular (not exclusive of other factors) might lead
to inaccurate results for dark-skinned people. It then mentions studies done
to either confirm or reject this hypothesis, and that the results seemed to
confirm it. They link to a follow-up study done here[0].

You're only giving vague, general dismissals. If you believe the conclusions
reached are in error, what evidence do you have that pulse oximeters are _not_
affected by skin color?

[0][https://journals.lww.com/anesthesia-
analgesia/fulltext/2007/...](https://journals.lww.com/anesthesia-
analgesia/fulltext/2007/12001/dark_skin_decreases_the_accuracy_of_pulse.4.aspx)

~~~
jeffreyrogers
I don't think the conclusions are in error. It's reasonably well known among
medical professionals that skin color affects pulse-ox readings. It is also
true that sweaty hands or low iron levels will cause incorrect pulse-ox
readings.

I'm also not suggesting we ignore skin color as a relevant factor. I'm taking
issue with the idea that everything that involves skin color is evidence of
bias or other more malicious underlying causes.

------
bluntfang
I don't think this should be flagged/removed. It's a technical review of
medical devices.

------
vr46
Quelle surprise - garbage in, garbage out. I'll bet my Apple Watch doesn't
have these issues.

~~~
vinay427
The Apple Watch doesn't have a pulse oximeter (EDIT: it does have one that is
not approved and therefore is disabled), so it would be difficult for it to
have this issue. You might be thinking of optical HR measurement, for which
sensors have also been shown to have skin color biases in some cases.

[https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7010823/](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7010823/)

~~~
ceejayoz
> The Apple Watch doesn't have a pulse oximeter

Yes, it does. The FDA blocked its use.

[https://www.city-journal.org/fda-blocks-apple-watch-blood-
ox...](https://www.city-journal.org/fda-blocks-apple-watch-blood-oxygen-
feature)

[https://www.cultofmac.com/320322/apple-watch-sensors-are-
cap...](https://www.cultofmac.com/320322/apple-watch-sensors-are-capable-of-
measuring-blood-oxygen/)

~~~
vinay427
Noted and thanks for the information, but as far as I can tell my point stands
if the feature is completely inaccessible. I don't really understand what the
GP comment was referring to regarding Apple Watch pulse oximetry accuracy.
Other consumer devices such as my smartphone and watch have working pulse
oximeter sensors built-in that are presumably FDA approved (or not FDA banned
anyway), so it doesn't bode well for the Apple Watch if it can't pass this.

~~~
vr46
I have a lot of confidence in Apple even if they are less than 100% all the
time. They put a vast amount of effort into catering for diverse users, their
iOS accessibility, for example. So I would bet that the Watch sensors are
solid. Or will be, as the feature appears to be coming soon:
[https://9to5mac.com/2020/03/08/apple-watch-blood-oxygen-
satu...](https://9to5mac.com/2020/03/08/apple-watch-blood-oxygen-saturation/)

~~~
TallGuyShort
Your dedication to your Kool-Aid is impressive.

