Hacker News new | past | comments | ask | show | jobs | submit login

This is so incredibly common, it's embarrassing. I was on an expert panel about "AI and Machine Learning in Healthcare and Life Sciences" back in January, and I made it a point throughout my discussions to keep emphasizing the amount of bias inherent in our current systems, which ends up getting amplified and codified in machine learning systems. Worse yet, it ends up justifying the bias based on the false pretense that the systems built are objective and the data doesn't lie.

Afterward, a couple people asked me to put together a list of the examples I cited in my talk. I'll be adding this to my list of examples:

* A hospital AI algorithm discriminating against black people when providing additional healthcare outreach by amplifying racism already in the system. https://www.nature.com/articles/d41586-019-03228-6

* Misdiagnosing people of African decent with genomic variants misclassified as pathogenic due to most of our reference data coming from European/white males. https://www.nejm.org/doi/full/10.1056/NEJMsa1507092

* The dangers of ML in diagnosing Melanoma exacerbating healthcare disparities for darker skinned people. https://jamanetwork.com/journals/jamadermatology/article-abs...

And some other relevant, but not healthcare examples as well:

* When Google's hate speech detecting AI inadvertantly censored anyone who used vernacular referred to in this article as being "African American English". https://fortune.com/2019/08/16/google-jigsaw-perspective-rac...

* When Amazon's AI recruiting tool inadvertantly filtered out resumes from women. https://www.reuters.com/article/us-amazon-com-jobs-automatio...

* When AI criminal risk prediction software used by judges in deciding the severity of punishment for those convicted predicts a higher chance of future offence for a young, black first time offender than for an older white repeat felon. https://www.propublica.org/article/machine-bias-risk-assessm...

And here's some good news though:

* A hospital used AI to enable care and cut costs (though the reporting seems to over simplify and gloss over enough to make the actual analysis of the results a little suspect). https://www.healthcareitnews.com/news/flagler-hospital-uses-...




I agree 100% about how common it is. The industry also pays lip service about doing something about it. My last job was at a research institution and we had a data ethics czar, who's a very smart (Stats phd) guy and someone I consider a friend. A lot of his job was to go around the org and conferences talking about things like this.

While there's a lot of head nodding, nothing is ever actually addressed in day to day operations. Data scientists barely know what's going on when they throw things through TensorFlow. What matters is the outcome and the confusion matrix at the end.

I say this as someone who works in data and implements AI/ML platforms. Mr. Williams needs to find the biggest ambulance chasing lawyer and file civil suits not only the law enforcement agencies involved, but top down everyone at DataWorks from the president to the data scientist to the lowly engineer who put this in production.

These people have the power to ruin lives. They need to be made an example of and held accountable for the quality of their work.


Sounds like a license for developing software is inevitable then.


>When AI criminal risk prediction software used by judges in deciding the severity of punishment for those convicted predicts a higher chance of future offence for a young, black first time offender than for an older white repeat felon.

>When Amazon's AI recruiting tool inadvertantly filtered out resumes from women

>When Google's hate speech detecting AI inadvertantly censored anyone who used vernacular referred to in this article as being "African American English

There's simply no indication that these aren't statistically valid priors. And we have mountains of scientific evidence to the contrary, but if dared post anything (cited, published literature) I'd be banned. This is all based on the unfounded conflation between equality of outcome and equality of opportunity, and the erasure of evidence of genes and culture playing a role in behavior and life outcomes.

This is bad science.


Please post your sources. Your comments about

> the erasure of evidence of genes and culture playing a role in behavior and life outcomes

are concerning.


> There's simply no indication that these aren't statistically valid priors. And we have mountains of scientific evidence to the contrary, but if dared post anything (cited, published literature) I'd be banned.

I'd consider reading the sources I posted in my comment before responding with ill-conceived notions. Literally every single example I posted linked to the peer-reviewed scientific evidence (cited, published literature) indicating the points I summarized.

The only link I posted without peer-reviewed literature was the last one with the positive outcome, and that's the one I commented had suspect analysis.


Let's just consider an example; where do you draw the line in the following list? To avoid sending travelers through unsafe areas:

1. Google's routing algorithm is conditioned on demographics

2. Google's routing algorithm is conditioned on income/wealth

3. Google's routing algorithm is conditioned on crime density

4. Google's routing algorithm cannot condition on anything that would disproportionately route users away from minority neighborhoods

I think the rational choice, to avoid forcing other people to take risks that they may object to, is somewhere between 2 and 3. But the current social zeitgeist seems only to allow for option four, since an optimally sampled dataset will have very strong correlations between 1-3, to the point that in most parts of the us they would all result in the same routing bias.


This is exactly why I suggested actually reading the sources I posted before responding. The Google example has nothing to do with routing travelers. It was an algorithm designed to detect sentiment in online comments and to auto-delete any comments that were classified as hate-speech. The problem was that it mis-classified entire dialects of English (meaning it completely failed at determining sentiment for certain people), deleting all comments from the people of certain cultures (unfairly, disproportionately censoring a group of people). That's the dictionary definition of bias.


You're completely missing my point. And the purpose of my hypothetical. So let me try it with your example:

>The problem was that it mis-classified entire dialects of English (meaning it completely failed at determining sentiment for certain people), deleting all comments from the people of certain cultures

What happens in the case that a particular culture is more hateful? Do we just disregard any data that indicates socially unacceptable bias?

What, only Nazis are capable of hate speech?


> What happens in the case that a particular culture is more hateful? Do we just disregard any data that indicates socially unacceptable bias?

That's not what was happening. If you read the link, you'll see the problem is that the AI/ML system was mis-classifying non-hateful speech as hateful, just because of the dialect being used.

If it were the case that the culture was more hateful, then it wouldn't have been considered "mis-classification."

> You're completely missing my point.

I'm not missing your point; it's just not a well-reasoned or substantiated point. Here were your points:

> There's simply no indication that these aren't statistically valid priors.

We do have every indication that this wasn't what was happening in literally every single example I posted. You just have to read them.

> And we have mountains of scientific evidence to the contrary, but if dared post anything (cited, published literature) I'd be banned.

You say that, and yet you keep posting your point without any evidence whatsoever. Meanwhile, every single example I posted did cite peer-reviewed, published scientific evidence.

> This is all based on the unfounded conflation between equality of outcome and equality of opportunity, and the erasure of evidence of genes and culture playing a role in behavior and life outcomes.

Again, peer-reviewed published literature disagrees. Reading it explains why the point that it's all unfounded conflation is incorrect.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: