Hacker News new | past | comments | ask | show | jobs | submit login

That propublica article is the perfect example of the many aspects of the discussion here. It argues just one side of it, but the scenario it raises is one where an ML algorithm correctly found a solution to a badly framed problem.

Briefly, (and simplifying for clarity) it worked like this: the ML algorithm scored the risk of criminals reoffending. It gave a "high risk" score if someone was 80% likely to re-offend, and a "low risk" score if they had a 20% chance. Out of 100 blacks and 100 whites, here are the scores the ML algorithm gave, and how many ultimately did re-offend:

    100 black criminals
      * 50 "high risk", 40 re-offended
      * 50 "low risk", 10 re-offended
    100 white criminals
      * 10 "high risk", 8 re-offended
      * 90 "low risk", 18 re-offended
On the one hand, the algorithm was completely correct: 80% of high risk individuals re-offended, and 20% of low risk individuals did, and this is true even looking at just the black or the white criminals. It was unbiased according to its goals.

On the other hand, the failure mode disproportionately punished the black criminals: 10 were "high risk" who never re-offended, while only 2 white criminals were. Meanwhile, 18 "low risk" white criminals did re-offend while only 10 black criminals did. So the score was more strict for some unlucky blacks and more lenient for some lucky whites.

However, a key point is that because the underlying re-offense rates were different between the populations (50/100 for the black criminals, 26/100 for the white criminals), the algorithm could not have done otherwise. That is, given some n% re-offenders, if you have to fit them into 20% and 80% buckets, your "high risk" and "low risk" counts are mathematically fixed. In other words, it wasn't the ML by itself that was the problem, but the COMPAS score that it was trying to compute that had these issues inherent in it.

I think this is a good example where ML wasn't biased (at least, not anymore than reality), but where people were too eager to turn to it for a poorly considered project. By wrapping up important questions in high tech algorithms, it's too easy to fool yourself that what you're doing is the best thing you could be doing and miss problems in the fundamental framing.




1) Using skin color (or proxy) as basis for judgement or systematic discrimination, is illegal = punishable.

2) The whole point of systems is to impact the world, not replicate already recorded history or escalate problems.

Can the ML algo predict what the outcomes and consequences of its outputs are, and assess that? No?


The point is that the algorithm wasn't using skin color as an input, (directly, at least) and got those outputs anyways.

Would you prefer an outcome where half of the white men were also marked as high risk, but with 16 reoffending? (to give the low risk pools the same recidivism rate) Now instead of unjustly marking ten black men and two white men as high risk, you're doing so for thirty-four white men. Is that somehow less discriminatory?


Just because another option is worse doesn't make an unjust and repressive kind of social profiling better. The entire premise of using prediction may be found to have dire consequences, so the entire premise is dubious. There is no binary choice in this matter, but there needs to be accountability.


And if the decision isn't based on race, but on things that happen to correlate with it? Whether your father was present, for instance - since that's a major predictive factor for all sorts of life outcomes even when controlling for race. Should that not be used just because certain groups are more likely to come from single-parent households than others?

Because in the end, the choices are "accept that there will be a correlation between the outputs and race", "use no system that produces estimates or imperfect outputs", or "explicitly discriminate based on race to remove the race-result correlation". That's it.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: