
Approaching fairness in machine learning - optimali
http://blog.mrtz.org/2016/09/06/approaching-fairness.html
======
Eridrus
The biggest issues of bias/fairness in ML are not to do with the algorithms or
results, but the underlying data.

A trivial example would be: what if you trained a classifier to predict
whether a person would be re-arrested before they went to trial? Some
communities are policed more heavily so you would tend towards reinforcing the
bias that exists and provide more ammunition to those arguing for further bias
in the system, a feedback loop if you would.

Or what if some protected group needs a higher down payment because the group
is not well understood enough so that you can't distinguish between those who
will repay your loans and who won't? Maybe educational achievement is a really
good predictor on one group, but less effective on another. Is it fair to use
the protected class (or any information correlated with it) when it is
essentially machine-enabled stereotyping?

Recently it has been noted that NLP systems trained on large corpuses of text
tend to exhibit society's biases where they assume that nurses are women and
programmers are men. From a statistical perspective this correlation is there,
but we tend to be more careful about how we use this information than a
machine. We wouldn't want to use this information to constrain our search for
people to hire to just those that fulfil our stereotypes, but a machine would.
This paper has some details on such issues:
[http://arxiv.org/abs/1606.06121](http://arxiv.org/abs/1606.06121)

I don't think there are any easy solutions here, but I think it's important to
be aware that data is only a proxy for reality and fitting the data perfectly
doesn't mean you have achieved fair outcomes.

~~~
h4nkoslo
These are not "fairness" issues; these are process feedback issues. The same
problem pops up if you're using algorithmic selection of machine parts to test
for failure, attempting to programmatically evaluate patches for code quality,
writing fraud detection algorithms, etc.

------
tlb
Another recent paper on this topic:
[http://arxiv.org/pdf/1606.08813v3.pdf](http://arxiv.org/pdf/1606.08813v3.pdf).
It shows how naive lending algorithms can skew against minority groups simply
because there is less data available about them, even if their expected
repayment rate is the same.

It can be self-reinforcing. Imagine some new demographic group of customers
appears, and without any data you make some loans to them. The actual
repayment rate will be low, not because that group has a worse distribution
than other groups, but simply because you couldn't identify the lowest-risk
members. A simplistic ML model would conclude that the new group is more
risky.

Of course, smart lenders understand that in order to develop a new customer
demographic they need to experiment by lending, with the expectation that
their first loans will have high losses, but that in the long run learning
about how to identify the low-risk people from that demographic is worthwhile.
And they correct for the fact that the first cohort was accepted blind when
estimating overall risk for the group.

~~~
yummyfajitas
Of course, this theory of discrimination is only applicable _when minorities
are fundamentally different from majorities_. I.e., if the same ruleset is
accurate for both whites and blacks (i.e., "I don't care about race, if he
puts 20% down he's good"), this argument doesn't work at all - you can train
your model on everyone and it'll work just fine.

However, if blacks and whites need to be treated _fundamentally differently_
in order to make accurate loan decisions, then this argument applies. I.e.,
perhaps whites need a 20% downpayment for a loan to be financially a good risk
but blacks need 40% (or vice versa).

I wonder how many people calling algorithms racist will endorse this
conclusion. It sounds kind of...racist.

(Note that I don't use "racist" a synonym for "factually incorrect" or "we
should not consider this idea", but merely "this sounds like the kind of thing
a white nationalist might say, or Trump would be criticized for if he said".)

~~~
danso
Yes, blacks are fundamentally different from whites in terms of the available
data to train algorithms on:

[http://www.nytimes.com/2015/10/31/nyregion/hudson-city-
bank-...](http://www.nytimes.com/2015/10/31/nyregion/hudson-city-bank-
settlement.html)

> _The government’s analysis of the bank’s lending data shows that Hudson’s
> competitors generated nearly three times as many home loan applications from
> predominantly black and Hispanic communities as Hudson did in a region that
> includes New York City, Westchester County and North Jersey, and more than
> 10 times as many home loan applications from black and Hispanic communities
> in the market that includes Camden, N.J._

That's of course, just recent history. Redlining that occurred in the 1960s on
would be enough to adversely affect the housing history data of minority
groups even today. Treating everyone equal in the eyes of the algorithm is
certainly an easy route to go but as the non-algorithm expert MLK Jr. pointed
out:

> _Whenever the issue of compensatory treatment for the Negro is raised, some
> of our friends recoil in horror. The Negro should be granted equality, they
> agree; but he should ask nothing more. On the surface, this appears
> reasonable, but it is not realistic._

~~~
yummyfajitas
Did you read what I wrote? Available data _on blacks specifically_ is
completely irrelevant if blacks and whites aren't fundamentally different. The
white model will generalize.

If repayment probability for blacks and whites alike is is A x
downpayment_fraction + B x credit_score, you can use training data from whites
and the model will accurately predict black repayment probability. It only
fails if you actually need A' and B' for blacks.

As an example, maybe for whites A = 1.0 and for blacks A' = 0.75. In that case
the optimal decision is to demand higher lending standards for blacks - a
black person with a 40% downpayment would be treated the same as a white
person with a 30% downpayment. Is this your belief?

~~~
closed
Even in models where race doesn't directly cause an outcome, a model's
judgements may be biased against a race.

For example, suppose that (1) people can be green or blue, (2) green people
tend to live in Idaho, (3) living in Idaho is associated with people not
paying back loans.

A linear model where there are only non-zero, positive coefficients for the
path p(green) -> p(Idaho) -> p(fail_to_repay), and p(credit_score) ->
p(fail_to_repay) will create trouble, even though color does not directly
affect repayment. If you use a multiple regression with fail_to_repay ~ B0 +
B1 _Idaho + B2_ credit_score, it will discriminate against green people, by
penalizing people from Idaho.

AFAIK, one of the points of the paper linked in the parent comment is that
blindly using indicators like IP address may indirectly lead to discrimination
against a racial group in this way, e.g. p(racial_group) ->
p(a_specific_IP_address).

Maybe more relevant to your example, though, is that assuming whites and
blacks have the same model in the "ground-truth" scenario I presented could
_cause_ a model to be discriminative (when it shouldn't be, because the
coefficient for the path from p(green) -> p(fail_to_repay) is 0).

This specific issue is hairy, and exists for traditional approaches also.

~~~
yummyfajitas
If I understand your model right, you are saying that Idahoans don't repay
loans and your model accurately reflects this. This isn't a bias at all. The
model is issuing fewer loans to green people not because they are green but
because they live in Idaho and are unlikely to pay back said loans.

This is a case like what is described in the article - when a perfect
predictor (another word for this is "reality" or "hindsight") will still
exhibit disparate impact.

~~~
closed
It is a bias if you calculate the cost to people taking out loans, based on
color. Green people will pay a higher cost, even though in the ground-truth
model their race is not directly related to loan repayment.

For example, if only blue people in Idaho fail to repay loans, green people
will still absorb a greater cost in the multiple regression case above (in the
sense that they are more likely to be penalized for being Idahoans).

~~~
yummyfajitas
Yes, if it's actually (blue & Idaho) ~> default, and your model ignores blue,
then the greens will pay a higher cost. If color is redundantly encoded then
your model can partially fix this and penalize the blue's in Idaho.

Do you consider this situation unjust? If so, you might be unhappy to learn
that the entire goal of the field of algorithmic fairness is to do something
along these lines.

~~~
closed
> Available data on blacks specifically is completely irrelevant if blacks and
> whites aren't fundamentally different. The white model will generalize.

I should have been more clear that I was responding to this part of your
comment. That even if blacks and whites aren't fundamentally different (in the
sense that your race does not directly cause an outcome of interest) you can
produce biases that are essentially a misatrribution about the relationship
between race and that outcome. Worse, if there_is_ a relationship you can
reverse the direction a model estimates for the relationship (Simpson's
paradox).

> Do you consider this situation unjust? If so, you might be unhappy to learn
> that the entire goal of the field of algorithmic fairness is to do something
> along these lines.

I don't think the creation of tools to accommodate this specific purpose is
bad, per se. Whether or not they are the appropriate tool to use is a
different question.

------
fatdog
What is fairness but political accountability?

There is an old joke about how people use statistics like a drunk uses a lamp
post: for support and not for illumination. Given this, we can expect people
to use AI like everything else in statistics, to support the agenda of whoever
is operating it while defraying negative personal accountability for the
results, because artificial intelligence. It's just an obfuscated and
sophisticated version of, "Computer says no."

The alternative is the near future headline, "AI confirms racists, sexists, on
to something."

~~~
carapace
I wish I could upvote this remark twice (or more.)

This is pretty much _the only important concept_ for figuring out how we will
use this tech politically.

Because it takes genius-level intelligence to be able to figure out whether
you're just telling yourself what you want to hear, and incredibly rare
responsibility to remember to [keep on trying to] do so, individuals and tiny
groups may be able to use AI for these sorts of things, but large groups,
municipalities, states, corps, etc. never will.

The systems we can understand and manage as a group are vastly simpler than
those which you can understand and manage as an individual.

------
wyager
Everyone suggesting that we ought to legislate that machines must be
illogical/suboptimal is missing the point.

If machine learning algorithms are unfairly discriminating against some group,
then they are making sub-optimal decisions and costing their users money. This
is a self-righting problem.

However, a good machine learning algorithm may uncover statistical
relationships _that people don 't like_; for example, perhaps some
nationalities have higher loan repayment rates. In these cases, the algorithm
is not at odds with reality; the angsty humans are. If some people want to
force machines to be irrational, they should at least be honest about their
motivations and stop pretending it has a thing to do with "fairness".

~~~
throw_away_777
This is a great point. People believe that most groups are basically equal;
this is true in the sense that if people were raised in identical environments
with equal opportunities than it probably wouldn't really matter what group
they were in, but wrong because that isn't the world we live in. Different
groups on average experience much different environments. Machine learning
doesn't care why the differences in groups arises, but people do.
Fundamentally the question is whether we want to base our decisions based on
how the world is, or on how we want the world to be.

~~~
PeterisP
It comes down to a choice between equality of opportunity versus equality of
outcome (or some mix of the two). You can't have both - granting equal
opportunities will result in unequal outcomes for all kinds of fair and unfair
reasons; and ensuring equal outcomes requires unequal opportunities (e.g.
quota systems).

For unfair stereotypes it's simple, you just ignore them; but there will be
some group differences that are _real_ \- it would be a mighty coincidence if
so many so diverse groups would magically happen to be identical in all
aspects.

So it's up for the society to decide what to choose what we will do if it
turns out that, other observable factors being equal, race/religion/ethnic
background/etc X _actually is_ 10% more likely to default on a loan.

------
yummyfajitas
After studying this issue, and learning a lot more about learning and
optimization, I've come to the conclusion that the best solution [1] is
probably explicit racial/sexual/other special interest group quotas.

Specifically, we should train a classifier on non-Asian minorities. We should
train a different classifier on everyone else. Then we should fill our quotas
from the non-Asian minority pool and draw from the primary pool for the rest
of the students.

As this blog post describes, no matter what you do you'll reduce accuracy. But
every other fairness method I've seen reduces accuracy both _across_ special
interest groups and also _within_ them. Quotas at least give you the best non-
Asian minorities and also the best white/Asian students.

Quotas also have the benefit of being simple and transparent - any average Joe
can figure out exactly what "fair" means, and it's also pretty transparent
that some groups won't perform as well as others and why. In contrast, most of
the more complex solutions obscure this fact.

[1] Here "best" is within the framework of requiring a corporatist spoils
system. I don't actually favor such a system, but I'm taking the existence of
such a spoils system as given.

~~~
h4nkoslo
The problem is that you run out of "good" NAMs (or women with the exact same
career preferences as men, etc.) extremely quickly. The demand for "good" NAMs
in any given field vastly exceeds supply, since quotas tend to be set at
population proportion.

------
drivingmenuts
Either you allow an algorithm to be ruthlessly fair, or you introduce bias and
never get the problem solved correctly, because someone, somewhere, will still
find a way to gripe about the amount of bias when, inevitably, it goes against
them, or is perceived to be against them due to lack of knowledge. Then you
wind up bikeshedding over the bias and not the actual problem.

------
rubyfan
I am actually optimistic on Big Data's effect in equality.

Small data is actually kind of the problem. When you have limited ability to
process data or limited data density then your segmentation ability is limited
to small data like state, county, zip code, credit score, whether you own a
home, etc.

Big data processing, big bad ML algorithms and the ubiquity of data is making
advanced segmentation available that allows us to make arguably more equitable
outcomes.

------
drpgq
Bayes and discrimination law doesn't seem like good partners.

------
denzil_correa
> As a result, the advertiser might have a much better understanding of who to
> target in the majority group, while essentially random guessing within the
> minority.

If this is the case, then it should be detected and ML should NOT be used for
the minority class. There are many classifiers out there which work on one-
class problems.

