
Replacing Judgment with Algorithms - henrik_w
https://www.schneier.com/blog/archives/2016/01/replacing_judgm.html
======
yummyfajitas
Schneier makes a startling claim:

 _It 's already illegal for these algorithms to have discriminatory outcomes,
even if they're not deliberately designed in. This concept needs to be
expanded. We as a society need to understand what we expect out of the
algorithms that automatically judge us and ensure that those expectations are
met._

Schneier is discussing an unpleasant fact; unbiased algorithms often discover
that things we previously attributed to bias were actually unbiased
predictors. But we don't like to admit this, so we instead make silly
statements like the above.

If we want to mandate the outcome of our decision process, let's just do that
openly and honestly. E.g. for credit, just impose a quota of credit that must
be issued to each protected class.

If we want fair and unbiased decisions then we need to instead examine the
internals of our process and accept the outcomes, whatever they may be.

But we need to stop pretending we can do both.

~~~
AnthonyMouse
> Schneier is discussing an unpleasant fact; unbiased algorithms often
> discover that things we previously attributed to bias were actually unbiased
> predictors. But we don't like to admit this, so we instead make silly
> statements like the above.

There is a legitimate question as to whether an algorithm is actually
unbiased. It could be quite accurate or there could be a bias, e.g. because
the algorithm penalizes drug convictions and drug convictions have an
unjustified bias due to racist or sexist enforcement.

The problem is it isn't a simple question to answer. And people are going to
want to choose their answer based on their politics.

But that can't work. We can't let a black box take discriminatory inputs and
expect it to produce a just result. We also can't just assume that every
unequal outcome had an unjust cause, or we'll end up e.g. causing loans to be
given to uncreditworthy people and wrecking the economy (see also housing
crash).

Which means that Schneier is wrong about quotas but exactly right that what we
need is transparency. Because _if_ there _is_ unjustified bias then we need to
understand why so we can prevent it, and if there _isn 't_ then we need to
understand why so we can accept it.

~~~
yummyfajitas
Consider your drug conviction example, and lets say the algorithm is linear
regression so I can give a very simple explanation.

The model will then be FICO = stuff + (-1) x (drug conviction).

I.e., your FICO score is lowered by 1 point if you've been convicted of drug
use.

If you are correct, some enterprising quant can take your hypothesis. He can
then rerun his regression with (drug conviction, black) and (drug conviction,
white) as variables. If your hypothesis is correct the result will be:

FICO = stuff + (-0.5) x (drug conviction, black) + (-1.5) x (drug conviction,
white).

This is because the bias in drug convictions makes it less predictive for
black people.

If you are wrong the coefficients will be equal [1]. If you are right he just
made millions of dollars for the bank he works for and probably added $100k to
his bonus.

What you are discussing is strictly a statistics problem.

[1] For examples of this type of analysis on social indicators (I don't know
of any on credit allocation), see these papers:
[https://www.rand.org/content/dam/rand/www/external/labor/sem...](https://www.rand.org/content/dam/rand/www/external/labor/seminars/adp/pdfs/adp_ajph.pdf)
[http://egov.ufsc.br/portal/sites/default/files/anexos/33027-...](http://egov.ufsc.br/portal/sites/default/files/anexos/33027-41458-1-PB.pdf)
[http://ftp.iza.org/dp8733.pdf](http://ftp.iza.org/dp8733.pdf)

~~~
AnthonyMouse
You're essentially making an argument from the efficient market hypothesis:

[https://en.wikipedia.org/wiki/Efficient-
market_hypothesis](https://en.wikipedia.org/wiki/Efficient-market_hypothesis)

In other words, the algorithm won't be biased because there is a buck to be
made by removing the bias.

But the efficient market hypothesis is weird. It's kind of like Schrodinger's
Cat. If you can find some specific instance where it's wrong and publish it
then the market corrects itself, so if you observe a bias then it ceases to
exist. So now on one hand, we know that biases can exist because people have
discovered them in the past; it's just that _those_ biases have already been
published and are being taken into account. We can speculate that more exist
but we don't know what they are. On the other hand, maybe now we've finally
found them all (or the remaining ones are negligible) and the efficient market
hypothesis becomes true.

But notice that the thing to do if you believe the efficient market hypothesis
is wrong in a particular case is clear. _Prove it_ , because you'll be able to
make a lot of money by making the world more just.

Which is why we need transparency. Because the market participants can't find
the bias if they don't have the information, and then the efficient market
hypothesis will certainly be false.

~~~
yummyfajitas
I made no claims about the EMH. My only claim is that if bias like what you
describe exists, it's something that statisticians rather than politicians
need to identify and fix, and that the statisticians in the position to fix it
have perfect incentives to do so.

Politicians have neither the ability nor incentive.

Similarly, if someone were claiming that we need political oversight of UI/UX
choices, and that American web pages need more red/white/blue to improve
conversion rates, I'd suggest that this is a job for web designers rather than
politicians. I'd also say that if you just want red/white/blue designs for
patriotic purposes, you should openly advocate for that and not talk about
conversion rate.

------
secondtimeuse
Paraphrasing Sussman-Minsky story [1], just because you rely on judgement does
not means there is no algorithm, it's just that you don't know what the
algorithm is. In a country like China with rampant corruption, Algorithms are
a welcome relief. Schneier having never faced any kind of corruption at
individual level completely ignores this giant problem. While the exact
details of the algorithm might be debatable, yet having an algorithm as
opposed to corrupt bureaucrats making arbitrary decisions is a welcome step.

[1] "So Sussman began working on a program. Not long after, this odd-looking
bald guy came over. Sussman figured the guy was going to boot him out, but
instead the man sat down, asking, "Hey, what are you doing?" Sussman talked
over his program with the man, Marvin Minsky. At one point in the discussion,
Sussman told Minsky that he was using a certain randomizing technique in his
program because he didn't want the machine to have any preconceived notions.
Minsky said, "Well, it has them, it's just that you don't know what they are."
It was the most profound thing Gerry Sussman had ever heard. And Minsky
continued, telling him that the world is built a certain way, and the most
important thing we can do with the world is avoid randomness, and figure out
ways by which things can be planned. Wisdom like this has its effect on
seventeen-year-old freshmen, and from then on Sussman was hooked."

~~~
adrianN
Regarding the Sussman story, it made it into a Hacker Koan

[https://en.wikipedia.org/wiki/Hacker_koan#Uncarved_block](https://en.wikipedia.org/wiki/Hacker_koan#Uncarved_block)

------
Houshalter
This article is mainly about algorithms used by the government for
surveillance and control. Bad stuff, of course.

But in general, I think algorithms are something to embrace rather than fear:

Even very simple algorithms reliably outperform experts by a large margin, in
certain domains:
[http://lesswrong.com/lw/3gv/statistical_prediction_rules_out...](http://lesswrong.com/lw/3gv/statistical_prediction_rules_outperform_expert/)

Despite this, humans are heavily biased against algorithms. Even when we know
they perform much better than humans. We also tend to vastly overestimate
human performance:
[http://lesswrong.com/lw/lsc/link_algorithm_aversion/](http://lesswrong.com/lw/lsc/link_algorithm_aversion/)

This has a lot of consequences. Humans are terribly biased, and often aren't
even aware of our biases. Not just stuff like race, but even people that are
ugly are heavily discriminated against:
[http://lesswrong.com/lw/lj/the_halo_effect/](http://lesswrong.com/lw/lj/the_halo_effect/)
We discriminate against people of different political parties even more than
we do based on race: [http://slatestarcodex.com/2014/09/30/i-can-tolerate-
anything...](http://slatestarcodex.com/2014/09/30/i-can-tolerate-anything-
except-the-outgroup/) Or just random factors we can't predict. Nothing about
being judged by a human is "fair".

And beyond fairness, they just give better predictions. Humans just aren't
very good at prediction, even when we think we are. And in some domains that's
extremely important, like in medicine. The first link referenced several
places where statistical methods did better at diagnosis than doctors who
specialized in it. Yet there has always been massive resistance to using
algorithms in diagnosis.

~~~
rando289
> I think algorithms are something to embrace rather than fear

He has the same sentiment in his article.

------
jraines
Replacing judgment with algorithms has always been part of bureaucracy.

Once encoded as programs, though, they have a chance -- however unlikely, and
still not without flaws -- of being audited either by the public or at least
by independent auditors.

~~~
pavel_lishin
Why do you supposed that programs written in a programming language will be
audited more effectively than programs written down on paper?

------
Nutmog
A large part of our lives is already devoted to gaming the human judgement
system. The clothes we wear, the way we speak, the way we walk. Who we
associate with - "Associate with deadbeats, and you're more likely to be
judged as one." \- has always happened in real life too. Women wear makeup to
work to game the judgement system and men try to show off their confidence.
All just to try to influence how other people see them.

So adding computers isn't really a big deal. It's already a major part of our
daily lives and efforts.

------
logn
I think we can probably get transparency and ethics built into the algorithms.
But I think we'll have a much harder time controlling how pervasive and
intrusive the algorithms are. And as a result, our behavior will be more
effectively controlled by corporations and governments.

------
rubberstamp
If any such algorithms are going to be built, it better be open source or be
able to explain to me how it arrived on that judgment. I am not going to trust
some blackbox on this.

------
zdw
Dupe:
[https://news.ycombinator.com/item?id=10864933](https://news.ycombinator.com/item?id=10864933)

~~~
throwaway2048
calling attention to a story without upvotes or comments isn't very useful.

------
hiddencost
There's already very active mathematical work in this area. See fatml.org . I
appreciate jeremy kun calling this out in Schneier'so comments.

------
ikeboy
>The first step is to make these algorithms public. Companies and governments
both balk at this, fearing that people will deliberately try to game them, but
the alternative is much worse.

This needs a lot more than the one sentence given to justify. How do we know
that an open source security algorithm will be better? The justification isn't
powerful enough for the claim.

~~~
bnegreve
It seems to me that the whole article is about proving this point.

Relevant quotes:

> The secrecy of the algorithms further pushes people toward conformity. If
> you are worried that the US government will classify you as a potential
> terrorist, you're less likely to friend Muslims on Facebook.

> If you know that your Sesame Credit score is partly based on your not buying
> "subversive" products or being friends with dissidents, you're more likely
> to overcompensate by not buying anything but the most innocuous books or
> corresponding with the most boring people.

> Uber is an example of how this works. Passengers rate drivers and drivers
> rate passengers; both risk getting booted out of the system if their
> rankings get too low.

> Many have documented a chilling effect among American Muslims, with them
> avoiding certain discussion topics lest they be taken the wrong way.

~~~
ikeboy
Those are trying to establish a loss from having these algorithms. They don't
show whether the benefits of open source outweigh the costs. That part only
seems to be handwaved in the sentence I quoted.

~~~
bnegreve
I think you're being unfair. The author has given us many clear issues with
closed source algorithm that would not exist with open source algorithms.

Showing that the benefits _outweigh_ the costs would require a model of the
impact the (closed or open) algorithmic judgments on the society, and thus a
model of the society itself.

Such model would inevitably be subject to even more subjectivity and
discussions.

If you're not buying it, it's your choice, and you may even be right. But that
doesn't make the whole argument unreceivable. Instead, you should mention the
benefits of closed source algorithms for the society, they're not that obvious
to me.

~~~
ikeboy
"the alternative is much worse."

That's a definite statement, and shouldn't be made without such a model. If
you only have handwavey models, you can't claim one is "much worse" than the
others. The level of confidence isn't justified. I'm fine with a list of
possible bad outcomes, but don't claim way A is much worse than way B without
stronger evidence.

>Instead, you should mention the benefits of closed source algorithms for the
society, they're not that obvious to me.

One is mentioned in the part I quoted: that they're harder to circumvent. This
isn't the same as regular security software where they theoretically are
secure, but practically have holes, and so there's gains by opening it up and
letting people report holes. There would be no theoretical security model
here, it would only be probabilistic. There may not be a "patch" for the kind
of holes (I'll give an example soon). So security by obscurity might make more
sense here than in general.

Example: suppose our data shows that people that drive car X have a greater
security risk. If this is kept secret, we are able to more effectively target
people (probably, many different factors will combine until someone is an
extreme risk, but this is simplified.) The fact that X is weak evidence of
risk is information that becomes useless relatively soon after it is
publicized (because potential terrorists just put "don't buy X" on their
checklists.) So publicizing the algorithm would concretely lead to worse
results.

Realistically, car model probably isn't too predictive, but possibly the union
of all behaviors that are easy to change when known predicts risk reasonably
well.

