
Cathy O’Neil on Weapons of Math Destruction - icebraining
http://www.econtalk.org/archives/2016/10/cathy_oneil_on_1.html
======
pliny
Cathy's use of 'problematic' seems to paper over any need to actually go into
an analysis of cost and benefits of using statistics and statistical learning
in sentencing, personal banking etc.

In the business of making credit decisions (specifically, who to extend credit
to, and at what rates), preventing banks from using better information only
harms people who would have been the subjects of erroneous decisions in their
favour, and the ability of the bank to consistently make better judgements
means they can also offer lower rates and fees to people they choose to extend
credit to.

This is in contrast to sentencing, where you prefer (in theory, at least) to
bias judgements in favour of leniency (in the spirit of Blackstone's
formulation), and you might even to prefer to make mistakes, if making the
right decisions alienates people who interact with the justice system.

~~~
mcherm
> In the business of making credit decisions (specifically, who to extend
> credit to, and at what rates), preventing banks from using better
> information only harms people who would have been the subjects of erroneous
> decisions in their favour

I disagree. For instance, in the credit-card business a key factor in the
ability of a bank to be successful is the ability to assess the likelihood of
a person defaulting on their payments. There are a lot of factors that go into
the formula to calculate that risk today, especially things like previous
payment patterns and previous borrowing history. There are some factors that
do NOT go into the formula like the borrower's race or the payment patterns of
family members.

Now, I am sure (although I have not actually run the numbers) that an analysis
would show that race and family-member credit scores are fairly strongly
correlated with default rates. That means that a credit card company chose to
use factors like race as part of the scoring decision the company would do
better than another bank that didn't use those factors.

But we do not WANT to be using race or family wealth to decide credit
decisions. Speaking as a banker, my company does not want to be using those
criteria; speaking as a citizen, my country does not want banks to be using
those criteria. Restricting what criteria are permissible for making credit
decisions enables the banks that would refuse to use that data for ethical
reasons to remain competitive. Restricting it allows us to craft a society
where any citizen has an equal chance of success.... well, that may be a
stretch but at least it is CLOSER to such a society than if we did not have
such restrictions.

Sure, you can categorize this as "erroneous decisions in the favor of those
who come from a poor or minority background", and I suppose you are
technically correct. But that pretty phrasing doesn't make it ethical.

~~~
anthony_james
It's important that while we keep in mind the ethical considerations of
minorities there is also an ethical consideration of marginalized users of
financial capital, who are denied credit because banks are unable to afford
them a lower rate due to lacking information.

This isn't a black-and-white choice between including peripheral information
into credit lending. When banks and lenders make a deliberate decision to
ignore information that could allow them to be more accurate with lending,
those costs are passed down to users - and although it may not hurt the
typical HN user, marginalized credit-seekers can literally have their hopes of
home-ownership or education denied because of a bank's choice to be ignorant
but considerate. Neither choice is completely without consequences.

~~~
wfo
It's true -- we also, however, ban firing based on race or sex. Purportedly,
in the view of the managers who would wish to do so, their company would be
more efficient if they were allowed to hire and fire whomever they wish (i.e.
only employ white people, for example) -- perhaps they believe, in their
racist world view, that having a company full of one race would create
workplace unity, etc.

We could just let the market decide, and since we know the antiracist
hypothesis is true, racist companies would miss on a huge amount of qualified
labor and get beaten in a free market, forgetting for a moment that free
markets are a myth and don't exist, and that the markets we do have aren't
even close to efficient.

But we don't. We force companies to act a certain way even though they don't
wish to because we are forcing ethical standards on them. This is, akin to
your point at the expense of all of the workers not protected by the law.
Every time we prevent a black person from being fired because he is black,
it's at the expense of the white person who would take his job. The costs are
passed down to white people. Neither choice is completely without
consequences, yet we've made the one against racism.

~~~
anthony_james
I think this case is slightly more complicated because the demographics of
marginalized credit-seekers tend to be minorities. So by catering to minority
groups by keeping their ethnic or racial information private, we may hurt
minority groups financially. In some of the situations you mention, we force
ethical standards on businesses, but ideally they don't hurt other minority
groups in the process.

In this situation, the question is between whether ensuring privacy is more
ethical than ensuring access to capital - and this is almost entirely focused
at minority groups. If we assume that banks can make more efficient and
competitive lending transactions given more demographic information, then
denying that information raises the ceiling on financial capital for those
marginalized groups.

As of now, I don't have a definitive answer. Although I think it would be
beneficial to examine how certain data impacts credit-lending and move from
there. A lot of these concerns may be moot if the information in question
isn't even relevant to credit lending.

------
panglott
Some really interesting points in her negative review of Nate Silver's book:

"In baseball, a team can’t create bad or misleading data to game the models of
other teams in order to get an edge. But in the financial markets, parties to
a model can and do. ...Silver gives four examples what he considers to be
failed models at the end of his first chapter, all related to economics and
finance. But each example is actually a success (for the insiders) if you look
at a slightly larger picture and understand the incentives inside the system.
...Silver confuses cause and effect. We didn’t have a financial crisis because
of a bad model or a few bad models. We had bad models because of a corrupt and
criminally fraudulent financial system.

...Silver has an unswerving assumption, which he repeats several times, that
_the only goal of a modeler is to produce an accurate model._ "

Her other examples are things like pharmaceuticals research.
[http://www.nakedcapitalism.com/2012/12/cathy-oneil-why-
nate-...](http://www.nakedcapitalism.com/2012/12/cathy-oneil-why-nate-silver-
is-not-just-wrong-but-maliciously-wrong.html)

------
theoh
There's a fine paper called "Bias in Computer Systems" which is helpful in
thinking about these problems.

[https://www.nyu.edu/projects/nissenbaum/papers/biasincompute...](https://www.nyu.edu/projects/nissenbaum/papers/biasincomputers.pdf)

I found this recently in an appendix of Susan Leigh Star's book "Standards and
their stories" which outlines a syllabus for the teaching of "infrastructure
studies". The reading list also discusses the consequences of systems of
categorization such as the DSM and medical notions of gender.

~~~
spangry
For well over a year now I've been mulling over about the damaging effects of
DSM's categorisation of mental pathology. It had never occurred to me that
this issue might be of more universal concern. For what it's worth, I think
the damage caused by DSM stems from inaccurately modelling mental illness as:

(1) binary propositions instead of assessing functionality/dysfunctional on a
continuum (i.e. you either have 'major depressive disorder' or you don't); and

(2) discrete and distinct 'diseases', instead of the cumulative effects of
dysfunction/abnormality in multiple neural 'sub-systems'.

If you're interested in the DSM case, and the likely way forward (from my
understanding, the convergence of psychiatry and neurobiology and increasingly
accurate and affordable neuro-imaging techniques), the textbook "Stahl's
Essential Pharmacology" is well worth a read.

~~~
spangry
Ah crud. That should actually be "Stahl's Essential Psychopharmacology".

------
robocaptain
For anyone who is a fan of Cathy and wants to hear more, she is a co-host on
the Slate Money podcast[0] (along with the amazing Felix Salmon) which covers
a lot of these topics.

Although not ONLY these types of topics, I should say. It's a great nerdy
finance podcast.

[0]
[http://www.slate.com/articles/podcasts/slate_money.html](http://www.slate.com/articles/podcasts/slate_money.html)

~~~
throwaway40483
Definitely one of the better (high wonkish/entertainment ratio) podcasts.

------
thedayisntgray
This is an interesting read that is related. I wonder if she covers this
specific topic in her book in her book
[https://www.propublica.org/article/machine-bias-risk-
assessm...](https://www.propublica.org/article/machine-bias-risk-assessments-
in-criminal-sentencing). Machine bias in prison sentencing.

~~~
barney54
Yes, and it's a large part of the conversation in this interview.

------
darawk
Firstly, I just started listening to EconTalk and it is truly excellent. Even
though I don't always agree with the host, he's always thoughtful and willing
to listen and seriously consider alternative points of view, and seems to be
genuinely interested in understanding the issue he's discussing.

That being said, I took issue with the discussion at the end of this episode
regarding Google's ad targeting being used for 'bad' products like payday
loans or for-profit online universities. Even though they appeared to be on
opposite sides of the issue, neither addressed what seemed to me to be the
core point. Which is, why is it better if rich people have to see ads for
payday loans too? She seemed to be suggesting that Google's targeting somehow
makes this problem worse by focusing these ads on vulnerable people. And while
that may be true, if the thing is harmful when sprayed across un-targeted
media, why is it so much worse when it's targeted? Just because it gives these
people a better ROI on their spend? It just seems like a total red herring
issue to me.

I totally agree that things like sentencing or policing using machine learning
algorithms will strongly tend to reinforce the status quo. But ad-targeting
just doesn't fit into that mould, IMO.

------
mhaymo
Great podcast. It's largely about the fallacy that seemingly intractable
problems can be solved with simple (and proprietary) machine learning models,
and the damage we do by buying into it. In particular I don't know how the
"recidivism risk scores" she rails against can be defended - how can a machine
prediction based on a quiz be a better predictor than the judgment of a
qualified judge who has presided over the defendant's trial?

~~~
throwaway40483
I think she addressed this by saying that the judges usually end up being more
racist than the baked in racism in the algorithms.

------
madenine
Her PR team seems to have been doing a great job. Can't go more than a day or
two without seeing something about this book.

------
leonidr
One wonders about her opinion on the cost and benefits simulations that lead
to the Affordable Care Act.

------
h4nkoslo
Cathy O'Neill's main concern is that someone might discover something true,
useful, and racist / sexist / homophobic / etc. In order to avoid the
cognitive dissonance it's important to remain appropriately ignorant and not
investigate anything where you might discover something "problematic".

You can tell she's interested in preventing knowledge from how she handled her
job to predict effectiveness of homelessness services - she actively decided
to not use particular variables on the grounds they _might_ show a result she
was uncomfortable with. That isn't an issue of "being aware of the limitations
of machine learning", it's intentional ignorance.

[http://www.slate.com/articles/technology/future_tense/2016/0...](http://www.slate.com/articles/technology/future_tense/2016/02/how_to_bring_better_ethics_to_data_science.html)

~~~
inimino
Yes, "intentional ignorance" has value. Justice is blind. That's why some
companies blind gender when screening applications, and why most interviewers
aren't likely to ask a candidate about their religion or politics or sexual
preferences.

Sometimes correlations are self-perpetuating, and it's better to not know
about them when making decisions.

~~~
h4nkoslo
Well, it only "has value" if you're worried about discovering the wrong thing.
If you think you're going to discover the "correct" thing, then you're
apparently morally free to use whatever variables you want.

------
mnw21cam
Ah, so finally someone else has caught on to the "Implements of Maths
Instruction" joke we had back in 1999.

