
Stop Explaining Black Box Models for High Stakes Decisions - polm23
https://www.arxiv-vanity.com/papers/1811.10154/
======
gumby
Humans do not use explainable models. We confabulate an explanation _post
hoc._. There is some evidence that we may do it down to very simple cases
(“why did you reach for that cereal? A: it’s good for me”).

We don’t do this for complex plans where we track multiple intermediate goals,
even not on a piece of paper, but the intermediate gaps between what is
written down are truly inscrutable.

That being said I have some sympathy for the technical portions of her
argument (and for the rest too).

~~~
Retric
Humans don’t have completely explainable models, but we do have partiality
explainable models which are still meaningful. A doctor who for example knows
that antibiotics don’t work on viral infections can explain sufficient reasons
to justify their actions. Shareable heuristics are a powerful tool and tossing
them away binds you to the biases in your training sets.

~~~
gumby
Perhaps I was not clear; here's an example of what I meant:

We make complex reasoned plans like "how to make coffee": I'm going to need a
filter in the pot, I'm going to need ground coffee, water, etc; let's make
sure the filter's there before I add the coffee and both before the water,
etc." Then we use some sort of hierarchical planning heuristics to run it. If
someone asks what I did I can explain it at varying levels of resolution ("I
made coffee", "I got these pieces together (xxx) and then made coffee...")
again depending on various "explanation heuristics" which we learned as part
of being eusocial organisms.

These plans are complex (even "get out of bed and pee" is a pretty complex
plan).

However below that are a ton of decisions we aren't even aware we're making.
I'm convinced (and my software reflects) that the actual "plans" we make
organically are very short, and that the interesting plans -- the ones we can
talk about and that we typically care about -- are very abstract ones. Making
coffee is super abstract, after all.

Perhaps for an analogy: the extremely abstract reasoning for not using an
antibiotic is at the level of "chemistry" while what I'm talking about is at
the level of "physics".

------
dry_soup
To quote one Maciej Cegłowski, "Machine learning is money laundering for
bias".

~~~
ayidnelm
It's a good line, what is the context / source for that quote?

~~~
polm23
[https://twitter.com/pinboard/status/744595961217835008](https://twitter.com/pinboard/status/744595961217835008)

His talks, particularly about superintelligence and automation, are also
relevant.

[https://idlewords.com/talks/](https://idlewords.com/talks/)

------
tomp
I'm no expert in ML, but doesn't this paper basically argue for reintroduction
of rule-based expert systems (though obviously what's going to happen is that
rules are going to get so complicated that they're no longer sensibly
interpretable) and has basically no useful suggestion for actually complicated
(and successful) fields of ML like computer vision.

 _> when a new image needs to be evaluated, the network finds parts of the
test image that are similar to the prototypical parts it learned during
training_

basically just waving away the complexity/black-box-ness of "computer vision"
into the complexity/black-box-ness of "similarity".

~~~
thedudeabides5
I don't think so.

There's a difference between expert systems and black boxes.

Black boxes are problematic in domains where what matters isn't just your
decision today, but the evolution of your decision making process.

Easy examples being finance, medicine, legal, education etc.

In these areas, when you are explicitly weighing competing
interests/rights/harms, it's pretty important that you be able to explain your
reasoning to another. So they can check it, so they can test it, and so they
can apply it if it's good.

Not just because your decision could be wrong, but because the process by
which we evolve our decisions is important (think precendent for law, blow up
analysis for finance etc).

If we want to push our understanding of a domain forward, black boxes
populated with a lot of data aren't super helpful.

They are able to spot complex patterns yes, many of which can be
cleaned/restructed into simple patterns.

In reality most of the best uses of ML thusfar have been either rapidly
screening/classification based on simple patterns (think OCR on a check - the
character recognizer engine in the machine isn't really teaching us much about
language or typology, it's just processing existing patterns), or in domains
with extremely rigid game mechanics, where the rules never change but you can
run a billion simulations (chess, go, video games etc).

~~~
xivzgrev
Yes. I think the idea is once you have a predictive model, is to go thru
computationally hard process of facrtorization - identifying inputs that if
removed don’t affect predictability that much. Rinse and repeat until you have
an explainable model.

Imho you won’t alwsys get an explainable model, because some times there may
be too many factors that are predictive, but the effort is what’s important.

------
6gvONxR4sf7o
>(ii) Explainable ML methods provide explanations that are not faithful to
what the original model computes.

>Explanations must be wrong. They cannot have perfect fidelity with respect to
the original model. If the explanation was completely faithful to what the
original model computes, the explanation would equal the original model, and
one would not need the original model in the first place, only the
explanation. (In other words, this is a case where the original model would be
interpretable.) This leads to the danger that the explanation method can be an
inaccurate representation of the original model in parts of the feature space.

This is such a succinct phrasing of what makes me so uncomfortable with these
approximate explanations.

------
kelnage
The Morning Papers had its take on this paper:
[https://blog.acolyer.org/2019/10/28/interpretable-
models/](https://blog.acolyer.org/2019/10/28/interpretable-models/)

------
comex
> Because the data are finite, the data could admit many close-to-optimal
> models that predict differently from each other: a large Rashomon set.

Hmm. But the ultimate goal isn’t actually to find a model that makes good
predictions on the training data. It’s to find a model that makes good
predictions on data that‘s ingested in the future when the algorithm is put to
actual use. But the set of possible future data is infinite! (Or at least
exponentially large, depending on whether the input fields have finite
precision.)

~~~
gmueckl
Yet, if an ML model can act as a reasonably accurate classifier on that data,
the presence of an internal structure of that data is proven. In other words,
there exists a transformation to a space in which the data set separates along
lines similar to what is desired.

Yet, without any kind understanding of that transformation, we cannot reason
about its properties in a meaningful way. This is the downfall of the current
generation of succesful ML models. We may not need an exact understanding of a
training result. An approximate one may be enough, depending on the kind of
insight that needs to be extracted.

My pipedream vision is that ML models are some day just mere tools that help
design simpler models with formally guaranteed properties.

------
jhrmnn
While I mostly agree, I’ll also point out that humans are also black boxes,
who have often hard time explaining their decisions.

~~~
ptah
humans can be held accountable for their decisions though

~~~
taneq
Only because we all agree that a human is a ‘person’ ie. a thing that can take
blame.

~~~
mehrdadn
Fascinating. I'd never thought of defining a 'person' this way.

~~~
rehasu
You might have never been in a VP+ position in a company then. A lot of tasks
in these positions are about avoiding consequences or shifting blame for
actions that are either necessary but look bad or that had to be made with way
too little input information. And not necessarily just one's own actions.

~~~
taneq
I was definitely thinking about corporate blame-shifting when I wrote that,
and actually nearly said "legal 'person'" instead.

------
antpls
I didn't read the full paper yet, but a quick ctrl+f shows no mention of
Activation Atlas for vision neural network, a collaboration between Google AI
and OpenAI :

[https://ai.googleblog.com/2019/03/exploring-neural-
networks....](https://ai.googleblog.com/2019/03/exploring-neural-
networks.html?m=1)

Also no mention of the "Unforeseen Attack Robustness" metric by OpenAI :

[https://openai.com/blog/testing-robustness/](https://openai.com/blog/testing-
robustness/)

There are probably other publications for NLP models, all the big players are
aware that explanation is key.

------
XuMiao
Human knowledge is built upon the high level logical relations. For example,
3-body problem. The equations over the variables are always right although no
one can predict them accurately using those equations.

Current black box AI does not learn such a high level logical relation
although it might predict motions the most accurate.

High level logical relations likely generalize to other domains. Low level
prediction models are sensitive to the distributions of the data and hardly
generalizable cross domain.

Perhaps we need a hybrid system to combine both abstract logical reasoning and
semantic tensor computations.

------
phab
Actual arXiv link:
[https://arxiv.org/abs/1811.10154](https://arxiv.org/abs/1811.10154)

------
jeffrallen
To be fair, the original article does have "please" in the title.

~~~
tom_mellior
Still a terrible title for a scientific paper.

------
ncmncm
In many uses, inscrutability is the main feature sought. Accuracy is
secondary, where it counts at all.

This probably needs legislative correction, because scrupulous data scientists
are not in the driver seat.

------
oli5679
Sparse decision trees and regression models are easily explainable.

In some domains, with sufficient feature engineering you can get reasonable
results with these approaches.

------
mattmcknight
"sparsity is a useful measure of interpretability, since humans can handle at
most 7±2 cognitive entities at once "

If you are going to limit models to 9 features, or 9 combinations of features,
or 9 rules, models are not going to work as well. This just seems like a very
weak argument to introduce.

"A black box model is either a function that is too complicated for any human
to comprehend, or a function that is proprietary" These seem like two very
different things. A false equivalence is then made with many arguments against
proprietary models being made as if they apply to complex models.

"It is a myth that there is necessarily a trade-off between accuracy and
interpretability....However, this is often not true" It's a bit of a straw man
to suggest that there is "necessarily" a trade-off, as there are surely cases
where there is not. One could say that there is "often" a trade-off between
accuracy and interpretability. I don't think there are many people out there
with the naive view that one should never do feature engineering.

"If the explanation was completely faithful to what the original model
computes, the explanation would equal the original model" This is just
nonsensical to me. The idea is not to be completely faithful, but to raise
things up to another level of abstraction. The series of videos from Wired
comes to mind where a concept is explained at multiple levels.
[https://www.youtube.com/watch?v=OWJCfOvochA](https://www.youtube.com/watch?v=OWJCfOvochA)

"Black box models are often not compatible in situations where information
outside the database needs to be combined with a risk assessment." This is
absolutely untrue, depending on one's definition of often. The output of a
model, whether it falls into this incorrect definition of a model or not, can
be treated as a feature in another model. Ensemble learning exists.

COMPAS is the punching bag, but no one seems to know what it is. I haven't
seen the evidence that its performance is equal to three if statements. It
certainly doesn't have anything to say about machine learning in general, as
it is set of expert designed rules. So, it is actually the kind of algorithm
the author favors, except proprietary.

"typographical errors seem to be common in computing COMPAS.... This,
unfortunately, is a drawback of using black box models" Unclear why
typographical errors only affect black box models.

The BreezoMeter case is not clear evidence of anything broader either. It is
unclear whether the one error noted out of millions of predictions is from bad
source data. Stretching this to concern any sort of proprietary prediction,
such as mortgage ratings, is a stretch that doesn't really tell us anything.

"Solving constrained problems is generally harder than solving unconstrained
problems." This doesn't make sense to me at all. All evidence is that ML works
better on constrained problems.

The idea that CORELS is somehow better, even if it comes up with a ruleset of
millions of rules, doesn't make sense. The proposed workaround for this "the
model would contain an additional term only if this additional term reduced
the error by at least 1%" could result in the failure to create a model if no
one term provided 1% on its own.

Scoring systems are useful, but it's like a single layer perceptron in the
example provided. You need to consider combinations of factors to see their
impact. A high X is bad, unless Y is also high and Z is is low.

The fundamental problem here is trying to limit the power of the algorithm to
the power of the human mind. From the very beginning we have used computers to
do things that are difficult or impossible for us to do. In some cases the
answers were provably correct, but we have now reached a point where computer
generated proofs are accepted. The "oracle" mode of computing will compute an
answer for us, but we ultimately have to choose whether or not to accept it on
the basis of the evidence, much like we do with the opinion of an expert.
Simple techniques such as providing one's own test data set, and the kind of
analysis done by Tetlock around Superforecasters, can go a long way to
building that understanding of accuracy of predictions, such that we have a
guideline for evaluating algorithms that are beyond our ability to understand.

~~~
buckminster
With an interpretable model typographical errors are obvious in the result.
For example, if the system denies bail because you have four convictions, but
you actually don't, then the problem is obvious. If the system denies bail
with no interpretation then the typographical error goes unnoticed.

~~~
mattmcknight
I guess I don't see that part. If the typo is in the number of convictions,
wouldn't an interpretable model also be subject to that typo? An interpretable
model would only consider number of convictions as one of the factors. So if
you look at a model like one of the scoring models shown and there are 20-30
factors under consideration, the impact would not be any more apparent than it
would be from reviewing the input data. Like if it said a person has zero
convictions and allowed bail, but they had four convictions, it wouldn't be
obvious from the result that there was a typo somewhere.

------
buboard
But it's required by law, at least in the EU. And what do you do when you
can't really explain them? You BS people

------
mlthoughts2018
This is dangerously close to trying to repurpose the term “black box” for
irrational fearmongering.

The higher the stakes of the decision, the more incentive to approach it like
a rational Bayesian agent who would use whatever tool has the best risk /
reward tradeoff, totally regardless of explainability — or if “explainability”
(which is not some universal concept, but instead differs hugely from
situation to situation) is directly part of the objective, then its importance
will be factored into the risk / reward tradeoff without any of this dressed
up FUD language, and you might even have to pursue complex auxiliary models to
get explainability in the primary models.

For a good example, consider large bureaucratic systems, like a military chain
of command connected to a country’s political apparatus — the series of
decisions routed through that is _way_ too complex for a human to understand
and it’s almost impossible to actually get access to the intermediate data
about decision states flowing from A to B in, say, a decision to use a drone
to assassinate someone and accidentally killing a civillian.

You could consider various legal frameworks or tax codes the same way.

What does “explainable” mean to these systems? A human can give an account of
every decision junction, yet the total system is entirely inscrutable and not
understandable, and has been for decades.

Turning this around on ML systems is just disingenuous, because there is no
single notion of “explainable” — it’s some arbitrary political standard that
applies selectively based on who can argue to be in control of what.

~~~
pjc50
> decision to use a drone to assassinate someone and accidentally killing a
> civillian.

So who's held responsible in this case? "Nobody" will not be an acceptable
answer forever.

~~~
mlthoughts2018
I think the prevailing political system is intentionally set up so that it is
“nobody” or a low-level scapegoat. That’s the whole point of the system.
Similar with corporate legal structure, corporate oversight and the way
executive actors can avoid personal liability.

My overall point is that “explainability” is inherently subjective &
situation-specific and whether a decision process “is explainable” has
virtually nothing to do with the concept substrate it is made out of (e.g.
“machine learning models” or “military chain of command” or “company policy”
or “legal precedent” and so on...).

It’s about who successfully argues for control, nothing more.

