
A New Approach to Understanding How Machines Think - laurex
https://www.quantamagazine.org/been-kim-is-building-a-translator-for-artificial-intelligence-20190110/
======
ssivark
I found it easier to understand the idea by skimming the introduction in the
paper [1]:

> _TCAV uses directional derivatives to quantify the model prediction’s
> sensitivity to an underlying high-level concept [...] For instance, given an
> ML image model recognizing zebras, and a new, user-defined set of examples
> defining ‘striped’, TCAV can quantify the influence of striped concept to
> the ‘zebra’ prediction as a single number. In addition, we conduct
> statistical tests where CAVs are randomly re-learned and rejected unless
> they show a significant and stable correlation with a model output class or
> state value._

[1]: [https://arxiv.org/abs/1711.11279](https://arxiv.org/abs/1711.11279)

~~~
cs702
Me too. Thank you for posting the link.

------
abrichr
> _“AI is in this critical moment where humankind is trying to decide whether
> this technology is good for us or not,” Kim says. “If we don’t solve this
> problem of interpretability, I don’t think we’re going to move forward with
> this technology. We might just drop it.”_

AI interpretability is a very important and exciting field, and I don't mean
to detract from the rest of this article, or from the speaker's work. However:

1) Technology is neither good nor bad in and of itself. It is a tool that can
be used for one or the other.

2) If it's useful for something, some people will use it, despite the protests
of others. Have we ever collectively decided to "drop it" in the past, where
"it" is a powerful technology?

~~~
roywiggins
CFCs and asbestos are very useful and we've also decided to largely ban them
outright because of the horrible side effects. It turns out there's hardly any
safe ways to use them so they're not really worth the benefits.

~~~
abrichr
Good point! From
[https://en.wikipedia.org/wiki/Chlorofluorocarbon](https://en.wikipedia.org/wiki/Chlorofluorocarbon):

> _Because CFCs contribute to ozone depletion in the upper atmosphere, the
> manufacture of such compounds has been phased out under the Montreal
> Protocol, and they are being replaced with other products such as
> hydrofluorocarbons_

Maybe this success is at least somewhat due to the fact that there is an
alternative technology that has very similar utility and cost, but without the
significant negative effects.

The absence of such an alternative might explain why Asbestos hasn't had
similar success. From
[https://en.wikipedia.org/wiki/Asbestos#Usage_by_industry_and...](https://en.wikipedia.org/wiki/Asbestos#Usage_by_industry_and_product_type):

> _Some countries, such as India, Indonesia, China, Russia and Brazil, have
> continued widespread use of asbestos._

~~~
jcadam
The CFC-free albuterol inhalers (for us asthmatics) introduced in the last ten
years are objectively worse than the ones they replaced. On the bright side,
the new formulations meant new patents for drug manufacturers and the
elimination of cheap generic alternatives for patients, so there's that.

~~~
AnimalMuppet
Was CFC the propellant, or part of what you inhaled? I find it hard to believe
that inhaling CFC could be good for you.

~~~
roywiggins
CFC was used as the propellant.

[https://www.fda.gov/Drugs/ResourcesForYou/Consumers/Question...](https://www.fda.gov/Drugs/ResourcesForYou/Consumers/QuestionsAnswers/ucm077808.htm)

------
b_tterc_p
Imagine making a linear regression to predict runners’ speed using data from
two individuals timing their speeds. Not a very interesting problem. Assuming
they’re both competent measurers, you would probably be best off taking half
of the first measure and half of the second measure (aka the mean). But the
clever linear solver might notice that you can technically get a better answer
by weighting the first measurer by 3.001 and the second measurer by -2.002 due
to natural variance and the way things happened to land. We can adjust the
solver to not do this by punishing it for large coefficients (ridge
regression) and that’ll mostly settle that.

But for a neural network with hundreds of mini logistic regressions built
in... seems tough. Interpretability isn’t just knowing how decisions are made
but also how much irrationality is being built into it. If one factor is being
marked as a huge negative influence, what does that actually mean about the
problem space? Maybe nothing. If you actually want an interpretable neural net
model, you should probably hand craft layers of domain specific ensembles that
you can individually verify and are closer to being self evident. Maybe you
won’t know how it determines stripes or horselike, but if you feel good about
those two models individually then it’s a much easier task to follow the last
step which is: it’s a zebra if and only if it’s horselike and has stripes.

------
quattrofan
Very interesting although I thought the chainsaw analogy was poor, after all
while the user may not understand how it works the person who built it
certainly does.

~~~
ramtatatam
That's what stroke me when I was first reading about neural nets 15 years ago,
that researchers was admitting nobody really understands why neural nets do
what they do. Maybe there are new publications that overthrow this view?

------
tim333
The question of explaining how a neural network made a decision also comes up
with us humans of course and is often not easy. Don't know why I didn't
call...
[https://www.youtube.com/watch?v=tO4dxvguQDk](https://www.youtube.com/watch?v=tO4dxvguQDk)

Human rationalisation has it's issues eg “So convenient a thing to be a
reasonable creature, since it enables one to find or make a reason for every
thing one has a mind to do.”

Wonder if AI will do better.

------
goldenkey
Just finished reading the paper. TLDR; a linear classifier is trained on the
activation values of an already trained network when it is fed examples of a
specific high level feature. Assuming the feature is isolated to a certain
linear subset of the activation matrix, a vector along the decision boundary
can be computed. When feeding examples, the magnitude of the directional
derivative in this vectors direction can be computed on the activation matrix,
yielding a measure of how much the feature is responsible for the overall
derivative..ie. the total activation.

Seems like an obvious technique. A few months back, someone posted a similar
linear reverse engineering technique to make a customizable face generator
from already trained GANs.

~~~
jeromebaek
"how much the feature is responsible for the overall derivative" part is
obvious. But I'm curious how the assumption can be justified - how do you
isolate a feature to a certain linear subset? And how do you know that this
feature is (i.e.) "stripes"? The mapping from vectors to human language is the
part that seems hard.

~~~
goldenkey
By training a binary linear classifier over the vector space of the other NNs
responses to a set of inputs with that specific feature.

Language shouldn't be necessary if your feature can be conveyed through
examples.

But yes, it's a big assumption to say that all features can be isolated as
linearly decidable subsets of the activation space.

I would guess one could get better results with stronger, non linear
classifiers combined with more abstract generalizations to directional
derivatives.

------
rbrbr
Any ever created decision making algorithm is nothing more than programs. A
machine is not thinking. It’s executing code.

~~~
lm28469
Well, aren't we all powered by organic decision making algorithms ?

Can it be emulated with our current tech / knowledge ? maybe, maybe not. In
the end it all boils down to: is intelligence / consciousness 100%
material[0]. If it's the case it would be theoretically possible to replicate
it. In practice it's much more complex.

If these principles can't be explained by materialism I think we'll have even
bigger questions to answer.

[0]
[https://en.wikipedia.org/wiki/Materialism](https://en.wikipedia.org/wiki/Materialism)

also see:
[https://computing.dcu.ie/~humphrys/philosophy.html](https://computing.dcu.ie/~humphrys/philosophy.html)

~~~
man-and-laptop
It could be material while not being expressible by a Turing machine. If by
"algorithm" you mean something that runs on a Turing machine, then human
intelligence doesn't have to follow any algorithm. If you mean that human
intelligence is governed by equations, then that _must_ be true if materialism
is true, because the laws of physics (or "material reality") are governed by
equations.

~~~
hopler
Physics is approximated by laws, not governed.

Turing machine is only one kind of computing device. It just happens to be
really good at simulating many other kinds.

~~~
naasking
> Turing machine is only one kind of computing device.

Turns out, most differences between computing devices don't really matter.

------
Ace17
"The question of whether Machines Can Think... is about as relevant as the
question of whether Submarines Can Swim."

\- Edsger Dijkstra (EWD898).

~~~
signa11
obligatory heinlein quote:

Am not going to argue whether a machine can “really” be alive, “really” be
self-aware. Is a virus self-aware? Nyet. How about oyster? I doubt it. A cat?
Almost certainly. A human? Don’t know about you, tovarisch, but I am.
Somewhere along evolutionary chain from macromolecule to human brain self-
awareness crept in. Psychologists assert it happens automatically whenever a
brain acquires certain very high number of associational paths. Can’t see it
matters whether paths are protein or platinum.

~~~
amitprayal
Not sure about that , what if the path does matter.

~~~
trevyn
What if there is no evidence that the path matters, and on top of that, there
_is_ evidence that human brains and society are biased toward unjustified
belief that the path matters?

~~~
amitprayal
At present the only evidence that the path matters is you yourself, till we
find/create something that disproves it , all, else is speculation.

