
Known Unknowns - pron
https://harpers.org/archive/2018/07/known-unknowns/
======
fanzhang
This is not a problem inherent to machines. If I teach a baby the word "tank"
by pointing to a picture of a sunny tank and the word "forest" by pointing to
a picture of a night forest, the baby could be correct in deducing "tank"
meant day and "forest" meant night.

This is not even specific to deep neural nets (DNNs). Even something like line
fitting (OLS) will have problems with it.

Classical statistics has a great way to deal with this, which is that your
training data for classifying a label A against B should encompass the entire
range of A and B that you want to use this machine on.

This means if you want to identify a picture of tanks in a situation x, you
better have a training set of tanks that "overlay a neighborhood" of x.

But maybe the article isn't talking about the ability to train MLs, or how
humans will be better than MLs, but that it's scary to have machines that can
decide things and we don't know how they did it (or more precisely, the
meaning of how they did it).

That's reasonable, but I don't think operationally any different that if you
ask a human who misclassified something and she just said "well uh, it just
kinda looked like B from this angle / from a glance..."

~~~
throwaway2048
I dont think many babies need to see thousands of examples of tanks before
they understand what a tank is, and even then sometimes mysteriously classify
horses as tanks when one pixel changes.

~~~
cdoxsey
You've nailed. Language is a great examply where our intuition about how one
learns is wildly incorrect.

Babies are prewired to rapidly learn language, merely by hearing it around
them. They do this regardless of the language, with teachers who have no idea
what they're doing. It's an incredible process but because it happens so
easily it leads us to think teaching a computer these things must not be all
that hard.

But we're not generic computing devices, and our linguistic developmental
skills are innate and incredibly well adapted. We just can't see it so we take
it for granted.

------
sjclemmy
An interesting article but some of the conclusions at the end of descriptive
paragraphs seems to come from nowhere and are completely unsubstantiated; “But
machines don’t correct our flaws—they replicate them.”

“We face a world, not in the future but today, where we do not understand our
own creations. The result of such opacity is always and inevitably violence.“

------
btrettel
Is anyone aware of a particularly good explanation of the general confounding
issue (not the tank problem specifically) discussesd at the start of the
article?

In a talk I'll give in about a month I will be briefly mentioning how
confounding between two variables makes much previous research in my field
wrong. Most people in my field will never have heard of confounding and I
don't have much time to dedicate to this in the talk. Right now I am planning
to say something like "If you make two changes at once, you can't know which
caused the observed change in the output or the relative contributions of each
input change."

~~~
techbio
> "If you make two changes at once, you can't know which caused the observed
> change in the output or the relative contributions of each input change."

And there are millions of changes in the pixel data. I would appreciate "a
particularly good explanation" myself.

------
qmalzp
When talking about chess: "But even the most powerful program can be defeated
by a skilled human player with access to a computer—even a computer less
powerful than the opponent."

Is that true? Can a state-of-the-art chess engine plus a grandmaster really
outperform just the state-of-the-art chess engine?

