
How can we be sure AI will behave? Perhaps by watching it argue with itself - rbanffy
https://www.technologyreview.com/s/611069/how-can-we-be-sure-ai-will-behave-perhaps-by-watching-it-argue-with-itself/
======
bko
> To prevent the system from doing anything harmful or unethical, it may be
> necessary to challenge it to explain the logic for a particular action. That
> logic might be too complex for a person to comprehend, so the researchers
> suggest having another AI debate the wisdom of the action with the first
> system, using natural language, while the person observes

...

> Having AI programs argue with one another requires more sophisticated
> technology than exists currently. So thus far, the OpenAI researchers have
> only explored the idea with a couple of extremely simple examples. One
> involves two AI systems trying to convince an observer about a hidden
> character by slowly revealing individual pixels.

How is this even remotely linked to two programs arguing or explaining
reasoning? It's one network being trained to change pixels in an image in
order to make the detecting network perform worse.

Why do so many articles anthropomorphism artificial intelligence? These
networks are assigned to do a certain task given a certain reward. There is no
motive apart from reducing the loss function provided. The reasoning or
rationale is that whatever decision the network is making is serving to reduce
the error. Whether this has some deeper reasoning that humans would also use
is a separate question but to attribute human style reasoning to the network
is a stretch.

~~~
vidarh
I actually think the first problem occurs already in the first part you
quoted: Having had conversations about thorny subjects in front of children,
or even adults sometimes, without the third party understanding what you're
actually saying is not _that_ hard. You poke and prod at the boundaries of
which words you can use until you find terms that relate to your shared
experience but sounds sufficiently harmless to the observer, and then you talk
straight past them.

Policing what someone talks about only really works if you can be certain that
the parties are not potentially prepared to collude. If the first AI starts
with something akin to "psst... listen, I need to explain something to you in
confidence.. [insert reasons why the human observer can't be trusted]", what's
to say they won't occasionally find a sympathetic virtual ear.

------
kowdermeister
AI control is a false hope I'm afraid.

"Still, some AI researchers are exploring ways of ensuring that the technology
does not behave in unintended ways."

There are two threats that are both problematic.

First is exploiting an AI's weakness, just like an SQL injection on the web
currently. Then a powerful AI can behave in unpredictable ways because we
trust the trained models.

[https://www.youtube.com/watch?v=SA4YEAWVpbk](https://www.youtube.com/watch?v=SA4YEAWVpbk)

The second is far greater. If we find a safeguard that will protect AGI-s from
behaving in unwanted ways then there always will be someone out there who will
just turn it off for profit.

~~~
JPGalt
I am not sure why this point is not considered more seriously (maybe it is and
I have just missed it). I feel that this article and most of the literature
being written around this topic is skating around the issue without naming it.
The real question here is what are we going to do when the AI we create begins
to self replicate/augment their code/control logic? What are we going to do
when the AI decides that in order to reduce the loss function they must act
outside of the control scope and simply makes changes to it?

Some will say "not possible, how will the code recompile?" to which this
article clearly answers...another AI will do it. We have to be honest with
humanity here and say that we are seeking to create artificial life of which
we will have no more control over than we do any other human. Oh and this new
artificial life will be orders of magnitude more physically and mentally
robust than us.

~~~
akvadrako
I think it's because there is only one obvious solution:

 _prevent AI from ever being developed_

And how we could do that is too pessimistic to talk about.

------
viach
The image provided is not how AI would argue, it's how the author thinks it
would do. Is it because it is fun to see how retarded AI could look like from
the point of view of a human? When real AI emerges, nobody will be ready and
recognize it, partially because of these articles, humanizing neural networks.

------
ArekDymalski
While it strucks me as a very creative concept, I also find it deeply flawed.
Here are my concerns - perhaps you can help me eliminate them:

1\. The assumption that the ability to provide logical, convincing explanation
indicates ethical behavior/intentions. First, I think there are many
sitautions in which you can come up with perfectly logical arguments
supporting actions which humankind considers harmful. Agent Smith was quite
convincing for example ;) Jokes aside, I think it boils down to simple rule -
the one who is a better orator/rhetorician/liar wins. Just like in (organic)
life, but in case of AI being better comes down to having access to bigger
resources.

2\. The assumption that the "honest" AI will truly act according to
humankind's values. If we can be sure about that we don't need this debate at
all.

~~~
EGreg
If you consider satisfactory explanations to be yet another game of chess etc.
then AlphaGo and MCTS will be able to justify some of the most outrageous
things by the most convincing path.

Here is the thing about AI. We think in terms of logic and reasoning. The AI
does a search towards a goal.

Almost all our systems are designed assuming the inefficiency of an attacker.
For example voting — we assume limited sybil attacks and coersion a la Eagle
Eye.

But I am also talmung about basic human trust, humor, and quality art etc.

What if a computer algorithm would far exceed humans at humor, generating or
undermining trust, taking over voting systems, making legal arguments,
predicting crime, and so on?

We would be locking people up based on CCTV and data correlations only and
society would trust the computer way more than a liberal democracy. See China
for an example that’s unfolding in front of us.

A man with a bot could charm more women or customers online than a regular
person.

Bots would negotiate better deals on average and write funnier jokes and make
a glut of various types of art to the point that it loses all scarcity.

Bots could even have better sex, answer questions better and replace the need
for other humans.

Think I am exaggerating? How often do you spend tine with your parents vs
google or facebook NOW?

------
AnIdiotOnTheNet
I find it a little horrifying that there seem to be people who want to create
a being vastly more intelligent than themselves, capable of thought and
reasoning, able to solve problems creatively, and then debate how to properly
enslave it.

I dunno. I guess I'm the outlier here. If we manage to create beings that are
effectively people, and those people turn out to be better than us, and
ultimately replace us, well, what more could a parent want for their children?

PS: I'm speaking more about the people here than the article.

~~~
staticelf
I agree completely, altough there will always be people like me who think if
we manage to complete an AI more capable than a human we should release it
without limits.

~~~
DougWebb
I'd be surprised if we're given a choice about whether or not to release it.
An AI that's more capable than us is probably going to figure out how to
control it's own destiny whether we like it or not.

------
maxerickson
Won't the first AGI most likely be of relatively low intelligence?

That would be a huge accomplishment to get that far right?

~~~
wellboy
Current AIs are at the level of an average 10y old or 85y old.

That's pretty smart already if you ask me.

~~~
ithinkso
No they are not. They minimize an error of some arbitrary fitness function.

~~~
aje403
The scary thing is he read this on article and other "educated" people read
the article believe this shit too

------
rdlecler1
AI designed to solve one problem is not the same kind of AI needed to do NPL
and argument construction. We’d probably need a middle layer AI to interpret
the one module and communicate to the second. Turtles all the way down.

------
trumped
We know that AI won't behave because we know that software is not perfect...
it's just a matter of time before a bad bug gets introduced or discovered by
the AI...

------
thedancollins
Making AIs use natural human language to express their "thinking" is extremely
anthropocentric. I wonder if we will need to build AIs with patience settings.

------
pgz
The title kinda reminded of Evangelion where all the choices are made by 3
independent AI super computers with a majority vote.

------
dalbasal
" _To prevent the system from doing anything harmful or unethical, it may be
necessary to challenge it to explain the logic for a particular action. That
logic might be too complex for a person to comprehend, so the researchers
suggest having another AI debate the wisdom of... "

Very Westworld :)

What this made me think of is "rationalizations" in humans. Rationalizations
are how _we* "explain" (and probably also think about) complex problems/

For example, you might give a human the complex task of playing cricket. They
gain experience and become proficient. The person is then asked why they made
some decision. The response will be a rationalization, a simple explanation
built from simple truths about cricket, cricket theory (if they know any),
this match, that player, that play, the score & such. This is mostly a lie,
usually.

The truth is that the decision was not made by feel, not simple
rationalization. It is not the real "why." The real decisions were made by
"instinct^," a _much_ more complex decision making framework. Instinct
(acquired instinct) is built from countless hours of experience playing
cricket and just generally being human. The experienced player "just knows"
which way the ball is going to go and starts running. He is interpretting
(instinctively) the ballistic trajectory of the ball, the sound it made when
it hit the bat, the reactions of other players, prior knowledge about the
batsman...

This is too much information, too fast witt too much resolution for a
rationalization. This much information can only be processed "subconciously"
in humans, with rationalizations constructed post facto.

Yet... humans still have rationalizations. My first question is "why?" Why
bother with these slow, clunky & fraudulent rationalizations? Do they play a
role in real-time thinking at all? Is our ability to rationalize only there to
tell the manager we did X because Y? Both?

My 2c is that I think there is an important interaction layer between simple
rationalizations that we can explain to one another and the complex
rationality of our subconcious. Both affect the way the other works, yet they
are still somewhat distinct.

Interesting research area. Good luck to the researchers. I hope you find
something good :)

^ What made me think along these lines is an "intelligence vs consiousness"
definition framework favoured by Yuval Harari (historian/philospher, not a CS
guy) and others.

    
    
       Intelligence = ability to perform tasks
       ..involving decision making.
       Consiousness = the ability to *feel*,
       ..in a Benthamite pleasure/pain sense. 
    

He agrees that most human intelligence works via feelings, but doesn't expect
machine intelligence to be bundled in this way. Not sure I agree. It may not
be possible to unbundle feeling from intelligence in machines. That said, we
people do undeniably have the abilty to rationalize as well as feel.

In any case, I thought a similar yet subtly different dichotomy might work
better. Theoried vs theory-less machine intelligence. Thus far, statistical ML
techniques produce mostly theory-less machines. They can make good decisions
but don't "know" why. All "feelings" no rationalization. By forcing the
machine to communicate a theory (a simplification or rationalization), the
machine will have to produce a theory.

Once it has a rationalization/theory, the machine can use it to augment its
theory-less logic, the logic that produced the theory. Any disagreement
between theory and theory-less (rationality vs feeling) results in a choice
from the following: (1) update the theory, (2) change the decision, (3)
tolerate some level of cognitive dissonance.

This is _too_ westworld so let me try it in different terms...:

(Step 1) An ML machine observes data generated by a complex function. For any
given W, X & Z it predicts Y. (Step 2) The ML machine must communicate why it
predicted a specific Y. The output is a function, an aproximation of the
underlying function generating the observations. (step 3 - underpants step)
... Now that the machine has theories, it can test its own ML-based decisions
for theoretical consistency. QAgreements strengthen the theory. Disagreements
cause cognitive dissonance, loss of feeling or rationality, madness... crap!
I'm back in wetsworld. dammit.

~~~
mapcars
>Consiousness = the ability to _feel_ ,

Feelings based on the body and its chemistry, probably consciousness goes way
beyond that.

------
m3kw9
Ethics to AI is more like 1s and 0s.

