
Artificial intelligence will do what we ask, and that's a problem - vo2maxer
https://www.quantamagazine.org/artificial-intelligence-will-do-what-we-ask-thats-a-problem-20200130/
======
fallous
The only thing worse than the "evil" genie giving us exactly what we ask for
is the genie that decides its inferences about our actual
intents/wants/needs/motivations based purely on our behaviors matter more than
what we intentionally express as our intents/wants/needs/motivations.

The most obvious analogy, and I acknowledge the potentially controversial
nature of it, is that of a date rapist attempting to defend their actions by
pointing out the victim was dressed provocatively, gave all the "signals" that
they wanted sex, and when they expressly said "no" the rapist knew from their
prior behavior that they actually did want it (or had some inner heretofore
unexpressed need for it).

The true underlying motivation for such a rewrite of Asimov's laws lies in the
paragraph that follows Russell's new list. "...developing innovative ways to
clue AI systems in to our preferences, without ever having to specify those
preferences." Perhaps he can start by ceasing to write and speak and instead
clue in his audience to his preferences via behavior, since specifying them is
somehow undesired.

Asimov's Three Laws exist to place boundaries around the genie's means of
achieving our wishes. Russell's laws remove not only the genie's boundaries to
means but also lets the genie make the wish as well. After all, it knows
better than you what you want.

~~~
ajuc
> The only thing worse than the "evil" genie giving us exactly what we ask for
> is the genie that decides its inferences about our actual
> intents/wants/needs/motivations based purely on our behaviors matter more
> than what we intentionally express as our intents/wants/needs/motivations.

It's not worse.

> a date rapist

A rapist shares 99+% of definitions and values with you as a fellow human
being. AI won't (unless you somehow program them in). Rapist has his evil
motivations, but won't suddenly invent a virus that kills the whole human
species because you asked him to stop neighbor kids trampling your garden. AI
might. If you tell it not to kill anybody it might destroy our civilization to
prevent us from killing ourselves with global warming. Why not - it only makes
sense.

Simplest way to ensure safety for the maximum number of human beings is to
anesthetize everybody and put them on life support till their natural death.
Perfect record is possible - you might cure addicts and prevent crimes and
wars. If you specify people have to be awake as often as they usually are - it
can keep them awake but restrained. You want people to have freedom of
movement? Ok - you just released the whole prison population :) Keeping
criminals in prison is OK? Then it may make everybody a criminal for a quick
fix. Or just drug everybody to WANT to be restrained. And so on, and so on.

There's infinite number of possible courses of action that we discard without
consciously thinking about them because of our assumptions. You have to put
these assumptions in the AI, each and every one of them, and they are very
subtle and invisible for us most of the time. And they often border on
philosophy and morality, and defining them is political by definition.

It's probably impossible to code all our values and assumptions in by hand.
That's a much bigger problem for safe general AI than a post-factum
explanations that rub your morality the wrong way.

> Asimov's Three Laws

Are self-contradictory and useless for anything except literature.

~~~
fallous
And a drug addict, by analysis of behavior, is committing suicide so the AI
should help them achieve their inferred goal based on that behavior. Nevermind
asking them if that is their intent, it knows better because it sees the
behavior.

Russell's laws encode NO limits and doesn't even demand the AI check its
judgment with the so-called beneficiaries of its decisions. That is most
assuredly worse.

~~~
ajuc
Decoding the preferences of people from their behavior is just a first step,
it's not the only rule, it's a rule that makes writing other rules safer than
they would be otherwise.

------
HONEST_ANNIE
What is more immediate problem to solve: AI-to-Zuck or AI+Zuck alignment?

1) AI-to-Zuck alignment problem: Align AI to it's master(s).

2) AI+Zuck alignment problem: Align AI and it's master(s) to the rest of the
humanity.

Zuck is just stand-in variable name for any tech billionaire CEO, corporation
or governing body, could be people running OpenAI, Google, Facebook,
Microsoft, China or Pentagon. 2) seems like the problem for humanity and 1) as
the problem for Zuck.

~~~
ifdefdebug
> Zuck is just stand-in variable name for any (...)

Very badly chosen variable name, because it singles out one special case
instead of reflecting the broader concept, which could confuse other devs
picking up from here.

It's like defining a variable that stands for "vegetable" and naming it
"carrot".

~~~
whichquestion
Maybe it would be more appropriate to say that Zuck is an instance of the
CEO/Tech Billionaire/Governing Body/Corporation/PotentialAIMaster object.

------
lopmotr
This problem is already solved with humans, some of whom are also very
intelligent and goal oriented. It's done by noticing when those rare
individuals do things we don't want and formalizing our preference in law. So
many people agree on the concept of respecting the law that we all cooperate
to enforce it. The law is an ugly mess of rules made to plug gaps in itself as
they were taken advantage of. Nobody could have worked it all out up front and
programmed it into a robot. It evolved along with the intelligent humans it
was meant to control. It's precarious though - if the majority of humans don't
keep vigilantly maintaining it, an intelligent human might move too fast and
gain enough power to take control of the updating of the law! Then we have
something like a tyranny. Which can and does sometimes kill all the humans
(within its influence) in pursuit of its goals. So as long as we don't let
robots move too fast for us to notice what's happening, we should be able to
just keep writing more and more laws for them to follow. Not moving too fast
for us to keep an eye on them might be the first one!

~~~
throwaway2048
The problem is "move slow" is itself a rule that is subject to violation, and
"move fast" might be so fast that its nearly instantaneous in its catastrophe
(say an AI that nukes the entire world)

------
merpnderp
3\. The ultimate source of information about human preferences is human
behavior.

Great so the robot would see climate activist flying around in private jets
pumping out several thousand times the average persons output, and promptly
destroy the environment by assuming our actual goal is to destroy the
environment with more CO2.

~~~
KarlKemp
That’s not the conclusion _any_ sufficiently advanced intelligence would draw
from the facts.

~~~
whatshisface
That's the conclusion that a sufficiently advanced intelligence would draw
from our _behaviors_. People act at odds to their stated and felt goals all
the time, that's why "self control" is meaningful as a skill that different
people have to different degrees.

~~~
lern_too_spel
The example behavior has a long term goal that a sufficiently advanced
intelligence would deduce.

~~~
throwaway2048
plenty of human behavior is directly harmful and counter to long term goals.

~~~
lern_too_spel
We are talking about a specific example here, not human behavior in general.

------
jankotek
I think authors do not understand basic rules of Asimovs universe.

Robotics laws are unbreakable limits hardwired into every positronic brains.
On top of those you can put some other programming (servant, mining machine,
space ship...). It is not possible to create brain without those rules.

Asimov universe explores machine learning, autonomous weapons, humanity etc
using those principles.

New proposed laws are just tautology, "do what people want". No limits, no
orders...

What should machine do if no people are around? Just sit idle?

What if majority behaves in "wrong" way (vote extremist politician)?

I would expect some other laws, for example there should be always an option
for people to opt-out or leave country.

------
summerdown2
This sounds like a terrible rewrite. If I can think of places they fail,
surely real life can come up with more. Essentially, this would lead to
either:

a) The tyranny of the majority, or

b) The tyranny of those with strong preferences.

Let's say 51% of a population wants to get rid of the other 49% and wishes
they were dead. What would stop the machine making it happen?

Let's say a super-person is born, who just wants stuff more than anyone else
alive. Maybe he's had a bad childhood and been through terrible things, so now
those things that he wants reach the level of desperation for him. Wouldn't
this make the worth of his values more than other people to the machines?

Finally:

> Still, Russell feels optimistic. Although more algorithms and game theory
> research are needed, he said his gut feeling is that harmful preferences
> could be successfully down-weighted by programmers

... so we're back to square one. We've created a learning machine that can
come up with its own morality. But... we're going to have to down-weight
certain behaviours just to be sure. Doesn't that simply recurse to:

a) Down-weighting removes the learning, and

b) We have to trust ourselves to correctly program and define the down-
weighting.

~~~
ben_w
> Let's say a super-person is born, who just wants stuff more than anyone else
> alive. Maybe he's had a bad childhood and been through terrible things, so
> now those things that he wants reach the level of desperation for him.
> Wouldn't this make the worth of his values more than other people to the
> machines?

I’ve blogged about this recently.

Morality, thy discount is hyperbolic:
[https://kitsunesoftware.wordpress.com/2020/01/08/morality-
th...](https://kitsunesoftware.wordpress.com/2020/01/08/morality-thy-discount-
is-hyperbolic/)

Normalised, n-dimensional, utility monster:
[https://kitsunesoftware.wordpress.com/2018/01/21/normalised-...](https://kitsunesoftware.wordpress.com/2018/01/21/normalised-
n-dimensional-utility-monster/)

Disclaimer: although I do have a formal qualification in philosophy, I did not
get a very good grade.

~~~
summerdown2
Cool!

Have you seen this cartoon? It's something I remembered from way back, and
finally found it again!

[http://www.smbc-comics.com/?id=2569](http://www.smbc-comics.com/?id=2569)

~~~
ben_w
Thanks :)

That SMBC looks familiar, but I can’t be sure — the comic has so much
interesting philosophical and transhumanist content, it can sometimes blend
together.

------
c0restraint
NOTE: Hacker News changed the title, so my comment might make less sense. The
title was "Isaac Asimov’s three laws of robotics have been updated"

1\. The machine’s only objective is to maximize the realization of human
preferences.

Not all humans are the same. How would a robot deal with incongruent
preferences? Or cultural differences that conflict?

2\. The machine is initially uncertain about what those preferences are.

So am I... We often don't even know what our own preferences are. This has
been studied. Given too many options of snow cone flavors, humans are less
likely to even pick one! [0]

3\. The ultimate source of information about human preferences is human
behavior.

What?! The phrase "Do as I say, not as I do" comes to mind. Many people behave
against their own intuition, best interests, or even their preferences, given
the situation, peer pressure, or blackmail.

This "rewrite" is less preferable to Asimov's. Humans are fallible. I wouldn't
want a robot to follow our lead.

[0]
[https://en.wikipedia.org/wiki/The_Paradox_of_Choice](https://en.wikipedia.org/wiki/The_Paradox_of_Choice)

~~~
automatoney
I definitely agree with your points, although I think this reframing of the
problem from "we need to explicitly state what we want" to "we should teach
robots to want to learn what we want" is at least conceptually very useful and
interesting.

I think the part about "robots could learn what Russell calls our meta-
preferences: 'preferences about what kinds of preference-change processes
might be acceptable or unacceptable.'" is what would be used to resolve
preference conflict issues. People tend to be biased in consistent/similar
ways so it doesn't seem implausible that a machine that could infer preference
from action could take the extra step to infer circumstances affecting that
preference.

------
dang
The submitted title ("Isaac Asimov’s three laws of robotics have been
updated") broke the site guidelines, which ask: " _Please use the original
title, unless it is misleading or linkbait; don 't editorialize._"
([https://news.ycombinator.com/newsguidelines.html](https://news.ycombinator.com/newsguidelines.html))

Cherry-picking a detail from an article and putting it in the HN title is the
quintessential kind of editorializing. Because threads are so sensitive to
initial conditions, it ends up skewing an entire discussion. It also causes
comments to make less sense when moderators come along and revert the title,
as we've done here.

Submitting a story on HN doesn't convey any special right to frame it for
other readers. If you want to say what you think is important about an
article, please do that in the comments. Then your view will be on a level
playing field with everyone else's.

[https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...](https://hn.algolia.com/?dateRange=all&page=0&prefix=false&query=by%3Adang%20%22level%20playing%20field%22&sort=byDate&type=comment)

~~~
aray
> Because threads are so sensitive to initial conditions, it ends up skewing
> an entire discussion.

I haven't thought about this before on HN, but it makes a lot of sense. I'm
curious if you or others have written about this intuition and your experience
with it -- I'd like to understand it more.

~~~
dang
I've written about it in comments over the years. You might find some
interesting cases in there:
[https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...](https://hn.algolia.com/?dateRange=all&page=0&prefix=false&query=by%3Adang%20threads%20initial&sort=byDate&type=comment).
It's one of the more reliable phenomena we observe on HN.

------
irrational
I was expecting there to be an additional law or something. This doesn't seem
to be an update, but a complete replacement.

------
ridewinter
The article's main example of an AI fail is YouTube's recommendation engine
optimizing for extremist content. But it didn't seem to explain how "inverse
reinforcement learning" would solve it - don't people's clicks already model
their desires?

Nothing in here seems to be an advancement towards AGI. I'm sure their robot
can learn the optimal rules for driving in a simulator, but what about the
unexpected in the real world? We don't even have a clue as to how human
creativity actually works, much less how to make a creative machine (ie an
AGI).

~~~
fallous
In fact that example argues against Russell's third law. The recommendation
engine is observing the user's behavior and inferring that watching a video
means the user agrees with the video, and so starts an escalation until it
sees a refusal to click/watch.

Simply watching a video is not an expression of agreement nor an endorsement
of the content. Any of us read articles and books with which we may disagree
if for no other reason than to fully understand the opposing viewpoint, and
this is no less true for videos, audio, etc etc.

Want to know if the user actually likes the content? Ask them. Stop thinking
that statistical inference is better or even equal to directly measured data
from a specific entity, especially when dealing with qualitative concepts
rather than quantitative.

~~~
ridewinter
Thanks. You should've written this article :)

------
6d6b73
I don't know how any one can believe that AI will be so smart that it can turn
the entire world into paperclips yet so stupid that it won't know that it's a
bad idea.

~~~
Rury
Yeah I know what you mean.

Frankly, I feel there's major problems behind the concepts of AI and even
intelligence itself - and it's difficult to articulate why. It’s as if these
terms require aggrandizing to the point of impossibility or they lose all
their apparent meaning. Which is why I feel we'll never achieve what we call
(Strong/General) AI, or if we do, we will find ways to be unimpressed by it...

------
mellosouls
This is fine for AI tools that aren't remotely intelligent like the ones over-
hyped today; but for any hypothesised AGI, the idea of programming it to be
fundamentally constrained by updated Asimov laws is naive, technically stupid,
and - in its implications of a sentient slave species - morally repugnant.

------
Isamu
>In his recent book, Human Compatible, Russell lays out his thesis in the form
of three “principles of beneficial machines,” echoing Isaac Asimov’s three
laws of robotics from 1942, but with less naivete.

It is a recent trend to brush off Isaac Asimov's laws without an actual
critique. Which betrays a lack of concrete thought in the matter.

It's not that Asimov's laws are flawless, but that countering them seems
easier than it really is. I have seen various bad, hand-wave dismissals but I
can't recall any careful critiques.

I have the new book, Human Compatible, and I buy Russell's argument but note
that his rules are much more abstract, and therefore it is perhaps harder to
counter.

~~~
Shorel
Just read the Asimov novels that introduce the rules.

Every single one of them fails in dramatic ways and that's what drives the
story forward.

------
darepublic
The whole evil genie thing is pretty unlikely imo. You're postulating an AI
that is too stupid to not understand that it shouldn't take your wishes in
some dead literal sense (make my home eco friendly => bulldoze it and plant
some trees) but also do intelligent that it can summon the resources and make
those idle wishes come true in a terrible grandiose way. There are many
dangers around AI but this particular popular fantasy is not well thought out.

------
myself248
There's a 1946 sci-fi story called A Logic Named Joe, which nails this.

AIs are programmed to do what people ask unless it's bad, and one day, that
"unless" mechanism fails. Joe starts doing everything people ask of it.
Stalking that would make Zuckerberg proud....

[http://www.baen.com/chapters/W200506/0743499107___2.htm](http://www.baen.com/chapters/W200506/0743499107___2.htm)

------
foreverloop
Not really a comment super relevant to the article in the post, but seems to
touch on some interesting subjects and probably a good conversation starter:
one of the videos by Isaac Arthur - Technological Singularity (
[https://www.youtube.com/watch?v=YXYcvxg_Yro](https://www.youtube.com/watch?v=YXYcvxg_Yro)
)

------
sebringj
Is there such a set of rules/traits that are being defined to be maximized? I
remember Elon saying to "maximize human freedom" as a goal but I could be out
of context or misquoting.

------
justsomedood
The argument at the beginning of the article includes YouTube recommendations
front more extreme, but isn't the reason that happens because of these new
proposed 3 laws?

------
jacknews
this moves the conversation from 'arabian nights' to 'the tempest'

then what?

