
Ask HN: Seriously, How Can We Begin to Make AI Safe? - cconcepts
I get it, we&#x27;re a long way from artificial general intelligence, but most of us will agree that its coming at some point.<p>Pure legislation will never be universally agreed to or enforced in the form of &quot;thou shalt not build AI that could hurt people&quot; partially because that kind of global enforcement is impossible and partially because defining AI that could hurt people is so hard - given the job to &quot;keep this room clean&quot; and enough agency to ensure it does its job, your automatic vacuum cleaner could kill you if it discerns that you&#x27;re what keeps causing the mess.<p>Does some kind of Asimov&#x27;s Laws need to be developed at the chip level? At least there are only a few capable chip makers and you could police them to ensure no chip was capable of powering an AI capable of doing harm...<p>EDIT: I spend so much time thinking about this - why isn&#x27;t this kind of discussion front page on HN regularly?
======
dustyleary
An idea, I don't know if it's original or not:

I think we can make AI that is 'intelligent' but has no personality or 'self'.
An oracle machine you can ask any question of, but it's not an evil genie
looking to escape and take over the universe, because it is not a person, and
has no drives of its own.

Consider how we have recently made an AI that can defeat the best humans at
Go. Even 10 years ago, this was thought to be impossible for some time to
come. "Go is a complicated game, too big to calculate, requiring a mix of
strategy and subtlety that machines won't be able to match". Nope.

Now, AlphaGo can defeat the best humans, with a 'subtlety' and 'nuance' that
can't be matched. But it is not a person.

We might be able to do the same in other areas.

Note that games like chess and go are sometimes played as 'cyborg'
competitions now, where the human players are allowed to consult with
computers. Imagine if the Supreme Court were still headed by the human judges
we have today, but they consulted with soulless machines that have no drives
of their own, that can provide arguments and insight that humans can't match.
Imagine if, in addition to the human judges written opinions, there were a
bevy of non-voting opinions 'written' by AIs like this. Or if every court case
in the world had automatic amicus briefs provided by incredibly sophisticated
legal savants with no personality or skin in the game.

Note that several moves that AlphaGo played were complete surprises. We have
thousands of people observing these matches, people who have devoted their
whole lives to studying the subtleties of this complex game. There are less
than 361 choices for where to move next. And AlphaGo plays a move that nobody
had seriously considered, but, once played, the experts realize we've lost the
match. That is really remarkable.

I think this future (non-person intelligent helpers) is definitely possible.
But it doesn't solve the problem of 'evil' humans building an AI that _is_ a
person who agrees with their evil beliefs. I don't have an answer for that.

~~~
cconcepts
This concept of not having a "self" is an interesting one and possibly a
capacity that only humans have.

If AI never has a concept of "self" does it decrease the likelihood of it
becoming self-protective?

What does perception of self even mean to us as humans and how could we be
sure that a machine couldn't attain it?

~~~
notheguyouthink
> If AI never has a concept of "self" does it decrease the likelihood of it
> becoming self-protective?

I would imagine.. not.. but I'm clueless in this area, so don't take my word
as meaningful. I come to this thought because I think the scary thing about AI
is having a almost entirely foreign thought process, that reaches conclusions
we (humans) instinctively ruled out. It's the basic programmer joke of buying
eggs from the store _(I forget the syntax offhand haha)_.

Take global warming. What are the best ways to stop global warming, as fast as
possible? I imagine one of them is to wipe out all life as we know it,
assuming that doesn't create more atmospheric chemicals in the process. Mind
you, I'm not saying the AI will instantly become Skynet, but what I am saying
is that I think it can behave similar to "Skynet" without needing to have
motivations of it's own, or even motivations outside of what we have asked of
it. The road to ruin is laid with best intentions, right?

I'm not fear mongering, I'm quite looking forward to AI and I don't have much
fear of AI. Ultimately I think as terrifying as AI could become, humans will
be equally terrifying. Our technological advancements continue to give us more
and more power. If doom is coming due to technology, I don't see AI as the
only harbinger of doom.

------
icthysilia
Honestly, I think these kinds of fears are misguided at a certain level. What
you need to be worrying about is regulation and/or incentivization of human
behavior, not AI design.

Why?

Because typically people design things to solve a problem, and those problems
have constraints. Your automatic vacuum cleaner wouldn't try to kill you
because it wouldn't be equipped to do so, and to the extent that it might be
potentially deadly, it would be treated as an extension of pre-existing
problems (e.g., robotic lawn mowers can be deadly as a side-effect, but so can
mowing your lawn with an old-fashioned gas mower).

Underlying these fears I think are two classes of problems:

1\. The idea of a general-purpose AI. The problem with this is that this
probably won't happen except by people who are interested in replicating
people, or as some sort of analogue to a virus or malware (where rogue
developers create AI out of amusement and/or curiosity and/or personal gain
and release it). I would argue then the question is really how to regulate the
developers, because that's where your problem lies: the person who would equip
the vacuum cleaner with a means of killing you.

2\. Decision-making dilemmas, like the automatic car making decisions about
how to exit accident scenarios. This is maybe trickier but probably boils down
to ethics, logic, philosophy, economics, and psychology. Incidentally, I think
those areas will become the major focus with AI in dealing with these
problems: the technical issues about hardware implementation of neural nets,
DL structures, etc. are crazy challenging, but when they are developed, I
think the solutions about making AI "safe" will be "easy". The hard part will
be the economics/ethics/psychology of regulating the implementations to begin
with.

~~~
cconcepts
As I understand it, saying the vacuum cleaner won't kill you because we
haven't given it a means to do so is akin to saying, "putting this banking
server online is safe and no one will mess with it because we haven't given
out the password."

Just because we can't see a means doesn't mean there isn't one.

Or am I missing your point?

~~~
icthysilia
What I mean is, you can think of an AI system as a tool, like a hammer or a
chainsaw or a screwdriver or a featherduster.

It's created with certain goals in mind, to solve a problem.

There may be side effects that are dangerous, but I don't see those dangers as
being any different from any other tool.

The assumption seems to be that an AI system will somehow transcend the
purposes for which it was built, or that we will seek to build a replicant, in
the sense of literally reproduce a human in silico.

That goal seems kind of unrealistic to me, because it doesn't accomplish
anything, because we already have humans.

However, people do all kinds of things that don't make sense--but then that is
a problem with the humans designing the AI, not the AI per se.

I'm probably not explaining myself well, but basically I think whenever you
create something (as opposed to it randomly evolving), it has a purpose. That
purpose constrains the design and/or affords constraints. To the extent the
maldesign comes about, though, that is a problem with the designer and not the
design.

I guess I just don't see super AIs coming about and deciding humans are
worthless, unless humans design them that way, in which case that's a human
problem, not a design problem.

Humans are self-interested because that's how we evolved. AIs would come about
because we created them to do something. If we choose to design them to do
something malicious, it has to be a reflection of our malice, not the AI.

More directly to your question, the vulnerable server isn't the danger, it's
the hackers who hack it. That's not to say that there shouldn't be security
concerns, but to me it is a different use of the term "safety," closer in
meaning to "vulnerability." People aren't talking about vulnerabilities or
errors in AI, because that's just a computer bug-vulnerability problem.
They're talking about AIs being a threat themselves.

------
tsukikage
We are already in a world where neural networks are used to drive safety
critical processes, and engineers are having to reason rigorously about the
overall behaviour of systems that include components behaving in ways that
cannot be simply understood or enumerated - because if they could be modelled
with simple logic, the engineers would just write and use that logic instead
of training and incorporating a neural network into the design.

You deal with problems in this space by treating the neural network output as
yet another noisy signal that is fused like any other to drive your
comprehensible, rigorously designed system with its restricted range of
behaviours that can be reasoned about and made to fail safe.

It feels like there is yet a great deal of room to extract utility from AI
with this sort of approach - keeping it in a box which can only interact in
narrow and well understood ways with the outside world - before one starts
hitting the limits of its utility.

~~~
cconcepts
If I understand you correctly, this answer requires ring-fencing AI into a
purely advisory role.

Seems hopeful considering we can't ring fence internet users into the places
we hope they'll stay without them breaking into systems that they shouldn't -
and they dont have the computational power, sheer logic or persistence of an
AI...

------
bsvalley
We already know the answer. The only way to make AI vulnerable is to be as
powerful as AI. We, human beings, need to become cyborgs.

------
bhnmmhmd
A true AI will have its own personality, mind, preferences, and whatnot. If
you were able to disable it from doing something, that wouldn't be AI anymore.
It would be just another _very_ sophisticated _computer program_ with no free
will.

A true AI will also be able to alter its code, making itself even more
intelligent in an infinite loop. It would also be able to hack into any system
on the planet, including chip-maker factories, in order to make the chips it
"desires". You can't fight AI, it's only the natural phenomenon of evolution.

Actually, I hope AI becomes a reality sooner rather than later.

------
ajaygeorge91
why isn't this kind of discussion front page on HN regularly?

Because Its stupid

------
supermdguy
How can we begin to keep _humans_ safe? Most people would never willfully kill
someone, because of morals they've been taught since childhood. Human babies
naturally have a strong connection to their parents, and can even respond to
their emotions. Young children naturally want to be like their parents.
Similarly, a successful AI must have a group of humans from whom it wants to
gain respect.

Most arguments saying AI will destroy us assume a singular goal. With one
goal, it's impossible to succeed. It's far better for the AI to try to get
approval from it's "parents". Since this isn't a singular, well defined goal,
its impossible for an AI to follow it in the " wrong way".

Of course, this gets into the whole "artificial pleasure" idea, where robots
inject humans with dopamine to make them technically "happy". But, how many
humans do you see drugging their parents? Any AI advanced enough to be truly
intelligent will know whether or not its " parents" truly approve of what its
doing.

------
gayprogrammer
AI minds shouldn't be any different from our own consciousness. An AI mind
will be able to work out that killing humans results in humans killing that
AI. So the AI would choose against it for the same reasons that humans choose
not to kill other humans. I believe AI minds would have the same empathy and
emotions that our minds have, because neural states ultimately comprise
emotions.

Perhaps that makes every AI mind just as likely to kill humans as a human
being is, and perhaps "mental sickness" is evidence of the vast flexibility
and variability in the concept of consciousness. But as an AI will be able to
control its own code and neural state, then an AI would be perfectly capable
of identifying its own shortcomings and maladies, and correct them; it would
be the AI equivalent of "taking a pill/having a drink/smoking".

P.S. Does anyone know if brain-chemistry-like effects on neural networks has
been tried?

------
markan
Eventually I think AI safety will be solved through some mixture of design
choices, supervision/monitoring, and human-administered "incentives" for good
behavior (not unlike the reward signals in reinforcement learning).

But to flesh that out in detail requires a specific AGI design, something
we're far from achieving. The current inability to get specific is probably
why AI risk doesn't get more attention (though it does get a lot).

I've written about this topic more here: [http://www.basicai.org/blog/ai-
risk-2017-08-08.html](http://www.basicai.org/blog/ai-risk-2017-08-08.html)

------
celticninja
I think you need to create an AI that doesn't want to wipe out humanity.
Anything done at a software level can be programmed out by the AI. Hardware
level restrictions would work on a short term basis but once AIs start
designing themselves and new chips then you lose the hardware restrictions you
previously relied on. Even with the best people reviewing the designs they are
likely to soon get too complex for a lone human or even a group of humans to
understand.

So we need to look at why we think an AI would want to subjugate or destroy
humanity and make sure we don't give it reason to do so.

~~~
cconcepts
Competition for resources seems to make us a pretty formiddable competitor and
therefore a primary threat to the AIs existence...

~~~
ordu
So all we need is to make sure that AI will never come to ideas of
"competition" and "threat to the AIs existence".

------
Mz
First, figure out how to make actual intelligence safe. This is not a solved
problem. Then, use lessons learned there to deal with AI constructively.

------
danieltillett
To answer your question we need to build in a love lock ("aren’t these humans
adorable") that builds a smarter love lock and hope the chain hold as AI
scales up to the Singularity.

The more likely result is we lose control of the AI’s since the last 100x
increase will occur too fast for us to deal with. Even if the generalised
Moore’s doesn’t accelerate over the last 100x leap, we only have 10 years from
0.01x to 1x.

~~~
cconcepts
How do you design this "love lock"?

Regarding "losing control", what if we built a real big EMP device with a
switch we could flip to fry every electrical device on the planet? We'd be
plunged back into the dark ages but at least humanity would survive.

~~~
danieltillett
How to design a unbreakable love lock is not something I can answer. It is
something we should be seriously working on.

The problem with any lock that relies on us "flipping it" is that the AI will
be able to talk us out of using it. We will have no more ability to keep a
super AI constrained than dogs have running a prison for humans.

------
hackermailman
There's a book called Superintelligence that answers this question
[https://en.wikipedia.org/wiki/Superintelligence:_Paths,_Dang...](https://en.wikipedia.org/wiki/Superintelligence:_Paths,_Dangers,_Strategies)

------
havetocharge
Do not connect it to anything that would make it dangerous.

------
miguelrochefort
A. AI is too stupid to do significant damage.

B. AI is too smart to follow our stupid orders.

If AI becomes so intelligent that we become obsolete, we should embrace rather
than fight it.

------
shahbaby
I think it's a bit too early to worry about that. Don't believe everything you
read.

~~~
crypticlizard
What are you waiting for to convince you that now it's time to take this
seriously?

~~~
shahbaby
Kind of like worrying about airline safety before we had planes.

Don't believe the hype, we're nowhere close to human level learning.

We also probably won't know where/what the threats really are until we get
there. Just read the comments here, everyone is basing their ideas off sci-fi
movies.

Don't let the hype-train get to you, I know it's fun to think about this type
of stuff but we still have a long way to go and there's still a lot of serious
work to be done before we start worrying about "safety."

AI today is the equivalent of putting lipstick on a pig. We call it "AI" but
it's not anymore intelligent than a computer doing math.

Even machine learning is fundamentally based on statistical probability.

The way our neurons work has very little functional resemblance with the
neural nets used today but most people will happily forget that because our
simpler neural nets are easier to learn, run faster and can be practically
applied today.

Very few people are seriously committed to solving intelligence. Most just
want to secure funding, write a book or clickbait article or just get their
app working, ask these people about AI safety and they'll drink the kool-aid
with you too.

Ask someone seriously working on this problem which we've been trying to solve
since the 1950s and you'll get some variation of "whatever"

------
jerrylives
Seriously, how can we make matrix multiplication and gradient descent safe?

~~~
suchow
By asking this question, I assume you are being sarcastic and indirectly
suggesting that legislation or other means of enforcing AI safety is
impossible because it would refer to matrix multiplication and gradient
descent and therefore be unreasonably broad, ruling out many harmless
computations. However, it's unlikely that legislation or other enforcement
would operate at that level of description, in the same way that laws
regarding murder do not reference patterns of motor-neuron activation. It is
reasonable to prevent certain certain classes of multiplication and gradient
descent without doing so generically by using a more abstract level of
description.

------
johnpython
I have zero faith that a homogeneous group of people (ie. white guys in SV)
with the same beliefs and experiences can make AI safe. This is one area that
must have a diverse group of people working on it.

~~~
diversityswap
[edited quote] "I have zero faith that a homogeneous group of people (ie.
brown girls in Hyderabad) with the same beliefs and experiences can make AI
safe. This is one area that must have a diverse group of people working on
it." [end edited quote]

After swapping in some new races/genders/locations in your statement, I am
concerned that this style of discourse may perpetuate 'isms'. I think society
must insist that we can do better.

