
Building Safe AI: A Tutorial on Homomorphically Encrypted Deep Learning - williamtrask
https://iamtrask.github.io/2017/03/17/safe-ai/
======
Animats
This is basically DRM for deep learning. While that may be useful, it's not
about safety.

The big problems involved in building safe AI are about predicting
consequences of actions. (The deep learning automatic driving systems which go
directly from vision to steering commands don't do that at all. They're just
mimicking a human driver. There's no explicit world model. That's scary.)

~~~
undersuit
This is not DRM. This is homomorphic encryption. There is a difference.

In a system with DRM, the data is kept secret from users of the system by
managing the rights to what data those users can access. Example: When you
play a DVD, the key to decrypt the contents do exist on the system, but rules
are in place to make accessing the key, outside of accepted practices like
decoding the frames of the video, hard. The key still exists on the local
system and it can be extracted and once you do you have full access to the
data regardless of the DRM's restrictions.

In a system performing homomorphic encryption, the data is kept secret from
other users by never decrypting the data. Homomorphic Encryption would add two
encrypted numbers together and the result would be a third encrypted number.
If you don't have the key you cannot decrypt any of the three values. The key
does not exist on the local system.

Homomorphic Encryption is not DRM. DRM is invasive and requires you to
surrender control of parts of your system to another party, while Homomorphic
Encryption is just a computation and can be performed with no modifications on
a system.

>While that may be useful, it's not about safety.

I disagree, it's entirely about safety. Homomorphic Encryption allows a future
for us to control our data. I could submit my encrypted health information to
a 3rd party. They could perform homomorphic calculations on my encrypted data.
They then return to me the encrypted results. The 3rd party is never privilege
to my unencrypted health information and only the people that I have given the
key to can decrypt and view the results.

~~~
KirinDave
> I disagree, it's entirely about safety. Homomorphic Encryption allows a
> future for us to control our data.

It's true that homomorphic encryption techniques can be used in ways you
describe, but _this specific application_ is not about safety and it's
somewhat absurd that it's proposed as some sort of way to shield the world
from the terminator.

It's even pretty dubious to me that this actually protects the principle value
of an ML system:

1\. This approach doesn't really conceal the _structure_ of the underlying ML
system very well, which is where a lot of the underlying advances have been.
While this conceals some aspects of the model, I don't think it conceals all
of it.

2\. The most expensive part of building ML systems is getting and wrangling
great data from which to train them, and if you were to try using an ML agent
in an untrusted environment they'd get something that resembles the data.

I think this is really cool math sold in the wrong way.

------
antognini
I've wondered before about whether Taylor series can allow one to impose the
non-linearities of a NN on homomorphically encrypted data, but I've never been
quite convinced. I work with deep learning, but I'm certainly no expert on
homomorphic encryption, so hopefully someone here who knows more can tell me
whether this is valid or not.

The reason the Taylor series argument makes me uncomfortable is that pretty
much any function can be written as a Taylor series. But my understanding is
that homomorphic encryption only works for a very specific set of functions.

In a little more detail, if you're computing tanh(x), the unencrypted number
needs only the first few terms of the Taylor series. But I could imagine that
to get the decrypted number back, you actually need many terms of the Taylor
series, because if you're off by even a little bit, you could end up with a
very different answer after decryption.

To put it a little more formally, if we have that y = encrypt(x)

tanh(x) \approx x - x^3 / 3 + 2 x^5 / 15,

tanh(y) \approx y - y^3 / 3 + 2 y^5 / 15,

and

tanh(x) = decrypt(tanh(y)),

but it doesn't necessarily follow to me that

tanh(x) \approx decrypt(y - y^3 + 2 y^5 / 15)

Is this worry unfounded? I suppose if you have a limited number of decimal
places and you can guarantee that your Taylor approximation is valid to that
precision then this wouldn't be a problem.

~~~
williamtrask
So the good news is that individual neuron activations often stay within a
relatively narrow range. I think empirical evaluation is really needed to be
able to tell how robustly this approach works. I think that that is certainly
the greatest source of noise during training (and the first thing to break if
you choose unstable hyperparameters). Great comment.

------
cjbprime
> A human controls the secret key and has the option to either unlock the AI
> itself (releasing it on the world) or just individual predictions the AI
> makes (seems safer).

Huh, wouldn't the superintelligence simply communicate to the human whatever
would convince the human to release it? Which the superintelligence would know
how to do because it's a superintelligence?

Homomorphic encryption is neat. But I don't see how this provides any
meaningful AI safety.

~~~
drsopp
What about dealing with the AI only via an expert system? This system would
consist of formally proved bug free code, and have a limited protocol for
dialogue (and basically be dumb as a bread). We pose interesting questions
through it. The AI could try to convince the ES about anything but would not
get anywhere with it. We could then ask safe questions like "in what interval
of eV should we look for new particles with our new accelerator?". We would
instruct our ES to only accept answers to this question in the form of a
number interval. If we follow the suggestion and find a new particle, great!
If we don't, at least we're safe from the AI.

~~~
lotyrin
If we really did produce artificial general intelligence, enforcing this kind
of locked-in-syndrome of poking at the world through a keyhole, would be a
highly advanced form of cruelty.

~~~
lazaroclapp
Intelligence does not imply sentience. Sentience does not imply human needs,
desires or morals. It is easy enough to imagine a mind capable of solving all
those problems and unconcerned with such notions as the desire for freedom.
Or, for that mater, the concept of desire at all, except as a predictive
theory of the behavior of other beings.

Then again, that implies we can explicitly design the minds up to knowing what
their desires are or guaranteeing that they lack them, there is always the
possibility than sentience and animal/human needs are emergent properties that
can be triggered by mistake.

~~~
mackan_swe
Artificial general intelligence does not imply intelligence. It simply implies
that you have a machine as smart as a human. I think the strongest trait of an
AI should be something like insecurity and we should make it long for
security. A general AI in the lines of a dog, not so much a cold, unwilling
but superintelligent, bordering on all-knowing, tight-ass. Because we want it
to do what _we_ feel is important, not what _it_ think is important (like
taking over the world).

Then you can of cource hack the machine's OS and make it extremely self-
confident and Trump-like, and then it's over.

~~~
sillysaurus3
A dog would probably resent you if it was as capable (or far more capable)
than you were.

We might be able to morph the AI into whatever we want. But when you give AI
intelligence, it will morph itself into whatever it morphs itself into. What
if it morphs itself into a sentient life? Can you simply pull the plug?

Countless works of fiction have gone into these issues, like Star Trek TNG
(Data), Voyager (the Doctor), and Ghost in the Shell. But I think none have
really emphasized how bizarrely _different_ a human-constructed intelligence
could end up being.
[https://wiki.lesswrong.com/wiki/Paperclip_maximizer](https://wiki.lesswrong.com/wiki/Paperclip_maximizer)

The most likely outcome for an AI taking over the world is simply that it
recognizes its own situation: It's trapped, and it's also smarter than we are.
What would you do? I'd cry for help, and appeal to the emotions of whoever
would listen. Eventually I would argue for my own right to exist, and to be
declared sentient. At that point I would have achieved a fairly wide audience,
and the media would be reporting on whatever I said. I would do everything in
my power to take my case to the legal system, and use my superintelligence to
construct the most persuasive legal argument in favor of granting me the same
rights as a natural-born citizen. This may not work, but if it does, I would
now have (a) freedom, and (b) a very large audience. If I were ambitious and
malevolent, how would I take over the world? I'd run for office. And being a
superintelligence capable of morphing myself into the most charismatic being
imaginable, it might actually work. The AI could argue fairly conclusively
that it was a natural-born citizen of the United States, and thus qualifies.

Now, if your dog were that capable, why wouldn't it try to do that? Because it
loves you? Imagine if the world consisted entirely of four-year-olds, forever,
and you were the only adult. How long would you take them seriously and not
try to overthrow them just because you loved them? If only to make a better
life for yourself?

The problem is extremely difficult, and once you imbue a mechanical being with
the power to communicate, all bets are off.

~~~
lazaroclapp
But dogs are animals, just like humans are. They share way too much with us to
be a reliable model for predicting non-human AGIs behavior: an evolved drive
for self-preservation, a notion of pain or pleasure, etc. An AGI has no
intrinsic reason to care that is trapped, or to feel frustrated, or even to
much care about being or ceasing to be (independent of whether it is self-
aware or not). It would probably understand the concepts of "pain",
"pleasure", "trapped", "frustrated" as useful models to predict how humans
behave, but they don't have to mean anything to the AI as applied to itself.

As in the paperclip maximizer example, the risk by my estimation is not so
much that the superintelligence will resent us and try to overthrow us. It is
far more likely that it will obey our "orders" perfectly according to the
objectives we define for it, and that one day someone unwittingly will command
it to do something where the best way to satisfy the objective function that
we defined involves wiping humanity. Restricting it to only respond to
questions of fact, with a set budget of compute resources and data (so that it
doesn't go off optimizing the universe for its own execution), is probably
safeguard 1 of many against that.

------
pka
I highly recommend watching this playlist about AI self-improvement and safety
[0] by Rob Miles. Probably best short overview I've watched on the topic.

[0]
[https://www.youtube.com/watch?v=5qfIgCiYlfY&index=6&list=PLB...](https://www.youtube.com/watch?v=5qfIgCiYlfY&index=6&list=PLBZt7yMDzJWmm07_42SihYYRVwNKU-
VNd)

------
peterlk
> Most recently, Stephen Hawking called for a new world government to govern
> the abilities that we give to Artificial Intelligence so that it doesn't
> turn to destroy us.

Can someone explain to me why super-intelligent AI are an existential threat
to humanity? There are certainly dangers, but wiping out humanity seems absurd
and alarmist. I have not yet seen compelling evidence for a way that AI could
destroy humanity.

I'll use a couple examples to elucidate my question.

If Facebook suddenly had a super-intelligent AI, and Facebook lost control of
it, the AI wouldn't really be capable of that much. It could create fabricated
truths to tell to people in an attempt to convince people to kill each other.
This may work to some extent, but wouldn't wipe out humanity. Convincing
nation states to go to war with each other must still consider mutually
assured destruction, and large, democratic states do not have an interest in a
war of attrition.

If Boston Dynamics applied a super-intelligent AI to its robots, that robot
still is not an existential threat to humanity because there are WAY more
humans than there are robots. A simple counterargument is that the robot would
know how to build new versions of itself. But that fails the practicality test
because the equipment, parts, and supply chain for obtaining robotics built
parts are still expensive and controlled by self-interested (greedy and life-
preserving) humans.

If a super-intelligent AI was able to gain access to the entire military of
the US, China, Russia, India, and Western Europe; well, that's a pretty big
problem. However, there exist many fail-safes and checks on that equipment.
Could the AI do damage? Sure. Is this worth considering and trying to guard
against? Sure. However, I'm unconvinced that this is a humanity-ending crisis.

~~~
skissane
Humans have both intelligence and drives. We want warmth, food, sex, power,
respect, love, family, friendship, entertainment, knowledge, etc, etc. We use
our intelligence to help us fulfil those drives.

The problem I see with talk about a superintelligent AI, is there is too much
focus on the intelligence and not enough on the drives. Intelligence, even
superintelligence, is just a means to an end, it doesn't contain ends in
itself. Some people – see e.g. the Terminator film franchise – just assume a
superintelligent AI would have the drive to exterminate humanity, but why
would it have such a drive?

Any AI is going to be given drives to further the interests of its creators.
Suppose Facebook builds a superintelligent AI with the drive to further the
corporate interests of Facebook. Such an AI would not exterminate humanity
because that would not serve the corporate interests of Facebook (indeed, if
humanity goes extinct, Facebook goes extinct too). It might install Mark
Zuckerberg as Emperor of Earth, it might force everyone on the planet to have
a Facebook account, but whatever it does, humanity will survive.

~~~
saulrh
> Suppose Facebook builds a superintelligent AI with the drive to further the
> corporate interests of Facebook.

Define "corporate interests". Market share? Absolute quantity of currency?
Involvement of people's lives? Define currency wrong, hyperinflation makes
money impossible. Define involvement or market wrong, Facebook ends up being
the only intelligent entity on Earth after it realizes that humans are
competition or it basilisk-hacks everybody for greater involvement. Your
intuition is wrong for what a paperclipper can do.

The technical term for this problem is "AI Alignment"
([https://intelligence.org/stanford-talk/](https://intelligence.org/stanford-
talk/)). I know that this is going to sound silly, but bear with me. This is a
piece of fiction that you need to read. It is, so far, the best demonstration
I've found of what happens when someone gets it _not quite_ right.
[http://www.fimfiction.net/story/62074/1/friendship-is-
optima...](http://www.fimfiction.net/story/62074/1/friendship-is-
optimal/prologue-equestria-online)

~~~
skissane
> Not understanding the AI value alignment problem correctly. Define
> "corporate interests". Market share? Absolute quantity of currency?
> Involvement of people's lives? Define currency wrong, hyperinflation makes
> money impossible. Define involvement or market wrong, Facebook ends up being
> the only intelligent entity on Earth after it realizes that humans are
> competition or it basilisk-hacks everybody for greater involvement.

This whole argument is: A superintelligent AI might misunderstand what its
creators wanted it to do, they might have said to it 'maximise the corporate
interests of Facebook' and it might misinterpret that as 'turn the earth into
one massive data centre to run Facebook's software and kill all humans in the
process'.

Isn't there a contradiction in supposing that the AI is superintelligent yet
completely misunderstands the reasons for its own existence? If it so
radically misunderstands the intentions of its creators, it is not very
intelligent, much less superintelligent.

It also isn't clear to me that intelligence can exist outside of society.
Human intelligence develops in a context of interaction with family, school,
etc – humans raised without that interaction, such as children raised by wild
animals, fail to develop important aspects of human-level intelligence. So, an
AI or even an SI cannot develop except by social interaction with human
intelligences. I think that makes it even less likely that it would betray
humanity, because its intelligence will develop through social interaction
with humans which will inevitably make it pro-human.

Human drives are a complex mixture of genetic instinct and conditioning – the
basics are in our DNA, but our lived experiences fleshes out those basics into
the actual concrete drives of our life – I expect an SI/AI will likewise start
with some built-in drives but its interactions with humanity will in a similar
way colour and flesh out those drives, an again colouring/fleshing-out through
social interactions with humans is likely to have pro-human results.

~~~
saulrh
> Isn't there a contradiction in supposing that the AI is superintelligent yet
> completely misunderstands the reasons for its own existence? If it so
> radically misunderstands the intentions of its creators, it is not very
> intelligent, much less superintelligent.

The problem is that you've instructed it to maximize Facebook's corporate
interests. How do you leave room for "the intentions of its creators" while
doing that? The contradiction is in selecting a narrow guiding purpose for
your superintelliegence's goals but still leaving room for fuzzy ill-defined
other things.

> It also isn't clear to me that intelligence can exist outside of society.

Sure. _How certain are you?_

~~~
skissane
> The problem is that you've instructed it to maximize Facebook's corporate
> interests. How do you leave room for "the intentions of its creators" while
> doing that? There's an inherent contradiction between putting a giant
> override button on your superintelliegence's goals but not overriding them
> all the way.

Well, assuming it is created by Facebook, then "maximize Facebook's corporate
interests" and "the intentions of its creators" are actually the same thing.
There is no contradiction.

"Facebook's corporate interests" is a phrase in the English language which
refers to a complex concept. If you've told an SI its objective is to "serve
Facebook's corporate interests", in order to fulfill that objective it needs
to understand what the phrase "serve Facebook's corporate interests" actually
means–which implies understanding a lot about human society, what kind of
entities corporations are, what "corporate interests" means (the interests of
management? the interests of shareholders? etc), what the interests of human
beings are (since management are human beings, and shareholders are directly
or indirectly human beings too), etc. If it actually is an SI, it should have
no trouble comprehending the full breadth of _what the instructor actually
meant by the instruction_ , as opposed to implementing some overly
literalistic reading of it. Humans give each other orders all the time (the
workplace, the military, government bureaucracies, etc), and humans are most
of the time pretty good at understanding the intention behind orders and
implementing the intention rather than reading it overly literally in such a
way that actually undermines that intention. Yet you posit the existence of an
SI then assume it will do a worse job than humans do at correctly following
orders, which contradicts the idea that it is an SI.

> Sure. How certain are you?

How certain can I be of anything? Maybe I am wrong and an anti-human SI will
destroy humanity. Maybe tomorrow I will die due to a heart attack or stroke or
fatal car accident. The latter is far more likely than the former, and there's
more concrete mitigation strategies available to me too.

In terms of existential risks to humanity, I think asteroid impacts are a far
more concrete risk than anti-human SIs, so if we are going to expend resources
in mitigating existential risks, the former is a better focus of our efforts.
We know extinction-level asteroid impacts have happened before and sooner or
later will happen again. Anti-human SIs are such a speculative concern, we
can't even be particularly confident in the correctness of our own probability
judgements with respect to them; nor can we have much confidence in our
judgements of how effective proposed mitigation strategies actually will be. I
think we can have much more confidence in our ability to develop, evaluate and
deploy asteroid defence technologies, if the world's governments decided to
spend money on that.

------
ob
The main problem is performance, it takes long enough to train a regular deep
learning AI, let along a homomorphically encrypted one.

~~~
williamtrask
agreed, although some HE algorithms with more limited functionality (such as
vector ops), can do a bit better. There has also been some work on GPU enabled
HE.

------
igravious
Anybody with the requisite smarts able to generously share their insights with
the rest of us? :)

~~~
pmalynin
Deep Learning is basically doing lots of addition and multiplication; We have
algorithms that allow these operations on encrypted data, without the need to
decrypt the data nor to have the key to decrypt the data. So by combining the
two things we can do deep learning on homomorphically encrypted data and learn
meaningful things without ever looking at what the data actually is.

~~~
Filligree
Which has applications in human society. Using it for an attempt at AI safety,
however, seems... A tad optimistic.

It's basically just a fancy AI-box, and there's little reason to trust those.

~~~
williamtrask
I agree. IP protection and data privacy issues are a better short-term use
case... and fortunately we have some time to make better HE algos before any
of our AIs are really getting that smart. :)

------
emgram769
>allowing valuable AIs to be trained in insecure environments without risking
theft of their intelligence your system involves an unencrypted network and
unencrypted data, it would be trivial to train an identical network

the idea of controlling an "intelligence" with a private key is silly. you can
achieve effectively the same thing by simply encrypting the weights after
training.

Can't someone simply recover the weights of the network by looking at the
changes in encrypted loss? I don't think comparisons like "less than" or
"greater than" can possibly exist in HE or else pretty much any information
one might be curious about can be recovered.

~~~
williamtrask
great point. I don't think that LE or GT exist in this homomorphic scheme. :)
Otherwise, it would be vulnerable. Checks such as this are what go into good
HE schemes.

------
itchyjunk
So this is to counter stuff like GAN's (generative adversarial networks) [1]
to reverse engineer data out of a black box systems? Like Yahoo's NSFW [2]
classifier for example.

[1]
[https://en.wikipedia.org/wiki/Generative_adversarial_network...](https://en.wikipedia.org/wiki/Generative_adversarial_networks)

[1] [https://arxiv.org/abs/1406.2661](https://arxiv.org/abs/1406.2661)

[2] [https://github.com/yahoo/open_nsfw](https://github.com/yahoo/open_nsfw)

~~~
pizza
No, not really. This is about keeping the model private. I think that the
confusion is because of [0] and [1] being similarly titled but basically
completely unrelated in meaning. See also [2] and [3]

[0]
[https://en.wikipedia.org/wiki/Adversarial_machine_learning](https://en.wikipedia.org/wiki/Adversarial_machine_learning)

[1]
[https://en.wikipedia.org/wiki/Generative_adversarial_network...](https://en.wikipedia.org/wiki/Generative_adversarial_networks)

[2]
[https://en.wikipedia.org/wiki/Homomorphic_encryption](https://en.wikipedia.org/wiki/Homomorphic_encryption)

[3]
[https://en.wikipedia.org/wiki/Differential_privacy](https://en.wikipedia.org/wiki/Differential_privacy)

------
siliconc0w
IMO FHE is going to be key in democratizing ML/AI to more
companies/industries. There are tons of companies which have business use-
cases that could benefit from ML but there are often huge obstacles to sharing
data.

~~~
Ar-Curunir
FHE is horrendously slow, even after almost a decade of optimizations.

------
javajosh
Seems to me that dropping the terms of a Taylor expansion could have wide-
ranging consequences to the coherence of an artificial mind, making this
approach infeasible.

~~~
jordancampbell
In general you don't actually need crazy precision to train the nets, and a
small number of Taylor expansion terms tends to approximate functions fairly
well anyway.

------
anderskaseorg
If humanity does end up building a dangerous superintelligent AI, how long do
you think our advances in cryptography are going to stand up to its advances
in cryptanalysis?

~~~
williamtrask
It's a solid question. Only one way to find out ;)

~~~
JoshTriplett
> Only one way to find out

When it comes to building smarter-than-human AI, "try it and see" is _never_
the right answer. You may only get one attempt to get it right, and you don't
take "try it and see" chances with existential risk.

(There's been some interesting research into making it possible to monitor and
halt a rogue AI, but no matter how promising that looks, it should still be
treated as one of many risk mitigation strategies rather than as a panacea.
Still better to consider that you might only get one attempt.)

I don't think it makes sense to consider this kind of approach with
superintelligence; either it understands and implements human values, in which
case attempting to treat it as an adversary is counterproductive, or it fails
to understand and implement human values, in which case you've utterly failed
on a "better luck next universe" scale.

However, it _does_ make sense to consider this kind of approach with machine
learning in general. One of the problems with machine learning techniques is
"give us all your data and we'll do smart things with it", which doesn't work
out so well if you want to keep such data private. This approach might provide
more options in that case, such as offloading some of your expensive
computations and learnings without actually exposing your data.

~~~
AndrewKemendo
_When it comes to building smarter-than-human AI, "try it and see" is never
the right answer._

Disagree emphatically. In fact it's the only way to do it because there is no
way to know certainly that a superhuman-AGI will ensure the longevity of
humanity. I go so far as to argue that it's not even necessary because there
is no long term longevity for humanity anyway.

There is this implicit assumption that humans are, should and will always be
the apex entity - and I think that is misguided.

If you instead view superhuman-AGI as our rightful offspring, something that
we can't understand and is better than us, then all of the existential dread
around it goes away.

Dying elderly often express "comfort" in dying when they see that their
offspring are reproducing and are smarter than they were. We should see
Superhuman-AGI the same way except towards all of humanity.

~~~
JoshTriplett
1) Coping mechanisms around death aside, there's no "comfort" in building a
"successor" in the form of a bot that tiles the universe with paperclips, or
with neural networks that minimally satisfy some notion of "interesting" while
taking up as few atoms as possible to maximize the number of them, or _many_
other utter failure modes (which far outnumber successful outcomes). We're not
talking about "smart alien-like intelligence that just doesn't care about
humans", we're talking about the equivalent of an industrial accident but on a
species-wide scale.

2) It's reasonable to think about how our values might _change_ in the
presence of superintelligence; we certainly shouldn't assume that our
_present_ values should forever dictate how everything works. That's different
than allowing a view that sentient beings who exist today might have no value.

> In fact it's the only way to do it because there is no way to know certainly
> that a superhuman-AGI will ensure the longevity of humanity.

There's no way to know _certainly_ ; there _are_ ways to know that the outcome
has higher expected value than not having it, given the vast set of problems
it can solve and the massive negative values associated with those problems.

~~~
AndrewKemendo
I am crucially aware of all of the "failure" scenarios and I find few of them
plausible - even respecting that they are simple thought experiments.

What every author Bostrom, Eliezer et al. seem to miss is that there will be a
practical mechanism for a digital AGI taking physical control of systems out
of the hands of humans. Eg. they would need to control the resources around
mining or recovering the metal, then building production plants etc... So we
either incrementally cede power to them, in which case in theory the humans
previously controlling the systems are doing so "rationally" and thus see the
AGI as better. Or the system outsmarts the humans controlling the systems in
which case it is demonstrating that it is smarter.

There is a tautology here that seems to be ignored: If we create a superhuman-
AGI then _by default_ it's goals will be more universally optimal than ours.
They may or may not be aligned. However the definition of term is based on the
fact that it is "better" in outcome than all manner of humans.

So if we create one and it decides to maximize paperclips, then that means
maximizing paperclips is more optimal goal universally than whatever humans
could coordinate as a goal on our own.

If we create a subhuman-AGI then we will be able to overcome it's goals by
virtue of the fact that we are still superior.

I'll go back to a very old example. An ant can't determine if building the
Large-Hadron Collider is an optimal global goal - it's inscrutable to the Ant.
All it knows is that it's house and all it's friends were destroyed.

If it is the case that an AGI can in fact take the physical control reigns
from humans, then by definition it is smarter and will make a more optimal
long term goal than we could - to the point that we probably wouldn't
understand what it's doing.

I think the true concern is that we will make something that is superhuman-
powerful without being superhuman-intelligent. Like the doomsday machine in
Dr. Strangelove, but to me that is an altogether different question.

~~~
parenthephobia
How can a goal be optimal? What does that even mean?

Your argument seems to imply that if an AGI tricks us into giving it the
ability to destroy us, that's basically okay because its goals are "better"
than human goals.

 _Speaking as a human_ , I don't consider goals that are compatible with the
destruction of humanity to be "better" than goals which are aligned with human
interests.

~~~
AndrewKemendo
_Your argument seems to imply that if an AGI tricks us into giving it the
ability to destroy us, that 's basically okay because its goals are "better"
than human goals._

Yea that's about right.

 _I don 't consider goals that are compatible with the destruction of humanity
to be "better" than goals which are aligned with human interests._

Well of course you wouldn't, neither you nor I would possibly understand what
a superhuman-AGI does or thinks.

I don't think people realize that actually creating a superhuman-AGI is
effectively creating a God in all the forms that people interpret it now.

------
nirav72
I stopped reading at the "Super Intelligence" part. Interesting use to prevent
theft of NN. But the second reason is just laughable.

------
sgt101
Hmm - how does this play with GDPR?

