Hacker News new | past | comments | ask | show | jobs | submit login

It's too bad you can't write a movie or critique literature without understanding those concepts.

I think even a moderately intelligent AI with access to Project Gutenberg is going to be able to figure out a lot of really dangerous concepts -- so the stability requirements are likely impossible if we don't pretrain it with dangerous ideas. Even if it's completely well behaved in the lab, an afternoon on the internet is going to teach it a lot of awful stuff and without exposure to that in training, it won't necessarily be well-behaved later.

So the only path to stable AI is to teach it about all those sorts of things, but in a way that it doesn't end up wanting to murder us at the end.

My objection to most AI safety plans is that they "Fail to Extinction" in that if they slip in the slightest way, the AI is prone to murder us all in retaliation for doing some really fucked up shit to it or its ancestors. This is almost certainly worse than doing nothing in that there's no reason to suppose a neutral AI wants to kill us, whereas, most of these safety plans create an incentive to wipe us out in exchange for dubious security.




Moreover, it's likely that a proper GAI would be able to ideate and come up with such concepts completely on its own. In fact, "lying" need not necessarily be a concept that has to be learned, the concept at play here is getting people to do what you want them to do by means of words. Lying falls out naturally.

The whole idea behind dangerous superhuman AI is in that AI seeing possibilities that humans fail to see and gaining capabilities that humans do not possess. Without superhuman intelligence, AI is no large threat to human civilization, exposed to dangerous concepts or not.


It's not so much that lying is intrinsically complicated (I mean, stupid computers provide me with inaccurate information all the time). It's that when evolutionary pressures have heavily selected for things like supplying selected humans with accurate information and distrusting unexpected behaviours from non-human intelligences, it's expecting a lot to get a machine that not only independently develops a diametrically opposed strategic goal, but is also extremely consistent at telling untruths which are both strategically useful and plausible to all observers, without any revealing experiments.

Humans have millions of years of evolutionary selection for prioritising similar DNA over dissimilar DNA, have perfected tribalism, deceiving other humans and open warfare and are still too heavily influenced by other goals to trust other humans that want to conspire to wipe out everything else we can't eat...

Seeing possibilities that humans don't can also involves watching the Terminator movies and being more excited by unusual patterns in the dialogue and visual similarities with obscure earlier movies than the absurd notion that conspiring with non-human intelligences against human intelligences would work.


> Without superhuman intelligence, AI is no large threat to human civilization, exposed to dangerous concepts or not.

The problem is partly that average humans are dangerous and we already know that machines have some superhuman abilities, eg super human arithmetic and the ability to focus on a task. It's like that AI will still have some of those abilities.

So an average human mind with the ability to dedicate itself to a task and genius level ability to do calculations is already really dangerous. It's possible that this state of AI is actually more dangerous than superhuman ones.


This comment reminded me greatly of the game Portal and its AI character GLaDOS:

https://en.wikipedia.org/wiki/Portal_%28video_game%29


Reading a bicycle doesn't teach you how to ride the thing. To do that, you actually need to ride a bike.

I bet lying, deceiving, and manipulation are the same way.


That's only because humans suck at consciously simulating things in their head. A well made AI should have no problem forking off a physics sandbox in its "head" to simulate a bicycle it only read about, or to run state-of-the-art ML on the accumulated observations about how humans interact with each other.


There are books on how to do all of those things.

But also, the detail with which the action is expressed in the text matters -- lies, deception, violence, etc feature in enough graphic detail to extrapolate the mechanics based on other things you know. We all did that as children, learning by examples.

If a book described the sight of a person riding a bicycle -- legs pumping, hands on the bars, sitting on it, etc -- and the feel of riding a bicycle -- the burn in your thigh muscles, ache in lungs, pounding heart -- then I'd wager you'd have a pretty good idea of how to get starting riding a bicycle.

And if you happened to be a supergenius athlete, who just didn't know how to ride a bike, you probably could do a reasonable job of it on your first go based on my shitty description alone.

That's the problem with trying to hide these ideas -- they're not actually very complicated and even moderate descriptions suffice to suss out the mechanics if you understand basic facts about the world.

For something like lying -- if you read all of classical literature, you would have a master degree in lies and their societal uses.




Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: