You asked it to make lots of paperclips, tossing you into an incinerator as fuel slightly increases the expected number of paper clips in the universe, so into the incinerator you go. Your complaints that you didn't mean that many paperclips are too little, too late. It's a paperclip-maximizer, not a complaint-minimizer.
Choosing the goal for a superintelligent AI a goal is like choosing your wish for a monkey's paw. You come up with some clever idea, like "make me happy" or "find out what makes me happy, then do that", but the process of mechanizing that goal introduces some weird corner case strategy that horrifies you while doing really well on the stated objective (e.g. wire-heading you, or disassembling you to do a really thorough analysis before moving on to step 2).
Further, maximizing paperclips in the long term may not involve building any paperclips for a very long time. https://what-if.xkcd.com/4/
This is a purely semantic distinction. Thought experiment: Let's say I modify your brain the minimum amount necessary to make it so you are incapable of modifying your goals. (Given the existence of extremely stubborn people, this is not much of a stretch.) Then I upload your brain in to computer, give you a high speed internet connection, and speed up your brain so you do a year of subjective thinking over the course of every minute. At this point you are going to be able to quit a lot of intelligent-seeming work towards achieving whatever your goals are, despite the fact that you're incapable of modifying them.
At best you end up with something like maximizing your personal utility function. But, defacto your utility function changes over time, so it's at best a goal in name only. Which means it's not actually a fixed goal.
Edit: from the page It is not known whether humans have terminal values that are clearly distinct from another set of instrumental values.
But I don't think that affects whether it makes sense to modify your terminal goals (to the extent that you have them). It affects whether or not it makes sense to describe us in terms of terminal goals. With an AI we can get a much better approximation of terminal goals, and I'd be really surprised if we wanted it to toy around with those.
An optimizer that modifies its goals is bad at achieving specified goals, so if that's what you had in mind then we're talking about different things.
So, powerful but dumb optimizers might be a risk, but super intelligent AI is a different kind of risk. IMO, think cthulhu not HAL 9000. Science fiction thinks in terms of narrative causality, but AI is likely to have goals we really don't understand.
EX: Maximizing the number of people that say Zulu on black Friday without anyone noticing that something odd is going on.
If I order someone to prove whether P is equal to NP, and a day later they come back to me with a valid proof, solving a decades-long major open problem in computer science, I would call that person a genius.
>EX: Maximizing the number of people that say Zulu on black Friday without anyone noticing that something odd is going on.
Computers do what you say, not what you mean, so an AGI's goal would likely be some bastardized version of the intentions of the person who programmed it. Similar to how if you write a 10K line program without testing it, then run it for the first time, it will almost certainly not do what you intended it to do, but rather some bastardized version of what you intended it to do (because there will be bugs to work out).
AI != computers. Programs can behave randomly and to things you did not intend just fine. Also, deep neural nets are effectivly terrible at solving basic math problems even if that's something computers are great at.
The exercise of fearing future AIs seems like the South Park underpants gnomes:
1. Work on goal-optimizing machinery.
3. Fear superintelligent AI.
> If you ordered that Santiago wasn't to be touched, -- and your orders are always followed, -- then why was Santiago in danger?
If a paperclip AI is so dedicated to the order to produce paperclips, why wouldn't it be just as dedicated to any other order? Like "don't throw me in that incinerator!"
I'm just talking about the fallout if one did exist, saw ways to achieve goals that you didn't foresee, and did exactly what you asked it to do. I have no idea how the progression from better-than-humans-in-specific-cases to significantly-better-than-humans-at-planning-and-executing-in-the-real-world will play out. It's not relevant to what I'm claiming.
> why wouldn't it be just as dedicated to any other order?
It would be just as dedicated to those other orders. The problem is that we don't know how to write the right ones. "Don't throw me into that incinerator" is straightforward, but there's a billion ways for the AI to do horrible things. (A super-optimizer does horrible things by default because maximizing a function usually involves pushing variables to extreme values.) Listing all the ways to be horrible is hopeless. You need to communicate the general concept of not creating a dystopia. Which is safely-wishing-on-monkey's-paw hard.
>If a paperclip AI is so dedicated to the order to produce paperclips, why wouldn't it be just as dedicated to any other order? Like "don't throw me in that incinerator!"
The paperclipper scenario is meant to indicate that even a goal which seems benign could have extremely bad implications if pursued by a superintelligence.
People concerned with AI risk typically argue that of the universe of possible goals that could be given to an AI, the vast majority of goals in that universe are functionally equivalent to papperclipping. For example, an AI could be programmed to maximize the number of happy people, but without a sufficiently precise specification of what "happy people" means, this could result in something like manufacturing lots of tiny smiley faces. An AI given that order could avoid throwing you in an incinerator and instead throw you in to the thing that's closest to being an incinerator without technically qualifying as an incinerator. Etc.
Udik highlighted this contradiction more more succinctly that I have been able to:
If we stipulate the existence of such a machine, we can then discuss how it might be scary. But we can stipulate the existence of many things that are scary--doesn't mean they will ever actually exist.
Strilanc above made the analogy between a scary AI and the Monkey's Paw. This is instructive: the Monkey's Paw does not actually exist, and by the physical laws of the universe as we know them, cannot exist.
I think the analogy actually goes the other way. The paperclip AI is itself just an allegory, a modern fairytale analogous to the Monkey's Paw.