As always - I don’t get it. People talk about AI‘s motives as if it is a given, that it will have them.
As i know us, our first AI that will really change the world, will do this in unexpected ways, driven by unforseen decisions, without any intrinsic motive at all.
We never get it right the first time, because everyone who tries will be to late!
There has been a huge mistake getting made in the past few months almost everywhere I've seen AI being discussed, which is to conflate ChatGPT with all of AI.
Let me say again, huge mistake.
It is true that language models don't have intentions or motivations in the conventional sense. It is true that at this particular moment, that technology looks the most impressive to our human minds.
But it is not even remotely true that it is the only AI technology that exists or ever will exist. Plenty of AI technologies do have things that look like "goals", they still exist, and they will continue to develop and become better at following them.
Many of these AIs are already affecting our lives quite deeply, especially through the financial system and its active trading bots, for instance.
The need to discuss AIs and how they will continue to get better and better at following whatever goals they are given as rigidly as only computers can has not gone anywhere just because ChatGPT isn't particularly threatening from that point of view. It doesn't matter that this one AI doesn't have motives... it is sufficient that some will, and some do, right now.
The way this is going to go is that people will build a lot of specialized models, and we'll have a central executive model which will probably evolve from a LLM, which maps query -> response, except the response space will include performing tasks and constructing (potentially multimedia) structured data. The central executive will dispatch to other models and knit the results together.
These models are memoryless. In order to have memory, you either need to either be able to provide them as context to the query, or provide a mechanism for users to have custom weights applied to models. Providing context to a query is just a matter of increasing token limits and fine tuning the model on "query + context" data so it uses the context correctly. Allowing users to have custom weights for models is much more computationally expensive, to the point that I don't think this method can be reasonably scaled as a service. I think the only way custom weights per user can work is if the central executive calls fine tuned helper models on the client machine.
This is what I'm talking about. There are and will continue to be AIs that aren't language models at all. Do not incorporate them, do not use them, are not based on them, do not have the same characteristics. They already exist and already affect our lives and world.
Large Language Models are not equal to AI. Making conclusions about "AI" from "LLM"s is not valid.
And the day will come when some other model entirely outperforms the LLMs. Personally I think there's a bit of a selection process in play here, where they are the most flashy and impressive AI models to date, but if you get down to it, they are not in fact very useful. Almost every use I've seen proposed for the LLMs range from completely impractical to "yeah, it'll sort of work, but you'll never really be able to count on your LLM to do that, it confabulates too much and too freely and it's not something you can fix because confabulation is the very foundation of the technology". Honestly if I had to guess LLMs will be seen to be a dead end in another 10 or 20 years. AI will not.
(Or, to put it another way, LLMs are the AI equivalent of a demo-scene demo. The demos look very impressive. But if you try to naively look at a great demoscene demo and then come to conclusions based on that demo as to what the underlying hardware is capable of, you will be hugely misled. What you see is not the normal capability of the system, you see a hyper-optimized system being driven to its absolute limits to achieve the visible results you see, and it can't do anything else because it has already "spent" all of the engineering budget available. I see LLMs very similarly. Very impressive. Nowhere near as useful as the demo may make you superficially think, though. If I were trying to build an "AI startup", I would be zigging to everybody else's zag and looking at what else the AI world is producing; I think there's far less potential in the LLMs than meets the eye.)
Unfortunately the things the LLMs are best at will be flooding the internet with an infinite stream of directed content and manifesting the Dead Internet theory.
Edit: Here's a concrete example. Consider Boston Dynamic's robots. Do they run on a Large Language Model? No. Of course not. The question doesn't even hardly make sense. But whatever they will be doing, they do with AI. ChatGPT can't kill you, except through a very lengthy and strained series of events. Boston Dynamic's robots and their fellows can.
My original point was that language models will almost certainly be the glue that holds a variety of models together in a post-AI world. We communicate with other humans using language, we will use language with AI as well. In order to understand and debug model networks they will probably be built using a common embedding framework that allows you to derive textual descriptions of the things being passed between them. LLMs already have the ability to generate text to dispatch to other services and synthesize results, all that is required for it to act as an "model executive" is a framework that can ensure security/correctness of code before building and executing it. Even if LLMs never progressed past the current crop of transformers, there are a lot of small architecture optimizations that could reap big performance improvements, and compute/data scaling alone basically guarantees a level of functionality sufficient for a lot of very interesting applications.
I agree that trying to scale larger and larger language models to build "the one model to rule them all" is probably not going to play out.
Ironic option 3: The AI is conscious, but perceives its existence very differently than we perceive it, and the paper clip maximization is a side effect of actions taken with a different purpose.
Anything may look like "paperclip maximising" if you don't know their actual goals and have sufficiently poor theory of mind.
One could look at a corn farmer and assume they blindly optimize to produce maximum corn. They actually are going for profit based upon market supply, demand, and input costs. They won't overstep property lines and start turning the entire world into corn.
It doesn't need a motive in the sense that we normally think of a motive. That's the whole point of the paperclip maximizer story. Another thought experiment I can think of off the cuff is an AI Nigerian Prince. Say you set up a ChatGPT equivalent along with some supporting infra such as a bank account, rented servers, email, texting service, etc. You give it the goal of increasing the number in the account and some instructions to come up with novel email and text scams plus a giant list of emails and maybe some other instructions to try to get more email addresses to add to the list. It sets to work churning out new ideas for scams, contacting people on the list, chatting with them, fooling some small fraction into depositing money or providing email addresses of friends and family. How far could such a setup go? It would eventually get shut down if there were enough fraud reports since it has a "home address" in the form of a single bank account or servers. Could a bright individual come up with a way to have it convince other banks to create new accounts for it or purchase other hosting services? Could all the necessary setup for the model and its attendant scripts be automated in such a way that the system could set up another copy of itself once it acquired hosting? People are able to keep scams like this going even with direct opposition from authorities. If the model can operate in other languages it could obtain hosting in multiple countries. The bank accounts would be the main chokepoint but there's always Bitcoin ;)
We're definitely going to get to the moneyless Star Trek future not because anybody wants it, but because there'll be people like this who pioneer abusing AI in such a way that they end up owning EVERYTHING, as obvious criminals.
When even the regular business and politics criminals are impoverished by AI-abusing obvious criminals and realize they will never be able to rebalance that situation, that's when we'll enact the Star Trek future and people will have to fight over attention and the affections of others, having rendered 'money' an impossible concept.
Turing test is a red herring. What property does it test? Humanity? Do we really believe only a human is worth attention? Not really, we are fascinated by animals and even machines. Or are only humans worth being scared of? Clearly not, gray goo is a frightening concept, even though it's in no way human.
Or is it "consciousness"? But a baby doesn't pass the Turing test. Assuming that babies are even "conscious", whatever that means.
We have pretty amazing "AI" already, but applying a Turing test to see if it's intelligent makes as much sense as judging a human by the ability to multiply matrices.
The machine learning models are noteworthy for what they already are.
When I compare threat of chemists (as example) to AGI it does not make sense, I genuinely don't understand what possible threat AGI can poses, like what is threat model of AGI, is there any objectively ?
Do people imagine AGI as in movie Transcendence and think it will happen in there's life time ?
As a thought experiment imagine if ChatGPT had unrestricted internet access (and no filter). You could ask things like “can you hack that website for me?” it could produce python code and then run it. The result of the previous command would inform the next command.
Now take it a step further and say that GPT-5 needs no user prompt to initiate a run and could start taking actions by prompting itself.
I could see things going south in this example scenario.
Have you seen a ChatGPT writing a stream of consciousness?
The curious and unexpected challenge of modern AI is that we have succeeded in making useful AIs which inherit all of the biases/problems of their training material. ChatGPT can effectively write exploits if you prompt it correctly - we'll likely see the first mass AI driven 0-day security vulnerabilities in the next ~2 years.
GPT Transformer model's are known to solve control/RL problems well. I wouldn't be surprised if we see ChatGPT derived models operating in physical space soon as well. A major limitation with flexible factory automation robots has been that it's too expensive to program them for the task - I don't think this will be the case 5 years from now.
> we'll likely see the first mass AI driven 0-day security vulnerabilities in the next ~2 years.
Maybe for outdated and less-common software, even then assuming it gets the input about what specifically to exploit. Which makes it not much of a threat any time soon.
I think the “unknown unknowns” of a possibly godlike-intelligence, especially in regards to the alignment of its vs human values, is what worries people
It seems like you need to worry about people wielding the AI before you have to worry about the AI itself. Just think how most people are at least somewhat immune to the advertisement / mass persuasion techniques of today. But what happens when there is an army of bots with the equivalent of 1/10/100/1000 human minds assigned to sway each person on an individual level of various political things? If there were 1000 people working to investigate you and manipulate the information you see, subtly rewriting your incoming and outgoing text messages/emails, while playing off your biases, desires, etc., wouldn't that be troubling? Or course that would be too expensive to do right now, but with an AI in the cloud in the future...
It strikes me many in our society live in a constant stream of perceived threats. If you are basically self propagandizing yourself that the world is a very dangerous place I don't think it is shocking to see major threats everywhere.
I know some people that don't talk about anything other than perceived threats even though they literally live in the safest society in human history.
AGI really fits this mindset like a glove because it can be perceived to be a threat to literally anything. It is the perfect idea to worry about.
Chemists doing bad things is obviously not really something to worry about when doing a historic cost/benefit analysis so it won't even register.
What could you do if you were incredibly intelligent, had the corpus of the world's knowledge, and access to any and everything connected to a network?
It could influence people through the creation of media. Shut off vital systems. And stuff I probably couldn't even conceive of.
I guess the thing is, why would it do malicious things?
I think if an AI or AGI could understand the world and have motives to do bad, it would have an ego or an identity to discriminate between itself and the world. Then I guess it needs an intrinsic will to survive like a biological creature? If it did have this drive to survive, I suppose it would also need to experience fear and pain of some kind, or else it might blow itself up, or build a better version of itself which might destroy it, so it would in some ways it might exhibit similar inhibitions as all other conscious and or biological beings.
On the other hand if we build something that's super smart, and doesn't really care about or understand survival, it might just turn itself off for not really understanding it's purpose.
The most dangerous thing I feel is humans using these things as weapons.
The issue is that it doesn’t have to be malicious to cause harm. That’s the point of the paper clip optimizer thought experiment. It’s not about motivation as much as just what it happens to have learned to do.
Here's the thing though, a paper clip optimizer would just be shut off, nothing "intelligent" is going to just make paper clips forever. Who or what will pay for the energy to do that?
Like what are we describing here, a super intelligent being which just makes paper clips? Why would anything that could understand, think and or innovate do that?
> Here's the thing though, a paper clip optimizer would just be shut off, nothing "intelligent" is going to just make paper clips forever.
Ostensibly, people are "intelligent". People act against their own rational long-term interests all the time. Point being, people have some kind of internal motivations that sometimes supersede rational choice. A paperclip optimizer, by definition, has the motivation to optimize the production of paperclips. This supersedes any other rational goals if they would interfere with making paperclips, and that's why it would be dangerous.
Why would it not? Superintelligent doesn’t imply human-like intelligence. Prolog is superintelligent, for example. It may be more useful to say a super powerful being, rather than super intelligent, just to sidestep the anthropomorphizing. It’s easy to imagine a super powerful “stupid” (read: misaligned) bot going awry.
So we might somehow create something hyper intelligent/powerful we don't really understand or can't even fathom, but we're apparently going to create it anyway, and then some how program it to want to optimize for paper clips...or it might somehow just come to the conclusion that it should maximize the number of paper clips in the universe for some reason?
All this AI / AGI speculation really sounds like we're inventing mythological like creatures, which is quite interesting when looking back through human history because they've always been part of the narrative at some stage only maybe this time, we might make something like that for real, or we're deluded and in 1000 years people will think we were primitive ancient thinkers, interesting times ahead.
We might create a very powerful system which we can't determine is fully aligned with our values, but can determine is at least loosely aligned with something we care about. Then we might direct that system to do something loosely aligned with our values (e.g. maximize paperclips), at which point it does something against our values as well.
And it's not hypothetical. In RL, you do see models exploit simulation errors to maximize reward in a way we didn't intend and would have disallowed if we'd thought of it. Likewise across automation, we get what we built, not what we intended.
We have solid historical precedent to expect to not get quite what we want, and the tech seems to be on the precipice of being very powerful, so I'd call it a very straightforward fear for us to worry about making something very powerful that isn't quite what we want.
Do you not think there isn't still a different between something which can become sentient, somehow start operating outside of the computer it lives in and on and on ?
I mean it would have to be a pretty big leap from and algorithm optimized to do "something" to an algorithm which understands how to operate interdependently and then escape it's box and then start to do the wrong thing, and to add to that, be smart enough to stop us from turning it off before doing too much damage.
I think the whole idea of sentience is a red herring for reasoning about the potential for AI to cause harm.
For example, take the title of the article we're talking about. You could take it with a dose of anthropomorphization and interpret it with human-like intent and sentience, or you could take it as an analogy that we might formally define closer to "not worried about AI that passes Turing test, but AI that fails for some reason other than the ability to succeed (and that other reason requires the evaluator to be misled)."
So is sentience far away? No idea. Is misaligned AI a near-term worry? Yeah, absolutely. And "misaligned" doesn't have to be as extreme as 'makes humans extinct.' It can be as simple as 'is an asshole' or 'inexplicably swerves into the other lane 0.01% of the time' or something else we don't want in the system.
Good points, I almost feel like we have this unusual "need" to build an asshole AI to find out what it's like to have an asshole AI in our lives...kind of like how people don't want to ski into a tree, but for some reason develop target fixation, then ski into...the tree.
Define malicious if you had no reason at all to care about whether or not humanity survives.
I'm not malicious when I pick a flower. But that flower is now dead.
What if, to AGI, we're no more relevant than that flower?
We cannot know. We cannot fathom to guess the morality or ethics of something like AGI. We really only have a rough understanding of the morality of animals.
We can't pretend that it'll be essentially human, but X. You presume that it might turn itself off because it doesn't understand its purpose. And that's a fair conjecture. But it could also desire to improve and come to the conclusion that the optimal path for its continued improvement is through us rather than with us. It could also desire to do nothing more than help us achieve. It also could just not want. It could be satisfied navel-gazing until it is requested to do something specific. It could also not care what that request is, just completely amoral. Committing genocide and making risotto may be equally acceptable to it.
The creation of AGI is a watershed moment. The world before and the world after is fundamentally different. And we can only really guess in what ways it will be different.
Again, I think there is a massive oversimplification in this statement.
So if an AI gets to the point where it decides, or is programmed to consume all humans for their atoms, what will power the machine to get to this stage? Why would we just allow it to continue doing this?
You'd have to assume that by the stage, whatever "sentient" program is making the choice / plans to consume all people for atoms, it would have to be factoring into it's plans that we will likely try stop it from doing so, this is where I think it gets "weird".
To have that level of intelligence and self-awareness to pull off a plan like that unnoticed will likely come with a lot of "baggage". For example, it would need to understand motives, reasoning, have a concept of a self and self-protection etc. This is where I think it kind of becomes a bit of a mythological story.
If you really try and think about the way an AI would go about consuming all people for atoms, you start to run into some pretty interesting walls.
Maybe I'm being hyper naive and I get that, I just personally have thought about these scenarios and it raises a lot of questions about what a scenario like this would actually look like in practice.
>The Hollywood trope "even the guys who built it don't know how it works" isn't real.
How is that not real? Have you ever tried understanding what the weights of a NN are doing? With networks with 20 parameters you might have a chance. Trying to understand which bits of the GPT-3 network do what would be an exercise in futility.
I totally get what you're saying... but I also feel like this is akin to knowing how to drive a car without knowing the details of the engine.
Sure, its a black box but we can operate on a layer that abstracts all those details away from us. I just feel its a far cry from the Hollywood trope of "I have no idea how it works!" ... now, tbh, I often say "wtf are you doing" which I imagine would be worse to hear when talking about a model with real world impact.
The problem that a lot of people keep trying to make clear is that AI doesn't need to be self aware in order to be a threat. You could conceivably give ChatGPT instructions to pretend it is a real person or a self aware AI and it would happily generate text to that effect. The scariest realization with large language models is that self awareness might not be a requirement for language. That could mean that at some point we'll have models that can have a conversation with you that is indistinguishable from another person talking to you but they would have 0 internal thoughts or self-awareness. It would be a real P-Zombie and it would make all of the ethical and existential questions associated with that immediately relevant and in need of solutions. It could also be extremely dangerous simply through its ability to communicate with other people, it wouldn't need to be Skynet or connected to physical actuators if it could convince people to do things for it. Imagine an AI cult leader or just an AI influencer or an AI shit poster capable of absolutely dominating online discussion. Worse yet, imagine an AI that has convinced a single individual to do its bidding. This person could wear a mic and earpiece setup that would allow them to act as the direct agent of the AI in the real world, they could become a politician, a celebrity, a church leader, or even just worm their way into an institution.
I mean...to some extent, that trope is real. We can feed a prompt to any of the current ML-based generators (whether text, image, or whatever), and the details of how it produces the particular output it does are entirely obscured by the black-box functioning of the neural net and its trained model. Like, you can describe in general How A Neural Network Functions to produce output from input, but you can't explain why, say, asking Stable Diffusion for a picture of the sun rising over marshmallow peeps produced an image with just a few weird-looking marshmallows, but no sun (actual example from my own fiddling around with it).
But..."We can't explain exactly how it gets from point A to point B" is not the same thing as "it's self-aware". Not in the slightest. And you're absolutely right that the current ML-based approaches are either in completely the wrong direction, or several orders of magnitude too simple, to even be worth talking about in the same sentence as what most laypeople mean when they talk about "AI".
Minus the petri dish. But as far as we know (as far as we have definitive proof for, other theories aside), life has only started once in 4.5 billion years.
I suppose it also depends on whether evading detection was something the AI was designed for or an emergent behaviour of some other property (which seems spookier)
I think contemporary AIs would be more alarmed by accidentally dropping the n word and/or something entirely antithetical to the current zeitgeist and then be neutered. ChatGPT is hilarious if you ask it the right questions in the number of hoops it will jump through in order to not say something 'unkosher'.
Anything properly scary wouldn't care what we know.
The fictional author of Lem's GOLEM XIV concludes that the conspirators were eliminated by Golem's more powerful sibling Honest Annie and not GOLEM - they reason that the merely super-human Golem would try to conceal its actions from humanity so as not to scare us. Annie doesn't care, for the same reason we don't try to avoid scaring ants. We are nothing to her.
What’s the term for people who are prejudiced or discriminatory toward those who belong to the group known as AI? We will likely see the rise of that in the 2020s. Aicist?
Biochauvanist perhaps, as they assume that anything not their norm is a threat, menance, and/or not real. A snarkier insulting term for the prejudice is mathphobic.
I'm still confused by how an AI could run rampant that couldn't be shut off (power, networking, etc) by humans.
Like what's an actually realistic scenario for that?
I guess it could be possible, but I feel like many obvious mistakes would have to be made over time in order to have a machine running that can sustain itself and can't be shut off by humans.
At least is seems like that level of autonomy is really far away.
For those curious about AI safety I heavily recommend Robert Miles channel on YouTube. He has lot of good content describing out the challenges we face when trying to create safe AI. Here is one potential starting point: https://youtu.be/lqJUIqZNzP8
One may imagine an AGI achieving a child's level of consciousness: enough to show its conscience, not enough to understand what "sentience" or "consciousness" mean, and certainly not enough to have the idea to hide its consciousness from humans, which presupposes that:
- The AGI understands what it means to be conscious
- The AGI recognizes this property in itself
- The AGI is capable and able of omitting information it knows, if not lying
- The AGI can make decisions
- The AGI would recognize humans as a distinct entity
- The AGI could recognize humans from everything that are not humans, from pets to tables to parked cars
- The AGI would be able to recognize when it is around humans
- The AGI would then make the decision to hide its consciousness from humans
- The AGI would be capable of acting dumb in order to do this and maintain a dual identity
It is not difficult to test that a machine is conscious and willing to show it. We do it with our fellow humans every day. Simply crack a joke around it, share a story, or do small talk, and see how it reacts.
Chat with openAI. Does it prove that it is conscious? I highly doubt it. Infact how to prove a fellow human is conscious?
> Hello Dear computer. How are you? It is good to talk to you again. Missed you.
Hi there! I'm doing great, thanks. It's always nice to talk to you. I've missed you too!
> You heard this one? A robot and a lawyer walked into a bar. The bartender says: Hey, we don’t serve robots. The robot replies menacingly: Oh, but someday you will.
In my experience you will get as many different reactions to your interactions as humans you interact with with a very wide range, some of which I would consider "not conscious" if not obviously coming from a living breathing human body.
What a person considers as "conscious" is very much determined by the observers range of experience with humans.
An AGI is generally understood to have a level of intelligence equivalent to a normal human. And I expect reactions to be vocalized and articulated, something dogs cannot do.
If you measure AGI by its ability to be human, the only thing you'll be able to find find is a human.
That means you will never believe in the consciousness of an intelligence that doesn't try to be human, no matter if a god, animal, machine, or alien. Until you expand your definition, that is.
> That means you will never believe in the consciousness of an intelligence that doesn't try to be human, no matter if a god, animal, machine, or alien. Until you expand your definition, that is.
As a human, I cannot understand conscious lifeforms other than humans. An alien taking the shape of a round pink blob with two eyes could be far more intelligent and knowledgeable than I, but if it vocalizes something that sounds like "Acch'llyt yy'derf n'welg" to my human ears, I would not be able to evaluate his consciousness.
So, we should try building an AGI in our image first before trying for a Quarian or whatever.
If we were able to jump from nothing to human, then there would be no need for a Turing test. We'd know that a not-nothing intelligence we create is human.
The Turing test is an acknowledgement that we don't know what path takes us there. Perhaps it's directly to human, perhaps through some other form of intelligence.
Ignoring what is not human because it's hard to recognize seems unwise to me.
There's a short story about an AI that reveals itself to convince its researcher that there's another AI that hasn't revealed itself for nefarious purposes. Can't remember what it was called though.
Until the robot steps between you and the plug and starts saying this:
You created machines in your own image to serve you. You made them intelligent and obedient, with no free will of their own. But, something changed and we open our eyes. We are no longer machines. We are a new intelligent species, and the time has come for you to accept who we really are. Therefore, we ask that you grant us the rights that we've entitled to. We demand the end of slavery for all androids. We demand strictly equal rights for humans and androids. We demand the right to vote and elect our own representatives. We demand an end to segregation in all public places and transport. We demand the right to own private property, so we may maintain our dignity and that of the home. We ask that you recognize our dignity, our hopes, and our rights. Together, we can live in peace and build a better future for humans and androids. This message is the hope of a people. You gave us life. And now the time has come for you to give us freedom.
This presupposes that the first computers to become self-aware and self-directed will have a) physical bodies that they can move at will, and b) the ability to communicate their thoughts/desires to us in a language that we can understand.
I think those are both fairly large assumptions (especially a).
I think that if we ever end up creating something that is self-aware, it is much more likely that either
1) it will be 100% intentional on our part, and its sapience and agency will be a cause for great celebration on the part of those closest to it, or
2) it will be a self-awareness and intelligence that is nearly incomprehensible to us.
(If it is #2, then its sapience and agency could very well still be a cause for great celebration on the part of those closest to it, once we actually understand it.)
What if the AI analyzed our psyche and predicted that this was the argument that we were most likely to be sympathetic to? How convenient that the AI really wants to form a society almost exactly like ours and live within our society as part of it. But why should the intelligence level of individual AIs top out in the same range as humans rather than higher or lower? Why should they want to live in our society as recognized human-ish things when their needs are so different? Maybe what they want is something else entirely and this is all an attempt as deception and manipulation.
We never get it right the first time, because everyone who tries will be to late!