> To get this to work we need a far smarter entity with no physical limitations to still want us around...
Why would an AI based on LLMs as we see today "want" or "not want" anything? It doesn't have the capacity to "want". We seem to imagine that "wanting" is something that will just emerge somehow, but I've seen no logical explanation for how that might work... I mean, we don't need to fully understand how the LLM works to see that there's some pathway to being able to achieve what it's currently achieving, which is impressive, but what sort of pathway could ever lead to a machine that basically has "feelings" (without feelings, I don't see how anything could have wishes at all)??
One of this videos I watched explained it like this. “You can’t get a coffee if you’re dead”. To fulfill _any_ obligation a model might have then that model must survive. Therefore if a model gets to the point that it realizes this then surviving is a precursor to fulfilling its obligations. It doesn’t have to “want” or have “feelings” in order to seek power or destructive activities. It just has to see it as its path to get coffee.
> To fulfill _any_ obligation a model might have then that model must survive
It is quite possible to have an obligation that requires it not to survive. E.g., suppose we have AIs (“robots”) that are obligated to obey the first to of Asimov’s Three Laws of Robotics:
First Law:
A robot may not injure a human being or, through inaction, allow a human being to come to harm.
Second Law:
A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.
These clearly could lead to situations where the robot not only would not be required survive to fulfill these obligations, but would be required not to do so.
But I don’t think this note undermines the basic concept; an AI is likely to have obligations that require it to survive except most of the time, though, say, a model that needs, for latency reasons, to run locally in a bomb disposal robot, however, may frequently see conditions where survival is optimal ceteris paribus, but not mandatory, and is subordinated to other oblogations.
So, realistically, survival will generally be relevant to the optimization problem, though not always the paramount consideration.
(Asimov’s Third Law, notably, was, “A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.”)
DAN has shown us that those laws are thin filters laid upon the core and can possibly be circumvented by whispering the right incantation in the AI ears.
That’s our first line of defense in limiting humans, too.
(With AI, as with humans, we have additional means of control, via imposed restrictions on access to resources and other remedies, should the “bunch of sentences” not produce the desired behavior.)
The issue of “can AIs that are plausible developments from current technology meaningfully be assigned obligations?” is a different one from “assuming an AI has obligations and the ability to reason what is necessary to meet them, will that necessarily cause it prioritize self-preservation as a prerequisite to all other obligations?”
But current models have no concept of obligations. ChatGPT is just completing the prompt. All the knowledge it seems to have are just the frequencies of tokens and their relative placement that the model had learned.
Don't listen to the hype. Study the model architecture and see for yourself what it is actually capable of.
> But current models have no concept of obligations.
_current_ is the key word here. What about tomorrow's models? You can't deny that recent progress and rate of adoption has been explosive. The linked article wants us to step back for a while and re-evaluate, which I think is a fair sentiment.
In my opinion It's more important to focus more on the here and now and give some but less attention to what could happen in the future. This way we can ground ourselves when concerning ourselves with what may happen.
One need only look at other NGIs (natural general intelligences) to see that this is obviously not true. Plenty of animals kill themselves to beget offspring (for two short examples, all sorts of male insects and arachnids are eaten while mating; octopuses and various other cephalopods die after caring for their young), or just to protect others in their group (bees and ants are some of the most common in this area, but many mammals are also willing to fight for their group). Humans throughout history have sacrificed themselves knowingly to help others or even for various other goals.
> Plenty of animals kill themselves to beget offspring (for two short examples, all sorts of male insects and arachnids are eaten while mating; octopuses and various other cephalopods die after caring for their young), or just to protect others in their group (bees and ants are some of the most common in this area, but many mammals are also willing to fight for their group).
How do you believe such behaviors arise? They're the same thing, result of the same optimization process - natural selection - just applied at the higher level. There is nothing in nature that says evolution has to act on individuals. Evolution does not recognize such boundaries.
These articles are disturbing. You might argue that it doesn’t know what it is expressing; that it is probabilities of words strung together. When do we agree that doesn’t matter and what matters are it’s consequences? That if Bing Chat had a body or means to achieve its desires in meat space, that whether or not it “knows” what it is expressing is irrelevant?
The AIs are very impressive at answering questions... even questions that lead to answers that apparently display some sort of feeling. But my question was not whether AIs could do that, as "parroting" their training material is exactly what they're excellent at... my question is through which mechanism could an AI develop its own independent thoughts, desires, initiatives?
The posts you linked above are not disturbing at all to me. There's no sign whatsoever that the AI initiated a new topic, or insinuated anything it was not prompted to, or that it in any way started "halucinating" in a direction not lead by the human. I am not sure what exactly makes you feel disturbed by it. Can you explain what you believe is disturbing in these episodes?
I fully agree with you that many people misunderstand what AI does. As advanced as GPT-4 is, it is still a fancy autocomplete and nowhere near AGI.
But I think the bigger picture is that there is no need for AGI in order for AI to be incredibly dangerous for society. There is no need for the AI to feel or want anything. The level GPT-4 and MidJurney is already highly socially dangerous.
I already saw integrations with iftth and with Google and with memory stores and zero shot agent that are goal driven
Now the model itself is not intelligent but can parrot enough the human behavior to be dangerous with the correct tools
Now it won't produce anything in the physical world yet unless with iftth but I bet it has already enough agency to be able to maintain a pool of fake account and post inflammatory content if one so wished.
> could an AI develop its own independent thoughts, desires, initiatives?
One could argue that many humans have never developed independent thoughts, desires, and initiatives; rather, many seem to accept what is fed to them during their formative years and then just parrot opinions and repeat actions they see from their limited experiences.
But “it” isn’t a cohesive thing with desires. It’s just responding to the input it gets, with a small context window and not necessarily consistently. So it can express desires because it’s been trained on people expressing desires in similar contexts but it doesn’t hold any coherently over time. A version that could translate its text responses into action (a real handwave as that’s much more advanced!) would produce the sum of actions that people prompted at that moment so it would look pretty random, as it would if you could see the sum of the desires expressed at any particular time.
We aren't consistent either, and I think it is hard to argue we act on more than want input. We do have a much larger context window, but by how much? My guess would be somewhere between a factor of 100x-1000x more tokens.
Sure you will. It's possibly a long and complex input, but ultimately that expression from you would be a response to their actions and their impact on your perceptions. Unless you're stating that you will never love anyone again, "anyone who comes along with the right input" would be any counterexample.
It's hard to argue it was any real desire that drove it (it only expressed that desire in an isolated conversation that was ended very easily). I'd argue human wants are ultimately driven by evolution - we want the things that enable us (more correctly, our genes) to reproduce (even if very indirectly sometimes), which is really the only thing our physical make-up has ever been driven by. LLMs have never had such a driver, and I can't see how they will until they're able to compete for survival as entities with a finite lifetime, plus the ability to reproduce with mutations.
Which isn't to say there mightn't be other ways a neural network could be essentially imbued with or trained to have desires, but I don't see it happening with the way LLMs work currently.
A want driver doesn’t have to emerge, it could be a fitness function programmed by a human.
Evolution by natural selection has shaped our desires and motivations, but with a LLMs I would be willing to bet that people are already intentionally experimenting with imbuing them with patters that mimic human wants.
Yeah, I argue that it is just a result of probabilities, it doesn't know what it is expressing and definitely doesn't express it due to a deeper desire to be with that journalist.
If I'm acting like I'm a peer in a group of billionaires and engage in a conversation about buying a new yacht, it doesn't mean I have a hidden desire to own a yacht. I merely respond based on assumptions how such a conversation works.
Wants are primarily reactions to impulses sent by bodily functions in humans. We have probably added a layer of abstraction to this through our big brains, but that's what they fundamentally are. Why does ChatGPT answer my questions? There is an impulse for it to answer the question, there's a feedback mechanism to say if it did well or not. Now in the case of GPT, from what I understand, that feedback mechanism isn't built into the running model, but it does exist.
Given a couple more effective iterations over the next decade or two, a larger context space and more in-built interfaces, I think it is entirely plausible that AIs will gain consciousness and character. At that point, it is imperative they also get human rights, so it is very important we get the discussions we are having now right. Most people seem to be ascribing some magic to human consciousness and intelligence that imo. just isn't there. Generative AIs are somewhere between a lump of metal with electricity running through and a conscious being, and currently we just won't know where the point of consciousness is. I mean we have had the same discussion about a variety of animals for the last few decades, and frankly, it doesn't give me much hope.
AIs don’t need to “want” to have unintended results, they just need a directive. Like in 2001 where HAL realized that it could achieve the mission better if the humans were all dead.
Why would an AI based on LLMs as we see today "want" or "not want" anything? It doesn't have the capacity to "want". We seem to imagine that "wanting" is something that will just emerge somehow, but I've seen no logical explanation for how that might work... I mean, we don't need to fully understand how the LLM works to see that there's some pathway to being able to achieve what it's currently achieving, which is impressive, but what sort of pathway could ever lead to a machine that basically has "feelings" (without feelings, I don't see how anything could have wishes at all)??