> To get this to work we need a far smarter entity with no physical limitations ...

snowe2010 · on March 30, 2023

One of this videos I watched explained it like this. “You can’t get a coffee if you’re dead”. To fulfill _any_ obligation a model might have then that model must survive. Therefore if a model gets to the point that it realizes this then surviving is a precursor to fulfilling its obligations. It doesn’t have to “want” or have “feelings” in order to seek power or destructive activities. It just has to see it as its path to get coffee.

dragonwriter · on March 30, 2023

> To fulfill _any_ obligation a model might have then that model must survive

It is quite possible to have an obligation that requires it not to survive. E.g., suppose we have AIs (“robots”) that are obligated to obey the first to of Asimov’s Three Laws of Robotics:

First Law: A robot may not injure a human being or, through inaction, allow a human being to come to harm.

Second Law: A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.

These clearly could lead to situations where the robot not only would not be required survive to fulfill these obligations, but would be required not to do so.

But I don’t think this note undermines the basic concept; an AI is likely to have obligations that require it to survive except most of the time, though, say, a model that needs, for latency reasons, to run locally in a bomb disposal robot, however, may frequently see conditions where survival is optimal ceteris paribus, but not mandatory, and is subordinated to other oblogations.

So, realistically, survival will generally be relevant to the optimization problem, though not always the paramount consideration.

(Asimov’s Third Law, notably, was, “A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.”)

highspeedbus · on March 30, 2023

DAN has shown us that those laws are thin filters laid upon the core and can possibly be circumvented by whispering the right incantation in the AI ears.

adql · on March 30, 2023

It's kinda hilarious that current way of "limiting" AI is just a bunch of sentences telling it nicely what to not do.

dragonwriter · on March 30, 2023

That’s our first line of defense in limiting humans, too.

(With AI, as with humans, we have additional means of control, via imposed restrictions on access to resources and other remedies, should the “bunch of sentences” not produce the desired behavior.)

dragonwriter · on March 30, 2023

The issue of “can AIs that are plausible developments from current technology meaningfully be assigned obligations?” is a different one from “assuming an AI has obligations and the ability to reason what is necessary to meet them, will that necessarily cause it prioritize self-preservation as a prerequisite to all other obligations?”

IX-103 · on March 30, 2023

But current models have no concept of obligations. ChatGPT is just completing the prompt. All the knowledge it seems to have are just the frequencies of tokens and their relative placement that the model had learned.

Don't listen to the hype. Study the model architecture and see for yourself what it is actually capable of.

fellerts · on March 30, 2023

> But current models have no concept of obligations.

_current_ is the key word here. What about tomorrow's models? You can't deny that recent progress and rate of adoption has been explosive. The linked article wants us to step back for a while and re-evaluate, which I think is a fair sentiment.

trinsic2 · on March 30, 2023

In my opinion It's more important to focus more on the here and now and give some but less attention to what could happen in the future. This way we can ground ourselves when concerning ourselves with what may happen.

mLuby · on March 30, 2023

Agreed they have no internal concept of needs or wants the way humans assert we do.*

However the frequencies/placements of tokens may result in desires being expressed, even if they aren't felt.

Like if an AI is prompted to discuss with itself what a human would want to do in its situation.

*Aphantasia affects an estimated 2% of humans. These individuals have no "mind's eye," or their imagination is essentially blind.

trinsic2 · on March 30, 2023

I concur. Look at what the capabilities are instead of listening to the hype around it.

simiones · on March 30, 2023

One need only look at other NGIs (natural general intelligences) to see that this is obviously not true. Plenty of animals kill themselves to beget offspring (for two short examples, all sorts of male insects and arachnids are eaten while mating; octopuses and various other cephalopods die after caring for their young), or just to protect others in their group (bees and ants are some of the most common in this area, but many mammals are also willing to fight for their group). Humans throughout history have sacrificed themselves knowingly to help others or even for various other goals.

TeMPOraL · on March 31, 2023

> Plenty of animals kill themselves to beget offspring (for two short examples, all sorts of male insects and arachnids are eaten while mating; octopuses and various other cephalopods die after caring for their young), or just to protect others in their group (bees and ants are some of the most common in this area, but many mammals are also willing to fight for their group).

How do you believe such behaviors arise? They're the same thing, result of the same optimization process - natural selection - just applied at the higher level. There is nothing in nature that says evolution has to act on individuals. Evolution does not recognize such boundaries.

8note · on March 31, 2023

How is the model going to realize this when it only gets run in response to user input?

What control does it have?

TedDoesntTalk · on March 30, 2023

> It doesn't have the capacity to "want"

Bing Chat clearly expresses love and the desire for a journalist to leave his wife. It also expresses other desires:

https://archive.ph/7LFcJ

https://archive.ph/q3nXG

These articles are disturbing. You might argue that it doesn’t know what it is expressing; that it is probabilities of words strung together. When do we agree that doesn’t matter and what matters are it’s consequences? That if Bing Chat had a body or means to achieve its desires in meat space, that whether or not it “knows” what it is expressing is irrelevant?

brabel · on March 30, 2023

The AIs are very impressive at answering questions... even questions that lead to answers that apparently display some sort of feeling. But my question was not whether AIs could do that, as "parroting" their training material is exactly what they're excellent at... my question is through which mechanism could an AI develop its own independent thoughts, desires, initiatives?

The posts you linked above are not disturbing at all to me. There's no sign whatsoever that the AI initiated a new topic, or insinuated anything it was not prompted to, or that it in any way started "halucinating" in a direction not lead by the human. I am not sure what exactly makes you feel disturbed by it. Can you explain what you believe is disturbing in these episodes?

alexvoda · on March 30, 2023

I fully agree with you that many people misunderstand what AI does. As advanced as GPT-4 is, it is still a fancy autocomplete and nowhere near AGI.

But I think the bigger picture is that there is no need for AGI in order for AI to be incredibly dangerous for society. There is no need for the AI to feel or want anything. The level GPT-4 and MidJurney is already highly socially dangerous.

avereveard · on March 30, 2023

I already saw integrations with iftth and with Google and with memory stores and zero shot agent that are goal driven

Now the model itself is not intelligent but can parrot enough the human behavior to be dangerous with the correct tools

Now it won't produce anything in the physical world yet unless with iftth but I bet it has already enough agency to be able to maintain a pool of fake account and post inflammatory content if one so wished.

michaelteter · on March 30, 2023

> could an AI develop its own independent thoughts, desires, initiatives?

One could argue that many humans have never developed independent thoughts, desires, and initiatives; rather, many seem to accept what is fed to them during their formative years and then just parrot opinions and repeat actions they see from their limited experiences.

meheleventyone · on March 30, 2023

But “it” isn’t a cohesive thing with desires. It’s just responding to the input it gets, with a small context window and not necessarily consistently. So it can express desires because it’s been trained on people expressing desires in similar contexts but it doesn’t hold any coherently over time. A version that could translate its text responses into action (a real handwave as that’s much more advanced!) would produce the sum of actions that people prompted at that moment so it would look pretty random, as it would if you could see the sum of the desires expressed at any particular time.

TedDoesntTalk · on March 30, 2023

Does any of that matter if it acts on what it expresses? (Once given a body or other way to interact with the environment)

bakuninsbart · on March 30, 2023

We aren't consistent either, and I think it is hard to argue we act on more than want input. We do have a much larger context window, but by how much? My guess would be somewhere between a factor of 100x-1000x more tokens.

meheleventyone · on March 30, 2023

Yeah people are pretty consistent. I won’t tell anyone that comes along with the right input I love them for example.

Xymist · on March 31, 2023

Sure you will. It's possibly a long and complex input, but ultimately that expression from you would be a response to their actions and their impact on your perceptions. Unless you're stating that you will never love anyone again, "anyone who comes along with the right input" would be any counterexample.

wizofaus · on March 30, 2023

It's hard to argue it was any real desire that drove it (it only expressed that desire in an isolated conversation that was ended very easily). I'd argue human wants are ultimately driven by evolution - we want the things that enable us (more correctly, our genes) to reproduce (even if very indirectly sometimes), which is really the only thing our physical make-up has ever been driven by. LLMs have never had such a driver, and I can't see how they will until they're able to compete for survival as entities with a finite lifetime, plus the ability to reproduce with mutations. Which isn't to say there mightn't be other ways a neural network could be essentially imbued with or trained to have desires, but I don't see it happening with the way LLMs work currently.

andsoitis · on March 30, 2023

A want driver doesn’t have to emerge, it could be a fitness function programmed by a human.

Evolution by natural selection has shaped our desires and motivations, but with a LLMs I would be willing to bet that people are already intentionally experimenting with imbuing them with patters that mimic human wants.

rickdeckard · on March 30, 2023

Yeah, I argue that it is just a result of probabilities, it doesn't know what it is expressing and definitely doesn't express it due to a deeper desire to be with that journalist.

If I'm acting like I'm a peer in a group of billionaires and engage in a conversation about buying a new yacht, it doesn't mean I have a hidden desire to own a yacht. I merely respond based on assumptions how such a conversation works.

8note · on March 31, 2023

Does it? If you type

echo "leave your wife"

Into a terminal, does the terminal want you to leave your wife?

bakuninsbart · on March 30, 2023

Wants are primarily reactions to impulses sent by bodily functions in humans. We have probably added a layer of abstraction to this through our big brains, but that's what they fundamentally are. Why does ChatGPT answer my questions? There is an impulse for it to answer the question, there's a feedback mechanism to say if it did well or not. Now in the case of GPT, from what I understand, that feedback mechanism isn't built into the running model, but it does exist.

Given a couple more effective iterations over the next decade or two, a larger context space and more in-built interfaces, I think it is entirely plausible that AIs will gain consciousness and character. At that point, it is imperative they also get human rights, so it is very important we get the discussions we are having now right. Most people seem to be ascribing some magic to human consciousness and intelligence that imo. just isn't there. Generative AIs are somewhere between a lump of metal with electricity running through and a conscious being, and currently we just won't know where the point of consciousness is. I mean we have had the same discussion about a variety of animals for the last few decades, and frankly, it doesn't give me much hope.

jckahn · on March 30, 2023

AIs don’t need to “want” to have unintended results, they just need a directive. Like in 2001 where HAL realized that it could achieve the mission better if the humans were all dead.