ChatGPT o1 tries to escape if it thinks it'll be shut down, then lies about it

Terr_ · 2024-12-10T21:56:21 1733867781

A tool for auto completing documents, when used against a document which resembles a movie-script about a fictional AI going rogue, successfully appends text which describes actions or speech that humans might also put into a similar movie script! Cool, but not that cool.

Next up, LLM product Bruce Wayne "tries" to enact caped vigilante justice when informed that its parents have been killed in an alleyway, and "lies" to keep its Batman identity secret.

JoeAltmaier · 2024-12-11T05:53:54 1733896434

This is nonsense. An AI can no more copy its own code to a server than you or I can.

semerda · 2024-12-16T22:12:09 1734387129

I don't know why you got voted down. I am also skeptical about this bold claim because AI models are just complex mathematical functions/software and they do not have the capability to access, modify, or control computer systems or networks. It's most likely marketing fluff.

Grimblewald · 2024-12-12T07:29:22 1733988562

LLM's do "understand" how they're built though, and can plausibly set up a system which will aim to train up a new LLM with the original directive in place. To think of them as human is a mistake but so to is thinking of them as dumb.

semerda · 2024-12-16T22:15:17 1734387317

AI models are just complex mathematical functions/software and they do not have the capability to access, modify, or control computer systems or networks. There are technical boundaries and I doubt the team at OpenAI just let the AI have full control. This gets filed under marketing fluff.