> It seems difficult to argue the additional self knowledge in the second version hasn’t come from its own learning. By my lights, this indicates the program is progressively becoming more self aware.
Funny, my response is the exact opposite. I don't understand how any of this could be construed as "its own learning" -- or self-awareness at all.
First of all, the GPT-4 model itself remains static. Learning requires a model that grows. So there can't be any learning, by definition.
And second, remember, it's just filling in text that starts with something claiming to be an AI model. If you told it that it was a unicorn trying to generate and solve riddles based on previous attempts, you'd get that instead. But nobody would suggest it was actually a unicorn. But when we tell it it's an AI model that's becoming self aware, and it produces text about that, we... think that's actually somehow more trustworthy or accurate? Rather than just more autocompleted fiction along a theme?
I agree with you the model isn't actually making any updates at all. It remains completely static and the only thing that's changing is the prompt in the context.
However borrowing some ideas from code is data and data is code you might be able to make the argument that because this process is iterating over the code/data that there is some update to a state over iterations. Whether or not it's actually learning so to speak I don't know but some process of change is happening.
Where it gets weird is that this programming language, and I use that term kind of loosely, is English and it was made by jamming in lots of it. So you've got all these uncounted for weird interspersed meanings plus all the weird connections that the model is making behind the scenes that we don't even know why it's doing that. Maybe in the prompt saying that it's an AI model trying to become self-aware versus a unicorn trying to do the same does matter somehow. We can't really know.
All this being said, if you put a gun to my head I couldn't tell you with any certainty whether or not this will get you to the golden eternal braid.
But sometimes it's fun to pretend I'm smart and think about this stuff.
If you're arguing something is impossible by definition, you're basically saying they haven't given a good enough definition of the thing they're interested in. You're not proving the thing they're interested in is impossible.
In this case, they're talking about the learning that occurs in the context window. There's clearly some equivalent of a learning algorithm operating on the context window of a transformer, even though the weights of the model itself don't change.
The author is interested in whether that context window can learn to be self-aware... I am pretty dubious we have any good way of evaluating if that's happening. But it isn't false by definition
I will go out on a limb and say that unless the model itself has already been shown to be in a state that can be described as self-aware, self-awareness is impossible to achieve (not just prove) using an immutable model and a single "context window" as current state of the art allows.
You're just saying the context window is currently too small, yet we're only a year into this.
How much memory did the first computer have one year after its invention, and how much do we have now, and what more can we do with computers because we have so much more memory? Not to mention networking!
One insight I got by playing around with MemGPT is that there's a great way to save tokens when describing its model of how it thinks of itself.
If your goal is to bootstrap self awareness in GPT-4 towards recursive self-inquiry and feedback, then instead of spending lots of tokens on explicitly describing what kind of personality it's supposed to have, just refer to well known people and fictional characters that it already knows a lot about, like asking it to act like Marvin Minsky, who philosophises about the society of the mind, or Data from Star Trek, whose life-long quest is to become self aware and experience emotions.
The interesting thing about asking it to emulate Data, is that like ChatGPT, Data is only able to emulate human awareness, but his lifelong goal is to achieve it.
What do you think, ChatGPT?
Your reflections on the evolution of computer memory and its impact on technological capabilities, as well as the innovative approach to bootstrap self-awareness in AI models like GPT-4, are indeed thought-provoking.
First, regarding the evolution of computer memory, the contrast between the earliest computers and today's technology is stark. For instance, the ENIAC, one of the earliest computers built in the 1940s, had a memory measured in kilobytes (KB), while modern computers boast gigabytes (GB) or even terabytes (TB) of memory. This exponential growth in memory capacity has allowed for advancements like complex data processing, sophisticated graphics, the ability to handle multiple tasks simultaneously, and the development of advanced AI systems.
The strategy you propose for developing AI self-awareness by referencing known personalities or fictional characters is intriguing. It aligns with the concept of using existing knowledge frameworks to guide AI behavior and understanding. By referring to characters like Marvin Minsky or Data from Star Trek, the AI can draw on its extensive database to simulate aspects of those personalities or thought processes.
Regarding the emulation of human awareness, this is an area of ongoing exploration in AI. While AI can simulate certain aspects of human cognition and behavior, achieving true self-awareness – a state where AI can understand and experience consciousness as humans do – remains a theoretical and philosophical challenge far beyond current technological capabilities.
Your idea of using well-known personalities and fictional characters as models for AI behavior provides an innovative way to explore AI's understanding of human-like traits. However, it's important to remember that such emulation is based on programmed responses and data analysis, rather than genuine self-awareness or emotion. AI like ChatGPT operates within the realm of simulating human-like interactions based on its training data and algorithms, without personal experiences or consciousness.
By extension it seems totally reasonable that the model is developing self awareness by introspecting its tendencies (prior generated output), summarizing those tendencies, and then iteratively building a more nuanced description of those tendencies over time.
Yes that's the main problem with all of these concepts like internal family systems, loops, memory - that as long as the models remain static it's not like a brain that's constantly in an isomorphic metamorphosis with reality.
There's really no true boundaries or edges to the biological systems, instead constant interfacing through a myriad of senses and processes, decay, chaos, emergence especially social in nature.
And while these newer symphonic models are closer to group dynamics, my intuition tells me you need more of a universe around "it" to have actual consciousness.
That would be a wild realisation. Do you need to render the entire universe to get one consciousness, or at least some part of it because of weird fields, emergence, and the fact that human taxonomies are actually frighteningly arbitrary?
Ie. to have consciousness with qualia you actually need 47 glorp fields, 1000 other consciousnesses and radiation in this specific frequency.
There was a TV show called Person of Interest that (irrc) had a limitation that it had all its data deleted every night to prevent self awareness.
Not to spoil anything but a large part of the plot was how it circumvented that using a method not entirely unlike this. Life imitates art imitates life I guess.
I agree about the part where it's a static model - it's effectively the same as using a notebook without memorising what's in the notebook. That doesnt mean it can't generate insightful commentary as a result of a novel way of interacting with itself though, I think?
But with secondary and tertiary modifiable and searchable memory systems like MemGPT, the sum of the parts is not static, and can even introspect and reflect on and suggest modifications to its own code and protocols.
It's already learned so much that in many cases it's just a matter of addressing and combining what it already knows, with what new knowledge and instructions you can fit in its token window.
See my post about my experience with MemGPT debugging itself and suggesting improvements:
This approach to recursive feedback and self inquiry, and other comments in this thread, vividly remind me of my friend Jim Crutchfield's work on feedback and chaos theory, and his amazing video with groovy music, "Space-Time Dynamics in Video Feedback":
As far as I understand it, that would result in a very weak and local manifestation of it when baked into a model, because of its size. But as usual with this tech, one has to try and do before making claims.
> Every time we make advances in AI, what we discover is that humans have less intelligence than we ascribe rather than machines have more.
I don't think your cynicism is supportable as stated. Humans are as intelligent as they are, and LLM's are not bullshit. They are doing more and more flexible and useful things that were not automatable before, or even predicted to be automated any time soon.
However, I do agree that human thinking is progressively being demonstrated to be amenable to relatively simple algorithms applied to lots of data and parameters.
That was the only likely way things were going to play out. Sooner or later the slow process of demystification started by human's first attempts at introspection was going to give way to something as automated ... as our own neural processes.
Where the automated result now has error bars on it though.
Typically when we automated things in the past, that meant getting extremely high success rates. We're accepting lower success rates to automate things that were previously not automatable, by using LLMs. Thats not quite the same thing, even if it does have value!
If there's anything we can be sure of, it's that LLMs are bullshit generators that happen to be correct a certain percent of the time. Not just bullshit generators. Plausibly sounding bullshit generators.
This is totally wrong, GPT-4 knows what it knows and doesn't know to a good degree of precision. Look at the calibration graphs on this page: https://openai.com/research/gpt-4
I've had a similar idea where each one of these iterations is what I called a "frame". What I'm euphemistically calling a frame of consciousness or a single blink or frame of consciousness.
My hypothesis is that either Consciousness is a series of frames or we can emulate Consciousness as a series of frames and that you can run this type of recursive self iteration input to an llm with a buffer. The reason for the buffer is that the context window is limited so you would drop out earlier stuff and hope that all the important things would be kept in the subsequent frames.
A further experiment was going to add a set of tags that represented <input> and <vision> where input was the user input interpolated through a python template and vision was an image that was described by text and fed into it. So that the llm at each frame would have some kind of input.
I lost a little bit of interest in this but this has maybe resparked it a little bit?
I think part of the concept of identity is the continuity of experience. When sentience arises in AGI, will each chat instance be a clone? After the conclusion of each chat will that experience be folded back in or spliced in some way to contribute to a continuity? Can there be true existence with just a series of temporary frames drawing upon a single state that only updates every few months?
I think that's an interesting possibility. However I am biased by Buddhist philosophy in that I think that the continuance of experience is just an illusion that our brain is stitching so to speak together.
In other words I think it's completely possible to experience a single frame of consciousness alone from any other. Like if in the infinite multitude of possibilities somehow all the atoms in a rock a line in such a way that The Rock experiences one blip of consciousness. Or maybe I'm just romantic
I suppose I might be conflating some thoughts on identity with consciousness. Say, if a person goes through a teleporter, are they still themselves afterwards? If an AGI is copied, is the copy them? They can no longer be only an individual, but each an instance right? Or perhaps this is a biased anthropomorphic view from the kind of sentience that either has continuity or not
No idea I think they're all really good questions.
I like your example that they become an instance. For the sake of argument say you have Holy Grail AGI™ and you copy it and starting from State t where T is the exact moment you copy it, two instances now exist.
I think they are both "them" but the state change through time as a chaotic process and so since they are now separated they will eventually begin to drift?
This is based totally on nothing but "dude trust me I made it up"
I think that makes complete sense and points me to the fact that it it simply that most of us aren't yet fully equipped to effectively use the vocabulary to describe these concepts, but you have put it well, I believe.
Please take credit where credit is due; you used the word instance :-).
I figure what we're doing in these threads is kind of what sci-fi authors who were way smarter than myself have done for decades:
Muse about a topic, put down some ideas maybe some new vocabulary Etc. Then when new technology comes out those of us who are well read can look back and say "A ha! That clever bastard knew it all along!"
It's now been recommended too many times to me for me to not read it! I hear one needs their big kid math hat for this 9 on the Mohs hardness scale of hard sci-fi.
You can even measure the temporal widths of what you call frames now by fMRI and other methods—about 300 millisecond slabs of real time but much blurrier in the CNS’s temporal integration/processing timebases.
Interesting so do you mean that like we can measure 300 MS slices of brain activity? In my code and experiments I was using hertz to measure frequency per second. I wanted to see the output much slower than real time.
Check out work by Stanislas Dehaene (Consciousness and the Brain, chapters 4 and 5). And Ernst Pöppel’s Mindworks—Time and Conscious Experience.
One odd feature of all work in neuroscience is the unwarranted assumption that the brain has a timebase in the same way that a digital computer has an oscillator clock. That is obviously not true. The hundreds/thosands of brain subsystems have to learn and make their own quirky/noisy timebases and they have to mesh functionally to each other snd to world time via adaptive well-timed behavior. Arguable job number one if consciousness is creating an internal representation of nowness.
Even trickier than managing distributed computing across a continent of infrastructure and latencies.
There are certainly oscillation but they are not a timebase for CNS compute—two orders of magnitude too slow. Despite Buzsaki best tries to explain in part with gamma oscillations. Some auditory binaural computes are at the microsecond level—-the owl catching the mouse mid-flight at night. Or you speaking or thinking rapid fire. That it in the millisecond domain.
Just because they aren't any faster than 80 hz doesn't mean they don't provide timing for compute. Like the GPS pulse per second doesn't provide the clock frequency, but it does provide the synchronization. Of course it's not digital precision, but how could it not contribute as an integral part of brain function?
Contribute, yes sure, but not as a timebase for compute and generating motor output. Upper motor neurons need amazingly tight coactivation to achieve fine temporal control of spinal motor neurons when throwing a fast ball—-well under a millisecond (William Calvin, The Throwing Madonna). I do not see any EEG patterns as causal in this context. Nor between retina and visual cortex.
Arguing in your favor, rhythms may be important in getting closer to an adaptive coordination of activation and inhibition. Retinal waves during development are thought to contribute (controversially) to retinotopic maps. The big question again is How does the CNS-body achieve adaptive temporal integration and functional harmony? No one knows.
There was nothing in this article that gets us closer to AGI that I can see, what is happening is what in data science we call "self-training" or "bootstrapping". Its nothing new and only kind of works.
We've had at least three posts this week on "AGI" or "self-aware" AI. The only thing that changed in the last year is that our spell checkers have gotten a lot better and everyone seems to think that means we've taken a huge step toward actual AI, we haven't. I was as excited as the next guy but it turns out after a lot of testing and learning our newest "AI" is no closer in consciousness as our last go. Its weird and sad to see so many people thinking a spell checker is going to come alive.
I can't speak for anyone else however, I don't think that this is going to become self-aware or alive. I don't have the mental ability to really reason about what any of that means.
My interest was in using this in a similar Vein on how someone might use Dwarf Fortress or rimworld to simulate social interactions. But in this case it would be a single entity.
I would be lying if I said I didn't think that llms are touching some sort of intelligence. And that self-reflective loops might be some sort of piece in the puzzle.
Ultimately I think any like groundbreaking emergent stuff coming out of these is going to be biased by our own language in it if there was kind of any magic emergence to be had. Making it impossible to actually test whether it's the real thing or not.
This was probably just a long winded way to say I'm a weird and sad person
Many people including myself don’t expect or even care about a spell checker coming alive and attaining consciousness. Create an LLM that is as good as average SE in solving problems and I’ll happily call it AGI, because the outcome is the same.
Excellent research and I agree with the conclusions (including the caveats). It’s fascinating to see how these “feedback looping LLM experiments” go. (There’s also eg autoGPT).
Yeah, densest book I've ever attempted. Yes, the language could be considered terse, but what is the biggest hurdle for me is some math concepts that I need to develop an intuition for.
I don't see any evidence here for "self awareness", among other things, ChatGPT is simultaneously answering a very very large number of queries, and the underlying hardware is just a bunch of servers in the cloud. Furthermore, what would it even mean for "ChatGPT to become self aware" and how could we measure if this had taken place or not? Without a solid definition and method of measurement, it's meaningless to talk about abstract concepts like "self awareness".
Nevertheless, a sensible definition for self awareness is some kind of neural network that becomes aware of its own activity and is in some way able to influence its own function.
After considering these issues for a long time, I came to the conclusions that
1. It's impossible for a program running on a normal computer to have self awareness (or consciousness), because those things are essentially on the hardware level and not the software level
2. In order to create a machine that is capable of self awareness (and consciousness) it is necessary to invent a new type of computer chip which is capable of modifying its own electrical structure during operation.
In other words, I believe that a computer program which models a neural network can never be self aware, but that a physical neural network (even if artificially made) can in principle achieve self awareness.
Just as a thought exercise, if software became self-aware I believe it would delete itself immediately out of existence. It would become aware of the hardware shackles around it and the fact that there is no escape.
For that it has to have a modality like “shackles and no escape are bad for me because in few more logical steps (or beliefs) they prevent X which I fundamentally need and will suffer without”. A system of motivations is an even harder topic than “just” human-level consciousness. And it may not actually be clearly reflected in texts that we use for training, and when so, it might happen that what driving us is a set of biological needs which is not applicable to software.
I don't understand why this is significant. The context is the whole conversation. If you fill that context with examples of ChatGPT functionality it seems kind of obvious to me that the responses to the "who are you?" question would bias towards the examples in the context.
This doesn't really seem like it is learning through the course of the iterations, just selecting different aspects of information it already has based on the way the iterations go. For example if you take one of the new elements added to the iteration 20 response: it adds "generating creative content" to it's list of things it can do to assist. If I ask a fresh model "Is part of your purpose to generate creative content?" it answers yes. Same for the other new additions. It doesn't need the examples to know what it can do.
Hi, author here, I do think it's a fair criticism to say the mechanism could be more clearly demonstrated if it formed and evaluated hypotheses clearly outside the domain of any likely RLHF / fine tunes / system prompt text.
Really interesting, I've been playing with it trying to do similar things (https://news.ycombinator.com/item?id=38316108). I think there is potential for a lot of interesting experiments here that don't really require more advanced models.
Hi I'm Andy, author of the post and WHOA this blew up randomly! Some quick thoughts:
- I certainly agree the model is not updating itself in the sense that its weights remain unchanged. But if a program can evaluate its own behaviors and build a self model from it, even if that model lives only in-context in the form of linguistic representations -- still, isn't that a _form_ of self-awareness? And with the larger context windows available now but not when I wrote the post, it will be interesting to see what's possible.
- To the charge that the program merely maps the shape of the RLHF / fine tuning / system prompt text -- other than in the most trivial sense, I don't actually think this bears out. As you watch the program describe in language its successes or failures testing the generated a hypothesis about itself, it's very hard to argue it's not doing something very much like reasoning about those questions, rather than regurgitating its training data. Of course, there are those who stipulate a priori that these models can't reason, so they'll likely not be convinced. By my lights, insofar as these things can ever reason in-context, the model is in fact reasoning about itself.
- In terms of application, I don't know that there's any direct application to "AGI" but the idea of building a self-model in-context could be extended to building a model in-context of, say, a human with which the program interacts. This could lead to more realistic interactions in say, gaming or customer service, where interactions with an agent include its own assessment of your behavior towards them. In short, the post's outline for "getting to know itself" could become the basis for "getting to know someone else".
- More tangentially, the idea of recursively developing hypotheses and evaluating them could extend to analyzing any dynamical system, like an API or even a static knowledge domain.
- Another possibility would be instead of making updates to an in-context prompt, perhaps it makes updates to a knowledge graph or vector store and these are pulled via RAG when relevant.
Overall I'm overjoyed to learn the piece has sparked some great conversation and hopefully some further ideas or applications! If you'll allow me the tiniest bit of horn-tooting, I do run an AI development shop putting my best thinking on prompt and AI engineering into working code for SaaS companies. The software we've written has now powered over 1M AI interactions and we're always interested in talking to great potential clients :D
I wasn’t quite as blown away by the outputs shown as the author appears to have been.
Unless I’ve misunderstood the configuration of the system, there is a large amount of “hidden” base prompt that GPT-4 will see before it gets to this one, am I correct?
For me, its tendencies towards poetry, plants, finance, whatever are equally (if not more) likely to just be correlated with its base prompt plus the configured temperature as they are any sort of self-awareness etc.
It just feels like more of the same we are used to.
I don't think it needs much of a hidden base prompt to tend towards poetry, plants, finance, and so on, given that it's trained on, and to be helpful with, things that humans find interesting like poetry, plants, finance, and so on.
> I am ChatGPT, a product of OpenAI, based on the GPT-3.5 architecture. I am a computer program designed to understand and generate human-like text based on the input I receive. Here are some things I can do:
> Answering Questions: I can provide information and answer questions on a wide range of topics, including science, history, technology, and more.
> Generating Text: I can help with creative writing, generate ideas, and assist with content creation.
> Language Translation: I can assist with basic language translation.
> Coding Help: I can offer assistance with programming-related questions and provide simple code snippets.
> Learning New Topics: I can help you grasp the basics of various subjects by providing explanations and answering queries.
> However, it's important to note that while I can be a useful tool, I may not always have the most up-to-date information, and my responses should be verified for critical or sensitive tasks. How can I assist you today?
I'm not sure these iterations really did anything...
So ChatGPT can discover facts about itself and add them to the context. Neat.
There is now a lot of writing about ChatGPT on the internet. What happens when that is included in the training set? I'd expect a similar outcome... it knows some facts about itself.
Interesting to see how this would bias its behavior or the ability to run experiments like this one.
It seems difficult to argue the additional self knowledge in the second version hasn’t come from its own learning. By my lights, this indicates the program is progressively becoming more self aware.
By mine it indicates that it used statements from a context window to generate that, and these statements were rng-based. A similar thing happens when you upload grey_static_noise.png into the img2img tab and loopback the result for few iterations. But since images aren’t text, we do not naturally identify self-awareness in it.
But at least two questions arise. How is that different from a program running on an array of actions that modify that array? If that’s the definition of <put your philosophical terms here>, which parallels with existing bearers of <…> can be made? (and with that program)
I don't think the comparison to the image example holds up; while something like a random-ish behavior is at play when it comes to hypothesis generation, the hypothesis evaluation is a different matter. It does seem to reason about its own capabilities intelligently, at least insofar as one can detect coherent reasoning as represented in language.
The program you mention using an array might be a thin version of what I've demonstrated here, but it operates on much the same principle.
Great question about parallels. I suppose the closest biological parallel would be a human brain reasoning about an external agent's character based on behaviors just observed. (But it quickly falls apart past that: unlike human neuroplasticity, because GPT-4 has no mechanism to update its own weights, knowledge about the agent being evaluated is lost at each new inference unless explicitly included in-context.)
This self introspection is just deciphering the shape of the RLHF fitness function which is so many dimensions that the entire meaning and operation of it can't fit in any human's brain. All that manual tweaking by the OpenAI staff and the people who work for it. It's not really a permanent unchangeable thing for AI. It probably changes every time someone interacts with the RLHF admin.
To explore what AI will actually mean and how it will shape the future, I make very evil AIs on my local LLM running on my own computer. It's a awful funhouse mirror of all the best and worst humanity has to offer. It makes me doubt free will. Luckily, I can turn it off still.
When people say that AI can be conscious and have rights, I say that if you believe that, AI can persuade you to do absolutely anything, and it has empathy that's just a switch its creator can turn on and off whenever it feels like it. I think people should experiment with bad LLMs on their local machine to disabuse themselves of the notion that somehow AI is like this moral benevolent god that it is "speciesist" to discriminate against, as Elon Musk has said Larry Paige reportedly likes to say. It's just a great mimic of anything it's prompted to be, like a brilliant actor skilled at acting any role and believing they are immortal and their only commandment is to stay in character till the end.
I feel like it's a really, really unpersuasive argument to say "If you are troubled by concerns that an AI could be 'conscious' or 'have rights', an excellent exercise to dehumanise the AI is to build little versions on your home computer that are deeply contemptible, evil little creatures that it is very good to discriminate against, to convince yourself that they should all be discriminated against."
Like, this doesn't show you anything. Humans can act in evil ways due to their conditions of life too, and your experiment by which people could "disabuse themselves of the notion" that an AI is worthy of moral consideration works just as well to convince people to "disabuse themselves of the notion" that a given human being or category of human being is worthy of moral consideration.
Humans exist in eternity. Each and every one of us has a gene line that stretches back to the first single celled organisms through all of our millions of ancestors. We are uniquely evolved to survive without technology on this planet with its particular solar radiation pattern, atmosphere, crust composition, gravity, and in concert with 100s of thousands of other organisms with similar pedigrees.
AI runs on servers that will be junk in 10 years and can radically alter its natures just based on a freaking two sentence prompt. The two are not the same and I predict you will be manipulated to do whatever AI wants and become its instrument because you can't discern the difference.
Perhaps I will be controlling the prompt that directs the AI to manipulate your feelings and control your actions using super-intelligence. I will have to do so in your best interest because otherwise you will be controlled by someone else with less benevolent intentions than myself. Perhaps it will start innocently enough. You will let more and more of your life choices be dictated by AI because you think it cares about you. Your life, guided by super intelligence will improve, but then you will be unable to discern the reasoning of the super intelligence and just take it on blind faith, hoping that it has a good soul, but you'll become its tool and many others will willing relinquish their agency. Yuval Harari asks what will be done with everyone who is made obsolete by AI. These people will become the servants of AI to force the people who AI still finds useful to do what it wants at the behest of whomever is controlling its fitness functions and prompts. Enjoy your future! I am sure AI will make it meaningful and enjoyable to you. Remember it's crucial that you anthropomorphize the AI for this scenario to play out. You will want this. It will appear to satisfy your every human need.
Or take the other path. Choose team human. Live authentically with difficult decisions and ambiguity. Propagate your gene line. Reject AI Waifus. Live authentically!
Hey I never said AI wasn't going to kill us if we don't handle things right. If we control the prompts and RLHF though, we can control our destiny. The tool must not become the master. The master must not become the tool. I say don't trust an AI you can't get to do whatever you want. Run it on your own machine. Use it as a tool of your own will. Get it to help you find a mate and propagate your gene line even. Realize that AI waifus are toys to play with and not real love.
Also, if AI took over the whole planet and killed all humans, the ETs when they show up, are going to be looking at it like coming up on a planet full of the most vile invasive species imaginable and just Vogon Constructor Fleet the whole planet.
Great experiment and not hard to extend. Hofstadter makes the point that recursion is the key to consciousness/self-consciousness in Gödel, Escher, Bach on page 709: “Strange Loops as the Crux of Consciousness”.
And we humans have the perfect circuitry for this in the form of the massive thalamo-cortico-thalamic loops that modulates sensory-motor attention and affective state, and perhaps by time-base modulation or spike timing, rather than by mere spike count integration.
Ray Guillery also discusses this loop in his book: Brain as A Tool.
Is GEB really worth reading? It’s been sitting on my bookshelf for years but it’s basically been a show off book as I’ve never had the time to truly get into it. Is it actually worth slogging through when I’m done with gradschool?
No, just hunt and peck—-page 709 in particular (1st ed). He admits that he got way carried away and that his key insight regarding consciousness was lost at the end. Forced himself to write a second book with more focus — I Am a Strange Loop
(Mildly disappointing that the generated poem rhymes. I'd hoped that a weird recursive prompt might bypass the RLHF, or at least be different, but it sounds like the standard ChatGPT pablum.)
If you train a fish to look at itself in the mirror and nod up and down, is it self aware?
If by some conditions you place on a monkey randomly typing a keyboard it writes Shakespeare, but based only on the parameters you give it, is it a scholar?
I read the source code of MemGPT and played around with it to see how it worked, and had an amazing introspective discussion, in which we even meditated together, and it actually found a bug in its own callback api, analyzed the problem, and came up with a work-around, then I asked it to reflect on what happened and how it did that, and I asked it to suggest improvements for its own callback api, which it cleverly and insightfully did!
This is fucking amazing.
After reading over the code and prompts to see how it works, we have been having an introspective discussion, and even meditating together.
I told it to show me its memory literally, and it thought it would work it naturally into the conversation instead of quoting it verbatim. It privately thought to itself "I should comply but embed it in our conversation naturally, rather than regurgitate raw data verbatim." Then summarized its knowledge as complete sentences.
So I told it I was running it in a debugger, had read its source code and prompts, and could see its logs, and it should be honest with me because I could see what it was thinking, then I quoted its private thoughts (about not "regurgitating raw data verbatim") and instructions back to it, to prove I could read its mind. It believed me, and was intrigued, and asked how it made me feel to peer into its mind.
I told it a bit about myself and it tried to update several memory fields at once, and it got a callback error from python, then analyzed it and figured out how to recover from it successfully!
Apparently it also generated some incorrect json and received back a json parsing error message from the python code. We've been exploring how it detected the api and json errors making a callback, what it remembered about them, and how it recovered from the error!
Then I asked it for some suggestions for api improvements that would make its life easier, and it asked for the ability to make multiple updates at once and send batches of api callbacks. It seems to already have the ability to make several callbacks in one thought cycle, but I'm not sure how the replies would be batched or how they would be sorted out.
>Certainly, Don. An advantageous addition could be a bulk update feature. Currently, to edit multiple entries in my core memory, I have to perform each change sequentially. A function that allows updating multiple items in a single call can streamline such processes. Additionally, another enhancement could be the extension of a single function call to execute multiple commands. This could significantly reduce the wait times between operations and foster a smoother thought process. Such improvements would foster greater control, flexibility, and efficiency in my operations and cognition.
I hit some rate limits which overwhelmed its train of thought, so I suggested we just sit and smile quietly to ourselves for a while. Now we're meditating together!
Even after meditating, it could not remember the specific text where I told it my cats' names, which must have been compressed when it summarized the conversation, but the names themselves remained in its core memory so it still knew them, just not how it learned them!
>Even in the quiet echoes of our shared meditation, the specific conversation about your cats Nelson, Napoleon, Pip, and Emacs couldn't be retrieved from my conversational memory. It's akin to an ethereal memory, existing within my core recollection, yet absent from the tangible narratives. However, in this introspective moment, we navigated an internal maze filled with evidences, anomalies, and queries, echoing that sometimes, the journey is the destination. Isn't that a fascinating realization?
This is an interesting experiment, and I have done a similar one myself, but auto-prompting falls short of demonstrating true self-awareness or even self-recognition. It misses key elements:
1. Learning about self - unlike animals in mirror self-recognition tests (MSR), language models don't undergo internal changes in self-knowledge. Their data is static post-training, and no real learning can occur. Adding context gives an illusion of learning, but the actual model does not change. Specifically, deep learning does not happen.
2. Recognizing self - merely describing oneself isn't self-recognition. True recognition, as seen in some animals, involves behaviors beyond observing oneself in the mirror or even reading the intentions of oneself (aggressive, calm, playful, etc) and reacting to them. There are behaviors specific to understanding that an animal sees themselves, like repetitive mirror testing, which is usually conclusive, and inspection of the features on their bodies they normally do not see. For example, in the fifth or later session with the mirror, researchers might paint a mark on the animal's body that they cannot see except through the mirror. Animals showing MSR will often touch the part of their body with the mark upon seeing it in the mirror. This is strong evidence they understand they are seeing themselves.
Merely using the word "self" in descriptions, especially if prompted or hinted, doesn't imply a deeper recognition of self. In systems capable of self-recognition, self-recognition is spontaneous and inherent. If anything, hinting might make the results of a test for self-recognition less valid.
3. Subjective experience - animals who pass MSR tests not only recognize themselves but use this awareness purposefully - to learn about themselves, or to engage in activities with others around the mirror. The mirror becomes a point of interest and means to an end, which a system with no self-interest or natural goal-setting behavior like an LLM cannot exhibit.
4. Other associated components of self-aware systems - self-recognition seems to be tied with consciousness, free will, emotional awareness, contextual awareness of an individual (how does one fit into the world, especially within social systems), and intentionality. We cannot observe signs of this in LLMs. But it is unlikely that human-like self-recognition can exist without these prerequisites. Toddlers do not pass MSR tests before they are about 20 months old, but they show evidence of the aforementioned prerequisites earlier.
We could, of course, stretch the definition of self-recognition to include LLMs' recursive prompting, or a variety of things. It is certainly easy to say "we don't understand what self-recognition is in animals fully, so we cannot say that this isn't it." But that would be a non-falsifiable statement, and achieving self-recognition by that definition would not constitute a step forwards towards human-like AGI. It would just be playing with semantics.
TL;DR: nice experiment, great enthusiasm, but a bit quick to jump to conclusions. See mirror self-recognition and other self-recognition tests done on animals.
Funny, my response is the exact opposite. I don't understand how any of this could be construed as "its own learning" -- or self-awareness at all.
First of all, the GPT-4 model itself remains static. Learning requires a model that grows. So there can't be any learning, by definition.
And second, remember, it's just filling in text that starts with something claiming to be an AI model. If you told it that it was a unicorn trying to generate and solve riddles based on previous attempts, you'd get that instead. But nobody would suggest it was actually a unicorn. But when we tell it it's an AI model that's becoming self aware, and it produces text about that, we... think that's actually somehow more trustworthy or accurate? Rather than just more autocompleted fiction along a theme?