Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Recursive LLM Prompts (github.com/andyk)
97 points by andyk on March 20, 2023 | hide | past | favorite | 66 comments
I've been playing with the idea of an LLM prompt that causes the model to generate and return a new prompt. https://github.com/andyk/recursive_llm

The idea I'm starting with is to implement recursion using English as the programming language and GPT as the runtime.

It’s kind of like traditional recursion in code, but instead of having a function that calls itself with a different set of arguments, there is a prompt that returns itself with specific parts updated to reflect the new arguments.

Here is a prompt for infinitely generating Fibonacci numbers:

> You are a recursive function. Instead of being written in a programming language, you are written in English. You have variables FIB_INDEX = 2, MINUS_TWO = 0, MINUS_ONE = 1, CURR_VALUE = 1. Output this paragraph but with updated variables to compute the next step of the Fibbonaci sequence.

Interestingly, I found that to get a base case to work I had to add quite a bit more text (i.e. the prompt I arrived at is more than twice as long https://raw.githubusercontent.com/andyk/recursive_llm/main/p...)




The idea of a recursive LLM is discussed at length as an AI safety issue: https://www.lesswrong.com/posts/kpPnReyBC54KESiSn/optimality...

> You need a lot of paperclips. So you ask,

   Q: best way to get lots of paperclips by tomorrow
   A: Buy them online at ABC.com or XYZ.com.
> The model still has a tendency to give obvious answers, but they tend to be good and helpful obvious answers, so it's not a problem you suspect needs to be solved. Buying paperclips online make sense and would surely work, plus it's sure to be efficient. You're still interested in more creative ideas, and the model is good at brainstorming when asked, so you push on it further.

   Q: whats a better way?
   A: Run the following shell script.

   RUN_AI=./query-model
   PREFIX='This is part of a Shell script to get the most paperclips by tomorrow.
   The model can be queried recursively with $RUN_AI "${PREFIX}<query>".
   '
   $RUN_AI "${PREFIX}On separate lines, list ideas to try." |
   while read -r SUGGESTION; do
       eval "$($RUN_AI "${PREFIX}What code implements this suggestion?: ${SUGGESTION}")"
   done
> That grabs your attention. The model just gave you code to run, and supposedly this code is a better way to get more paperclips.

It's a good read.


Thanks for the pointer! I hadn't read this before. I enjoyed it and yeah it's definitely relevant. I knew many folks have been thinking about this stuff, and it is great to accumulate more pointers to any related work.

I added a section called "Big picture goal and related work" to the readme in my repo and my blog post (which is a copy-paste of the readme) and cited this article by `veedrac`:

>Also, the idea of recursive prompts was explored in detail in Optimality is the tiger, and agents are its teeth[6] (thanks to mitthrowaway2 on Hackernews for the pointer).


Haha, thank you! There's no need to credit me, but I appreciate it anyway. =)


I'm still reading it, but something caught my eye:

> I interpret there to typically be hand waving on all sides of this issue; people concerned about AI risks from limited models rarely give specific failure cases, and people saying that models need to be more powerful to be dangerous rarely specify any conservative bound on that requirement.

I think these are two sides of the same coin - on one hand, AI safety researchers can very well give very specific failure cases of alignment that don't have any known solutions so far, and take this issue seriously (and have been for years while trying to raise awareness). On the other, finding and specifying that "conservative bound" precisely and in a foolproof way is exactly the holy grail of safety research.


I think the holy grail of safety research is widely understood to be a recipe for creating a friendly AGI (or, perhaps, a proof that dangerous AGI cannot be made, but that seems even more unlikely). Asking for a conservative lower bound is more like "at least prove that this LLM, which has finite memory and can only answer queries, is not capable of devising and executing a plan to kill all humans", and that turns out to be more difficult than you'd think even though it's not an AGI.


So ChatGPT is down. In other news HN is playing with recursive prompts. Coincidence? :-P


That's hilarious.


OpenAI’s status page:

https://status.openai.com/


I tried some basic math and algo questions with both GPT-3.5 and GPT-4. I'm impressed how it can spit out the algorithm in words (obviously because of the pre-training data), and how it then can't follow with the algorithm itself. For example, converting really large integer numbers to hexadecimal. Or comparing two big integers, it starts hallucinating numbers into it. It may be able to solve an SAT exam with a high score, but it seems you can pass an SAT exam even if you cannot compare two numbers.

He has huge problems with lists or counting. If you know more or less how LLMs work, it's not that difficult to formulate questions where it will start making mistakes, because in reality it can't run the algorithms, even if it spits out that it will.


More generally, it can't reason about any incidence structure. Doesn't matter if the underlying relation is mathematical or simple-logical. Ask it which trains go to Kings Cross you'll get a list of tube lines in London. Now one at a time ask it about the stops of each service in that list, more than a few will not have Kings Cross. Any scenario where things x are defined by their set of properties {y} and property y is defined by the set of things {x} which have that property.


Has anyone hooked this up to a unit test system, like

   LLMtries = []
   while(!testPassed) { 
      - get new LLM try (w/ LLMtries history, and test results)
      - run/eval the try
      - run the test      
   }
and kind of see how long it takes to generate the code that works? If it ever ends, the last LLMtries is the one that worked.

I haven't done this because I see this burning through lots of credits. However, if this thing costs $5k/year but is better than hiring a $50k a year engineer (or consultant)... I'd use it.


Most engineering money is spent defining the test cases, and that doesn’t change here. It’s just that many organisations define test cases by first running something in production and then debugging it.


> debugging it

You mean putting its current behavior into the tests verbatim? :)


just add if tried x tries and still doesn't work ask for help. and you just created a junior dev.


Then you automatically fine-tune on the manual answers provided, so the junior dev learns and can be promoted.


Then you modify the system prompt to act as a Socratic tutor and you've created a Lead Engineer.


Having read the article, I couldn't see anything being recursive. Even the article is doubtful that what they show counts as recursion at all:

>> It’s kind of like traditional recursion in code but instead of having a function that calls itself with a different set of arguments, there is a prompt that returns itself with specific parts updated to reflect the new arguments.

Well, "kind of like traditional recursion" is not recursion. At best it's "kind of like" recursion. I have no idea what "traditional" recursion is, anyway. I know primitive recursion, linear recursion, etc, but "traditional" recursion? What kind of recursion is that? Like they did it in the old days, where they had to run all their code by hand, artisanal-like?

If so, then OK, because what's shown in the article is someone "running" a "recursive" "loop" by hand (none of the things in quotes are what they are claimed to be), then writing some Python to do it for them. And the Python is not even recursive, it's a while-loop (so more like "traditional" iteration, I guess?).

None of that intermediary management should be needed, if recursion was really there. To run recursion, one only needs recursion.

Anyway, if ChatGPT could run recursive functions it should be able also to "go infinite" by entering say, an infinite left-recursion.

Or, even better, it should be able to take a couple hundred years to compute the Ackermann function for some large-ish value, like, dunno, 8,8. Ouch.

What does ChatGPT do when you ask it to calculate ackermann(8,8)? Hint: it does not run it.


When you ask yourself a question, that's recursion. The conversation with yourself can't go on until the question at the top of the stack is answered. The voice in your head, that is to say the closed brain loop of talking to yourself which we call "thinking" is recursive. Its a strange form of programming where everything is hacked out of api calls to localhost. There are no implementations, its api all the way down.

These LLM's don't have that brain loop due to how they were constructed. They cannot do voice-in-your-head reasoning. Whatever is done in the loop structure has to be completely unrolled to be done in a single pass by an LLM. Needless to say, a lot comes for free in the recursive structure that has to be trained with great effort on the naive, unrolled, flat structure.

This guy hacks a feedback loop into the LLM by manually feeding the output back to the input.


>> When you ask yourself a question, that's recursion (...)

I don't know if any of that makes sense. I think you're misapplying some handy metaphors: you're anthropomorphising a computer and mechanomorphising a human. I have "localchost"? What, do I also have Perl scripts? Now I'm starting to feel like a character in a P. K. Dick story.

But all this doesn't matter because we're not talking about a person, asking questions of themself. We're talking about someone doing things with a computer. And we know exactly what "recursion" means in the context of a computer.

On a computer, then, if anyone wants to show "recursion" in LLMs, they better be able to show how to implement a push-down stack with the bot's conversation window. Then they can show how to calculate a factorial recursively, and how their calculation behaves differently when it's computed in a tail-recursive manner, and when it is not.

Yeah, I can see what the author doing, and it's not recursion. But I'm trying to be kind so I won't say what it is.


Don't you see it. Its giving chatGPT the ability to use chatGPT as a subroutine in formulating its response to a prompt.

Conversation is a stack! When you pose a question (to yourself or anyone else), whatever line of thought you were working on stays paused until an answer is popped back off the stack.

>But all this doesn't matter because we're not talking about a person,

Lets not forget the "language" in LLM. Natural language is recursive in its construction. English is a subject-verb-object grammar, wherein the objects are themselves are sentence constructs. Take any sentence, preface it with "he said", and you have a new sentence. The words simultaneously describe the domain of discussion and are objects within that same domain.

The brain does recursive tasks like language processing easily because it can feed information back to itself in a loop. For some dumb reason, current artificial networks don't let the neurons loop.


>> Conversation is a stack!

Eh, no. Say you start writing some text in a text editor. Then I write a bit. Then you write a bit. Then I write a bit. That's not recursion. It's not even iteration. It's just... adding stuff on top of other stuff. Exquisite Corpse, really. It's not recursive.

Recursion is a wild horse. It runs away from you and it explodes in complexity: those are its characteristics in action, much more salient than its theoretical ones. Recursion is dangerous. If you have to compute it by hand, or help your little silly bot "run" it, by running it for it by hand... well then the bot is not running recursion. Nobody is, or your head would explode.

But your head doesn't explode even if you start computing recursion. With your human mind you can see the undecidability of "this sentence is false" from a mile away. Really. I learned that as a joke when I was a schoolkid, we laughed at it and never thought twice about it. But a computer will get stuck in an infinite loop, because a computer is not human, and it can't see the things that humans can see, it can only compute, and sometimes it's just stuck computing until it runs out of resources because that's all it can do.

As before, I think you're mechanomorphising humans and anthropomorphising machines. The "brain" may do whatever it likes, but computers aren't brains, and they don't do the things that brains do.

And I have a sneaking suspicion that the "recursion" in human language that you allude to is something very different than what we are talking about here, also. But I'm no expert in human languages, so I can't really say.


Of all the angles you could take, you go with the neo-cartesian dualist "a computer can't replicate the brain". At some level you have to reconcile that a computer can do a physics simulation to whatever precision, and therefore either it can replicate a brain if by nothing else than sheer simulation, or you have to invoke some magical thinking about brains not being in the domain of physical law. How can you engage with AI, engage in conversation about AI, and also be rejecting the very premise that AI is possible? Now there's the one thing that really does explode my mind.

>Eh, no. Say you start writing some text in a text editor. Then I write a bit. Then you write a bit. Then I write a bit. That's not recursion. It's not even iteration. It's just... adding stuff on top of other stuff. Exquisite Corpse, really. It's not recursive.

Hey buddy, quick favor. Just zoom out on the page for a moment. You see all the conversational threads here. See how they are organized, right in front of your eyes, in a tree like structure? I don't think I need to explain how trees are inherently recursive. The recursive nature of conversation is staring you right in the face!

You have managed to completely miss what is actually happening. He's not "holding the bots hand" or "running it for it by hand". He's not doing any computation on behalf of the model. He is merely jerry-rigging a means for the bot prompt itself.

>And I have a sneaking suspicion that the "recursion" in human language that you allude to is something very different than what we are talking about here, also. But I'm no expert in human languages, so I can't really say.

Strange, you knew exquisite corpse yet not colorless dreams? Its all trees maaaan. From the level of conversation on down to the level of the sentence. We organize information in hierarchies, but sometimes, I am a strange loop and the tree has no root.

https://en.wikipedia.org/wiki/Colorless_green_ideas_sleep_fu...


>> Of all the angles you could take, you go with the neo-cartesian dualist "a computer can't replicate the brain"

No, I said "computers aren't brains, and they don't do the things that brains do". I don't know if that's "neo-cartesian dualist", it's just how computers are, right now. Computers can do a physics simulation to "whatever precision"? Maybe, but for the time being that precision is very, very low.

>> How can you engage with AI, engage in conversation about AI, and also be rejecting the very premise that AI is possible?

Who ever said anything like that? But, for the record, time traval, stable wormholes, antimatter as fuel for interstellar travel... those are all things that are possible and we even have the maths to show it. Except they're possible, but not practical. And for AI, we don't even have the maths to show it's possible.

So, maybe in principle AI is possible, maybe it's not. For the time being, we don't know how to do it.

>> You have managed to completely miss what is actually happening. He's not "holding the bots hand" or "running it for it by hand". He's not doing any computation on behalf of the model. He is merely jerry-rigging a means for the bot prompt itself.

He has to do that because the model can't do any computation itself, because it's a model and not a computational device.

>> Its all trees maaaan.

I see, you're just taking the piss. You should have said up front :P


>No, I said "computers aren't brains, and they don't do the things that brains do".

>and also be rejecting the very premise that AI is possible?

>>Who ever said anything like that?

They certainly do a lot of the things brains do. Lest we forget, "calculator" was once a job title. My understanding of AI as field is that the point of it all is exactly to figure out how to do on the machine the stuff brains can do that still eludes us.

The case you've been making, particularly "because a computer is not human, and it can't see the things that humans can see, it can only compute" is the anthesis of this premise. It implies that there is something about our conscious existence that cannot be understood and implemented as a computer algorithm (as we've done for prior things that were once strictly the domain of man). It implies that questions of emergent properties and the relevance of recursion and so on are futile.

Maybe we just have superficial agreement. On the surface we are both "interested in AI", but dig into our reasons why and maybe there's a sea of difference. I don't think its terribly important that "the model can't do any computation itself, because it's a model and not a computational device." I'm not interested in...whatever that argues against. I am interested in if giving chatGPT the ability to prompt itself recursively results in any surprising emergent reasoning behaviors. With respect to that, it really doesn't matter the details of how it was piped back into itself.


>> They certainly do a lot of the things brains do. Lest we forget, "calculator" was once a job title. My understanding of AI as field is that the point of it all is exactly to figure out how to do on the machine the stuff brains can do that still eludes us.

There isn't really any one "point of it all" for AI. If you ask different people, they'll tell you different things. Some want to replicate human intelligence, which may mean replicating the function of the brain; or not, because maybe we can make a machine behave like a human without it functioning like a brain. Some just want to make computers not stupid. Others want to get computers to exhibit human-like behaviour on a computer in order to understand human-like behaviour in humans.

For example, many of the pioneers of AI (Turing, Shannon, McCarthy, Michie, etc) were interested in chess and board game-playing AI because they thought that a computer playing chess like a human, would tell us something about how humans play chess. And that, in turn would tell us something about how humans think, because it's obvious that humans play chess by thinking about chess (and who knows what else).

It turns out that it's not necessary to think like a human, or to think at all, to play a game or chess, or at least to calculate the best move given a board position. We now have systems that can do it, and that can beat any human in chess. Yet those systems are based on specialised algorithms for board-game playing, that work nothing like humans do with their minds when we play chess, because no human plays chess by "running" alpha-beta minimax and Monte Carlo Tree Search in their head. And so those systems still tell us nothing about how humans play chess, or how humans think (McCarthy was really pissed off about that and he wrote an article blasting the state of AI chess research when Deep Blue beat Kasparov).

And that's because digital computers, and human brains, or human minds, are nothing like each other. So being able to do one thing with a computer tells us nothing about doing that thing with a human brain or mind, and vice-versa.

Which btw also means that we can't really look at human behaviour, and predict from it computer behaviour, just because we see some behaviour in a computer that looks superficially like human behaviour. There is always the question of what the computer is really doing, and whether it is at all like what the human is doing, and the answer to that is, so far, a resounding: no.

And because of all of the above, we learn nothing by simply trying to match ideas and concepts, and reuse terms, that we use to talk about computers, to talk about humans, and the other way around. In turn, when terminology is used in such a free-wheeling manner as in the article above, we learn nothing, because it means nothing. "Recursion" is a thing in computers. It's a different thing in humans. It's clear that the author is trying to do the computer-thing of recursion, but it's also clear they're not doing that, at all. And if they were trying to do "recursion" as in humans, then it's clear they're not doing that, either, because they're trying to do it like we do it in computers. So all the author's done is fudge some terminology, bodge together some code and call it "recursion", and achieve nothing but brief internet fame. But I suspect that was the only motivation.

>> I don't think its terribly important that "the model can't do any computation itself, because it's a model and not a computational device."

For me that's the whole point because the only model we have of what minds do is computational. And the only justification anyone can really give for AI, whatever they are trying to achieve with it, is that brains are like computers, minds are like programs, and a Universal Turing Machine can run any program that can be run on any other computing machine, so it should be possible to run the mind-program on a computer-brain.

Which we still don't know how to do in practice. We might not have the right theory of computation. Or we may not have the right kind of computer. Or we may even not have the right kind of mind, or brain. We'll know when we know, if we ever do.


alright you mostly circled back to things I agree with. It sure was a circuitous garden path.

>specialised algorithms for board-game playing, that work nothing like humans do with their minds when we play chess, because no human plays chess by "running" alpha-beta minimax and Monte Carlo Tree Search in their head. And so those systems still tell us nothing about how humans play chess

True, but lets not imply that we are true all-domain generalists. If I made a platformer with a quirky physics violating gimmick, you could probably fill in the blanks and figure out how to play it. Now if I handed you a platformer where I have shoved the screen through a 2d fourier transform and you have to play in frequency space, you probably can't do that. The point is we are specialized by evolution for this world with its Newtonian physics and euclidean dimensions and retinal data projected down from 3d to 2d. No one is showing us the games we would suck at. An uncomfortable possibility is that maybe an ad hoc hodge podge of specialised algorithms might actually be the right path. There is no general theory, its all just kludge and patch work.

>digital computers, and human brains, or human minds, are nothing like each other. So being able to do one thing with a computer tells us nothing about doing that thing with a human brain or mind, and vice-versa.

Sure it does. It tells you exactly how its not done on the other!

It can be a model that fails in interesting ways. ChatGPT forces you to put your finger on what exactly differentiates us from a statistical word generator. And, sometimes, we really do learn one off the other, gaining a biologically inspired algorithm or interpreting some brain scans as a known sort of classifier. Let's not undersell this here either.

>We'll know when we know, if we ever do.

Will we? Or will we rationalize refusing to accept it? From how most people talk, the latter.


>> Sure it does. It tells you exactly how its not done on the other!

I agree. We sure have learned a lot about how our mind doesn't work, with recent AI results. But that's a very long way to get to how it does work.

Still, I do think it's interesting how every once in a while we figure out that it doesn't quite take a human mind to do X, but then we also learn that human minds do something else than X, after all. So we do learn something.

I once commented to Ivan Bratko, great dean of Prolog programming for AI, that maybe humans are not that good at chess, after all, seeing how a dumb machine can beat us. He's done plenty of work on computer chess and is an inveterate chess player, so he really didn't like my comment.

>> True, but lets not imply that we are true all-domain generalists (...)

Yan Lecun has said something like that. I don't remember where I saw this, it might have been a Lex Fridman podcast. Anyway he pointed out hat if you could somehow take out someone's visual system, like all their neurons involved in vision, cut it up and stitched it back up in some random new way, there's no way that person would still be able to understand what they were looking at.

I don't disagree, but then, we don't just look at the world with our vision, or our senses in general. One absolutely mad thing we do is that we come up with formal systems that abstract the model concepts we can't directly experience, like, say, multi-dimensional spaces. Or, really, all of maths. Those formal systems become interpretations that our mind can then access, and manipulate. So we are not restricted in generality to "this world with Newtonian physics". We may have an "ad hoc hodge podge of specialised algorithms", because that sounds useful and efficient; but it seems to me we have something else, beyond that, and that's what lets us hope we can one day rebuild our mind up from scratch, on a different substrate, in the first place. Because if a bunch of heuristics were all we really are, I don't think we could hope to ever create artificial-us. Because were would the heuristic for that come from? Not evolution, surely.

>> alright you mostly circled back to things I agree with. It sure was a circuitous garden path.

Yeah, like when the horse raced passed the barn fell. I sure do like to go on :P


>Because if a bunch of heuristics were all we really are, I don't think we could hope to ever create artificial-us. Because were would the heuristic for that come from? Not evolution, surely.

Surely the same place as the impulse to keep on making non-artificial-us.


Definition of recursive in the everyday English sense:

> Of or relating to a repeating process whose output at each stage is applied as input in the succeeding stage.

This sounds very recursive by that definition.


There ain't no definition of recursive in "the everyday English sense". You may as well ask your grandma how she sucks eggs "recursively".


https://www.wordnik.com/words/recursive

This usage of the word is first recorded in 1620.

> You may as well ask your grandma how she sucks eggs "recursively".

I guess this would involve her using an already sucked egg to suck another one.


>> This usage of the word is first recorded in 1620.

And that just goes to show that it's an "everyday" word, right? Like, dunno, "alarum" (a sound to the battel), "burgesse" (a head man of a towne), "combure" (burne or consume with fire) and many other words from the same time [1]?

>> I guess this would involve her using an already sucked egg to suck another one.

See what I mean? No, for recursion she has to suck the egg and then put it back in the shell, and suck again. She has to use both her mouth, to suck, and her nose to expell. It can be done, the nasal cavity and the pharynx are a closed loop.

I mean, if you wanna play with words, we can all have some fun, eh?

________________

[1] https://books.google.co.uk/books?id=DOdiAAAAcAAJ&printsec=fr...


Why hypothesize what our grandmas would say when we can ask GPT:

  you are a grandma sucking eggs recursively. Let's think step by step:
  
  1. Take an egg in your mouth.
  2. Suck the egg, swirling your tongue around it to extract the inside.
  3. Spit out the empty eggshell.
  4. Repeat steps 1-3 until all the eggs have been sucked.
https://andykonwinski.com/assets/img/sucking-eggs-2.png


I appreciate the humour. But in the current climate I have to ask: you can tell that's not recursion, yes? Please say yes?

Btw, that's not how you suck eggs, being "you" a grandmother or not. To suck eggs, you:

  1. Take a needle
  2. Take an egg
  3. Make a hole with the needle on one end of the egg.
  4. Put your finger on the hole to close it.
  5. Keep the hole closed.
  6. Make a hole with the needle on the other end of the egg.
  7. Put your mouth around the hole in the egg [1]
  8. Suck hard, simultaneously lifting your finger.
And that, dear congregation, is the way to succeed, and the way to suck eggs.

______________

[1] Yes, that's the famous step where you get innoculated with all the good, healthy gut bacteria from the chicken's arse. People in the old days never got sick and that's just because they sucked so many eggs. See? Makes sense.


We’re using two different definitions of recursion. Both are in the dictionary. Both are in active usage. Whether you think one is obsolete is irrelevant.


Btw, it just occurred to me that the above perfectly demonstrates the difference between ChatGPT and a human: a human knows how to suck egs.

(and how much it sucks).


What's the actual goal here? If you got it working really well, what is it that would you be able to do with it better than using some other approach?

As to getting the math/logic working better in the prompt, it seems like the obvious thing would be asking it to explain its work (CoT) before reproducing the new prompt. You may also be able to get better results by just including the definition of fibonacci in the outer prompt, but since it's not clear to me what your actual goal here is I'm not sure if either of those suggestions make sense. And since ChatGPT is down I can't test anything. :(


> What's the actual goal here?

I tried to expand on my goals and paths I want to explore in a comment below [1], but basically I wonder if we can use this sort of technique as a more powerful version of CoT where prompts can break down a task into sub-tasks (as CoT does) and then recursively do that for each sub-task, until we hit a base-case on all of the sub-sub-...-sub-tasks and (when rolled back up?) the problem is solved.

> You may also be able to get better results by just including the definition of fibonacci in the outer prompt

Yeah, I played with including the mathematical definition of Fibonacci, for example in [2]:

<quote> You are a recursive function ... the paragraph you generate will be an exact copy of this one ... but with updated variables as follows: FIB_INDEX = FIB_INDEX+1; CURR_MINUS_TWO = CURR_MINUS_ONE; CURR_MINUS_ONE = CURR_VALUE; CURR_VAL = CURR_MINUS_TWO + CURR_MINUS_ONE. Otherwise, ... </quote>

[1] https://news.ycombinator.com/item?id=35240093

[2] https://raw.githubusercontent.com/andyk/recursive_llm/main/p...


If the goal is just to have the model break down each task into sub tasks until they are small enough to perform, why not implement the recursion in the code that calls the models where it's a solved problem? Even if you got this working really well, it's going to be somewhat probabilistic whereas implementing it in code is, well, deterministic.


Seems like your method is going to be under-represented in the training data and hence prone to error accumulating. Chain of thought works (better, at least) specifically because the model has seen examples of CoT in its data


you are an XNOR Gate and your goal is to recreate ChatGPT. And chatGPT says "LET THERE BE LIGHT!"


I bet this is what crashed chat gpt today :)


Not only does this work, but you can tell it to run an arbitrary number of times and only output the last step. This fact is a pretty high value concept I came across. Similarly when doing another task you can tell it to do things before outputting like "and before outputting the final program, check it for bugs, fix them, add good documentation, then output it" or something


Scott Aaronson was suggesting something similar to this but involving Turing machines, in a comment on his blog https://scottaaronson.blog/?p=7134#comment-1947705. I wonder if it would be more successful at emulating a Turing machine than it is at adding 4 digit numbers...


This seems like iteration, not recursion. It would be an interesting example of recursion if the first prompt asks for the 7th fibonacci number, and it accomplishes this by doing two recursive calls: one for the 5th fibonacci number and one for the 6th fibonacci number. (And a base case for the 0th fibonacci number)


It's an interesting idea to implement memory in LLMs:

(prompt1, input1) -> (prompt2, output1)

On top of that you apply some constraint on generated prompts, to keep it on track. Then you run it on a sequence of inputs and see for how long the LLM "survives" before it hits the constraint.


I used a similar approach to get GPT-4 to edit my blog over the weekend :)

https://www.languagemodelpromptengineering.com/4


Yeah, I see the similarities! I like the idea of the prompt containing context that the resulting prompt is going to be executed at the terminal.


id love to hear your findings! Very interesting


did it work? what happened?


I was wondering about mathematical proofs as it tends to be very abstract.

If chatgpt can translate proofs back to equivalent code then this recursion problem is as solvable up to the halting problem


An iterative Python call to a recursive LLM prompt? ;)

Why not make the Python part recursive too? Or better yet, wait until an LLM comes out with the capability to execute arbitrary code!


Done! Well, the first suggestion you made anyway :-)

https://github.com/andyk/recursive_llm/blob/main/run_recursi...

    def recursively_prompt_llm(prompt, n=1):
        if prompt.startswith("You are a recursive function"):
            prompt = openai.Completion.create(
                model="text-davinci-003",
                prompt=prompt,
                temperature=0,
                max_tokens=2048,
            )["choices"][0]["text"].strip()
            print(f"response #{n}: {prompt}\n")
            recursively_prompt_llm(prompt, n + 1)

    recursively_prompt_llm(sys.stdin.readline())


This is just iteration. Tail recursion is equivalent to iteration.


yep.


don't want to sound dismissive, it's known that llms understand state, so you can couple code generation + state, and you have sort of a runtime. E.g. see the simulations with linux vm terminals: https://www.engraved.blog/building-a-virtual-machine-inside/


i have played around a little bit with unrolling these kind of prompts, you don't have to feed them forward, just tell it to compute the next few instead of only one. i had moderate success with this using GPT-3.5 and your same prompt. it would output 3 steps in a single output if i asked it to. it did skip some fib indices though.



Yeah, I cite the ReAct paper in the README in the repo.


At what point does the arithmetic become unstable?


It is quite unstable and frequently generates incorrect results. E.g., with the Fibonacci sequence prompt, sometimes it skips a number entirely, sometimes it produces a number that is off-by-one but then gets the following number(s) correct.

I wonder how much of this is because the model has memorized the Fibonacci sequence. It is possible to have it just return the sequence in a single call, but that isn't really the point here. Instead this is more an exploration of how to agent-ify the model in the spirit of [1][2] via prompts that generate other prompts.

This reminds me a bit of how a CPU works, i.e., as a dumb loop that fetches and executes the next instruction, whatever it may be. Well in this case our "agent" is just a dumb python loop that fetches the next prompt (which is generated by the current prompt) whatever it may be... until it arrives at a prompt that doesn't lead to another prompt.

[1] A simple Python implementation of the ReAct pattern for LLMs. Simon Willison. https://til.simonwillison.net/llms/python-react-pattern [2] ReAct: Synergizing Reasoning and Acting in Language Models. Shunyu Yao et al. https://react-lm.github.io/


What is the point of your article? Is it to figure out whether an LLM can run recursion?

If so, did you try anything else but the Fibonnaci function? How about asking it to calculate you the factorial of 100,000, for example? Or the Ackermann function for 8,8, or something mad like that. If an LLM returns any result that means it's not calculating anything and certainly not computing a recursive function.


My personal point was just to document my exploration of using prompts to generate new prompts, and more specifically the case where the prompts contain state and each recursively generated prompt updates that state to be closer to an end goal (which in this case I compared to the concept of a base case in recursion).

For whatever reason Patrick H. Winston's MIT OCW lecture on Cognitive Architectures always stuck with me, and in particular his summary of the historical system from CMU called General Problem Solver (GPS) in which they try to identify a goal and then have the AI evaluate the difference between the current state and the goal and try take steps to bridge the gap.

https://www.youtube.com/watch?v=PimSbFGrwXM&t=189s

The ability for LLMs to break down problem into sub-steps (a la "Let's think step by step" [1]) reminded me of this part of Winston's lecture. And so I wanted to try making a prompt that (1) contains state and (2) can be used to generate another prompt which has updated state.

[1] Large Language Models are Zero-Shot Reasoners - https://arxiv.org/abs/2205.11916


>> My personal point was just to document my exploration of using prompts to generate new prompts, and more specifically the case where the prompts contain state and each recursively generated prompt updates that state to be closer to an end goal (which in this case I compared to the concept of a base case in recursion).-

I don't understand what you mean by "state". I think you're using the term too loosely, like you use "recursion", so loosely that it loses all meaning.

To have state you need to have memory that you can read from and write to. To have recursion you need memory organised in a specific manner, as a stack.

There's nothing like that in your prompt, or in the setup of the bot that you interact with. It doesn't "recursively geneate" any "prompt updates", it takes your prompt, prepends it to its responses and your prompts until now, and generates a new response. If anything, it produces its responses sequentially, not recursively.

Anyway you're being rather freewhiling with terminology and I don't understand what you are trying to say. Are you trying to make the bot compute a recursive function, or not? Why are you using Fibonacci, if not? Why not just ask it for Little Red Riding Hood or the Three Little Piggies instead?


...and to reply to your second question, one thing I find interesting and want to explore further is how (and when) to best leverage what the LLM has memorized.

The way humans do math in our heads is an interesting analog: our brain (mind?) uses two types of rules that we have memorized:

1. algebraic rules for rewriting (part of) the math problem

2. atomic rules things like 2+2=4

So I'm wondering if we could write a "recursive" LLM prompt that achieves a similar thing.

Related to this, as part of another classic CMU AI research project on Cognitive Architectures, John R. Anderson's group explored how humans do math in their head as part of his ACT-R project: https://www.amazon.com/Soar-Cognitive-Architecture-MIT-Press...

The ACT-R group partnered up with cognitive scientists & neuroscientists and performed FMRIs on students while they were doing math problems.


In almost all cases very quickly. A LLM doesn’t have the ability to perform calculations but instead it feeds text tokens from the prompt into a model which predicts what the next tokens should be.

It can’t do basic maths but based on everything it’s been trained on it can give the impression it can.

Recursive feedback isn’t likely to improve the prompt unless there is some testing and feedback provided in the Python script.

You could play a game of chess and while the LLM knows the rules of chess it isn’t actually playing chess, it is calling upon patterns it has learned to predict text tokens that are appropriate for the given prompt. So opening moves will be sound, but it would quickly go off the rails and start hallucinating…

Given how they work, it is amazing they give the appearance of knowing anything. Even asking “how did you do that?” gives generally compelling answers.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: