The challenge here is that ChatGPT and other LLMs can only think out loud. They only "think" through writing, and that's always displayed to the user.
Has anyone tried giving LLMs a scratchpad where the model could e.g. run the pipeline in order, generate the poem, and then explicitly publish it to the user without showing the earlier steps?
They have! The ReAct[1] model, which is available in LangChain[2]. It can be quite powerful, especially when given access to search tools.
The user just sees the "Final Answer" / Finish response from the chain's execution, even if several invocations across different tools & model invocations were required
One approach here would be prompt injection: just insert the 'No' into your own response so ChatGPT tries completing that. Also:
> I speculate that the temperature, when coupled with the mechanism of generating text based on already-generated text, could explain some cases of ChatGPT stupidity. In cases when ChatGPT should be perfectly accurate, the temperature will surely under-optimize its cleverness, and now the entire conversation is broken, because everything else will depend on what foolishness it just wrote.
Absolutely. This is why 'best-of' sampling (not available in ChatGPT's default interface) can be so useful. You decode many different possibilities in parallel, and the ones where the random decoding makes a fatal error will get discarded and you'll get back the most plausible overall one, which is much more likely to be correct.
yes, hopefully I'll write it up soon. TL;DR: I used this top of GPT-3.5 to generate a magazine page of personalized recommendations: 3x of a title, paragraph, AI art, and a font name, and rationale. For images, I use SD 2.1 via stability.ai. Be sure to add 5400 dpi digital art at the front of your prompt :)
I'm a bit embarrassed to, "real" research finetunes internal models to play a particular role, rather than orchestrating several "conversations" and hoping your prompt will get you the right output format 100% of the time, etc.
Here's a woefully lacking diagram of this user/interpreter/LLM flow for a cohesive longform story generator. [1]
The coolest part of this design pattern you've ID'd is you can always add one more character/conversation that the interpreter orchestrates
ex. A DB character whose role is taking a new page as input, then outputting the new DB, where the DB is all important facts to sustain over a story. That let me scale to 16+ "pages"
You can ask GPT what would be a result of executing a python program, for which a multiple step calculation is needed. It will readily output the result, with no thinking aloud.
Yes. And if you give it a database schema, it can answer free-form questions about the data in it by generating SQL queries, so long as you wire up the results (or just manually copy/paste them). Although it does hallucinate fields in tables sometimes - but if your wiring reports errors in a readable way, it will usually self-correct.
I think the most interesting potential development of this concept would be to give it the ability to spawn child instances to process subtasks (such that each subtask gets its own token window!) and produce intermediate results that it would that combine. It can be done manually (copy/paste) with a lot of handholding; the trick is to come up with a way to automate it, such that it's clear which part of the output is a request to spawn a submodel + its prompt, and the result is also communicated in some way that's clear to the model.
Amount of compute applied to the problem is roughly linear to the number of input+output tokens. It is hard to predict at what stage the compute is applied to parse and create the embedding representing the problem and when it is applied to actually solve it.
And anyway, probably most of the compute is used to judge the social standing of the person asking the question. And if it is worth bothering to answer it ;)
I guided it to write a program for me, which it did correctly, and then I asked to evaluate it on different numeric inputs. It got correct answers for small numbers and the first few positions of map(thefunction,[1,2,3,4,5,6,7,8,9]) before wandering off into bad fuesses.
Has anyone tried giving LLMs a scratchpad where the model could e.g. run the pipeline in order, generate the poem, and then explicitly publish it to the user without showing the earlier steps?