I've noticed if I give chat GPT an algorithm it gives me consistent results when...

Vespasian · on April 11, 2023

LLMs don't have an inner monologue.

By their very nature they only "know" what they have written down and must infer the final answer from that token by token.

They fundamentally can't do certain things such as complex iteration or track back.

When you ask for chain of thought thinking, you allow the LLM to create a "buffer space" and break down the task into more manageable substeps thereby improving the quality of the results.

vintermann · on April 11, 2023

The Bing LM, or rather the service, did have "inner monologue" in the sense of text that it would generate, but not show to the user, and treat as "thoughts" to guide the generation of an actual reply that the user would see.

We know this because it happily told us, including the json format it uses internally.

Vespasian · on April 11, 2023

Interesting. I didn't know that.

When using gpt-4 directly through the API we can emulate this behavior

beepbooptheory · on April 11, 2023

And you trust what it told you?

int_19h · on April 11, 2023

No, but the reconstructed examples have "im_start" and "im_end", which strongly implies that it is, if not verbatim, then a close enough restatement of the real deal. Take a look:

https://www.make-safe-ai.com/is-bing-chat-safe/Prompts_Conve...

vintermann · on April 11, 2023

Yup, for the same reason I trust e.g. jailbreaks exposing the prompt: it was consistent.

Really, just asking again is a fine way to expose all sorts of "hallucinations" in a LM.

afiori · on April 15, 2023

I feel like this can be Implemented in the UI:

First you wrap the user query with "the user asked you: ... . What are the reasoning steps you need?" and then you prompt with "considering `<previous answer>` now answer <user prompt>"

Obviously this is clearly hackable so it would need improvements.

wrycoder · on April 11, 2023

https://youtu.be/Rog9oHtVmjM

Start at 7:30 to see example of backtracking.

leobg · on April 11, 2023

GPT is auto regressive. That means each output token becomes part of the new input sequence. Which is to say, the beginning of the model’s answer becomes part of your prompt.

If the model makes some mistake in the beginning, it now needs to explain / make sense of that mistake.

Kind of like a split-brain patient whom you ask why they got up, and they then say, to get a Coke. [1] In psychology, that is called confabulation. In machine learning, they use “hallucination“, probably so they can use the term across several disciplines, like language, audio, vision, etc.

[1] https://www.brainscape.com/flashcards/chapter-4-hemispheric-...

Jiocus · on April 11, 2023

The psychology of split-brain is a nice gateway to 'The Bicameral Mind', a major theme about how the conception of the AI's works in the series Westworld. Nice!

leobg · on April 11, 2023

Thank you. I am, in fact, reading it. Fascinating read. (How come all the best books were written in the 70s? Just survivership bias?)

jychang · on April 12, 2023

Because the 1970s had bad science that was poorly replicated, especially in psychology.

Just because a theory sounds nice and seems to make sense doesn’t mean it’s scientific

adnmcq999 · on April 11, 2023

I just watched a video where the guy touches on GPT-4 limitations and one of those is simple math. He asks it some order of operations question and it outputs the correct answer but only if it does it step by step. It then apologizes and says its original incorrect answer was “a typo.”

Video: https://youtu.be/qbIk7-JPB2c

og_kalu · on April 11, 2023

chain of thought prompting. It's well known

https://arxiv.org/abs/2205.11916

https://arxiv.org/abs/2201.11903