>So, what is your point and forest? That there exists some non-deterministic accurate LLM with a 100% correct training set which will always produce the same answer if you ask it 2x2 (where asking is somehow different from prompting)?
As I already said: modern LLMs mainly map a single input idea to a single output idea. They might express it in slightly different ways, but it's either correct or incorrect, you can't easily turn an incorrect result into a correct one by rerolling. If you spend any time with Gemini/o3/Claude, you understand this from the first-hand experience. If you know what current RL algorithms do, you understand why this happens.
An ideal LLM would learn one-to-many correspondence, generalizing better, and that still won't be any problem as long as the answer is correct. Because correctness and determinism are orthogonal to each other.
Here's what you started with: "The point about non-determinism is moot if you understand how it works."
When challenged you're now quite literally saying "oh yeah, they are all non-deterministic, will produce varying results, it's impossible to control the outcome, and there's some ideal non-existent LLM that will not have these issues"
I feel like we're walking in circles and not sure if you're doing this on purpose...
>Here's what you started with: "The point about non-determinism is moot if you understand how it works."
Yes, the authors' point about non-determinism is moot because he draws this conclusion from LLMs being non-deterministic: "what works now may not work even 1 minute from now". This is largely untrue, because determinism and correctness are orthogonal to each other. It's silly to read that as "LLMs are deterministic".
>When challenged you're now quite literally saying "oh yeah, they are all non-deterministic, will produce varying results, it's impossible to control the outcome, and there's some ideal non-existent LLM that will not have these issues"
That's not what I'm saying. Better LLMs would be even less deterministic than current ones, but even that would not be a problem.
>So what's your point and forest again?
Point: "Determinism and correctness are orthogonal to each other".
Forest: there's much more going on in LLMs than statistical closeness. At the same time, you can totally say that it's due to statistical closeness and not be wrong, of course.
I think (especially in the current offerings) non-determinism and incorrectness are so tightly intertwined that it's hard to say where one starts and the other one ends. Which makes the problem worse/more intricate.
As I already said: modern LLMs mainly map a single input idea to a single output idea. They might express it in slightly different ways, but it's either correct or incorrect, you can't easily turn an incorrect result into a correct one by rerolling. If you spend any time with Gemini/o3/Claude, you understand this from the first-hand experience. If you know what current RL algorithms do, you understand why this happens.
An ideal LLM would learn one-to-many correspondence, generalizing better, and that still won't be any problem as long as the answer is correct. Because correctness and determinism are orthogonal to each other.