Hacker News new | past | comments | ask | show | jobs | submit login

Yes, it is still wrong even if you just think "probability", not "p-value".

For people who don't believe me, spin up your LLM of choice using the API in your favourite language[1] and make some query using a temperature of zero. You will find if you repeat the query multiple times you always get the same response. That is because it always giving you the highest weighted result in the transformer output whereas if you set a non-zero temperature (or use the default chat frontend) it does a weighted random sample.

So there is no probabilistic variance between responses with temperature set to zero for a given model, but you will nonetheless find that you can get the LLM to hallucinate. One way I've found to get LLMs to frequently hallucinate is to ask the difference between two concepts that are actually the same (eg gemini gave me a very convincing looking but totally wrong "explanation" of the difference between a linear map and a linear transformation in linear algebra [2].

Therefore the probabilistic nature of a normal LLM response can not be the reason for hallucination because when we turn that off we find we still get hallucinations.

The real reason that LLMs hallucinate is more mundane and yet more fundamental- Hallucinating (in the normal sense of the word) is actually all that LLMs do. This is what Karpathy is talking about when he says that LLMs "dream documents". We just specifically call it "hallucination" when the results are somehow undesirable, typically because they don't correspond with some particular facts we would like the model's output to be grounded in.

But LLMs don't have any sort of model of the world, they have weights which are a lossy compression of their raw training data, so in response to some prompt they give the response that they have learned in instruction fine-tuning minimizes whatever loss function was used for that fine-tuning process. That's all. When we use words like "hallucination" we are in danger of anthropomorphising the model and using our reasoning process to try to back into how the model actually works.

[1] You need to use the programming API rather than the usual web frontend to set the temperature parameter.

[2] For the curious, it more or less said that for one of them (I forget which) you could move the origin so turned it into an affine transformation, but it mangled the underlying maths further. The evidence has fallen out of my gemini history so I can't share it, but that sort of approach has been fruitful in the past. Neither chatgpt nor claude fall for that specific example fwliw.






While I like to call out people for using "hallucinate" for this kind of behavior too (for language models, at least, it might actually be appropriate for visual models ?)-

> One way I've found to get LLMs to frequently hallucinate is to ask the difference between two concepts that are actually the same

-this only confirms my belief that "bulshitting" is an appropriate term to use for this behavior : doesn't exactly the same thing happen with (not savvy enough) human students ?

You call it "anthropomorphizing", and "not having a model of the world", but isn't it more like forcing a model of the world on the student / language model by the way that you frame the question ?

(Interestingly, there might be a parallel here with the article : with the language model not being a real student, but a statistical average over all students, including being "with one breast and one testicle".)


Yes actually I think you're right.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: