Hacker News new | past | comments | ask | show | jobs | submit login

It's a valid concern that ChatGPT could be hallucinating this, but there are a few things that strongly suggest this isn't what is happening in this case:

1. ChatGPT reliably produces the same output across several different instances, and other people have independently found identical versions of this text with different prompts.[1][2] This typically wouldn't happen with a hallucination, which may change with each time the model is prompted.

2. The instructions accurately describe the capabilities and restrictions of ChatGPT's function calls. When making custom GPTs, the browser tool and DALL-E image generation tool are options, and the "system prompt" given by the custom GPTs reflect whichever tools you've selected.

3. ChatGPT has reliably changed with the changes noticed in the "system prompt". I document a recent change made to the prompt on my post from yesterday.[3] On the older instances of ChatGPT I have, the model will suggest it has no idea what the "guardian_tool" is or make any attempt at stopping a discussion of U.S. elections, while newly made instances of ChatGPT will discuss the "guardian_tool". The "system prompt" repeated then must give at least some sense of what updates are being made under the hood.

I certainly think we should still take the idea that this is the "system prompt" with a grain of salt. There is a good discussion about this here: https://news.ycombinator.com/item?id=37879077

[1] https://www.reddit.com/r/ChatGPT/comments/18494zo/what_are_t...

[2] https://medium.com/@dan_43009/what-we-can-learn-from-openai-...

[3] https://dmicz.github.io/machine-learning/chatgpt-election-up...




You can also play around in the playground or with the API with your own system prompts and see that you can jailbreak them to report the prompt.


In (1) you imply that hallucinations are strictly due to nondeterminism in GPT computation. A hallucination happens (IIUC) because of the numeric imprecision, model regularization and various thresholds. In short, the hallucinations can be reliably reproducible (but, they can also happen due to non-deterministic computation).


There is empirical research showing that hallucinations in LLM tend to vary enormously from one answer to the next compared to accurate answers.

Of course, it could be that GPT-4 has been instructed to lie about its prompt, but failing that, you should expect any answer that stays the same across multiple wordings and prompting methods to be accurate.


That's mostly intuitive.

An accurate answer is often driven by a concrete and highly confident fact in the training dataset (e.g. structured data fact, like a birth date from Wikipedia etc.).

The hallucinations are derived facts of (hopefully) low confidence. Nondeterminism is more common if you have low scores. Only a few facts can take high score (in a usable system), while many can take a low score -- then numeric instability can make a mess.

I'm not very familiar with LLMs, but I do have experience with the traditional ML models and content understanding production system. But, LLMs are not far from them.


I'm not sure if I'm reiterating what you're saying, but isn't an element of it that the model doesn't have the storage capacity to literally contain all the information it's trained with, and so it's necessarily extrapolating from it's representation of knowledge, which is going to sometimes miss the mark?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: