Chatbots May 'Hallucinate' More Often Than Many Realize

cainxinth · on Nov 6, 2023

TastyLamps · on Nov 6, 2023

I think in addition to all the benchmarks used right now for LLM evaluation (HumanEval and the like). It would be interesting to have a 'hallucination benchmark' with a summarization based hallucination dataset.

bookofjoe · on Nov 6, 2023

I'm surprised that the Times continues to use the word "hallucinate" in favor of the more accurate "confabulate." Alas, that ship appears to have sailed.

simonhughes22 · on Nov 6, 2023

That's the term used by the academic literature also, so Hallucinate is an industry standard term.

bookofjoe · on Nov 10, 2023

ChatGPT's thoughts on the topic:

"Yes, it would be more accurate to say that AI models, especially language models like GPT-4, confabulate rather than hallucinate. Confabulation refers to the generation of plausible-sounding but potentially inaccurate or fabricated information, which is a common characteristic of AI language models when they produce responses based on limited or incomplete knowledge. This term better captures the nature of AI outputs as it emphasizes the creation of coherent, yet possibly incorrect, information rather than suggesting the experience of sensory perceptions in the absence of external stimuli, as hallucination implies."