Hacker News new | past | comments | ask | show | jobs | submit login
Whisper hallucinations flagged by study on medical transcription errors (apnews.com)
3 points by instagraham 1 day ago | hide | past | favorite | 3 comments





> The prevalence of such hallucinations has led experts, advocates and former OpenAI employees to call for the federal government to consider AI regulations. At minimum, they said, OpenAI needs to address the flaw.

I don't see why there should be regulations against models that produce hallucinations. If a model has a high error rate, then it should not be used for critical tasks. Any user of the model should know that it is not error-free.


I wish Whisper got 1/1000th the amount of love from OpenAI as ChatGPT. It really is genuinely useful (when it works, which is usually).

This is the source article for the claims about Whisper, but it's not very clear about how exactly Whisper was hallucinating in medical transcripts, barring a couple of examples. Without a description of the actual audio sample, you can't tell what's a transcription error (to be expected) and what's a hallucination.

It's also unclear what's a 'hallucination' here. Is it just mis-transcribing unclear audio or, like with ChatGPT, is it inserting random or targeted nonsense into otherwise legible bits of conversation?

For misheard sentences, it's definitely not something you want in medical transcription but it's also an understandable problem that humans aren't immune to either (I'm sure QC in human-led medical transcription works hard to avoid this).

But for everyone who uses Whisper, yeah, there's some due diligence you reasonably expected to have: If you have bad or unclear audio in places, you should be cross-checking the recording to make sure the transcription is accurate.

However, that may not be possible in the case of tools using Nabla, according to the article:

> It’s impossible to compare Nabla’s AI-generated transcript to the original recording because Nabla’s tool erases the original audio for “data safety reasons,” Raison said.

Say this is bad design, and fixable. My real concern is whether Whisper is outright hallucinating in random places or just mis-transcribing tough audio. The former makes the tool completely unusable as you'd have to cross-check every word. The latter means you just keep up your regular diligence

> Researchers aren’t certain why Whisper and similar tools hallucinate, but software developers said the fabrications tend to occur amid pauses, background sounds or music playing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: