How do you manage the hallucination problems, or do you not seem to be having th...

__MatrixMan__ · on April 17, 2023

95% of what I search for, I can independently confirm once I have it in hand. For the other 5%, I avoid ChatGPT.

vanviegen · on April 17, 2023

Reminds me of Knoll’s law of media accuracy:

“everything you read in the newspapers is absolutely true, except for the rare story of which you happen to have firsthand knowledge”.

Humans are pretty good bullshitters too!

rchaud · on April 17, 2023

Newspapers print their errata though. Does ChatGPT ever admit to making a mistake?

__MatrixMan__ · on April 17, 2023

All the time, but only when prompted. You have to have a conversation with it and provide more detail which exposes the flaws in its previous answers, then it will happily apologize for its mistakes. (For me, this usually looks like me pasting an error message that its code caused.)

I really hope they find a way to have it apply context from future conversations such that when it learns the error of its ways it emails you a retraction, but that's probably a ways out because humans can't be trusted to not weaponize such a feature into sending spam.

Xeamek · on April 17, 2023

But it doesn't learn its error, that's the whole problem. It only responds to 'accusations' from user in the most common way, which is 'apologies-like'.

The weight of phrases like "you are wrong" is in fact so strong, that it fools the chatGPT to apologize for its 'mistakes' even in the scenarios where its text was obviously correct - like telling it 2+2 doesn't equal 4

__MatrixMan__ · on April 17, 2023

Well yeah, it's an imperfect tool, and you have to treat it as such. Probably there's a lot to be discovered about how to use it most effectively. I just don't find that it's more problematic than the other tools in my box.

Sure, grep has never flat out lied to me the way chatGPT does, but it's a statistical model, not a co-worker, so I don't feel betrayed, I just feel... cautioned. It keeps you on your toes, which isn't such a bad state to be in.

kalmi10 · on April 17, 2023

It can pick up on inconsistencies, especially when pointed out, and can say it was wrong, and try to reconcile the information.

vanviegen · on April 17, 2023

Errata are extremely rare. Gross misrepresentations and errors are not, unfortunately.

onlypositive · on April 17, 2023

Bing generating snippets of text from websites isn't going to generate hallucinations like you think it is.

malaya_zemlya · on April 17, 2023

It totally would, if Bing doesn't return relevant results.

I've asked BingGPT about myself and it gave me three answers. One was more or less on-point (it found my linkedin profile), and the other two were hallucinations. What happened was Bing found two unrelated pages and GPT has tried and failed to make sense of them.

Either that, or I am a prince whose name means "goose" in Polish.

__MatrixMan__ · on April 17, 2023

Problematically, they're much better bullshitters than ChatGPT. And if you used Google to find them, they're probably either selling you something, or you had to navigate a minefield of people who are in order to find them.

docandrew · on April 17, 2023

It’s great too when you don’t know exactly what to search for, especially for acronyms.

cruano · on April 17, 2023

> its willingness to make up facts makes me leary

I see you haven't met humans

stanac · on April 17, 2023

We can downvote human comments and proposed solution (on stack overflow, hn, etc...) and also I don't expect colleagues to lied to me when I ask them about a feature or how to do xyz in a language or library or framework.

Bing, IIRC, has a way to provide feedback, not sure how useful it is for today's users and if it will be able to solve hallucinations one day.

mark_l_watson · on April 17, 2023

I try to always give Bing+ChatGPT chat or search results a thumbs up or a thumbs down. I am using the service for free, so it seems fair for me to take a moment to provide feedback.

CactusOnFire · on April 17, 2023

When google sends me to a website, I can at least judge the credibility of a website.

When ChatGPT tells me something, I have no idea if it's paraphrasing information gathered from Encyclopedia Britannica, or from a hollow-earther forum.

worrycue · on April 18, 2023

> When ChatGPT tells me something, I have no idea if it's paraphrasing information gathered from Encyclopedia Britannica, or from a hollow-earther forum.

Or it's something it just hallucinated out of thin air.

Spivak · on April 18, 2023

Which is why you use one of the AI search engines that makes it cite its sources.

phind.com has been incredibly good for me.

livueta · on April 18, 2023

This is a real question, so I apologize if it comes off as sophistry:

Is the work of judging the accuracy of a summary not just the work of comprehending the non-summarized field?

For example, a summary could be completely correct and cite its facts exhaustively. Say you're asking about available operating systems: it tells you a bunch of true info about Windows and OSX, but doesn't mention the existence of Linux. Without familiarity with the territory, wouldn't verifying the factuality of each reference still leave you with an incomplete picture?

At a slightly more practical level, do you actually save any time if you've gotta fully verify the sources? I assume you're doing more than just making sure the link doesn't 404, as citing a link that doesn't say what it is made out to be isn't exactly a new problem, but at that point we're mighty close to the traditional experience of running through a SERP.

Finally, even if you're reading all the links in detail, isn't that still a situation prone to automation bias? There's a lot of examples of cases where humans are supposed to check machine output, but if it's usually good enough the checkers fall into a pattern of trusting it too much and skipping work. Maybe I'm just lazy, but I think I'd eventually get less gung-ho about verifying sources and eventually do myself a mischief.

I'm asking because I've been underwhelmed by my own attempts at using LMs for search tasks, so maybe I'm doing it wrong.

vkou · on April 17, 2023

The average human is going to give me the wrong answer to a question I ask him.

But I'm generally not interested in asking an average human. I'm interested in asking someone who knows their butt from a hole in the ground in whichever topic I'm asking them about.

zouhair · on April 17, 2023

Humans are actually quite reliable. Wikipedia is that trust manifested. Also a human liar knows they are lying, AI doesn't know it's saying something wrong.

loandbehold · on April 17, 2023

Humans can also give wrong information without realizing it.

kelseyfrog · on April 17, 2023

That's why we call it hallucinating rather than lying. Confusing the two is conceptually unhygienic.

scandox · on April 17, 2023

What I've found is that until you see it really hallucinate like mad on a subject you know well you don't realize how crazy it can be.

Especially when I talk to it about fiction and ask questions about - for example - a specific story and you see it invent whole quotes and characters and so on...it is a masterful bullshitter.

Kye · on April 17, 2023

Citations! I never trust Bing Chat's answer. The links usually quickly tell you if the answer is hallucinated. Basically: treat it as a search engine, not an answer engine. Follow the links like you would on any other search engine. Those links will still be more relevant.

robotresearcher · on April 17, 2023

It happily made up citations for me. In a follow up, I asked it not too, and to please use only real papers. It apologized, said it would not do it again, then in the same reply made up another non-existent but plausible citation.

Checking the links is a good practice.

I feel like we just created an interesting novel problem in the world. Looking forward to seeing how this plays out.

brookst · on April 17, 2023

Are you talking about Bing Chat, which cites actual web pages it used to make the summary, or ChatGPT, which is a very different beast and relies on built-in knowledge rather than searches?

robotresearcher · on April 17, 2023

Good call. I was using ChatGPT.

bastardoperator · on April 17, 2023

Sounds like you should be doing the research yourself but are relying on an untrusted source to feed you answers? I don't think we're there yet...

robotresearcher · on April 17, 2023

On the contrary, I was doing a calibration, asking about something to which I know the answers very well. To see if it was trustworthy.

siquick · on April 17, 2023

Phind gives you citations and even let’s you ignore certain sources.

https://www.phind.com/

Kye · on April 17, 2023

Not any use to me (not a developer), but it's cool there are niche search engines for stuff like this.

Spivak · on April 18, 2023

Ignore the tagline, it's a general purpose search engine with some features for code.

siquick · on April 18, 2023

Yeah that seems an unnecessary tagline - it works great for everything in my experience

xapata · on April 17, 2023

> leary

Leary is a rare variant spelling of leery.

I mention this, because you seem to care about correctness.

loandbehold · on April 17, 2023

That was a problem for ChatGPT3. Not so much for ChatGPT4. I also switched to ChatGPT4 for most of my searches. I only use Google now as a shortcut for navigating to specific website.

rmbyrro · on April 17, 2023

GPT4 got a lot better at avoiding hallucinations, in my anecdotal experiences. But it ain't free yet.

caeril · on April 17, 2023

Hallucination problem is easily solved by using it as a code/config template or starter, and actually vetting its output. It's still a huge time-saver, even with the vetting time involved.

Cold War strategy. Trust but Verify.

qbrass · on April 17, 2023

Let the car drive itself, but do all the work of driving anyway so you can take the wheel when it screws up.

TheOtherHobbes · on April 17, 2023

Is this really a problem?

What could be more 2020s than a post-truth search engine?

atlantic · on April 17, 2023

Leery