Hacker News new | past | comments | ask | show | jobs | submit login

how do you know what it's telling you is correct and not (as the open ai paper has labeled it) "hallucinating" answers?

over Christmas, I used chat gpt to model a statically typed language I was working on. it was helpful, but a lot of what it gave me was sort of, in a very deceptive way wrong. It's tough to describe what I mean by "wrong" here. Nothing it spit out was just 100% blatantly incorrect. instead it was subtly incorrect, and gave inconsistent evaluations / overviews.

not knowing a bit about type theory, I wouldn't actually be able to evaluate how good the information I got out of it was. I'd be hesitant to take anything chatgpt gave me at face value, or feel confident in being able to speak precisely about a given topic.

did you run into similar problems? and if so, how'd you overcome them?




Not the person you asked, but chiming in. Two things:

First, GPT-4 is far more capable than 3.5 when it comes to not hallucinating. The 'house style' of the response is, of course, very similar, which can lead one to initially think that the difference between the two models isn't as significant as it is. Since I don't have API access yet, I do a lot of my initial exploration in 3.5-Legacy, and once I've narrowed things down a bit, I have GPT-4 review and correct 3.5's final answers. It works very well.

Second, and this is more of a meta comment: How people use ChatGPT really exposes whether they use sanity checks as a regular part of their problem solving technique. That all of the LLMs are confidently incorrect at times doesn't slow me down much at all, and sometimes their errors actually help me think of new directions to explore, even if 'wrong' on first take.

Conversely, several of my friends find the error rate to be an impediment to how they work. They're good at what they do, they're just more used to taking things as correct and running with it for a longer length of time before checking to see if what they're doing makes sense.

I do think that people who are put off by this significantly underestimate how often people are confidently incorrect as well. There's a reason the saying trust but verify is a common one.


Third, you just glue a vector search onto it and stuff it with a bunch of textbooks and documentation.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: