how do you know what it's telling you is correct and not (as the open ai paper h...

Baeocystin · on April 19, 2023

Not the person you asked, but chiming in. Two things:

First, GPT-4 is far more capable than 3.5 when it comes to not hallucinating. The 'house style' of the response is, of course, very similar, which can lead one to initially think that the difference between the two models isn't as significant as it is. Since I don't have API access yet, I do a lot of my initial exploration in 3.5-Legacy, and once I've narrowed things down a bit, I have GPT-4 review and correct 3.5's final answers. It works very well.

Second, and this is more of a meta comment: How people use ChatGPT really exposes whether they use sanity checks as a regular part of their problem solving technique. That all of the LLMs are confidently incorrect at times doesn't slow me down much at all, and sometimes their errors actually help me think of new directions to explore, even if 'wrong' on first take.

Conversely, several of my friends find the error rate to be an impediment to how they work. They're good at what they do, they're just more used to taking things as correct and running with it for a longer length of time before checking to see if what they're doing makes sense.

I do think that people who are put off by this significantly underestimate how often people are confidently incorrect as well. There's a reason the saying trust but verify is a common one.

throwaway4aday · on April 20, 2023

Third, you just glue a vector search onto it and stuff it with a bunch of textbooks and documentation.