It really bugs me when people post these threads without saying whether they use...

rolae · on May 29, 2023

Both 3.5 and 4 hallucinated according to the professor:

> Most used 3.5. A few used 4 and those essays also had false info. I don't think they used any browsing plug-ins but it's possible--it was a take-home assignment and not one they did in class.

https://twitter.com/cwhowell123/status/1662517400770691072

thinkingkong · on May 29, 2023

No it doesnt.

The public at large can be forgiven for not assuming a half version is the difference between being factually incorrect or not. If OpenAI actually even has confidence theyd call one model “sorta true sometimes” and “mostly true but only for certain things depending how convincing you need it to be”.

Users wont care whats true or not and blind trust in LLM output is the issue.