Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The premise of function calling is great, but in my experience (at least on GPT-3.5, haven't tried it with GPT-4 yet) it seems to generate wildly different, and less useful results, for the same prompt.


GPT-3.5 is pretty much useless for reliable NLP work unless you give it a VERY proscribed task.

That's really the major breakthrough of GPT-4, in my mind, and the reason we are absolutely going to see an explosion of AI-boosted productivity over the next few years, even if foundation LLM advancements stopped cold right now. A vast ocean of mundane white collar work is waiting to be automated.


You can change the randomness value to 0 and get the same output each time for the same text


In my experience (with GPT-4 at least), a temperature of 0 does not result in deterministic output. It's more consistent but outputs do still vary for the same input. I feel like temperature is a bit more like "how creative should the model be?"


One theory is it is caused by its Sparse MoE (Mixture of Experts) architecture [1]:

> The GPT-4 API is hosted with a backend that does batched inference. Although some of the randomness may be explained by other factors, the vast majority of non-determinism in the API is explainable by its Sparse MoE architecture failing to enforce per-sequence determinism.

[1] https://152334h.github.io/blog/non-determinism-in-gpt-4/


I should probably re-test it, but I think it wasn't the temperature. The results were unusually useless.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: