Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> (I didn't have control over temperature settings.)

That's...interesting. You'd think they'd dial the temperature to 0 for you before the demo at least. Regardless, if the tech is good, I'd hope all the answers are at least decent and you could roll with it. If not....then maybe it needs to stay in R&D.





Reducing temperature to 0 doesn't make LLMs deterministic. There's still a bunch of other issues such as float math results depending on which order you perform mathematically commutative operations in.

I keep reading this but I don't get it: for the same input shouldn't the order of resulting operations be deterministic too?

It gets more complicated with things like batch processing. Depending on where in the stack your query gets placed, and how the underlying hardware works, and how the software stack was implemented, you might get small differences that get compounded over many token generations. (vLLM - a popular inference engine, has this problem as well).

Not necessarily. This is a good blog post from a few days about it: https://thinkingmachines.ai/blog/defeating-nondeterminism-in...

Fantastic article, thanks!


Associative property of multiplication breaks down with floating point math because of the error. If the engine is multithreaded then its pretty easy to see how ordering of multiplication can change which can change the output.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: