This is with Together's API via OpenRouter, running DeepSeek V3 0324 and Kimi K2...

sailingparrot · 2025-10-12T19:22:59 1760296979

Oh in that case there is definitely a top-k or top-p behind the scene, it might just not be exposed to the user as a param they can change through their API. I haven’t heard of anyone running a LLM in prod with actual pure sampling

wishawa · 2025-10-13T22:14:19 1760393659

I see. That's slightly unfortunate. In principle, increasing temperature flattens out the distribution but the ordering between different tokens' probabilities remain the same, so setting a top-k shouldn't break my test. Can't say the same for top-p though. And all of this is probably too deep into the provider's implementation details for me to make assumptions on.