1B of tokens for Gemini Flash (which is on par with llama3-70b in my experience ...

hehdhdjehehegwv · 2024-06-15T05:24:26 1718429066

I answered the financial thinking in another reply, but another factor is I need to know if the model today is exactly the same as tomorrow for reliable scientific benchmarking.

I need to tell if I change I made was impactful, but if the model just magically gets smarter or dumber at my tasks with no warning then I can’t tell if I made an improvement or a regression.

Whereas the model on my GPU doesn’t change unless I change it. So it’s one less variable and LLM are black box to start with.

I may be wrong for Gemini, but my impression is all the companies are constantly tweaking the big models. I know GPT on Monday is not always the same GPT on Thursday for example.