I answered the financial thinking in another reply, but another factor is I need...

I answered the financial thinking in another reply, but another factor is I need to know if the model today is exactly the same as tomorrow for reliable scientific benchmarking.

I need to tell if I change I made was impactful, but if the model just magically gets smarter or dumber at my tasks with no warning then I can’t tell if I made an improvement or a regression.

Whereas the model on my GPU doesn’t change unless I change it. So it’s one less variable and LLM are black box to start with.

I may be wrong for Gemini, but my impression is all the companies are constantly tweaking the big models. I know GPT on Monday is not always the same GPT on Thursday for example.