If we find a new training technique that is that much more efficient why do you ...

mistercow · 2024-08-10T16:49:36.000000Z

We might, but it’s also plausible that it would change the ecosystem so much that centralized models are no longer so prominent. For example, suppose that with much cheaper training, most training is on your specific data and behaviors so that you have a model (or ensemble of models) tailored to your own needs. You still need a foundation model, but those are much smaller so that they can run on device, so even with overparameterization and distillation, the training costs are orders of magnitude smaller.

Or, in the small business case (mind you, “long term” for tech reaching small businesses is looooong), these businesses again need much smaller models because a) they don’t need a model well versed in Shakespeare and multi variable calculus, and b) they want inference to be as low cost as possible.

These are just scenarios off the top of my head. The broader point is that a dramatic drop in training cost is a wildcard whose effects are really hard to predict.