"Scaling" is going to eventually apply to the ability to run more and higher fidelity simulations such that AI can run experiments and gather data about the world as fast and as accurately as possible. Pre-training is mostly dead. The corresponding compute spend will be orders of magnitude higher.
That's true, I expect more inference time scaling and hybrid inference/training time scaling when there's continual learning rather than scaling model size or pretraining compute.
Simulation scaling will be the most insane though. Simulating "everything" at the quantum level is impossible and the vast majority of new learning won't require anything near that. But answers to the hardest questions will require as close to it as possible so it will be tried. Millions upon millions of times. It's hard to imagine.
I don't think so. Serious attempts for producing data specifically for training have not being achieved yet. High quality data I mean, produced by anarcho-capitalists, not corporations like Scale AI using workers, governed by laws of a nation etc etc.
Don't underestimate the determination of 1 million young people to produce within 24 hours perfect data, to train a model to vacuum clean their house, if they don't have to do it themselves ever again, and maybe earn some little money on the side by creating the data.