That makes no sense. Inference cost dwarf training cost if you have a succesfull...

whimsicalism · 2024-12-11T18:44:35 1733942675

> Afaik there is no commodity hardware that can run state of the art models like chatgpt-o1.

Stack enough GPUs and any of them can run o1. Building a chip to infer LLMs is much easier than building a training chip.

Just because one cost dwarfs another does not mean that this is where the most marginal value from developing a better chip will be, especially if other people are just doing it for you. Google gets a good model, inference providers will be begging to be able to run it on their platform, or to just sell google their chips - and as I said, inference chips are much easier.

menaerus · 2024-12-11T20:33:30 1733949210

Each GPU costs ~50k. You need at least 8 of them to run mid-sized models. Then you need a server to plug those GPUs into. That's not commodity hardware.

whimsicalism · 2024-12-11T22:48:53 1733957333

more like ~$16k for 16 3090s. AMD chips can also run these models. The parts are expensive but there is a competitive market in processors that can do LLM inference. Less so in training.