Speaking of hardware, LLMs dont run on air. OpenAI is likely using a boatload of Nvidia A100s.
And this is where the memory analogy starts to break down. OpenAI probably can't fund a hardware "LLM accelerator" to reduce costs. But you can bet Google, Amazon, Microsoft (who are silicon designers) are pondering just that, and Nvidia is probably thinking of hosting one themselves (which they can obviously do at far lower cost).
Maybe OpenAI can leverage a startup like Cerebras or Tenstorrent, but I am skeptical.
> OpenAI probably can’t fund a hardware “LLM accelerator” to reduce costs. But you can bet Google, Amazon, Microsoft (who are silicon designers) are pondering just that
I was thinking of something specifically designed for LLMs though. Aka massive memory pools, among other things. Right now, the I think the model has to be split up for inference, which is inefficient.
OpenAI's portion of corpus that is built citing fair-use scraping & then preventing users from creating a competitor from their output does not sit well at all.
And this is where the memory analogy starts to break down. OpenAI probably can't fund a hardware "LLM accelerator" to reduce costs. But you can bet Google, Amazon, Microsoft (who are silicon designers) are pondering just that, and Nvidia is probably thinking of hosting one themselves (which they can obviously do at far lower cost).
Maybe OpenAI can leverage a startup like Cerebras or Tenstorrent, but I am skeptical.