Hacker News new | past | comments | ask | show | jobs | submit login

> That still means that that AI firms don't have to buy as many of Nvidia's chips

Couldn’t you say that about Blackwell as well? Blackwell is 25x more energy-efficient for generative AI tasks and offer up to 2.5x faster AI training performance overall.






And yet, Blackwell is sold out.

What does that tell us?

The industry is compute starved and that makes totally sense.

The tranformer model on which current LLMs are based on are 8 years old. But why took it so much time to get to the LLMs only 2 years ago?

Simple, Nvidia first had to push the compute at scale strongly. Try training GPT4 on Voltas from 2017. Good luck with that!

Current LLMs are possible thanks to the compute Nvidia has provided in the past decade. You could technically use 20 year old CPUs for LLMs but you might need to connect a billion of them.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: