Hacker News new | past | comments | ask | show | jobs | submit login

And yet, Blackwell is sold out.

What does that tell us?

The industry is compute starved and that makes totally sense.

The tranformer model on which current LLMs are based on are 8 years old. But why took it so much time to get to the LLMs only 2 years ago?

Simple, Nvidia first had to push the compute at scale strongly. Try training GPT4 on Voltas from 2017. Good luck with that!

Current LLMs are possible thanks to the compute Nvidia has provided in the past decade. You could technically use 20 year old CPUs for LLMs but you might need to connect a billion of them.






Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: