Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Hey! One of the lead devs here. A cloud computing company called CoreWeave is giving us the compute for free in exchange for us releasing it. We're currently at the ~10B scale and are working on understanding datacenter scale parallelized training better, but we expect to train the model on 300-500 V100s for 4-6 months.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: