Hacker News new | past | comments | ask | show | jobs | submit login

None of the models other than the 600b one are R1. They’re just prev gen models like llama or qwen trained on r1 output making them slightly better





"Slightly" is an understatement, though. Distillations of R1 are significantly better than the underlying models.

Yeah but the second comment you see believes they are, and belief is truth when it comes to stock market gambling.



Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: