Hacker News new | past | comments | ask | show | jobs | submit login

And DeepSeek is just 3% behind. It seems in that benchmark all LLMs perform well and top is formed within some statistical error.





It could also be that they got "inspired" by DeepSeek, hence the very similar results.

So it could be that their success is mostly about taking an open and free thing, and turned it proprietary.


These percentage points don't mean anything. Look up how the Elo system works. They just add 1000 to the result to make it a nicer number.

There are llms below 1000 in the leaderboard

So? Percentage points are only meaningful when the mean of the dataset is 0, which is not the case here.



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: