Hacker News new | past | comments | ask | show | jobs | submit login

The places where they use the same methodology seem within the error bars of the cherry picked benchmarks they selected. Maybe for some tasks it's roughly comparable to GPT4 (still a major accomplishment for Google to come close to closing the gap for the current generation of models), but this looks like someone had the goal of showing Gemini beating GPT4 in most areas and worked back from there to figure out how to get there.



Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: