Hacker News new | past | comments | ask | show | jobs | submit | wujerry2000's comments login

My takeaways

(1) Companies will probably increasingly invest in building their own evals for their use cases because its becoming clear public/allegedly private benchmarks have misaligned incentives with labs sponsoring/cheating (2) Those evals will prob be proprietary "IP" - guarded as closely as the code or research itself (3) Conversely, public benchmarks are exhausted and SOMEONE has to invest in funding more frontier benchmarks. So this is prob going to continue.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: