Hacker Newsnew | past | comments | ask | show | jobs | submit | Jirach05's commentslogin

Can anyone explain why these models decrease in performance on this "MCRC v2 (8-needle)" long context benchmark when thinking is turned on?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: