Hacker Newsnew | past | comments | ask | show | jobs | submit | fromlogin
SWE-Bench Verified Is Flawed Despite Expert Review (ddkang.substack.com)
3 points by yuxuan18 13 days ago | past | discuss
AI agent benchmarks are broken (ddkang.substack.com)
185 points by neehao 24 days ago | past | 86 comments
Can AI speed up your code? (ddkang.substack.com)
3 points by amdc 11 months ago | past

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: