Hacker News new | past | comments | ask | show | jobs | submit login

This is not a good comparison for real world coding tasks.

Based on my own experience and anectodes, it's worse than Claude 3.5 and 3.7 Sonnet for actual coding tasks on existing projects. It is very difficult to control the model behavior.

I will probably make a blog post on real world usage.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: