Hacker News new | past | comments | ask | show | jobs | submit login

o3-mini and gpt-4o are so piss poor in agent coding compared to claude that you don't even need a benchmark





o3-mini-medium is slower than claude but comparable in quality. o3-mini-high is even slower, but better.

Claude really is a step above the rest when it comes to agentic coding.

When I used it with Open Hands it was great but also quite expensive (~$8/hr). In Trea, it was pretty bad, but free. Maybe it depends on how the agents use it? (I was writing the same piece of software, a simple web crawler for a hobby RAG project.)



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: