Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We’ve been testing the upgraded models in the API (where you can control when the upgrade happens), and the newer ones perform significantly worse than the older ones on the same tasks. Tweaking the prompts helps some but not enough. We’re staying on the older models for now in production.

Hope OpenAI figures this out because quality has been their biggest moat up until now.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: