Hacker News new | past | comments | ask | show | jobs | submit login

Your experience makes me think that the reason the models got a better success rate is not because they are better at reasoning, but rather because the problem made it to their training dataset.





We don't know. The paper and the problem was very prominent at that time. Some developers at Anthropic or OpenAI might have included that in some way. Either as test or as a task to improve the CoT via Reinforcement Learning.

Absolutely! It's the elephant in the room with these ducking "we've solved 80% of maths olympiad problems" claims!



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: