Your experience makes me think that the reason the models got a better success r...

s-macke · 2024-10-11T15:04:47.000000Z

We don't know. The paper and the problem was very prominent at that time. Some developers at Anthropic or OpenAI might have included that in some way. Either as test or as a task to improve the CoT via Reinforcement Learning.

andrepd · 2024-10-11T15:04:41.000000Z

Absolutely! It's the elephant in the room with these ducking "we've solved 80% of maths olympiad problems" claims!