Math and chess are similar in the sense that for humans, both require creativity, logical problem solving, etc.
But they are not at all similar for computers. Chess has a constrained small set of rules and it is pretty straightforward to make a machine that beats humans by brute force computation. Pre-Leela chess programs were just tree search, a hardcoded evaluation function, and lots of pruning heuristics. So those programs are really approaching the game in a fundamentally different way from strong humans, who rely much more on intuition and pattern-recognition rather than calculation. It just turns out the computer approach is actually better than the human one. Sort of like how a car can move faster than a human even though cars don’t do anything much like walking.
Math is not analogous: there’s no obvious algorithm for discovering mathematical proofs or solving difficult problems that could be implemented in a classical, pre-Gen AI computer program.
> there’s no obvious algorithm for discovering mathematical proofs or solving difficult problems that could be implemented in a classical, pre-Gen AI computer program.
Fundamentally opposite. Computer algorithms have been part of math research since they where invented, and mathematical proof algorithms are widespread and excellent.
The llms that are now "intelligent enough to do maths" are just trained to rephrase questions into prolog code.
For the OpenAI case, its unclear. They've not disclosed the method yet. (Though they have previously had an official model that could query wolfram-alpha, so they're not strangers to that method)
But math olympiad questions have been beaten before by AlphaGeometry and a few other's using prolog or similar logic evaluation engines. And it works quite well. (Simply searching LLM prolog gives alot of results on Google and Google scholar)
If openai did it through brute forces text reasoning, its both impressive and frighteningly inefficient.
Even just normal algebra is something llms struggle with, hence using existing algebra solvers is faar more effective.
But they are not at all similar for computers. Chess has a constrained small set of rules and it is pretty straightforward to make a machine that beats humans by brute force computation. Pre-Leela chess programs were just tree search, a hardcoded evaluation function, and lots of pruning heuristics. So those programs are really approaching the game in a fundamentally different way from strong humans, who rely much more on intuition and pattern-recognition rather than calculation. It just turns out the computer approach is actually better than the human one. Sort of like how a car can move faster than a human even though cars don’t do anything much like walking.
Math is not analogous: there’s no obvious algorithm for discovering mathematical proofs or solving difficult problems that could be implemented in a classical, pre-Gen AI computer program.