
IMO Grand Challenge - panic
https://imo-grand-challenge.github.io
======
OscarCunningham
A lot of IMO geometry problems are easy for computers because they can simply
use coordinates to convert the problem to algebra (this approach is
computation heavy which is why humans don't use it even though it's guaranteed
to work given enough time). So in a good year the computer might get two out
of the six problems without having to do anything clever.

~~~
saagarjha
IMO problems are generally proof-based and don't always lend themselves well
to brute force solutions using analytic geometry.

~~~
OscarCunningham
The kind of problems I'm thinking of are those like 2 and 6 on this year's
IMO.

You have a bunch of points, lines and circles in some configuration and you
have to prove that some condition holds. So you assign symbols to represent
the coordinates of each of these points and you represent each fact you are
told about them by some equation. The condition you are trying to prove is
also represented by some equation. After multiplying up to get rid of square
roots each of these equations is setting a polynomial equal to 0. So you are
just trying to prove that some polynomial must be 0 if a bunch of other
polynomials are 0, which can be done algorithmically via Gröbner bases.

~~~
woopwoop
There is a Mathoverflow thread [0] about this. Discussion in the comments
expresses skepticism because olympiad questions usually involve questions
about choosing specific real roots of polynomials, not just arbitrary roots.
The discussion there mostly suggests using some principles in real algebraic
geometry, but these seem to be too slow and complicated to use in practice at
this time.

[0] [https://mathoverflow.net/questions/337558/automatically-
solv...](https://mathoverflow.net/questions/337558/automatically-solving-
olympiad-geometry-problems)

~~~
OscarCunningham
That question was asked by Kevin Buzzard, who is on the IMO Grand Challenge
committee. So we are really going over old ground here.

~~~
woopwoop
I'm a little perplexed by the dismissiveness of your response (perhaps I'm
misreading?). I'm not saying IMO problems will never be algorithmically
approachable, just saying expressing these problems in terms of Grobner bases
may be harder than is suggested here. Buzzard also expresses skepticism about
using Grobner bases for these sorts of problems in the comments.

~~~
OscarCunningham
I wasn't trying to be dismissive, just saying that I didn't have much to add
that wasn't already said there (and separately pointing out the connection
between Buzzard and this thread).

------
nebulous1
Speaking of the IMO, I saw this video recently and thought it was pretty
interesting:
[https://www.youtube.com/watch?v=M64HUIJFTZM](https://www.youtube.com/watch?v=M64HUIJFTZM)

------
carapace
The "Reproducibility" proposed rule is very exciting IMO.

> The AI must be open-source, released publicly before the first day of the
> IMO, and be easily reproduceable. (sic) The AI cannot query the Internet.

------
typon
This seems impossible for current ML architectures, but I want to be proven
wrong. If the AI can't query the internet, does it have a limit of the model
size and the computation capability? I feel like letting an AI use megawatts
of power isn't quite fair.

~~~
ivanbakel
>I feel like letting an AI use megawatts of power isn't quite fair.

Based on what? An AI which could reliably produce proofs for these problems
would be at the cutting edge of research. Putting additional constraints on it
for "fairness" just increases the likelihood of failure. It's not as if
there's already some competition to measure up against.

------
throwaway-ai-ai
I'm curious: what are people's predictions for when it will be possible to run
some program XYZ on a $10/hr EC2 machine and have it beat humans on the
IOI/ICPC ?

Furthermore, suppose such an open source program existed, how long would it
take for it to start replacing remote contractors and then in-office
programmers? ["start replacing" as in, say, 10% of human programmers]

~~~
eternalny1
> Furthermore, suppose such an open source program existed, how long would it
> take for it to start replacing remote contractors and then in-office
> programmers?

Until AI can create an Angular application with a .Net Core Azure back-end
that meets ever-changing customer requirements ("can we remove the need for
Bootstrap 4? Can we make the integration with Active Directory seamless?") on
short notice, not very soon at all!

I'm throwing that out there because that's just one of my current projects,
but a lot of programmers do things daily that AI simply isn't intended for.

Or college courses, for that matter. Although since you mentioned 10% you
probably mean more in the area of research science, which may be a different
story.

~~~
throwaway-ai-ai
> Until AI can create an Angular application with a .Net Core Azure back-end
> that meets ever-changing customer requirements ("can we remove the need for
> Bootstrap 4? Can we make the integration with Active Directory seamless?")
> on short notice, not very soon at all!

I think you are arguing for "Not all programming tasks that can be automated
away by a bot that can win IOI / ICPC."

However, what I'm asking is: "Does there exist 10% of programming that can be
automated away by a bot that can win IOI / ICPC" ?

Furthermore, I think you are also assuming that human programmers remain
static, while the AI has to be a 'drop in replacement for humans.' On the
other hand, if we look at AWS -- AWS didn't build some AI that is a drop in
replacement for sys admin work. AWS built their own API, humans adapted to it.

It seems to me that in a world where such an IOI/ICPC winning bot existed,
programmers would adapt to it (just as programmers have adapted to AWS), by
figuring out "how can I leverage this API so that I don't have to hire someone
to do FOOBAR"

------
th-th-throwaway
There was a good link about why AI algorithms tend to approach human level
performance rapidly for some tasks: [https://www.coursera.org/lecture/machine-
learning-projects/w...](https://www.coursera.org/lecture/machine-learning-
projects/why-human-level-performance-FWkpo)

The answer turns out to be pretty boring: As long as humans are still better
than machine, humans can provide algorithmic insights and be a source of cheap
labels.

So I am pretty pessimistic about this particular task. There are only a small
handful of humans in the world who are qualified to help! (according to IMO
scores: [https://www.imo-official.org/hall.aspx](https://www.imo-
official.org/hall.aspx))

~~~
OscarCunningham
The pool of people who can solve IMO problems is much larger than the pool of
people who can solve IMO problems at the rate of one every 90 minutes.

Also consider that fact that human IMO contestants have little training in
university-level mathematics. The problem setters try to choose problems where
knowledge of advanced mathematics doesn't help, in order to produce a level
playing field which only measures raw problem-solving ability. But I suspect
that postgraduate-level mathematics will nevertheless be useful in programming
the AI.

~~~
saagarjha
> Also consider that fact that human IMO contestants have little training in
> university-level mathematics.

I don't think this is true, as in my experience many of the contestants have
already cultured a background in calculus but refrain from using it as the
problems are usually designed to actively discourage its use.

~~~
eru
I guess it depends on what you mean by university level mathematics.

Terrence Tao definitely says that doing university math changed his approach
to these problems. See [https://terrytao.wordpress.com/books/solving-
mathematical-pr...](https://terrytao.wordpress.com/books/solving-mathematical-
problems/)

~~~
saagarjha
> I guess it depends on what you mean by university level mathematics.

Calculus.

~~~
dan-robertson
This definitely depends on the country. There’s a reasonable amount of
calculus in high school (strictly sixth form) mathematics in the U.K. The
things one tends to see in university begin with (abstract) algebra and
analysis with some “calculus” topics being things like vector calculus, more
generic R^n->R^m calculus and contour integration.

~~~
eru
In Germany we had plenty of calculus in the Abitur (like A-levels), too.

Uni adds a much more axiomatic and formal approach.

------
oefrha
An obvious extension / sister project: Putnam Grand Challenge.

------
throwawaymath
I thought Reid Barton was at RenTech? He's back in academia?

~~~
lacker
AFAIK he just got an academic job

------
prvc
I see no reason why the difference between "textbook" and "contest" problems
would be anywhere near as significant to computers as it is to humans.

~~~
saagarjha
Depends on the textbook, but most ones used in schools are quite formulaic
while IMO problems generally require significant insight to solve.

~~~
eru
School math textbooks are quite formulaic.

University math textbooks less so.

~~~
saagarjha
Depends on the textbook, of course. I've personally found that commonly-used
university mathematics textbooks for standard courses to be of equal or poorer
quality than high school texts.

------
buyingarmor
>10 minutes (which is approximately the amount of time it takes a human judge
to judge a human’s solution)

Hah, some student solutions take hours to read...

------
hyperbovine
Is Reid Barton back to doing academic mathematics?

~~~
lacker
Yea he just got an academic job

------
pirocks
I'm curious as to whether LEAN is the right choice for this. LEAN is rather
hard to parse, and complex, as a consequence of being designed for use by
humans. Perhaps specifying the problem(s) in a small superset of first order
logic would be better?

~~~
andrepd
The people promoting this _are_ the Lean developers... :p

------
robocat
> but there are no other limits on the computational resources it may use
> during that time

I am guessing that defining a CO₂ or $ limit causes problems.

They could specify that the solution must be CO₂ neutral e.g. suggest an
official CO₂ offset provider?

------
benkarst
Thought this was a link to great reddit and youtube comments.

------
mkagenius
Not sure if reading the problem text and understanding what it is asking is
part of the problem or not. If it is, well, good luck collecting data for it.

~~~
monktastic1
This is explained in paragraph two of the post. No, that is not part of the
problem.

------
lordnacho
Why would this be hard for the machines? A lot of math contest problems are
about trying some rearrangements or substitutions, using a few principles to
guide you towards the solution.

If you give the machine some equations, wouldn't it find a path to the
solution pretty fast? Aren't there solvers that aren't considered AI that do
that sort of thing?

I did some math contests in school, and a lot of the problems needed some
brute force element along with a bit of overview so you didn't waste too much
time. Often there'd be a problem that you'd look at and think "hmm this will
solve via integration by parts" but the thing was complex enough that you'd
potentially screw something up.

I looked in the Gitlab, it seems the problems are already encoded in Lean.
Haven't you solved a major part of the problem already if you can formally
encode it like that? Half the fun of math contests is getting over the WTF
feeling.

~~~
throwaway-ai-ai
Suppose your assertion about this problem being easy was true.

Most research papers in Math / Theoretical CS are < 50 pages, while many IMO
problems have solutions > 1 page. So it's only a factor of 50 "more complex."

Then, we should be able to encode open problems / conjectures in Math /
Theoretical CS into Lean, run this brute fore approach, and have it start auto
generating new publications.

To the best of my knowledge, no one has done this yet.

~~~
eru
Well, the IMO problems are constructed with a solution in mind.

Also, who says that efforts goes up linearly with size?

(Not necessarily agreeing with lordnacho here, just saying that your argument
ain't a good one.)

~~~
NieDzejkob
That's a good point. If we assume each line of a publication is just applying
an axiom, the complexity of such a search is clearly exponential in the length
of an article. I feel like this wouldn't be _better_ if we dropped the
assumption.

