
Two-hundred-terabyte math proof is largest ever - tokenadult
http://www.nature.com/news/two-hundred-terabyte-maths-proof-is-largest-ever-1.19990
======
CJefferson
Darn, I had no idea one could get into the media with this kind of stuff.

I had a much larger "proof", where we didn't bother storing all the details,
in which we enumerated 718,981,858,383,872 semigroups, towards counting the
semigroups of size 10.

Uncompressed, it would have been about 63,000 terabytes just for the
semigroups, and about a thousand times that to store the "proof", which is
just the tree of the search.

Of course, it would have compressed _extremely_ well, but also I'm not sure it
would have had any value, you could rebuild the search tree much faster than
you could read it from a disc, and if anyone re-verified our calculation I
would prefer they did it by a slightly different search, which would give us
much better guarantees of correctness.

~~~
nsimplexx
I have a question about computer-generated proofs, as I am doing something
similar.

When you encode the search tree as a proof, is there a technical name for that
technique?

In my situation I encoded the proof as a sequence which is just the depth-
first linear labelling of the tree. Then I proved why the sequence represents
a proof. But if there is a most standard terminology/literature on this step,
it would make it easier for me to talk about.

~~~
CJefferson
I've had a similar problem, there is nothing that useful I'm afraid. Probably
best to look into general discussions of trees.

One thing worth trying to do (if you can) is separate the generation of the
tree from the tree itself -- hopefully you can imagine the tree existing as a
complete object (even though you never store it all), and formulate proofs
based on that.

------
Aardwolf
Probably the code used to generate the 200TB was not 200TB itself, right?

So in a sense, the code is the proof, and the 200TB is just something it
generates as a side effect.

~~~
Houshalter
In fact you can encode any proof to a few hundred lines. Just write a program
that brute forces through all possible proofs. If that counts, then no proof
need be longer than it's conjecture plus a constant.

~~~
ifdefdebug
I'm not sure: doesn't the halting problem apply to this? In other words, can
you write such a brute force program that is guaranteed to halt after a finite
number of steps, providing a yes or no? I guess not.

~~~
hyperpape
It doesn't, and it's interesting why. You can enumerate all possible proofs.
If there is a proof of a given statement, you will eventually find it. But if
there is no proof, you will never know, because it always could be the next
proof you haven't looked at.

For the same reason, for any program that halts, you can run it long enough,
and see that it halts. But (unless you are lucky enough to detect a loop), if
it doesn't halt, you will never witness that, you will only be able to say "it
hasn't halted yet".

It definitely can take a while until that kind of asymmetry becomes intuitive.

~~~
ifdefdebug
> But if there is no proof, you will never know, because it always could be
> the next proof you haven't looked at.

But that's exactly what I meant. So the brute force program proposed in the
post I replied to doesn't exist, right?

Let's say, you have proposition A. You write a program to brute force proofs
for A. If it finds a proof, it will halt and output Yes, if it finds a
disproof it will halt and output No, otherwise it will check the next proof.

So if you say that your brute force program can proof/disproof proposition A
(and A1, A2, A3, ... for that matter), you will have to proof that your
program will halt for sure after a finite number of steps. Now that's
proposition B.

If you find a clever proof for proposition B, then you can claim that you
wrote a program that can proof or disproof proposition A - maybe in a zillion
years, but it can.

If you can't find such a proof for proposition B, you can always try to write
a brute force program to proof/disproof proposition B... then C, D, E, etc. :)

And here comes the halting problem: you may be able to proof that a specific
program will halt on a specific input, but it can be very hard to proof. But
you can't proof that a specific program will halt on every input. This is
directly related to Godel's incompleteness of formal systems.

~~~
Certhas
Every conjecture that is provable can be proved by the proposed program of
length (conjecture + constant) in finite time.

That was the original statement:

> you can encode any proof to a few hundred lines.

This type of encoding is funny of course. You can not determine from the
encoding of the proof whether the encoding is the encoding of a proof (as you
point out). But if you have a proof, and thus know that a proof exists you can
"encode" it very succinctly in this program for whatever that is worth.

~~~
ifdefdebug
> you can encode any proof to a few hundred lines.

Yeah, I misread that. The sentence implies the existence of a proof. So I was
clearly talking about something different. Thanks.

------
mafribe
There is an increasing appreciation that large proofs need "proof engineering"
that is similar to software engineering. See e.g.

\- D. Aspinall, C. Kaliszyk, Towards Formal Proof Metrics.

\- G. Klein, Proof Engineering Considered Essential.

\- G. Gonthier, Engineering Mathematics: The Odd Order Theorem Proof.

\- T. Bourke, M. Daum, G. Klein, R. Kolanski, Challenges and Experiences in
Managing Large-Scale Proofs.

~~~
JadeNB
Links, unfortunately non-free:

Towards formal proof metrics:
[http://link.springer.com/chapter/10.1007/978-3-662-49665-7_1...](http://link.springer.com/chapter/10.1007/978-3-662-49665-7_19)

Proof engineering considered essential:
[http://link.springer.com/chapter/10.1007/978-3-319-06410-9_2](http://link.springer.com/chapter/10.1007/978-3-319-06410-9_2)

Engineering mathematics:
[http://dl.acm.org/citation.cfm?id=2429071](http://dl.acm.org/citation.cfm?id=2429071)

Challenges and experiences …:
[http://link.springer.com/chapter/10.1007/978-3-642-31374-5_3](http://link.springer.com/chapter/10.1007/978-3-642-31374-5_3)

------
infruset
A relevant article on large proofs: [http://gallium.inria.fr/blog/large-
proofs/](http://gallium.inria.fr/blog/large-proofs/)

------
cantagi
"but when you reach 7,825, it is impossible for every Pythagorean triple to be
multicoloured"

So they didn't know in advance how long it was going to take or the size of
the output?

~~~
ColinWright
Correct. They didn't even know if there was a finite number where colouring
became impossible, but it was strongly believed there was such a limit.
Privately, people had their own ideas - sort of Fermi estimates. I don't know
anyone who made such estimates public.

------
userbinator
I think "brute force math" would be a good name for this technique --- that
diagram reminds me of finding hash collisions:

[http://www.links.org/?p=6](http://www.links.org/?p=6)

------
dylanz
Is the data compressed and piped into a known compression algorithm, or is
there some other technique used? I'd love to know how the actual program is
run. Side note... the visual output is really interesting to see, and I'll
admit I crossed my eyes at it to see if I could see some sort of pattern
(read: there was no sailboat).

~~~
nrfc
The authors used this tool:

[https://www.cs.utexas.edu/~marijn/drat-
trim/](https://www.cs.utexas.edu/~marijn/drat-trim/)

There are publications and examples on the website that explain how the
compression the authors used works.

------
vanderZwan
> _Although the computer solution has cracked the Boolean Pythagorean triples
> problem, it hasn’t provided an underlying reason why the colouring is
> impossible, or explored whether the number 7,825 is meaningful, says
> Kullmann._

Kind of reminds me of why we were always taught to avoid proof by induction
(the mathematical kind[0]) when possible.

Also, funny coincidence, but the shortest math paper published in a serious
journal[1] was also a "no insight into _why_ " kind of proof.

[0]
[https://en.wikipedia.org/wiki/Mathematical_induction](https://en.wikipedia.org/wiki/Mathematical_induction)

[1] [http://www.openculture.com/2015/04/shortest-known-paper-
in-a...](http://www.openculture.com/2015/04/shortest-known-paper-in-a-serious-
math-journal.html)

~~~
j2kun
That's the problem with counterexample proofs, though. If something is only
true for a large-but-finite range of numbers, there might not be a satisfying
reason why.

------
gsam
If you verified the code performed the correct function, ensured it ran
correctly and computed the result, then the code itself serves as most of the
proof (not the 200TB). Really what you want to be peer-reviewed is the code,
not the swathes of data.

~~~
nrfc
_> If you verified the code performed the correct function, ensured it ran
correctly and computed the result, then the code itself serves as most of the
proof_

But then you have to trust their program doesn't have any bugs, and checking
the proof requires a lot of computational power.

Instead, the authors produced a very large input file to a very small program.
This way, instead of trusting the program they used to produce the input file,
you only need to trust:

1) the input file corresponds to the theorem statement in their paper; and

2) the certificate checking program is correct

The benefit of this approach over the one you suggest are numerous:

* The effort of part 2 amortizes and isn't specific to any particular problem.

* The computational cost of re-checking their proof is considerably smaller than the cost of finding the proof certificate, allowing for the proof to be re-checked by anyone with a bit of spare computing power (as opposed to requiring a super computer in order to replicate the result).

~~~
umanwizard
> But then you have to trust their program doesn't have any bugs

This is no different from a typical math paper. Proofs can have "bugs", too;
and sometimes they're fatal. That's -- in theory -- one of the points of peer
review.

~~~
nrfc
_> This is no different from a typical math paper_

It's substantially different from a typical math paper.

First because programs are typically much larger than even large mathematical
proofs (more on this below).

Second, because even after you trust the program, you then still have have to
_run the program_ to make sure the paper's result is correct. Far better to
run the program _once_ and then produce something you can _check quickly_ than
to require every peer reviewer to buy hundreds/thousands of dollars of compute
time. Think P vs. NP: why force the reviewer to solve an NP problem when you
can just as easily hand them a P problem.

 _> Proofs can have "bugs", too_

But the reviewer's bug checking obligation is typically isolated to the paper
at hand; i.e., the reviewer doesn't have to worry about the correctness of
cited papers -- it's assumed those results are correct. By contrast, software
implementations -- even for purely mathematical code -- often contain tens of
thousands of lines of implicitly trusted code [1].

Reviewers therefore (rightly!) say of ad hoc programs submitted as a portion
of a proof: "sorry, I can't possibly know whether there are any important bugs
in these tens of thousands of lines of code upon when you depend".

As a result, the SAT and theorem proving communities have developed highly
trustworthy proof checkers so that the portion of code the reviewer has to
trust can be meaningfully peer reviewed (and, more-over, peer reviewed apart
from any particular theorem).

[1] This is true even if you don't include things like the underlying
operating system. Standard libraries, mathematics packages, compiler
implementations, solver implementations, specialized software packages for the
application domain, etc. etc. Some of this software a reasonable person will
trust, but some of it really isn't interrogated well enough for the purpose of
mathematical proof... and even just sorting out what's trustworthy and what's
not can take a significant amount of time.

------
alimw
Surely the value of this is that it provides a counterexample to a long-
standing conjecture. The proof that it is a counterexample may not be too
enlightening, but that's hardly the point.

------
jbmorgado
This isn't mathematics, this is data analysis.

A mathematician doesn't come up with a research where he says: "Ei look, we
are now extremely confident that this theorem applies in every case... we
don't have a full mathematical proof for it but I can show you all these cases
where it holds true."

No, a mathematician proposes a result and then goes on to prove or disprove
mathematically that it holds true under specific conditions in absolutely
every case where those conditions are met.

~~~
BugsBunnySan
IANAMathematician, but, a brute force proof is still a proof though, isn't it?

If I have the theory, that all positive integers smaller than 5 are also
smaller than 6, I would assume I'm mathematically allowed to 'brute force' the
proof by enumerating all the numbers > 0 and < 5 and then testing for each
that it's also < 6.

If I can prove I enumerated all the positive integers smaller than 5 and
didn't find any exceptions, I guess I proved my theory...

It's a silly way to prove my silly theory, but it is a proof. (I have also a
truly remarkable proof, which this comment is too small to contain. ;) )

~~~
dagw
Computer assisted proofs are historically frowned upon because you eventually
end up in a black box you have to blindly trust. Can you verify that your code
is correct according to the language spec? Can you verify that your compiler
is bug free and follows the spec? Can you verify that your CPU is bug free and
runs the output from your compiler correctly? Can you verify that your storage
medium is correct and hasn't corrupted any of your intermediate results etc.
etc.

Practically people will eventually say that if your test has been run N times
using M different compilers and P different computers and they all give the
same answer then you're probably correct, but having a proof that humans can
reason through from beginning to end makes people feel more comfortable.

That being said computer assisted proofs are becoming more and more common and
mathematicians are getting more and more comfortable with them as the
verification tools surrounding them keep getting better. And even if computers
aren't used for the final proof they are often used for intermediate tests.
For example if you're making a statement about all primes it makes sense to
quickly test the first few hundred million primes to make sure there aren't
any obvious counter-examples.

~~~
thfuran
>it makes sense to quickly test the first few hundred million primes to make
sure there aren't any obvious counter-examples

It's kind of absurd the sorts of endeavors which computers have rendered not
only possible, but relatively trivial.

------
infinidim
I have discovered a truly marvelous demonstration of this proposition that
this margin is too narrow to contain.

