
200 terabyte proof demonstrates the potential of brute force math - sharemywin
https://motherboard.vice.com/en_us/article/padnvm/200-terabyte-proof-demonstrates-the-potential-of-brute-force-math
======
Koshkin
Sure, "proof mining" is feasible, but the goal of proving a theorem is only
rarely finding _any_ proof, it is rather finding a proof that _makes sense_ ,
is "elegant", and in the best case scenario a proof that actually forms a
sensible new theoretical framework in which the theorem becomes almost trivial
to prove.

~~~
ISL
If something can be proven by any method, that means that it is _true_. An
elegant proof is often more illuminating, but it isn't precluded by a direct
method.

If the proven theorem is useful, isn't it better to be able to make use of it
earlier, rather than wait for the elegant proof?

An example: If the AdS/CFT conjecture is demonstrably proven, it will elevate
it from "useful computational technique" to "law of mathematics, probably how
nature does it, too". I'll take any proof, as early as I can get it, in order
to guide our experimental work.

~~~
noobermin
But people often write papers anyway assuming X and Y and Z are true just to
see what happens because of it, it's not like you really need it to be true to
at least publish things.

And for your example that touches on applications of math, it approaches
_scientific_ research from the wrong perspective. If some math follows
reality, (agrees with experiment and even better makes predictions that turn
out to be right), there is has to be some backing to it, even if the rigor
isn't fully there. Physics often runs with ideas before Math has caught up
(see QM in the early days, HEP in general, etc).

So, if we back off from applications who, quite frankly, only care about
whether X is true in only a few, really important number of conjectures I can
think of--big questions like P=NP for example--then math itself I imagine is,
well, a study for its own sake. So, what above exhaustive proofs of your run-
of-the-mill conjecture? Who would want to read a paper from a math person that
merely iterates through a 200TB cache of data?

People forget that science is a social phenomenon as much as it is systematic
process.

~~~
rrobukef
I don't want a math paper detailing 200TB of data. I want a paper telling me
the generation of the digital input, the method that was used to actually
prove it if it's novel. I want the authors to show how they verified the
result and if it's important how they verified the automatic verification. The
paper will end with "Oh and by the way, we used this conjecture because it was
difficult, it's true. You can verify it yourself."

I want this all so I can do this myself, so I can prove a run-of-the-mill
conjecture and not waste my time with it. I want this because proving is
difficult and there is more to SAT than mathematical proofs, it is used in the
industry. I want this because I trust a verified proof that the airplane won't
crash more than a human proof.

The availability of computer proofs does not devalue the study of mathematics.

------
danharaj
The article: [https://cacm.acm.org/magazines/2017/8/219606-the-science-
of-...](https://cacm.acm.org/magazines/2017/8/219606-the-science-of-brute-
force/fulltext)

Note that most people believe that co-NP /= NP, which implies that an
unsatisfiability proof via SAT solving in general won't be efficiently
verifiable.

~~~
maaaats
Could this be used to prove co-NP = NP, though?

~~~
danharaj
probably not

------
janci
I hesitated before clicking on that link as for a second I thought it would
start downloading that 200 TB proof.

------
rocqua
Abstract:

\-------------------------

Recent progress in automated reasoning and supercomputing gives rise to a new
era of brute force. The game changer is “SAT,” a disruptive, brute-reasoning
technology in industry and science. We illustrate its strength and potential
via the proof of the Boolean Pythagorean Triples Problem, a long-standing open
problem in Ramsey Theory. This 200TB proof has been constructed completely
automatically—paradoxically, in an ingenious way. We welcome these bold new
proofs emerging on the horizon, beyond human understanding— both mathematics
and industry need them.

~~~
Xeoncross
> This 200TB proof

How do you verify a 200TB proof?

~~~
slaymaker1907
Not only do you need to verify the code, you also need to generate the proof
multiple times since there is a significant possibility of hardware error with
that large of a dataset.

~~~
noir_lord
..and then rewrite it again via a clean room approach and run that one and see
if they tally.

Preferably on different hardware.

~~~
dsacco
The Pi (and other mathematical constant) computation records don't go that
far. They verify random results towards the end with a different algorithm.

------
bmc7505
Has there ever been a brute force proof that discovered the existence of a
solution previously thought impossible? Something like a new tiling or proof
by contradiction that found a positive example?

~~~
qubex
If I do not remember incorrectly, the proof of the Four Colour Theorem is
essentially a brute-force one: a method of generating every possible graph
that represents adjacency of tiling polygons on a 2D surface, and a subsequent
colouring scheme for each and every one of them that never uses more than the
desired four colours.

~~~
bjl
Everyone already strongly suspected that the FCT was true. Nobody was
surprised by the result, it just took a computer to prove it.

~~~
qubex
You clearly are not a mathematician (nor am I). There's a _huge_ ,
unimaginably broad chasm cognitive chasm between "suspected", "conjectured",
and "proved".

~~~
Confusion
There is, according to quite a few, also a huge etc. chasm between 'proved'
and 'proved by a computer'. Especially if it's a proof through exhaustive
enumeration.

~~~
qubex
I disagree wholeheartedly. Proof by enumeration of finite cases is the most
iron clad and idiot proof form of proof one can possibly aspire to. Proof by
enumeration is fine and dandy. Some might cast a pall of disreputability upon
a proof by enumeration by computer, but really it is no different — the
computer and its program are merely a formal system somewhat more complicated
and thus harder to inspect than mathematicians are used to, but this does not
pose any real conceptual disconnect.

------
modalduality
Does anyone know what the proof actually proves, and how the reduction was
formulated?

~~~
curiousgal
Theorem — The set {1, . . . , 7824} can be partitioned into two parts, such
that no part contains a Pythagorean triple, while this is impossible for {1, .
. . , 7825}.

[https://arxiv.org/pdf/1605.00723.pdf](https://arxiv.org/pdf/1605.00723.pdf)

~~~
jordigh
Yeah, when I saw "200 terabytes" I thought it was going to be about the Erdős
discrepancy problem, since that's been making the rounds in the popular press,
but turns out that only required 13 gigabytes for the 2010 proof.

I'm glad that Terry seems to have found a different proof. I attended a
discussion of it during last week's Mathematical Congress of the Americas here
in Mtl, and I think they were saying that Terry's proof seems correct.

------
Radim
Mandatory insight-vs-bruteforce-in-science essay:

Peter Norvig, 2011: _On Chomsky and the Two Cultures of Statistical Learning_

[http://norvig.com/chomsky.html](http://norvig.com/chomsky.html)

------
lotsoflumens
It's disappointing that the authors use the term "brute force" to describe
modern SAT solving, in an article intended for people who may not be familiar
with the algorithms used.

That term implies that the speed of the computer that was used to find the
solution is the determining factor.

It is not. Modern SAT solvers use very elegant search algorithms that have
enabled the solution of SAT-encoded problems that could never have been solved
if only "brute force" was used.

------
grondilu
> You could take a basic algebra problem as an example: 2x + 100 = 500. To
> solve this with brute force, we simply check every possible value of x until
> one works. We know, however, that it is far more efficient to rearrange the
> given equation using algebraic rules and in just two computations we get an
> answer.

I gave some thoughts about this lately, and it occurred to me that it's a
problem that is similar to how computers play chess. I'd like to know if I'm
right.

The equation 2x + 100 = 500 is an algebraic identity and with algebraic rules
there is a finite number of ways to transform it. For instance, x + x + 100 =
500, 3x - x = 400, x + 200 = 600 - x, and so on. Each of these new equations
can once again be transformed in a finite number of ways. So basically the
initial equation is the root node of a tree that represents all the ways the
equation can be transformed.

So it's a similar problem than for chess, or many other board games, where you
start from a position and you must explore a tree of possibilities to find the
leaves (checkmates, for instance). For an equation the leaves are nodes of the
form x = numeric value.

Does that mean success in solving board games can have applications in
computational math?

~~~
nawgszy
Sorry, this is just a trivial observation, but if you're going to accept

x + x + 100 = 500

and

3x - x = 400

you can probably list any variety along the lines of

(n)x - (n - 2)x + y = 400 + y

so clearly that isn't a finite transformation space.

Now, these transformations might look trivial, but this problem is too. So
there's an interesting question of bounding your algebraic transformations to
a subset of all possible transformations that still make all possible
solutions available, which ultimately doesn't answer yours.

~~~
grondilu
True, it is not finite. Still, considering the algorithm will not attempt to
search the tree exhaustively anyway, maybe the fact that the tree is infinite
does not make much difference.

PS. On second thought the number of rules is finite, so the number of
possibilities _has to_ be finite, at least if you apply the rules one at a
time. It's the iteration of the rules that makes the tree infinite.

~~~
groby_b
Nope, a single rule still has an infinite number of possibilities. As an
example, multiplying both sides of the equation with an arbitrary constant
produces a new valid equation, and there's an infinite number of constants to
multiply with. (Bonus point: You get to choose what kind of infinity -
countable or uncountable)

The reason a move tree[1] in a chess game has a finite number of nodes is not
due to the finite number of possible transforms, but due to the fact that the
number of possible _board states_ is finite.

[1] graph, really.

~~~
cgmg
Infinite structures do not preclude finite analysis.

The theory of real closed fields is complete and decidable, for example.

~~~
groby_b
Yes. Also, not germane to the topic discussed.

------
GnarfGnarf
There's an error in the article.

    
    
        ((true AND false) OR true) AND NOT (true OR false)
    

is false, not true as asserted in the article.

~~~
cakebrewery
Maybe he corrected it afterwards but he clearly says it's not.

------
AstralStorm
A proof being valid does not mean the theory is useful or valid. If SAT was
used, it means it is internally consistent in first order logic assuming the
solver did not have a bug in it. (I bet the program was also verified.)

You can simplify these proofs by extracting subproofs and replacing them from
database by substitution. It is like compression but makes more sense.

------
opportune
From a technical standpoint, this is a terribly written article.

First, giving an example of solving an algebraic equation using brute force is
misleading, because you can't enumerate all values of a real number.

Second, the risks of secrutity regarding brute force computing having nothing
to do with SAT.

Also, their description of SAT is inaccurate. It's not new, and it was
actually one of the first formal problems studied in computer science. Also,
the original, linked paper notes that it is not brute-force computing that is
itself the breakthrough, but the addition of heuristics in SAT solvers that
make brute-forcing feasible.

~~~
legulere
This article is about SMT-solvers, which are used in formal verification,
which can be about security. SMT solvers are kind of newish:
[https://excape.cis.upenn.edu/documents/ClarkBarrettSlides.pd...](https://excape.cis.upenn.edu/documents/ClarkBarrettSlides.pdf)

~~~
opportune
The article and the linked paper
([https://cacm.acm.org/magazines/2017/8/219606-the-science-
of-...](https://cacm.acm.org/magazines/2017/8/219606-the-science-of-brute-
force/fulltext)) are about SAT solvers. The paper mentions SMTs briefly in
passing. There may be a disconnect in whether SMTs are referred to as SAT
solvers or their own distinct entity, though

Apologies for being nasty

------
Havoc
I suspect the average mathematician won't consider that "elegant".

~~~
vernie
Sure, but so what?

~~~
Koshkin
It is unlikely that such proof can be considered a contribution to
mathematics. Finding "elegant" proofs (and not just "facts"), building
conceptual frameworks in the process, is the heart of mathematics as a human
activity.

~~~
tempay
Can't these proofs be useful in the context of "given that X is true we can
say that"? In that case it doesn't matter how X was proven, it just provides a
tool for future mathematics.

~~~
Koshkin
Ultimately, in mathematics it is important to _understand why_ something is
true; simply knowing that someone (or something) "says so" is never good
enough.

~~~
AnimalMuppet
True, you don't understand why this is true. But you can understand that other
things are true because this is true.

------
hellbanner
What did it prove?

~~~
curiousgal
Theorem — The set {1, . . . , 7824} can be partitioned into two parts, such
that no part contains a Pythagorean triple, while this is impossible for {1, .
. . , 7825}.

~~~
hellbanner
Wow, interesting. Is the 200 terabyte proof necessary for that -- could it be
refactored into something smaller?

------
phasnox
This one is "from The Book" \- P.E.

------
pcmaffey
I can't wait for the day when a proof is 'discovered' that no one can
comprehend.

~~~
krallja
A 200TB proof is already incomprehensible to humans.

The average reading speed is 200wpm. The average human lifespan is 79 years,
or about 41.5 million minutes. If you did nothing but read at 200wpm for your
entire life, you might read 8.30 billion words. A word is 5.1 letters, on
average, and in ASCII, a letter is one byte. Multiply that all together to get
an estimated human reading speed of 42.4 GB/lifetime.

It would take 4728 entire human lifetimes to read this proof once.

This is vastly underestimating the actual number, due to silly things like
"eating", "sleeping", "learning to read", etc.

~~~
coldtea
> _A 200TB proof is already incomprehensible to humans._

Not really, since what matters is the compressibility of the proof, not its
size.

We might not be able to read all individual clauses, but it's enough that we
know how the clauses are constructed.

------
munificent
I really wish I'd known there would be an illustration of a naked woman in an
advertisement on the right before I opened this at work.

~~~
sheepdestroyer
Did not see it, shouldn't you use an adblocker at work for obvious reasons?

------
pervycreeper
Assuming progress continues long enough: as proofs (and theorems) become more
complex, the computational difficulty of verification, combined with the
inherent physical impossibility of obtaining absolute certainty in the
correctness of a computation will transform future mathematics into an
empirical science, where mathematical hypotheses will be tested by computer,
and results will be considered true only probabilistically.

~~~
colordrops
That wouldn't be a proof though. The article is about exhausting the problem
space rather than a probabilistic approach.

~~~
lisa_henderson
pervycreeper is making the point that all computers are, for practical
purposes, probabilistic, since you can never know if a given result is because
of a machine failure. In the normal sense of a math proof, one can not
"exhaust the problem space" using a computer. Using a computer remains an
empirical approach, since there is always the chance that the computer is
malfunctioning.

~~~
rocqua
The low probability of machine failure, combined with error correction means
that by 4 layers of error correction, the chances of a failure (4 successive
failures) are essentially nihil.

~~~
colordrops
And furthermore, you could run it on different machines and have humans verify
the results. In any case I'd trust the machines more than a human mind when it
comes to consistency, and we already trust human minds with proofs.

