
Learning to prove theorems via interacting with proof assistants - pgunt
https://blog.acolyer.org/2019/08/23/learning-to-prove-theorems-via-interacting-with-proof-assistants/
======
tlringer
There has been a lot of work like this popping up in the past year. I think
it's somewhat promising, but for ML for ITPs to really become useful, I think
the models need to better take into account semantic information about terms
and types and not just tactics. And this is not easy, because Coq's logic is
higher-order and dependently typed, and most work in ML for PL deals with much
simpler logics. Still, anything that helps with the tedium is welcome.

Another problem is that you will very rarely find examples of failing proofs,
since users rarely commit them. I'm working on a joint UW-UCSD project on
collecting and analyzing development data, and this is one of the goals. But
our dataset is not very large, since it is hard to get users to agree to
something so invasive, and there are still only 1000 or so active Coq users in
total in the world right now.

Another big problem is that it is very, very difficult to define what it means
for a proof to be "almost" correct. This makes it difficult to apply many
existing techniques fruitfully.

Nonetheless, I think the general spirit is right. The biggest gains in proof
automation in ITPs this decade are going to come from automation that makes
use in some way of existing proofs, rather than starting from scratch and
"blindly" (often very effectively or even decideably for certain fragments of
the logic or certain domains if the user knows the right tactic to call)
searching for some term with a given type.

~~~
antupis
Have you thought an interactive site where people could learn coq writing
proofs? Something very similar to
[https://projecteuler.net/](https://projecteuler.net/). You could get
thousands of failed proofs very fast.

~~~
tlringer
That's a neat idea. It would mostly help for gathering data from beginners,
which would be very skewed, but I'm sure it could still be useful, especially
for developing tools to help beginners.

~~~
antupis
Yeah skewing would be definitely problematic, it might be better if it would
be similar to Kaggle where people would compete with each other.

------
xelxebar
Oh neat. I recently spent a good bit of time comparing different formal proof
systems. The Mizar project [0] is pretty interesting since it's proofs look a
lot like standard proofs in the literature.

However, at the moment I'm a bit enamored with Metamath [1]. It's basically a
tiny system that just performs formal substitution. Everything else, including
propositional calculus, predicate logic, and ZFC set theory is encoded within
the language, rather than the implementation. I.e. it's a metalogic that only
knows about simple substitution. That way one can easily investigate other
logics and axiom systems, e.g. they have stuff for intuitionist logics,
orthomodular lattice models, you name it.

Anyway, the project already has a pretty substantial collection of theorems,
including a lot of analysis, topology, algebra and even QM. The cool thing is
that you can easily dig up how the high level proofs trace back to the
fundamental axioms, and there's no fancy math happening in the metalogic at
all.

Super cool stuff.

[0]:[http://www.mizar.org](http://www.mizar.org)
[1]:[http://metamath.org](http://metamath.org)

------
kvakil
From the article: "As an extra bonus, the generated proofs tend to be shorter
than the ground truth proofs collected in CoqGym."

This feels a little misleading, the paper itself says that the phenomena "...
suggests that theorems with longer proofs are much more challenging for the
model." It'd be more interesting to see how the automated theorem proving
length compares to the manual length for the same proofs (although I'd expect
this to be biased downwards, for the same reason).

------
nickpsecurity
Two more:

[https://arxiv.org/abs/1907.07794](https://arxiv.org/abs/1907.07794)

[https://arxiv.org/pdf/1904.03241.pdf](https://arxiv.org/pdf/1904.03241.pdf)

~~~
drchewbacca
And another

[https://arxiv.org/pdf/1608.02644.pdf](https://arxiv.org/pdf/1608.02644.pdf)

I'd really love to see someone like Deepmind get a hold of this. They are
perfectly placed to apply a lot of resources to it and it's right in their
wheelhouse.

Don't know how to contact them about it though.

~~~
carapace
Related: [https://arxiv.org/abs/1610.01044](https://arxiv.org/abs/1610.01044)

"DeepAlgebra - an outline of a program"

> We outline a program in the area of formalization of mathematics to automate
> theorem proving in algebra and algebraic geometry. We propose a construction
> of a dictionary between automated theorem provers and (La)TeX exploiting
> syntactic parsers. We describe its application to a repository of human-
> written facts and definitions in algebraic geometry (The Stacks Project). We
> use deep learning techniques.

------
carapace
"Solving Symbolic Equations with PRESS"

[https://core.ac.uk/download/pdf/82605343.pdf](https://core.ac.uk/download/pdf/82605343.pdf)

> The PRESS program was originally developed, in 1974--75, as a vehicle to
> explore some ideas about controlling search in automatic theorem proving by
> using meta-level descriptions and strategies

[https://github.com/maths/PRESS](https://github.com/maths/PRESS)

A lot of this stuff was/is done/doable in Prolog. I've been looking at TLA+
recently and although I'm not to far into it yet, so far everything has been
"Oh, I can do that in Prolog."

~~~
tom_mellior
Can you explain what parts of the TLA+ stuff you can express in Prolog and why
you think that might be better (or equally good)? It's true that modeling
languages all have something in common with Prolog, but do you think that
proofs about a similar Prolog version would be nicer than in TLA+?

~~~
carapace
I don't know if it's better or nicer. "I am but an egg."

Here is Lecture 4 of Lamport's course:
[https://lamport.azurewebsites.net/video/video4.html](https://lamport.azurewebsites.net/video/video4.html)

TLA+ source code:

    
    
        EXTENDS Integers
    
        VARIABLES small, big   
                
        TypeOK == /\ small \in 0..3 
                /\ big   \in 0..5
    
        Init == /\ big   = 0 
                /\ small = 0
    
        FillSmall == /\ small' = 3 
                    /\ big'   = big
    
        FillBig == /\ big'   = 5 
                /\ small' = small
    
        EmptySmall == /\ small' = 0 
                    /\ big'   = big
    
        EmptyBig == /\ big'   = 0 
                    /\ small' = small
    
        SmallToBig == IF big + small =< 5
                    THEN /\ big'   = big + small
                            /\ small' = 0
                    ELSE /\ big'   = 5
                            /\ small' = small - (5 - big)
    
        BigToSmall == IF big + small =< 3
                    THEN /\ big'   = 0 
                            /\ small' = big + small
                    ELSE /\ big'   = small - (3 - big)
                            /\ small' = 3
    
        Next == \/ FillSmall 
                \/ FillBig    
                \/ EmptySmall 
                \/ EmptyBig    
                \/ SmallToBig    
                \/ BigToSmall   
    
    
    

This is what I came up with (SWI Prolog). Note that I made some constraints
explicit to prune the search, like guarding the empty_FOO steps with
constraints that the jugs must not already be empty:

    
    
        :- use_module(library(clpfd)).
    
        type_ok(Small, Big) :- Small in 0..3, Big in 0..5.
    
        next_dh(Moves) :- next_dh(0, 0, Moves).
    
        next_dh(Small, Big, [[Move, Si, Bi]|Moves]) :-
            type_ok(Small, Big),
            die_hard(Move, Small, Big, Si, Bi),
            (Bi = 4 -> Moves = [] ; next_dh(Si, Bi, Moves)).
    
        die_hard( fill_small, Small, Big, 3, Big) :- Small #< 3.
        die_hard(   fill_big, Small, Big, Small, 5) :- Big #< 5.
        die_hard(empty_small, Small, Big, 0, Big) :- Small #> 0.
        die_hard(  empty_big, Small, Big, Small, 0) :- Big #> 0.
    
        die_hard(small_to_big, Small, Big, S, B) :-
            Big #< 5, Small #> 0,
            small_to_big(Small, Big, S, B).
    
        die_hard(big_to_small, Small, Big, S, B) :-
            Small #< 3, Big #> 0,
            big_to_small(Small, Big, S, B).
    
        big_to_small(Small, Big, S, 0) :-
            Small + Big #=< 3,
            S #= Small + Big.
    
        big_to_small(Small, Big, 3, B) :-
            Small + Big #> 3,
            B #= Big - (3 - Small).
    
        small_to_big(Small, Big, 0, B) :-
            Small + Big #=< 5,
            B #= Small + Big.
    
        small_to_big(Small, Big, S, 5) :-
            Small + Big #> 5,
            S #= Small - (5 - Big).
    
    
    

And here's a query with depth limit and some manual reflow of the list for
presentation...

    
    
        ?- call_with_depth_limit(next_dh(Moves), 11, _).
        Moves = [
            [fill_big, 0, 5],
            [big_to_small, 3, 2],
            [empty_small, 0, 2],
            [big_to_small, 2, 0],
            [fill_big, 2, 5],
            [big_to_small, 3, 4]
            ] ;
        true.

~~~
tom_mellior
This is very neat. It shows that Prolog is very close to being the holy grail
of an executable specification language. But my question was about
verification: TLA+ is not just used for modeling but also for proving stuff
about the models. Do you have anything similar for Prolog? If yes, I would
love to use it :-)

If you're only interested in the specification part of TLA+ but in proofs,
that's great. But then it's not fair to claim that you can do _everything_ in
Prolog that TLA+ does.

~~~
carapace
> This is very neat.

Cheers!

> Do you have anything similar for Prolog?

Well, what you think of "Formal Methods: A First Introduction using Prolog to
specify Programming Language Semantics"?
[https://homepage.cs.uri.edu/faculty/hamel/pubs/fcs16.pdf](https://homepage.cs.uri.edu/faculty/hamel/pubs/fcs16.pdf)

Abstract

> An important fundamental idea in formal methods is that programs are
> mathematical objects one can reason about. Here we introduce students and
> developers to these ideas in the context of formal programming language
> semantics. We use first-order Horn clause logic as implemented by Prolog
> both as a specification and a proof scripting language. A module we have
> written facilitates using Prolog as a proof assistant and insures that
> Prolog implements a sound logic. In order to illustrate our approach we
> specify the semantics of a small functional language and demonstrate various
> proof approaches and styles.

I just found this the other day and it seems promising.

> it's not fair to claim that you can do everything in Prolog that TLA+ does.

A claim I didn't make. "So far", I said, and, "I'm not to far into it yet"
(sic). :-)

(Although it's trivially true because of Turing completeness, eh?)

I have great respect for Lamport. I would hate to be unfair to him.

~~~
tom_mellior
Oh, that paper is very interesting, thanks for that. It's not really
comparable to a fully formal system, though: The user essentially chooses
which trivial things to prove to convince themselves of something. For
example, a real proof assistant would tell you what exactly to prove for the
inductive proof in the factorial example. In this system you pick an induction
principle yourself and hope it's correct.

Still, I'd like to see more work along these lines.

~~~
carapace
It's Prolog, so there is probably already more. I just found a bachelor thesis
from 2011 "Implementation of a Proof Assistant in Prolog"
[https://www2.imm.dtu.dk/pubdb/views/edoc_download.php/6043/p...](https://www2.imm.dtu.dk/pubdb/views/edoc_download.php/6043/pdf/imm6043.pdf)
I haven't read it yet.

There's also PRESS, which seems relevant, "Solving symbolic equations with
PRESS"

> We describe a program, PRESS, (PRolog Equation Solving System) for solving
> symbolic, transcendental, non-differential equations in one or more
> variables. PRESS solves autonomously, i.e. without guidance from the user.
> The methods used for solving equations are described, together with the
> service facilities. The principal technique, meta-level inference, appears
> to be relevant to the broader field of symbolic and algebraic manipulation.

[https://www.sciencedirect.com/science/article/pii/S074771718...](https://www.sciencedirect.com/science/article/pii/S0747717189800070)

[https://github.com/maths/press](https://github.com/maths/press)

I'm not trying to promote Prolog over TLA+ or anything, I've only just started
with TLA+.

------
mpoteat
This is really very cool, and makes me want to pick up Coq again. I know the
set of formally verified proofs is very small as compared to traditional
mathematics, but I think Coq and its cousins are vital for a future revolution
in mathematics.

------
krosaen
I was hoping this was going to be about education;anyone have tips or links on
using coq to get better at math proofs?

------
7373737373
Is anyone aware of a proof system that can do independence proofs over a set
of axioms?

~~~
krapht
All of them? I don't understand why you think existing systems are not able to
handle independence proofs. You embed your model in the logic of the system,
and this approach works for all proof assistants.

~~~
7373737373
Because embedding a model in the logic of the system is not practically
achievable in many systems

