For example, when he said at the end, that it is obvious to him that this will change how mathematics is done in the future. Yeah, thought the same when I encountered this technology, but I was just a 2nd year math student back then, and Isabelle was developed in the same building. Still, obvious.
When he said that math is more about structure, not so much about induction; yeah, said the same at some ITP more than a decade ago. Got a dismissive question from Andrew Appel when I said that. Tobias Nipkow seemed to agree with the dismissal, I never understood the question though. Didn't get a reply when trying to investigate further on stage :-)
Of course set theory can be automated. I would argue, it can be automated even better than dependent types. Just embed the set theory including Grothendiek universes in simple type theory. The problem here is how to keep the notation sane, as for example + could be defined only once.
Tactic style theorem proving is sooooo old. It is how interactive theorem proving was done from the very beginning. Tactic style proofs are not readable at all except maybe via experiencing them by stepping through them. Declarative proofs are much more readable. Writing declarative proofs with the machine bridging the intermediate steps is the obvious future.
I was always astonished how people can learn stuff. When taking Latin classes, I thought, sure, I can kind of learn to read it, but speak it in real time? No way! But of course, at some point in time people have. Without understanding how.
AlphaGo Zero kind of explains how people can do something really difficult, even strategy based, without exactly knowing what they are doing. This is not much different from how mathematics is done. I think we will have a deep mathematician solving at least one of the remaining millennium problems within the next 20 years. The path starts exactly with how Buzzard and Hales envision it: Bring enough mathematics into one system, so that the millenium problems can actually be stated in the system. Machine learn that stuff. Give feedback to users, they will find it easier to use the system interactively. Create more mathematics. Machine learn that stuff. Rinse repeat.
This can be done. Certainly with 100 million dollars in funding.
I think formal abstracts (https://formalabstracts.github.io) is promising in principle (it doesn't focus on proofs though).
The Archive of formal proofs (https://www.isa-afp.org) is the biggest effort I know, but the logic (Isabelle/HOL) is not powerful enough for doing advanced mathematics comfortably, and the process of contributing is quite arcane.
Yes. Metamath's set.mm is expressly such a project, and I recently created a Gource visualization of set.mm's progress over time. For the first few years it was basically one person, but that increased over time and there have now been 48 different contributors.
Tom Hales' formal abstracts project may be to your liking.
Yet neither does Alpha go. Finding solutions really is an NP-class problem, while proving them is much easier.
> Machine learn that stuff. Give feedback to users, they will find it easier to use the system interactively.
This is interesting. I wonder how much could the computer learn about our natural (or mathematical) language if we were to express problems before coding them, and then feeding both to the ML system.
Going a step further, having the ML system generate the code, and the user correct it (as far as my understanding of deep learning goes, correcting a net's output and backfeeding it is currently strictly equivalent to just providing that data in the training set before performing stochastic gradient descent. We might be missing something here, as it is obviously much more effective with us, and that would make ML a lot eaier to work with, requiring less training data).
That's not really true. Proving that an already found solution is correct might be easy for certain solutions (that's the whole idea of proof certificates). But finding a proof for an arbitrary theorem is certainly not easy. And that is what mathematicians do. Here, finding the proof IS finding the solution.
I meant proving in the sense of "checking that the solution works". In a mathematical problem, the solution would be a tentative proof for a theorem. Proof which needs to be verified. I should probably have used "verify" instead of "prove" in this context.
Keeping notation sane is, of course, very important. I think keeping notation sane is harder than it might first appear, but it is certainly possible.
One interesting trick used by Metamath's set.mm is to define variables that look similar to operators but can be defined for purposes of a particular proof (e.g., as hypotheses). For example, + underline, * underline, and so on.
For an example, see theorem grpridd, "Deduce right identity from left inverse and left identity in an associative structure (such as a group)." http://us.metamath.org/mpeuni/grpridd.html - the hypotheses include statements about "+ underscore", and lead to a proof about + underscore.
It has been a while since I worked with Isabelle locales. So how come that in your example "x + y" is not being confused with "A + B", although all involved entities are ZF sets? Or does A actually have a different type than x ?
Nevermind, I know the answer to that. That's why Set theory notation is a difficult problem, and you need some sort of type presentation / input layer. Some people also refer to that as a user space type system or soft types: https://www21.in.tum.de/~krauss/publication/2010-soft-types-...
Do you have a take on what systems are best for this? I found Isabelle/HOL quite reasonable in this respect. There are some toy add-ons to Coq, but I don't think any of them work. There's also FStar which has Z3 integration (if I recall correctly), but last time I played with it it needed quite a lot of hand-holding.
I would recommend just following the "Theorem Proving in Lean" book for starters.
One of the more interesting bits for me was in the Q&A, where Prof Buzzard was asked about alternatives to Lean. Lean is the most evolved of the dependent-typed calculus of constructions provers, but two other approaches might work. One is univalence, which is sexy, but Prof Buzzard observed that they haven't actually got much done in their system.
The other is set theory, which is more familiar and approachable to working mathematicians, but the systems out there lack automation. He didn't mention Metamath by name, but that's currently the most developed such system, and they have managed to get a lot done, either in spite of or because of the lack of type theory sexiness of the foundations.
So the question I'd love to see answered is whether automation is inherently easier in type theory, or whether it might be possible to build automation for a set theoretical approach. John Harrison gave a talk last year at AITP on the topic, but I haven't heard much more of this.
I think quotients are not typical of constructive objects in that if you construct some object, and then project it into the quotient, then you have the quotient, but you cannot project that object back out. You get subtly less, that the resulting quotient satisfies some equivalence relation.
Under a specific set of assumptions, you may very well be able to construct the quotient... but under a different set of assumptions (including the assumption of the quotient), Perhaps in these differing sets of assumptions, you have the quotient, but lack the assumptions necessary to construct it yourself.
Anyhow, the argument that they are non-constructive in a humpty dumpty sense you cannot
put it together -> take it apart -> put it back together again.
Where it seems reasonable to consider the "put it together" phase as not entirely incompatible with constructivity?
would you be able to say how "normalizing" fits into this? by "normalizing" i mean applying some function that takes each equivalence class to a representative element (think simplifying fractions). using something like that, we could round-trip a value (object -> quotient -> object), though at the end we might end up with a different value than what we put in. so yeah, a normalizing function seems like a useful thing when looking at quotients constructively. sorry for the handwavyness, i hope i managed to get my point across!
In lean specifically there is a bit more information here:
"and quotient construction which in turn implies the principle of function extensionality"
There is also, Michael Beeson's "Extensionality and choice in constructive mathematics"
So it seems at least that the definition of quotients in lean is classical, and justified by axioms, NuPRL at least seems constructive, this at least leads me to believe quotients as defined in lean aren't constructive. But i'm not sure we can take the step to "quotients aren't constructive".
Anyhow it's an interesting question, and one which I too wish I had more clarity on.
There are, of course, people working on fixing this. I just created a Gource visualization of the Metamath set.mm project, which has been working to formalize mathematics with absolute rigor. Over the years there has been increased activity; there have now been 48 contributors to set.mm.
In the end, that is what is necessary to formalize mathematics: efforts by many people working together to do it.
Some languages (and their approaches) are probably better suited to some mathematical objects, like some demonstrations are easier to perform algebraically than geometrically, for instance.
I once played with Lean but found the tutorial not very approachable. It took an hour of reading through abstract explanations until it finally explained the idea how it works. Essentially, true statements are expressions that pass the "type check" of a compiler. A function taking type A as param and returning type B is an implication A->B. To prove this implication, you need to find a function implementation that passes the corresponding type checks. This is what I would have wanted the tutorial to say at the start.
This sounds like the tutorial was geared toward those already familiar with other proof assistants, since this concept is present in most proof assistants.
It's a bit weird because has Haskell shows, you don't need dependent types for basic theorem proving. (but dependent types do give a lot of useful power)
The maths course (in French) that can be seen during the presentation: https://www.math.u-psud.fr/~pmassot/enseignement/math114/
License: Apache 2.0
I was afraid there would be a CLA, as is customary with Microsoft's projects (and the main reason I didn't contribute to any), but I couldn't find one. Good call.
> Lean 4
> We are currently developing Lean 4 in a new (private) repository. The source code will be moved here when Lean 4 is ready. Users should not expect Lean 4 will be backward compatible with Lean 3. [Committed one year ago]
Really, really not a fan of this. This basically prevents anyone from attempting to add new features or fixes, as they might be obsolete by the time the new version comes out (incompatible or already fixed).
I've wondering about that quite a while because I knew someone who was involved in the Mizar project, but never had the time to get into automated theorem proving myself. I was impressed by the semi-natural language proofs.
One view is to look at "Formalizing 100 Theorems" by Freek Wiedijk, which lists 100 mathematical theorems and the various systems that have formalized a nontrivial number of them. It's basically a "challenge list" for these kinds of systems:
This list is discussed in "Formal Proof - Getting Started" (Freek Wiedijk, Notices of the AMS, Volume 55, Number 11, December 2008).
That list is absolutely not the only way to compare different tools. Still, it gives you a sense of how far along each one has come in actually making proofs. Here's the current status:
HOL Light 86
which was written a few hours ago broadly in the vein of Lean vs. Coq, more precisely on the issue of how Lean handles quotients.
After reading this, are you really any wiser now on the topic of quotients in Lean? I didn't learn much from it except that some Coq developers are fed up with the current popularity of Lean among top mathematicians.
Only a little but I'm hoping that some weekend reading up on Canonicity and Subject Reduction (now that I know these are the issues at play) will shed some light.
I'm interested in both Lean and Coq but what I'm most excited about is the (Lean-based) Mathlib project.
I can believe that Lean may have made the wrong call on quotients (I guess with `quot.sound`) though I am not qualified to decide. If this is so, I imagine that it will manifest in terms of a natural limit on how far a project like Mathlib can push its borders before it reaches some sort of natural limit (maybe where the complexity of formalising what we want to say dominates the inherent complexity of the statements themselves).
However from what I've seen, Mathlib is by far the most successful formalisation project in terms of what seems to matter most sociologically right now: attractiveness to mathematicians. Whatever its fate, I think it will help make formalisation of mathematics much more mainstream, and will teach us a lot. I still think that a univalent type theory looks like most promising candidate, but we'll have to wait and see.
Back then I told myself to check out lean one day... still hasn't happened yet :/.
Don't forget how essential story-telling is to the human condition: culture, accumulated knowledge and everything we as a species have achieved is built on top of it. With that in mind, it's much easier to see why people some might want to play the role of storyteller at all costs.
But compare all of mathematics to just linear algebra and specifically neural network implementations. You have a lot of people working on AI who sometimes grossly overstate the capabilities of their system and fail to understand their systems when they do succeed. I would venture that the issue is not solving problems as much as it is to understand things to the level of mastery. It is always worth it to understand something to a continually exhaustive level of detail.
I think this is what artisans are. If you can make incredible hand made books , then surely you have underlying skills and abilities that transfer as well? If you are a grand master chess player then you may be dismayed that computers will always beat you , until you use a computer yourself to beat another computer (or at the very least until you become resentful towards IBM for misleading you in 1997). 
>Basically, Lean can understand mathematics, and can check that it doesn't have any mistakes in. Most of the files here are Lean verifications of various pieces of undergraduate level mathematics.
>Some of the lean files are in a library called Xena. One could imagine Xena as currently studying mathematics at Imperial College London.
- Gregory Chaitin, 2007
What are the future plans for the project? How will this be distributed in a form that mathematicians can use and contribute to?
(still watching the video)
EDIT: follow-up, since the replies helpfully point out that yes, it is: does this limit this kind of software, or can theorem proving software get around this by human intervention?
The smallest 'practical' example I know, is a finite state machine with a proof that we cannot proof (within a specific set of axioms) whether it will stop or not.
It's not even very big: 1919 states.
That is a very subtly different question than the one I was trying to ask. Or.. well.. I think it is. What I was trying to ask was if there are formally provable theories that cannot be expressed in a way that would let a computer "compute" the proof.
I think that this is not the same as proving that there are unprovable theories.
EDIT: I just noticed my phrasing was off in the original question.. gah, this reminds me of conversations with my mathematician friends IRL where I would always trip over my own sloppy use of human language. What I mean with "computing" is verification, not execution
Do you mean "find the proof" or "verify the proof"? That difference is critical.
Theorem-provers that work to find proofs have made many strides, but as of today humans easily run rings around them. That's not to say it can never be done. Years ago good Go players could trounce machines, and that is no longer true. So: Today, most proofs cannot practically be found by a machine, but they are getting better and there's no fundamental reason computers can't be excellent at it.
Verifying proofs is something a computer is par excellence at. To verify a proof, the proof checker simply needs to verify that each step is valid. No human can compare to the ability of a computer to be picky :-).
In some systems it may not be possible to express a proof of a particular claim in a reasonably short sequence. But that would be a limitation for both humans and computers. In both cases, the answer would be the same: find a way to switch to a "different" system where it's easier to express the proof. In some sense normal math does this; people routinely define new terms, and then build on those definitions so that everything else is simpler.
(The responses here have been very nice despite that though, thank you everyone!)
It depends on the logical system + axioms that your formally provable theory is in respect to, and "how soon" you design the computer that verifies the proof. Recall that a proof is a finite sequence of applications of rules of inference on zero or more axioms to derive a new statement that may be regarded as an axiom for applications later in the sequence, so that the last step derives the theorem.
If there are a countable number of rules of inference and a countable number of axioms, and each rule of inference has finite valence (i.e. infer from finitely many formulae-patterns as hypotheses), then yes, there is a way of encoding the rules of inferences and possible axioms of the logical system that a Turing machine (TM) will find a proof of any provable theorem (i.e. verify it), simply by enumeration of axioms in rules of inferences in the positions in a sequence that constitute the proof, while checking whether the axioms legally fit the rules of inference. (It is always possible to check in finite time whether a (finite-length) formula satisfies the pattern it needs to satisfy by a rule of inference).
On the other hand, if there are an uncountable number of rules of inference or an uncountable number of axioms, there may be a computer that cannot verify all the formally provable theories. For example, trivially, if there are an uncountable number of axioms, then the axioms admit no encoding; that is, there is no set of finite strings in a finite alphabet that is able to uniquely label all the axioms. But an axiom is itself a formally provable theorem; its proof is just introducing itself, yet that proof cannot be represented for all the axioms. (dependent on definition of proof; this example may not be so if we accept a zero sequence as a proof; but in general this illustrates the fact that there are simple theorems in some formal systems + uncountable axioms that cannot be computed).
Yet, if we know beforehand the theorem and rules of inferences and axioms used in some proof of it, and the rules of inference all have finite valence, then that proof only invokes finitely many rules of inference on finitely many axioms, such that we can encode a fragment of the logical system + axioms + expressible formulae such that the TM can verify all formally provable theorems in that fragment, including the desired theorem.
It helps to be precise. Stating the incompleteness theorem imprecisely generally just makes people confused about what it could even mean. Strictly speaking, an incompleteness theorem has to be proven within a particular formal system X (the "meta-system"). It expresses--within X--that "for every consistent formal system Y expressible in X, there is a statement A expressible in Y such that there is no proof in Y of A or not-A". Here Y is the "object system".
The statement A is only unprovable within Y in particular. If it is expressible in X as well, it may be provable (or disprovable) within X. Indeed, one can trivially construct a formal system in which A is provable and which can express Y by adding A as an axiom to X.
The real, philosophically significant, non-particular-formal-system-specific import of the incompleteness theorem is not just that there are unprovable statements (which isn't really true, and is in fact kind of mathematically meaningless, outside of the context of a specific formal system, given that any statement or its negation can be taken as an axiom, and all we have to judge such choices is a non-mathematical notion of "self-evidency") but that (as the name suggests!) no formal system is complete--every formal system can express statements that it can't prove. It's all about that universal quantification.
It’s hardly naive either considering so much has been learned about the limitations of mathematics only in the last century.
Highly recommend this short video
The title mentions Gödel, but it’s a broad treatment that really discusses a lot of the context of the subject.
Keep in mind there are multiple reasons why proofs can become problematic.
Gödel deals with some of them, other problems go unproven for centuries until our techniques catch up. Some proofs are impractical being nearly infinitely complex, etc.
The shortest answer I can give is no. There are no proofs that are beyond computation. There's some subtlety here and your final question of whether there is mathematics that is beyond computation is a much deeper question.
The question becomes far more interesting if you replace "fundamentally" with "efficiently." But that's getting ahead of myself.
Modern mathematics and logic view (formal) proofs as computation. A formal proof is a purely mechanical process that should require no creativity whatsoever to follow, just as a computation is a purely mechanical process that should require on creativity whatsoever to carry out.
If you require a non-mechanical leap (say of creativity) then it is no longer a "proof" (each leap corresponds to a "leap of logic" colloquially).
Modern mathematical standards usually mean that every informal mathematical proof should in theory be formalizable into a formal proof. If this is not possible, the informal mathematical proof is most likely insufficiently rigorous.
The creativity then lies rather in the construction of the proof rather than its verification.
Note though that given infinite time, again all proofs can be generated mechanically. Given that all proofs can be verified mechanically, you just keep generating every possible string and verifying it until you find a proof that either proves or disproves your statement in question. This is an extremely long and utterly impractical process, but it demonstrates that in theory this is possible.
There is a subtlety here. Mathematical statements can be independent of the set of axioms that you consider when deriving a proof, that is the axioms may not be strong enough to either prove or disprove the statement. This mechanical enumeration process cannot find that a statement is independent of the starting axioms, since it cannot know when to stop looking for a proof or disproof if it has not yet found one. This is analogous to the halting problem, where a computer cannot tell if a program has not yet halted whether it will halt in the future (e.g. by waiting just another few seconds more) or if it will never halt.
Hence in a certain sense the verification of mathematics is entirely bound by computation. The creation of it, given infinite time is also bound by computation. However, given that we do actually care about efficiency, that's where things get interesting.
Okay so what about the things that sibling comments touch on. Such as:
1. Godel's incompleteness theorem(s)
2. Arithmetical hierarchy (and the closely related notions of Turing jumps and Turing degrees)
3. (Not touched upon by other comments, but potentially relevant) Tarski's undefinability of truth
4. (Also not touched upon by other comments, but also an interesting one) P =? NP
1 - 3 apply to humans as well as machines. 4 is the only one in my eyes that's interesting.
1. Godel's incompleteness theorems are statements about the limitations of formal systems, but this limitation holds for humans as well. That is, modern mathematics is usually stated to be built on the foundations of ZFC set theory. All the constraints that Godel's incompleteness theorems place on ZFC (namely that there will always be statements independent of ZFC and that we cannot use ZFC to prove itself logically consistent) also hold for all humans using ZFC. The only advantage that humans have is that we can decide if ZFC is too restrictive to switch formalisms and use something else (i.e. the creativity in constructing a proof). This is the creative spark that is not yet captured by machines (although who knows with AGI).
2. The arithmetical hierarchy is a generalization of the following phenomenon: for universally quantifiable statements, i.e. "for all" statements such as "all cows have spots" or "all people are mortal," it is easy to disprove them since that requires only a single counterexample, but it is hard to prove them, since you need some way to show that the statements holds for all exemplars. On the other hand for existentially quantifiable statements, i.e. "there exists" statements such as "there exists a cow which is a sphere" or "there exists a person who is mortal," the opposite is true. It is easy to prove them since you only need to find one example, but it is difficult to disprove them, because you need to show that any potential exemplar is incorrect. The arithmetical hierarchy generalizes this by examining the "difficulty" of statements where "for all"s and "there exists"s interleave with each other. It turns out that "there exists" corresponds to the halting problem (there exists a number of steps after which the machine halts) and "for all" corresponds to the "not-halting" problem (for all finite numbers of steps the machine will not have yet halted). The hierarchy is an observation that each time you interleave a "there exists" and a "for all" you make the halting problem even "harder" (an oracle that magically gives you the answer to the halting problem for a lower number of interleavings will have its own halting problem it cannot decide in higher numbers of interleavings). An example of an interleaving is the question of whether a function is total. This is equivalent to the statement "for all inputs, the function halts." However, we just said that halting is a "there exists" statement. So we can further reduce the statement to "for all inputs, there exists a number of steps such that the function halts after such a number of steps." Notice the interleaving of "for all" and "there exists." However, note again that all the limitations I've described here apply both to computers and humans.
3. Tarski's undefinability of truth is the statement that for essentially all interesting logical systems, there is no way to write a logical predicate that takes another logical statement as an argument and is true if and only if that logical statement is true (essentially a logical version of the halting problem). Again this is a problem that afflicts both humans and machines and is an artifact rather of the attempt to formalize something (the heart of logic and mathematics).
4. This is the interesting question. So far I've outlined a brute-force way for a computer to enumerate all proofs. This clearly does not help actual human mathematicians do mathematics. Can we do better than that? Especially since we don't need to brute-force the validity of a proof? Humans certainly can do this better for specific instances of proofs (in much the same way that humans can prove specific programs halt, but cannot give a general formula for doing so). Is there a globally better way of doing this? Scott Aaronson delves into this far better than I could here: https://www.scottaaronson.com/papers/philos.pdf.
The video mentioned Formal Abstracts, which is still getting off the ground... a similar project that has been around for a while is Metamath. Metamath uses an SRS (string rewriting system) to formalize much foundational mathematics. It's proof explorer is conceptually similar to Egan's Mines: a database full of theorems and connections starting from fundamental axioms and definitions. Today, Metamath only includes fairly basic results, and has not yet reached a level where it can capture the state-of-the-art in modern research mathematics the way Formal Abstracts and other systems aspire to. To be fair, no system today is really at that level yet. Coq has a good set of packages and I think is probably in the lead today. I don't know where Lean is in comparison.
The aspect of formalization where it somehow lends additional credence to proofs seems less important. Famously, when Hilbert went to fully formalize Euclid's geometry, he found a missing axiom. (Euclid had assumed that two circles with centers closer than their radii would intersect at somewhere... which is not true over the field of rational numbers! Therefore this requires an explicit axiom, which had been overlooked for 2,000 years!) This seems to be the exception rather than the rule... as far as I know, no important theorem has been "overturned" as a result of formalization efforts.
Not true. I recommend Five Stages of Accepting Constructive Mathematics:
Par for the course for computer scientists - if we had infinite time/space we wouldn't have to optimise anything.
This paper  by Ed Nelson presents a brilliant analysis on the difference between classical logicians and intuitionists.
It starts with a crucial distinction in semantics.
Assuming ∀x¬A and arriving at a contradiction not a proof for ∃xA.
And yet, simply on the grounds of utility, the intuitionist perspective is far more useful to me as an every-day philosophy simply because "incomplete communications" is a mental model of how human systems/interactions work and how information flows in general. That which we call knowledge is socially constructed  and is always incomplete.
But you are are aware of this already, I'm just writing it out.
Look at these provers, they're almost all based on classical logic, and even on proofs by contradiction:
Even Isabelle/HOL, which is quite user friendly and has a lot of automation (like Sledgehammer, which can call to the automatic provers mentioned above) is based on classical logic with choice.
But you also end up with proofs that are much, much shorter.
I guess it's all about the degree of uncertainty. Constructive mathematics reduces the degree of uncertainty, which grows exponentially as you get in the higher level proofs.
What separates math from pure logic is that math has a set of very intuitive axioms. So mathematics is first and foremost about stretching and fixing inconsistencies in our intuition, I personally see no value in trying to refactor the foundations of mathematics like constructing integers based on sets etc, taking integers and their operations as an axiom hurts nobody. Theorem provers seems to be like that as well, people don't prove new things in it they just refactor old things to fit in, kinda like people rewriting their services in languages with more street cred like Haskel or Rust.
Taking certain sets of axioms can lead to very un-intuitive conclusions, that's why some people care about building a solid foundations. A classic example is https://en.wikipedia.org/wiki/Banach%E2%80%93Tarski_paradox , which states "Given a solid ball in 3‑dimensional space, there exists a decomposition of the ball into a finite number of disjoint subsets, which can then be put back together in a different way to yield two identical copies of the original ball." If we change the foundations to remove the axiom of choice, this paradox (and many others like it) is destroyed.
Once there is "enough" in it, the rest will go in at an exponentially increasing rate, for a few years.
After that, you will need to explain to a journal why your submitted paper doesn't refer to a mechanical proof.