Hacker News new | past | comments | ask | show | jobs | submit login
Building the Mathematical Library of the Future (quantamagazine.org)
146 points by misotaur 77 days ago | hide | past | favorite | 45 comments

Lean is the basis for my favorite puzzle game of the past five years: https://wwwf.imperial.ac.uk/~buzzard/xena/natural_number_gam...

It's like enemies & bosses are mathematical theorems you have to prove. Each time you slay one, you get it as a weapon you can use against further, more difficult bosses! The grand finale (Inequality World) is very memorable; such a feeling of accomplishment after I cracked it all. Then you zoom out and realize the math you've defeated reaches, like, fourth grade at most.

Lean revolutionized my understanding of math itself, which I wrote about here: https://ahelwer.ca/post/2020-04-05-lean-assignment/

Oh man that was fun! Totally nerd-sniped my day though.

Got me to go back and re-watch Kevin Buzzard's talk at MS Research in re: Lean, "The Future of Mathematics?"



Thank you for sharing this, I'm having a lot of fun with it!

I've been approaching proofs from a different angle, going through the book "How to prove it" and making a complicated setup to use an old applet recommended in the book (ProofDesigner, which requires old versions of Firefox to even load the applet). While that part helps with formal symbols, it's very rigid and aging compared to this fantastic resource and language.

How many hours does it take to complete?

I'd say 10 or so?

I’ve long awaited the day we have automated proof checking. As someone who started in CS but went for a degree in Mathematics I held on to the idea of some sort of test suite that newer mathematicians (who have tend to come up with already existing or already disproven proofs for fun while thinking they’re on to something) could run to determine if they’re wasting their time and it not have to be a class mate making fun of them (sadly I’ve done this to kids who were just excited to try and learn, as if I had any room to talk.)

And it would only scale from there, hell we could run such a test suite against all known proofs and who knows how many ancient theorems we could disprove. I don’t think this is the test suite but something like it and the ANN that OpenAI build to check proofs is a good start.

What do you gain from automated proofs? Take the infinitude of primes. Plenty of textbooks different proofs of this fact to deepen your insight. Having an oracle-like entity tell me whether something is true or false is unlikely to have the same result, and I’m afraid “following the automated proof” won’t help much in this regard.

They aren't talking about something that tells you whether a statement is true or false, but something which you can present your line of reasoning to, and it will check that your reasoning doesn't include an invalid step, or a gap that would need to be filled in.

I've definitely had the experience of "I think this is a valid proof. I haven't seen it presented anywhere else, and I'm not super familiar with the subject matter. It feels a little socially awkward to ask someone to check if it is right", and I started trying to express it in Lean.

However, I found it difficult to get the implicit type parameters to work right, so I had trouble even formulating the claim in Lean. (the claim was a claim in category theory, and the category theory library for Lean has lots of implicit argument stuff, and probably shouldn't have been the first thing I tried to use Lean for. Also the claim I was trying to show is probably very simple for category theory people.)

Increased assurance that the proof is correct and the various standard benefits of digital over analogue data. This software automatically checks the validity of the proof (with a very small, manually verified core of code doing so), it does not automatically generate proofs wholesale. The proof itself is written by the mathematician interactively, which is still a manual process that requires skill, creativity and thought.

One of the usual examples given in reaction to this question is the puzzle of Euclid's 5th postulate, the parallel-lines postulate. People tried for centuries to prove that it could be deduced from the other axioms. Turned out, it was independent of the others.

A proof system could identify cases where you need hidden assumptions in order to make proofs. In this case, you would get a null result from the oracle, which would be valuable in itself.

[1] https://en.wikipedia.org/wiki/Parallel_postulate

At best an proof-checker can invalidate a purported proof, it absolutely is not going to prove that no proofs are valid (it would have to affirmatively prove that the result is actually independent of the axioms you picked, which is a lot harder)

Many incorrect proofs of this postulate were given over the years. (For instance, in some cases these erroneous proofs have steps that implicitly use the postulate itself.)

The proof checker could detect such errors in a purportedly clean proof.

First of all, mathematics is not about T/F. Think of the axiom of choice, the law of excluded middle, the axiom of inaccessible cardinals, etc.

For the second, understanding why and actually proving something can be related but they're not equivalent. Of course, the intuitive ideas can lead to the right formal technicalities but it is not always the case.

There are two aspects of mathematical proofs-communication of ideas and formal verification. Confusing these distinction can be dangerous and this is why mathematicians led to the area of type theory and formal proofs.

Voedovsky, you can google him to find who he was, lost his faith in "human proof checking" after some error on his peer-reviewed and published result was accidentally discovered after years of nobody doubting its validity.

He dreamed of the world where people do and enjoy mathematics freely leaving the boring parts to computers like mathematicians in nowadays don't worry about how their writing will be published and communicated. What we do is write a proof and leave actual typesetting to the computer. It's called (La)TeX.

What you're saying is like claiming this program is correct because its logic is correct without ever knowing whether the code itself would get compiled or not. More extremely, there's no value for having more and more pseudocodes that we don't know they'll ever terminate or not.

> automated proofs

You've got a point, but in this case you are barking at the wrong tree. Aren't they talking about automated proof-checking?

Yes, though waves hands machine learning can in principle be used to create a proof-generator from a proof-checker.

You could ask questions that would normally take hours or days or years to know the answer to, and have the answer immediately. Sure you still need to work on understanding why, but you'd only work on understanding why for something correct. How could that not lead to a stronger intuition?

It’s more something that can tell you if you’ve already been disproven and it would save man hours.

There are several proof assistants trying to do what lean is doing (Coq, Isabelle to name but a few). Some of them have very well funded research projects with similar goals: https://cordis.europa.eu/project/id/742178

When programming 'business logic', the multitude of programming languages is a good thing- it allows one to choose the best way to model a problem.

With computer aided proofs, I do wonder if the multitude of systems trying to create a mathematical library of Alexandria is a good thing.

Essentially they're re-formalising the same theorems across different languages. Certainly some languages will model a specific problem better, but it does feel like a duplication of effort.

There is work on making proofs portable from one system to another. One effort is Dedukti, which is based on the idea that "lambda pi modulo" is a universal logic for theorem provers. A less ambitious effort is OpenTheory, which from what I can tell is mostly about being able to interoperate between different HOL flavors.

And one of the efforts I personally find most interesting is Metamath Zero, which attempts to be an interchange format among other things, with connections to HOL and Lean. See section 5 of: https://arxiv.org/abs/1910.10703

I think there are many provers which theoretically could be used for this effort, and am not convinced Lean is the absolute best, but I applaud the community for actually going ahead and doing it, instead of farting around with tools and hoping the library will magically happen (a valid critique of HoTT in my opinion).

The author of the Metamath Zero paper is actually one of the most active contributors to the project described in the article[0].

[0]: https://leanprover-community.github.io/mathlib_stats.html

Thanks for sharing! I Remember seeing Dedukti a while back, but I haven't seen some of these newer translation efforts.

I do hope the translation systems are actively used to consolidate, rather than built as an academic exercise.

There seems to be a large gap between vision ('our project will encapsulate all of mathematics in one place') and reality ('our project will just proliferate another standard')


It seems to me like Lean has a lot of the momentum at the moment. It's clear that other projects like Coq suffer from a lack of traditional mathematicians formalising modern, fashionable topics. Kevin Buzzard's implementation in Lean of perfectoid spaces is a great example of that boundary being broken, as far as I can tell the most advanced Coq proofs are the four colour theorem and Feit-Thompson theorem, which have analogue proofs from the 60s and 70s.

But you have to take into account that the Coq proofs are actual proofs of big THEOREMS. What Buzzard has done is implementing a big DEFINITION :-)

Also, the Flyspeck project has formalised the proof of a pretty recent theorem as well, mostly in HOL-Light and Isabelle.

There are indeed several different proof systems. "Formalizing 100 Theorems" tracks the progress of the "top" ones compared to a set of well-known proofs:


The top ones (sorted from most to least complete) are HOL Light, Isabelle, Metamath, Coq, Mizar, and Lean. Lean actually has the fewest proofs of those systems, though of course there are advantages to Lean as well.

I think some duplication is inevitable because there are fundamental disagreements on "what is important".

One problem with most of them as a "store of proof knowledge" is that most of these proof systems don't (normally) store the proof in terms of axioms, definitions, and previously-proven theorems. Instead, they store a program (as a sequence of tactic statements) that, if run, is intended to help the prover rediscover the proof... assuming that tactics never change (even though they do). As a result, older proofs in some of these systems can no longer be verified as being correct. This is one thing that attracts me to Metamath; Metamath stores literally every step, with all mathematical knowledge (including all axioms and definitions) in the database, so proofs stay proven. It also makes verification super-fast, e.g., 28K theorems can be verified in less than 1 second.

Others believe other traits are important, so they end up with different systems.

As far as translations between systems, I'm very interested in what happens with Metamath Zero. That work might, in the end, enable mutual translations between various systems. We'll see!

(Disclaimer: I'm addicted to Lean. My view is biased.)

Note that Lean is not so high on that list, but also note that serious theorem proving in Lean is less than 5 years old, whereas the other systems are (several) decades old. There are at least two entries on the list that have only been formalized in Lean, and not in any other system. We are catching up pretty quickly.

It's clear that this list has an important status. But in my opinion, building a library of undergrad maths, which is what what TFA is mainly about, is a lot more important. We are tracking our progress over at https://leanprover-community.github.io/undergrad.html

I am not aware of such an organized effort in other systems. Without such a fleshed-out undergrad library, we'll never get to systematic formalization of (post)grad mathematics.

The exact same thing is said about other programming languages.

I agree, and discussed this a little.

There is one key difference with the theorem provers- many of them share this lofty goal of building a library, akin to a 'library of Alexandria' (The article discusses this vision, and the research project I linked to at Cambridge is called 'Alexandria').

It's a grand vision - to formalise all mathematical knowledge in a library.

This vision doesn't exist in the world of regular software.

Nobody to my mind has claimed they're building a library of Alexandria. You could make the claim for npm, but I might shed a tear if you did that.

This is actually pretty cool. Every time theorem proving comes up, I make some mild complaints about not having any examples from real analysis or calculus like the mean value theorem. They in fact do and they can be found here [1]. Personally, I don't find it particularly easy to read, but I am appreciative that there are some good examples of proofs that more people would be familiar with than, for example something from number theory.

[1] https://leanprover-community.github.io/mathlib_docs/analysis...

Video demonstrating what it's like to use Lean interactively:


Here’s a nice paper and talk by the mathlib folks at a January 2020 proofs and certified programs conf https://popl20.sigplan.org/details/CPP-2020-papers/14/The-Le...

Might be fun reading for folks

See also

The Mathematical Functions Grimoire http://fungrim.org/

The Future of Mathematics? [video] (youtube.com) https://news.ycombinator.com/item?id=21200721

One cool thing about Lean is that the next version is trying to become something like a general-purpose programming language, with the idea of "using proofs to write more efficient code."

For example, you might prove that your code never accesses a buffer out of bounds, so you can remove all your bounds checks for efficiency. Another example is implementing recursive functions with induction -- you can implement Fibonacci naively (fib 0 = 0, fib 1 = 1, fib n = (fib n-2) + (fib n-1)) and Lean will rewrite it into an efficient (non-exponential) form.

See: https://leanprover.github.io/talks/LeanPLDI.pdf

I wonder if in addition to the proofs this could also be used to provide generated "meta textbooks". The towers of theorems must be a graph, right? So shouldn't I be able to say "I want to understand Calabi–Yau manifolds" and get a custom-made exercise set all the way down to set theory?

Add an invertible "examples" generator so examples can become exercises and keep track of all the exercises you've already done (perhaps with some kind of spaced-repetition system built in) and you can just keep "fleshing out" your own graph of practiced math knowledge starting with wherever you want to look next.

The tower of theorems may be a graph, but it is not a directed acyclic graph. (You probably want it to be directed, though...)

That is, you want to get to Z? Well, you start at A. But you can get there by the sequence B, D, H, R, Z, or you can get there by B, C, F, H, R, Z, or by B, C, F, J, Q, S, U, Z. There is more than one route to proving many important theorems.

That's not what the term "directed acyclic graph" means. "Directed acyclic graph" means that a digraph has no directed cycles, not that it's a directed graph with no undirected cycles. Otherwise the term would just mean a digraph which forms a forest.

The actual representation you want wouldn't be a regular graph, but a graph distinguishing theorems and proofs, where the antecedents of theorems are proofs of those theorems, and the antecedents of proofs are theorems they rely on. This kind of graph seems like it should come up regularly enough to have a special name, but I haven't been able to find one.

One important property to note is that a fair number of theorems exist where you can prove A with B or prove B with A, so the underlying directed graph is definitely very cyclic.

Maybe you've encountered it before, but your comment sort of reminds of the concept of a hypergraph (https://en.wikipedia.org/wiki/Hypergraph).

I'm not sure if this is equivalent or related, but I'm imagining an object which can contain, let's call them hypernodes. For example, in the case where A and B mutually imply each other, (A, B) is a hypernode, which internally contains a graph representing the relationship between A and B, but that relationship can be ignored in the main graph. I imagine this would become untenable with a slightly more complex network, though.

A graph whose nodes have for attributes graphs are not an hypergraph. Hypergraph do not modify the definition of a node, only the definition of an (hyper) edge.

From the best AGI project out there: A hypergraph is much like a graph, except that the edges, now called “hyperedges” can contain more than two vertexes. That is, the hyperedge, rather than being an ordered pair of vertexes, is an ordered list of vertexes. The metagraph takes the hypergraph concept one step further: the hyperedge may also contain other hyperedges. https://github.com/opencog/atomspace/blob/master/opencog/she...

Thanks for the clarification, that makes sense. I just would not have been surprised to learn that two different looking extensions of the basic graph concept might turn out to be equivalent, or dual.

Is the name you are looking for "bipartite directed graph" ?

Well, specifically, a bipartite directed graph equipped with the specific partition, but if the graph is connected[1], which it should be in this case, then the partition is uniquely defined.

I guess such a graph would define another directed graph which has as vertices, the subsets of the "theorem" vertices, and a directed edge from any such subset to another which has 1 more "theorem" vertex, exactly when the "theorem" vertex has (in the original graph) an antecedent proof vertex such that all of the antecedent theorem vertices of that proof vertex are elements of the original subset of the theorem vertices.

so, if G = (V, E) is the original graph of theorems and proofs, and T \subset V is the set of theorems,

V' := \{ S \subset A \}

\forall S \in V', t \in T, (S, S \cup t) \in E' iff :

    t \not\in S, and \exists p \in V s.t. (p,t) \in E, and \forall t' \in T s.t. (t',p) , it holds that t' \in S.
G' := (V', E')

G' _will_ be a directed acyclic graph, because all of the edges strictly increase the size of the set. Then, finding "how can one learn all the prerequisite proofs in order to learn [result]" can be rephrased as "given the current vertex of G' (which is the set of theorems one already knows), what is a path which will get to any of the vertices of G' which have as an element the desired element [result] of G ?". The graph G' would be exponentially large in the size of T, so probably should not be represented explicitly in memory, but it seems like some shortest path algorithms should still help with this? There are many other properties of the graph G' which I imagine would be helpful in constructing a shortest path.

[1] in the sense that if you forgot the direction of the edges, the result is connected, not in the sense that every pair of vertices can be connected by a directed path in both directions

> Is the name you are looking for "bipartite directed graph" ?

I've described it as bipartite directed graph in the past, but a key element of what I describe is that the two sets of nodes generally inherit their properties in different ways. For one set, you need something to happen "for all" predecessors; for the other, it needs to happen "for any" predecessor. It's this extra property I have no name for.

As I understand, Lean is a dependently typed programming language. How does it compare to a language like Idris?

How come mathematicians are encoding math in Lean instead of Idris for example?

Someone please correct me, I don't know much about this at all, but I was under the impression that Lean is in a more "imperative" style, while Idris is a functional language. I would have guessed that a functional language would be a better fit for describing math.

I think the "imperative" part is the tactics—metaprogram invocations, the underlying language is still functional.

Idris is geared more towards programming than theorem proving, with the difference basically being that with the latter one cares about more than the types ("I didn't formalize my spec completely") so the specific inhabitant matters. That makes tactics a bit dangerous.

Now there is also a school of thought the finds this reliance on metaprogramming (imperative of other) ugly and a site indication the languages and libraries are not good enough yet. I'm sympathetic, but currently the pro-tactic crowd seems to have the edge.

Lastly, "Lean" is kind of the epitome of "success at all costs" in more ways than tactics: implementation in C++ rather than a nice functional language for dogfooding, willing to sacrifice some intuitionist sacred cows not strictly needed for classical compatibility, etc.

> implementation in C++ rather than a nice functional language for dogfooding

To be fair, the main author of Lean has been cloistered for two years writing the next version, Lean 4, which is written in its own (pure functional) language. The siren song is strong :)

Oh, I did not know that! Will check it out.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact