Does anyone know how this compares to something like Idris?
- Gallina, which is a pure functional programming language a lot like Idris, which is used in 'straightforward' definitions.
- Ltac, which is an awkward imperative language for incrementally building up proof objects using a variety of "tactic" primitives. Ltac is executed at compile time to generate values which would be cumbersome or impossible to write directly in Gallina (a bit like Lisp macros).
Most of Coq's power is exposed via Ltac, e.g. most of the proofs in http://adam.chlipala.net/cpdt/ are generalised into powerful Ltac functions; the reason for this is explained at http://adam.chlipala.net/cpdt/html/Cpdt.Large.html
Unfortunately, this focus on Ltac has left Gallina a little impoverished. For example, dependent matching in Gallina is pretty horrible (compared to, say, Agda; although there are complications there e.g. with the K axiom), whereas in Ltac we can simply use a tactic like `destruct`.
Personally, I'm a fan of http://plv.mpi-sws.org/mtac/ which allows regular Gallina, along with a few extensions, to be used instead of Ltac. Basically, any 'unsafe' action which we might have done in Ltac, like general recursion, gets wrapped up in a monadic datatype "M". During compilation, "M foo" values can be forced to get the "foo" value they wrap, which are inserted into the output just like Ltac results. If that forcing fails, or loops forever, so does the compilation (again, like Ltac).
Mtac feels a lot like using Idris IMHO.
It would be closer to "caught" \kɔːt\ with a short vowel and a \k\ sound at the end.
EDIT: I may have a slight bias as a "General American" user, but I just noticed the parent actually also includes a length suprasegmental modifier, which probably means they're a Received Pronunciation English speaker, or copy-pasted the IPA from somewhere :-D. At any rate, my point that "it sounds like the vowel in 'caught' is pretty ambiguous" still stands.
The LSRA is also extractable as a Haskell library, so they can use it in their compiler:
From what I understand from John, implementing it was quite a challenge. It's certainly not for the faint of heart, but it is not completely impossible. In any case, they seem to have had success with this project and are moving onto working on other components of the project, from what I can tell.
That said, from my own personal experience, you're probably a lot better off trying to see if you can formulate your problems in a more domain-specific way and leverage tools for that. Full blown general theorem proving is much more difficult, and I try to think of it more as something to resort to if easier approaches don't work.
We still haven't entirely wrapped our heads around the fundamentals like equality (just because it's fundamental doesn't mean it's simple), so there's still a lot of work to be done in the DT world. DTs are one of those cases where I feel the curve is very much exponential: getting 98% of the "correctness" of any given piece of code is actually not that difficult. But going from 98% to 99.9999%, like Coq offers? It's outrageously hard. Sometimes that 1.9999% is absolutely mandatory to have, though (considering CompCert is used in e.g. AirBus/aerospace software, yeah, the extra 1.9999% is possibly vital). For most people, it isn't. Even in Haskell, I tend to stay away from super-advanced type level features, because they can introduce incidental complexity, and Haskell is already somewhere around that 95% mark anyway. With enough type level stuff, it can probably get to about 98%, but with some incidental complexity. So in the vast majority of cases I don't stand to gain much, unless things get extreme...
(And personally, out of any theorem prover, I'm most excited for Lean as opposed to Coq, because I like: its speed, and it has many useful features like directly being able to use it to program through meta definitions and the Lean virtual machine, its API and automation features -- and things like C++ code generation.)
> We still haven't entirely wrapped our heads around the fundamentals like equality (just because it's fundamental doesn't mean it's simple), so there's still a lot of work to be done in the DT world.
Yeah, even though in the end the proof techniques are similar in algorithms that are complex enough (invariants, refinement), I think that DTs in particular add a lot of accidental complexity (but that's because they try to do more than "just" verification). The difficulty of equality is not fundamental to computations -- it's an artifact of the functional formalism...
> But going from 98% to 99.9999%, like Coq offers? It's outrageously hard.
That's absolutely true. Much of the problem has to do with the fact that for 100% you need to do end-to-end verification, which means you must reason about the programming language's syntactic constructs, yet with semantics that still generalize and can be reduced (abstracted) to global properties. That's a tall order.
> I'm most excited for Lean
I'll give it a look. I really enjoy TLAPS, the TLA+ proof assistant, but I'm always interested to see what else is out there.
But I don't really mean that "equality is fundamentally hard to compute because of functional programming" or anything. Rather, it might be the other way around: we haven't given a computational meaning to equality, which is why it's all so bad! It's more about equality is ugly at the moment and this results in some of that incidental complexity that I was talking about. To get rid of that ugliness, adopting equality as a computational thing is powerful! The notions of equivalence vs equals vs isomorphic, etc seem to get weird over time, at least in this amateur's mind. And that's not even getting into the problem that you have to establish "equalities" independent of the theory, which is part of, if not the, whole problem (e.g. in set theory, "a equals b" only makes sense once you establish an outside rule for what "equals" means. In HoTT, "equals" is not defined independently of the theory at all -- its in the same universe, so equality is "computational" on its own, in the sense you can talk about equalities, pass them around, and establish equalities between equalities, etcetera. This leads to the notion of infinity-groupoids, univalence, etc)
Granted, it's not like the entire maths community is scrambling to fix this like the Manhattan project; but the notion of equality-being-hazy seems to be understood, even outside of functional formalisms. It seems to be more of a problem in higher order mathematics, too, which is where the majority of people who care about it work, it seems.
While homotopy type theory is the current en-vogue approach to solving the hairy problems of equality (among other things) since it unifies all this... actually adding a computational interpretation is still an open question. So one problem is solved and another pops up, just as usual...
Perhaps, but not for computation. There is a very clear, very nice, partial order on various kinds of computational semantics, and they all become a nice, "configurable" sliding scale once you work in a formalism that expresses computation more naturally than functions (that's because computations are simply not functions, at least not functions from the input to the output, and not even partial functions). Once you decide to work with functions you need to think of things like extensional vs. intensional equality etc., which simply aren't an issue in other formalisms.
> It seems to be more of a problem in higher order mathematics, too.
True, but the need for "higher-order mathematics" in the modeling of computation in the first place originates with the choice of functions as models of computation...
Consider this: computation bears a lot of resemblance to continuous dynamical systems (computation is essentially a discrete dynamical process). Now, in (continuous) dynamical system, suppose you want to make your system "higher order": the derivative of the system will itself by a dynamical system. There is still no need to go higher-order mathematically; dynamical systems are usually represented as first order ODEs regardless of their conceptual "order". Instead, all you do is add another dimension to your state space. As the size of your state-vector is arbitrary, and as the mathematics of increasing it is very clear, all you need to do to "go deep" is "go wide".
But perhaps I'm being unfair. While functions are not a very good abstraction for reasoning about computation, they are a very useful programming abstraction, and for end-to-end verification you must reason at the programming language level (although there may be better ways of doing that, too).
> True, but the need for higher-order mathematics in the modeling of computation in the first place originates with the choice of functions as models of computation... But perhaps I'm being unfair. While functions are not a very good abstraction for reasoning about computation, they are a very useful programming abstraction, and for end-to-end verification you must reason at the programming language level (although there may be better ways of doing that, too).
I kind of left "higher order mathematics" a bit vague... Personally in all the proofs, etc I've needed you don't need to go as far as talking about topological spaces or n-categories or whatever, which is what I'd consider "higher order", but this is the space where a lot of the focus on wrangling equality, etc seems to come from, which is what I was getting at. Then again, most people using Coq heavily are probably exactly the people who care about this. :) I'm not one of those people, I just like programming.
And personally I find the choice of functions, etc as a notion of computation quite powerful, even if it gets hairy sometimes! I suppose you could call it pidgeon-holing or laziness, but I tend to get a lot of insight from viewing mathematics from a computational, constructive point of view like this. I more use it as a "bridge" to get places rather than an overarching view of mathematics, I suppose, and I think it's quite useful at that (e.g. recently I found that I like the view of quotients in the computational, type-theoretic view rather than the 'quotients' of non-type-theory where you're really talking about partitions and equivalence classes, etc)... This is also probably partially due to the fact I've spent the majority of my programming career with systems like Haskell. So I'm fairly comfortable with "computational mathematics" from this POV.
In other words, I don't think that an interesting way to apply computational concepts (of typed lambda calculus) to the foundations of mathematics is necessarily the best way to apply mathematical thinking to computation. "Computational math" may not be the right math for "mathematical computation". There's an example of this unjustified (although possibly true -- we don't know) reasoning in the introduction to the Lean paper:
> In practice, there is not a sharp distinction between verifying a piece of mathematics and verifying the correctness of a system: formal verification requires describing hardware and software systems in mathematical terms, at which point establishing claims as to their correctness becomes a form of theorem proving. Conversely, the proof of a mathematical theorem may require a lengthy computation, in which case verifying the truth of the theorem requires verifying that the computation does what it is supposed to do.
The problem is that the kinds of computations relevant in the first case and the kinds of computation in the second case can be wildly different. Computations used in proofs are very much "functional" (map input to output). Computations in real-world software systems that really need formal verification rarely are. The most common type of computation in either category is the least common in the other, so the assumption that the same mathematical foundation (and/or the same tool) should be used for both is unjustified. While the authors use the term "in practice" to preface their statement, judging by their publication history, I'm a bit skeptical of their experience in verifying real-world software.
Nevertheless, Lean does seem like a great way to learn about dependent types, and as you pointed out, the work on dependent types and their possible application in software verification is far from done, so it's very hard to judge at this point whether or not the direction is the right one. In the end, regardless of formalism, the proofs and techniques of software verification turn out to be very similar in all formalisms.
I know of some small scale use at Facebook (where I work), where it has been used to verify a non-trivial protocol under various combinations of failures.
Whilst it does discuss that Amazon used TLA+ in nontrivial circumstances, it's still a long mental leap to me to get to "TLA+ for AWS S3 looked like this, and this is the sort of bug it helped quash".
It'd be great to see some of that sort of thing.
The kind of bugs you find when verifying distributed systems is, if this machine was in that state and that message was in transit, and then the other machine failed, then data was lost".
Basically, because formal verification of this kind ensures the specification is correct, it will find any kind of bug there is. But it's those bugs that are either a result of a severe flaw in the design, or bugs that would be hard to find in testing but their results may be serious, that make this worthwhile.
It has some objective qualities and some subjective ones. The objective ones (as confirmed by others) are that it's easy to learn and you can get productive very fast. I immediately started specifying the large project after two weeks of going through Lamport's tutorial. The other objecive benefit is that it has a model checker. Deductively verifying such large specifications (while possible) is simply infeasible (or, rather, unaffordable and an overkill) -- regardless of the tool you use.
The subjective benefit is that the conceptual model quickly clicked for me, personally. I've never liked the aesthetics of modeling arbitrary computations as functions, and the TLA formalism handles sequential, concurrent and parallel computations in the same, extremely elegant way (and allows using the same proof technique regardless of the kind of algorithm, if you want to use the proof assistant). It made me understand what computation is, and how different computations relate to one another, a lot better. The concept -- without the technicalities of TLA+ -- is nicely overviewed in .
That said, formal methods are never easy because rigorous reasoning about complex algorithms isn't easy, but it beats hunting down bugs (especially in distributed systems) that are very hard to catch and may be catastrophic; it is also very satisfying. I've found TLA+ to add very little "accidental complexity" to this difficult problem.
There's also a Coq framework called Verdi (http://verdi.uwplse.org) for formalizing distributed systems, but I don't know much about it.
I'm really interested in getting started with formal verification, but I have a more "engineering" background and don't really know where to start. Coq, Isabelle, Lean... There's quite a few of these tools out there!
If you absolutely must have end-to-end verification, namely you must verify that global program correctness properties are preserved all the way to the machine code, then you should probably use SCADE if you're writing embedded real-time software; otherwise, consider Isabelle.
I'm thinking of trying out TLA+ this holidays to spec a couple of algorithms at work.
I haven't used it but I've seen some examples of it. In many ways it is similar to TLA+, and Lamport says it's a nice, mathematical language. From an aesthetic point of view, it is less elegant: it has many more built-in concepts. From a pragmatic point of view, I think it does not support concurrency at all. Before creating TLA+, Lamport had intended to add TLA to Z for concurrency. This was the result: https://research.microsoft.com/en-us/um/people/lamport/pubs/...
Anyway, here's a comparison by Lamport: https://research.microsoft.com/en-us/um/people/lamport/pubs/...
> I'm thinking of trying out TLA+ this holidays to spec a couple of algorithms at work.
Cool. Apart from being very useful, it's a lot of fun.
Also, formal methods, like AI, suffered a "research winter" after making grand promises that they weren't actually able to keep. The hype initially drew a lot of funding, and with the disappointment came a period of relative neglect. Have those two unsubstantiated-hype-prone disciplines learned the lessons and are able to lower their sights to what's attainable, or will they end up wasting a lot of time and money? I guess we'll see. I think that at least the formal methods community is more cautious now, and focuses on reducing various classes of bugs rather than promising affordable 100% correct software.
It is. That's why Coq is a pure functional language, rather more like Haskell than like C.
> and with the right tooling it should be able to prove many things quickly
Yes, this should be possible, and it is possible.
If you mean you want it to be automated: Isabelle/HOL has more automation than Coq. For many simple things, it suffices to tell Isabelle, essentially, to perform induction on a given variable and from that to try to find a proof automatically. It builds on several external provers to implement this.
What is difficult is for the system to know which existing lemmas to reuse and how to combine them. It's a bit like your math homework, except that these systems know more lemmas (i.e., have a larger search space) and are not always good at ranking them by relevance.
> right now i don't see formal proofs of programs at all (i only hear some rumours that some companies use it at some small scale)
Look up SEL4 and CompCert for a start.
ACL2 is an alternative that strives to strike a balance between interaction and automation.
- It is more automated than any of the interactive theorem provers including Isabelle/HOL, Coq, etc.
- Its authors received the ACM software system award in 2005.
-It is used in commercial products since several years in companies including AMD, Centaur , Rockwell Collins, etc. There is a not-so comprehensive but yet impressive list of applications here:
- Also, since the language is a subset of Common-lisp, any one who knows a bit of programming can very quickly start using ACL2. One does not need to learn type theory/logical formalism to start using it. It has an eclipse plugin that is used by first and second year undergraduate students (http://acl2s.ccs.neu.edu/acl2s/doc/).
I would guess an database of previously proved lemmas for all the used libraries would help, so that you don't have to recompute everything.
 quickcheck is an awesome library that generates unit-tests for specified properties (like: "for all lists s: length(sort(s)) = length(s)", i would then just generate a few hundred lists and check whether the property got violated)
Yes, kind of. For certain values of "similar ease" and "Haskell". AFAIK Agda is pretty much Haskell, but its proving part is not all that automatic. On the other hand, Isabelle has a lot of proof automation (and includes an implementation of QuickCheck that it applies automatically to every property you state when you start proving it!), but its own variant of functional programming is not exactly Haskell. I think you can extract Haskell programs from it, though.
> I would guess an database of previously proved lemmas for all the used libraries would help
It may help, but it may also confuse the proof search if it gets lost in a sea of irrelevant lemmas. Which is what I was trying to say above, I never said anything about recomputing stuff.
> Yes, this should be possible, and it is possible.
Regardless of the formal tool you use, be it Coq, Idris, TLA+ or Isabelle, proving what you need proving most -- the non-trivial stuff -- is never quick and easy. The reason it's hard to prove non-trivial algorithms correct is because it's hard to write non-trivial algorithms correctly. Temporal logic, HOL, SMT, dependent types, model checking, abstract interpretation -- they're all useful tools, but in the end, to make the utmost of formal methods, and understand the difficulties and possibilities, one must first understand the nature of algorithms and how they work, and this is always the same, regardless of the particular formalism used. Personally, I've found TLA+ to best isolate the core of the issue with as few distractions as possible, and others may find other formalisms to better suit how they think, but the essential problem is always the same.
It is true, however, that some formal tools impose more accidental complexity than others. The ones that are most refined are those that are used by real engineers on large real-world software every day, like SCADE, TLA+, SPARK etc., while things like Coq and Agda are mostly aimed at researchers and designed to let them explore new ideas.
In practice, the industry uses automated formal methods -- SMT solvers and model checkers -- almost exclusively, with interactive deductive proof used to tie results together or to fill in a few missing gaps. Interactive theorem proving is currently too costly as a primary tool for use in real-world software, in all but the most high-assurance settings.
> Look up SEL4 and CompCert for a start.
seL4 and CompCert -- while great achievements -- are neither good examples of formal methods, nor are they particularly encouraging. For one, they were academic projects that took world experts years of effort for what amounts to very small software as far as industry software goes (comparable to a ~10KLOC C program). For another, the kind of verification they required is what's known as "end-to-end", namely global correctness properties are proven all the way down to machine code. This level of correctness is unnecessary for 99.99% of software; perhaps even for no software at all.
Much better examples of the use of formal methods in industry include the use of SCADE to write thousands of safety-critical, hard realtime software every year; the use of Z and Ada SPARK by AltranUK to write large, complex defense software; the use of SPIN and other model checkers to find flaws in hardware designs, the use of B-Method in the railway sector, and the use of TLA+ at Amazon, Oracle and other companies for complex server-side services.
> proving what you need proving most -- the non-trivial stuff -- is never quick and easy
The OP asked for proving "many things" quickly. You can prove many things in Isabelle by writing something like "by induction n. sledgehammer" and waiting for the automated machinery to do the rest. This doesn't suffice for everything, of course. But it's good enough for "many things", including stuff that would take a lot more work in Coq.
> seL4 and CompCert -- while great achievements -- are neither good examples of formal methods
The OP said they are not aware of any "formal proofs for programs". These are some examples. Many of the other things you mention use formal models, but not necessarily formal end-to-end proofs.
> "end-to-end", namely global correctness properties are proven all the way down to machine code. This level of correctness is unnecessary for 99.99% of software; perhaps even for no software at all.
Compiler bugs are a thing. At Airbus, the generated assembly code is inspected manually for bugs introduced by the compiler (which is untrusted, and takes input C code generated by SCADE's untrusted code generator). You think that's useless voodoo?
> formal methods in industry include the use of SCADE
SCADE is great, and it's "formal methods" in the sense that you write your program as a formal model with clear semantics that can be reasoned about. But it's not the kind of proof the OP asked about, I think.
No, but I also think it's just not worth the effort for 99.99% of software. It's neat that it can be done when absolutely necessary, but that's not where most of the benefit of formal methods can be had. Also, you don't need many certified compilers, so the techniques used to construct them need not be scalable.
> SCADE is great, and it's "formal methods" in the sense that you write your program as a formal model with clear semantics that can be reasoned about. But it's not the kind of proof the OP asked about, I think.
Well, they wrote "I would image that certain languages are easier to prove (haskell vs C)", and SCADE is easier to formally reason about -- be it with a model checker, static analysis or deductive proofs -- than Haskell. Usually, languages with very simple semantics are easier to reason about either automatically, manually or semi-automatically, and the languages that are used in the industry in formal methods (except static analysis) -- even if the formal methods used aren't deductive because that's just too costly -- are those that really are "easiest to prove".
Haskell doesn't have full support for dependent types (yet). The Haskell community is very cautious with adding features that make type inference undecidable (like dependent types), so what people do is gradually add type system features onto GHC and then gate them behind feature flags. So you can do some of the things you can do in Agda in Haskell (like GADTs, kind promotion, limited type functions, etc.) but not everything.
The advantage of Haskell's conservative approach is that it's actually very useful for writing programs, whereas Coq is only really meant for writing proofs and Agda is more of a research testbed than a production language.
I'd say Idris is the closest to a useful general purpose programming language with proofs. At least that I've found so far, unless isabelle is? Not used it just came across it a while back and have to have a look.
Same when saying "Hoare logic".
It's unfortunate people think of Coq as an academic language. I really think it's something more people should pull out at the workspace and not confine it to college experimentation.
However the academic side of it is pretty slick. First, it's amazing how a few dozen lines of Coq can help explain the ins-and-outs of Hoare logic. (https://www.cis.upenn.edu/~bcpierce/sf/current/HoareAsLogic....). If you're into some really heady stuff and into homotopic types, the HoTT/Coq community has been explosive lately. While it's a great community, they can be a little particular when it comes to pull requests. Don't take it personally,
Writing secure software will become a huge priority over the next few decades. There are many tools out there to create verifiable software, but I suspect there will always an opening for someone who can crank out rock solid Coq.
EDIT: Inria is French (although Google translate insists the less fortunate homophone still exists in French).
In french, the word (fr)"bite" translates to "cock". It's pronounced like (en)"bit". And while a lot of people prefer "octet", "bit" has very much become part of the french vocabulary. So, what do we do?
We don't care. We giggle at it when we're in high school and then it stops being funny. In Swedish, (en)"sex" and (en)"six" are the same word (se)"sex", pronounced the same way. They don't care.
I also feel like some people here might be forgetting that (en)"cock" has other meanings in English, including the male bird which is what (fr)"coq" means
Edit: Prefixed everything because this post was far too confusing.
Minor remarque changing nothing in the meaning of your message.
PS: remark, not remarque ;)
Which (en)"cock" does (fr)"bite" translate to?
Try playing with https://translate.google.com - the audio is very fair on it (albeit spoken much slower than you normally would).
Maybe Certified Programming with Dependent Types is along the lines of what you are looking for?