Not as much has been going on in that project as I would have liked. It certainly doesn't help that:
- The standard library is actually quite poor. A lot of basic functions must be implemented from scratch, and then you'll also want to write proofs about them.
- I had to reinvent IO in Coq.
- Most AOC problems are actually not that well-suited for formal verification. (It's certainly no fault of the AOC organizer!)
In particular, a lot of them basically ask you to simulate some system which doesn't have particularly interesting properties, so the implementation itself is often the best specification of the problem, and the main point of using a language like Coq is lost.
I did find some interesting problems though. For example, day 5 was a simple rewriting system and the question was to find the normal form of some starting string. Well, who's to say that the answer is unique? In this case it's easy enough to convince oneself that the given system actually is confluent, but I set out to prove it formally (and (re)learned a few things about rewriting systems on the way). (c.f., Confluent_react_steps, react_steps_injective, in day_05_common.v)
I know functional languages are slowly starting to adopt dependent types, and you can get reasonable certainty of correctness with property-based testing, but Coq just feels kind of universal. If the proof passes, so will your code.
I really wish I could get my employer to sign off on me using it for a real project.
The goal of a programming in Coq would, I suppose, be to shift debugging time from after compilation succeeds to before, with the hope that the compiler will direct one towards the bugs and they will be easier to find. I don’t think this is always a win. I think it is a win if what you write must be very reliable, but it seems less obvious that the advantage would still exist as the required reliability decreases.
Remember that bug Coq had when if you used more than 256 parameters in a constructor you could prove anything?
I suppose one could say Coq is good enough, with bugs.
Maybe that's FUD on my part; I've only read Coq code and never written any of my own.
In any case, at least the way I'm thinking of it, it's not that good enough with bugs is what makes the money. It's more a question of proportionality: It doesn't make sense to spend $100 to guard against a $5 bug. I don't think Coq leaves much room for being pragmatic about test coverage like that.
I'm still wondering how often that will be a problem if high-assurance methods get more adoption. I'll note it's mitigated a lot by the advice we give about where to apply formal methods:
1. Highly critical stuff.
2. That doesn't change at a fast pace.
3. That's known to be verifiable with existing methods in the time frame needed. As in, you want pre-verified components or it to be really similar to past projects.
One thing to support No 3 is to have straight-forward, sequential code that's composed in a hierarchical way with clearly-specified interfaces. You'll mostly be just be verifying behavior of individual components. A change at the bottom will be isolated into that module. Then, you prove the integration of whatever calls it. Then what calls that. With the proper structure, you dramatically reduce the work required with the reverification.
" I don't think Coq leaves much room for being pragmatic about test coverage like that."
People that use it often try to use it for everything because formal verification is their job. In industry, you use it where it makes sense. That might be most critical component which is also pretty simple. Actually, most in industry use model-checkers (i.e. SPIN, TLA+) and languages with solvers (i.e. Frama-C, SPARK Ada) for verification. That minimizes their work. The most-adopted methods, though, for the $5 bugs are software inspections (i.e. checklists), static analyzers, test generation from source, and fuzzing. I strongly recommend using them in order of lowest-cost and fastest results to highest-cost, hardest results to avoid wasting time.
Isn't the same true for static typing? One small business requirement changes, you change a foundational type, and suddenly all your code needs to adapt.
Of course this is true with dynamic typing too, but at least you don't know.
Business changes are usually changes in logic, not in types. So it's mostly "our cashback is now calculated in percentage of the sum, not as a fixed sum" etc. Of course, this may still have repercussions through the entire system. I don't know how it would work in Coq (I don't know Coq :) ).
Types _are_ logical propositions. In languages like Coq they are more expressive, and it's easier to leverage them as a way to constrain your implementation. Of course it's just a model, and there can be bugs in the specification, so it's not a silver bullet.
ML and Coq types, yes. Java, C types? Not so much.
SOLID helps a lot with that sort of thing. Avoiding inheritance like the plague helps a lot. I slightly hate that I said old school, because Smalltalk-style tell-don't-ask helps a lot.
I think that the idiomatic ways of structuring code in the ML family of languages help the most, because they tend to be pretty radical about limiting scope of visibility. But they also tend to make more of a muddle of things than they're worth in algol-style languages.
To be a bit facetious: What I would be terrified of in software development is some tiny change in business requirements ends up pulling the linchpin out of some fundamental premise way down at the bottom of your stack of functions, this forcing you to go and fix everything.
There is good proof engineering and bad proof engineering, just like there is good software engineering and bad software engineering. (If only all of us could agree on what exactly is good and bad in either ;-)). Coq has all sorts of modularity and abstraction facilities that you can use to build an architecture that is hopefully resilient to change.
But yeah, it's not perfect. Particularly because proofs are not always like unit tests: At the lowest level you prove things about actual implementations of functions, not just their behavior. In tricky cases, changing a low-level function does mean that you may have to change a lot of proofs. How to avoid this is still an open problem.
Of course, for a C compiler the requirements are less prone to change than e.g. some SAP-style business application. As a dev, coq is just another tool in your belt and being a good dev means knowing when to use what tool.
So I’m sure you _can_ build something in Coq that is fragile as you describe, but I feel sure that there are alternative approaches that won’t be as brittle.
Also, less expensive to fix given bugs are cheaper to fix earlier in the lifecycle. You can also reduce breakage during changes to existing code. Finally, you can warranty the code at specific defect rates or for certain kinds of defects where the fixes are free within the warranty. Companies using Cleanroom and Correct-by-Construction did that with the above benefits.
Let’s take HN for example. Is it coq proven? I doubt it. Is that a barrier to its success?
Wait, isn't Coq verified by itself?
It's not possible for Coq's consistency to be verified in itself, thanks to Gödel's second incompleteness theorem. However there are still various other aspects of the compiler that could be verified in principle. The CertiCoq project is actively going in that direction.
I don't think Coq is the way forward (I don't think "code that makes money" has many interesting provable properties in itself along those axes) but being able to ship stuff that's pretty close to bug-free will get you far
This is certainly the only pronounciation in the French formal methods community, where Coq comes from. It is also the standard pronounciation in the international formal methods community. I have never heard anyone pronounce it like "coke".
See for example Adam Chlipala at about 5:40 here: https://www.youtube.com/watch?v=4DYVJdHMV5k
Benjamin Pierce has a slightly longer o here, but it doesn't sound to me like it's deliberately pronounced differently from "cock": https://www.youtube.com/watch?v=KKrD4JcfW90
And here is Andrej Bauer: https://www.youtube.com/watch?v=COe0VTNF2EA
I know from rare comments like yours that some prudish Americans pronounce it "coke", some deliberately, others apparently because they have been fooled by the first group. But this seems to be a very small minority.
> which has a short o like the English "cock"
No, it doesn't. The sound it has isn't a common English vowel sound at all; it is closer to the vowel sounds in “cuck” and “cook” than the one in “cock”, but not exactly either of those, either. In fact of the four (those two, plus coke and cock), the “o” in “cock” is the one most Americans would find it most distinct from, I would think.
Yeah, none of those pronounce it with the vowel sound in English “cock”.
> I know from rare comments like yours that some prudish Americans pronounce it "coke",
It's not prudish, its a common way Americans that haven't mastered French pronunciation approximate the unaccented French “o” irrespective of context.
I was mainly talking about the vowel length. However:
> the “o” in “cock” is the one most Americans would find it most distinct from, I would think.
Really? More distinct than from the "oʊ" sound in American "coke"? (IPA from https://en.wiktionary.org/wiki/coke)
It's clearly because I don't pronounce them well, but it's certainly how I do it.
I'm pronouncing it like "coque" from now on.
Just for reference, another HNer offered up the following in python for inspection
There are other posted solution sets on HN if you search... Now of course, lots of very different languages with different goals and philosophies. Hard to make a fair comparison, but, still I'm making a subjective comparison. The Coq code just does not seem compelling.
Of course it’s possible it won’t make sense if you’re unfamiliar with the syntax or style. That’s like saying Haskell or Erlang are uncompelling from a brief glance, which ignores the benefits of using those languages.
As a brief aside, I've often encountered the claim that these goals make Python "English-like." For the grand failures achieved by pursuing such a mistake, one need look no further than COBOL and AppleScript. Python is clear and readable because it is relatively consistent, concise, and reads like how we are taught to expect pseudo-code to read, not because it reads like an English sentence.
It could be argued that Coq's goal in the syntax and semantics department is mathematical clarity, which is somewhat different than the kind of clarity Python pursues. Mathematical clarity favors terseness, simplicity, and consistency above all else, because these features are necessary to express complicated ideas succinctly and unambiguously. As an argument in favor of this definition of clarity over Python's, I submit that programming correctly in a formal sense is quite difficult to do without tools that encourage it as Coq and other languages operating at the level of generality of the calculus of constructions do. Most statically typed languages do not have type systems powerful enough to express the properties that Coq can, and among those that do, I'm not sure any are syntactically simpler.
Coq indeed sacrifices what is for a lot of traditionally trained programmers (myself included) the immediate familiarity of imperative pseudocode that Python expresses so well. What is gained is a highly general yet simple set of tools which can prove things about programs that are simply impossible with other tools. I would also argue that once the syntax is learned and the terseness adjusted to that Coq code is easy to read as well, and that the tradeoff for me is worth the learning period, but you may have different needs and priorities. I am intrinsically interested in rigorous formulations of software correctness, and I realize that makes me a bit more odd than the average programmer.
but Coq is also proving correctness so that seems expected