Hacker News new | past | comments | ask | show | jobs | submit login

> If you're trying to argue that “computation” is a physical phenomenon in the same sense gravity or electromagnetism are, I have bad news for you: it isn't.

Don't know what you mean by "in the same sense", but Turing, von Neumann, Gödel and Hartmanis would have been saddened by your news because they thought (I guess Hartmanis still does) differently. So did Church. The very idea of algorithms -- as Church writes in his review of Turing's paper -- contains a notion of physical realizability. The idea of un/decidability -- only intuitively touched upon in Church's paper, and pretty thoroughly discussed in Turing's -- is predicated on the notion that a computational step takes at least some constant amount of physical time. It is true that a computation isn't a specific physical phenomenon like gravity, but a general, abstract phenomenon, that can be realized in different ways. But if it can't be realized, then it isn't considered computation, unless the people discussing it exclaim very clearly that they're discussing hypothetical computation.

Some people take it further and say that not only computation is physical, but as it is meant to model the work of the mathematician, it grounds all of math in the physical along with it, and things that can't be feasibly computed (in some reasonable amount of time) should be left outside the realm of math. You may have heard this story by Harvey Friedman about the ultrafinitist Alexander Esenin-Volpin who attended his lecture[1], and Paul Cockshott from the Univeristy of Glasgow writes this[2]:

> Turing starts a philosophical tradition of grounding mathematics on the material and hence ultimately on what can be allowed by the laws of physics. The truth of mathematics become truths like those of any other science — statements about sets of possible configurations of matter. So the truths of arithmetic are predictions about the behaviour of actual physical calculating systems, whether these be children with chalks and slates or microprocessors. In this view it makes no more sense to view mathematical abstractions as Platonic ideals than it does to posit the existence of ideal doors and cups of which actual doors and cups are partial manifestations. Mathematics then becomes a technology of modeling one part of the material world with another.

As to the rest, there's much more to computation than programming, and your personal line of work is not the definition of utility.

[1]: https://en.wikipedia.org/wiki/Alexander_Esenin-Volpin#Mathem...

[2]: http://blogs.nature.com/soapboxscience/2012/06/20/turing-the...




The notion of a function having any inherent complexity is rather at odds with your earlier assertion that putting any sort of meaning on the result of running a Turing machine is strictly human interpretation rather than a hard fact of computation (and of course, you also turn around and favor certain interpretations over others based on the cost of embedding them in your favorite computation model).


> The notion of a function having any inherent complexity is rather at odds with your earlier assertion that putting any sort of meaning on the result of running a Turing machine is strictly human interpretation rather than a hard fact of computation

I don't see how. That inherent complexity is precisely that computed by the machine (provided that you're talking about a function in the ordinary, set theory sense, rather than in the type theory sense, in which case a function is a set-theoretic functions plus a proof of set membership).

> and of course, you also turn around and favor certain interpretations over others based on the cost of embedding them in your favorite computation model

No, the complexity of evaluation differs in the different models by a small factor (polynomial at most), that is much smaller than the complexity differences of the representation. There is no room for (significant) interpretation here: the typed formalisms can perform computations of possibly arbitrary complexity during their validation stage. They aim prove something and proving something comes at a cost. There is no interpretation by which proving a possibly complex theorem is considered of having little complexity.


> I don't see how. That inherent complexity is precisely that computed by the machine

The same function may be computed by different procedures with different complexity.

> rather than in the type theory sense, in which case a function is a set-theoretic functions plus a proof of set membership

Not all types denote sets.

> There is no room for (significant) interpretation here: the typed formalisms can perform computations of possibly arbitrary complexity during their validation stage.

This is the price of capturing the meaning of computation. Of course it doesn't come for free. But, in general, the earlier you do it, the less work it requires. In the limit, if you do it really late (e.g., test suites), it requires an infinite amount of work (searching infinite input spaces).


> The same function may be computed by different procedures with different complexity.

The whole idea of computational complexity theory, the time- and space-hierarchy theorems, that are generalizations of the halting theorems and are probably the most important theorems in computer science, is that every function has an inherent complexity, that is independent of the algorithm that implements it. The time or space complexity of an algorithm may be greater than or equal to the complexity of the function. You may want to read my blog post about computational complexity in the context of software correctness: http://blog.paralleluniverse.co/2016/07/23/correctness-and-c...

> Not all types denote sets

In this case, what matters is what set they specify (not necessarily denote) and what proof.

> This is the price of capturing the meaning of computation. Of course it doesn't come for free. But, in general, the earlier you do it, the less work it requires. In the limit, if you do it really late (e.g., test suites), it requires an infinite amount of work (searching infinite input spaces).

First, the computational complexity of verifying a program is the same no matter when it's done. Read my other blog post. Second, I'm talking theoretical computer science here. Obviously there are all sorts of interesting trade offs in real world programming. Finally, the whole point of this article is that language models do this extra computational work and solve a different problem, while the people discussing them often don't notice or point this out.


> In this case, what matters is what set they specify (not necessarily denote) and what proof.

How does this even make sense? The whole point to specifying something abstractly is that you want to denote anything that satisfies the specification. And what proof are you talking about? Are you confusing “proof” with “typing derivation”? A single term may admit multiple typing derivations.

> First, the computational complexity of verifying a program is the same no matter when it's done.

You're assuming that you can write correct programs, without taking verification into consideration right from the beginning. This assumption is incorrect.


> How does this even make sense? The whole point to specifying something abstractly is that you want to denote anything that satisfies the specification. And what proof are you talking about? Are you confusing “proof” with “typing derivation”? A single term may admit multiple typing derivations.

I am not confusing anything. Again, this is a matter of overloaded terms. When a typed language expression to compute a function Nat -> EvenNat is given, there are two computations that must be carried out: First, a proof that the function yields an even natural is computed (you call that type checking); second (that's the evaluation step), the actual computation is carried out, and an element in the set of even numbers is computed.

When you want to start counting operation is up to you. You can declare that the validation (type checking) phase is not counted. But the fact remains that if you know something -- i.e. that the result of the computation is a member of a certain set -- that knowledge must be gained at some computational cost. If you know it, then it means that something has paid that cost.

> You're assuming that you can write correct programs, without taking verification into consideration right from the beginning. This assumption is incorrect.

I am not assuming anything. When to do something is a practical, engineering concern. It is obviously of extreme importance. But from a theoretical point of view, the total computational effort required to prove a property of a program is the same (given the same conditions), no matter when it is carried out. Of course, engineering concerns move that complexity around between different machines -- the computer and the human programmer -- based on their respective strengths, but the total work is always the same. Just as to sort an array there's a minimum complexity required (O(n log n)) no matter how you do it, so too does writing and proving a program correct has an inherent complexity that has to be paid no matter what.


> First, a proof that the function yields an even natural is computed (you call that type checking); second (that's the evaluation step), the actual computation is carried out, and an element in the set of even numbers is computed.

As stated above, not all types denote sets. For instance, in a language with general recursion, (simple) types denote domains, not sets.

> But from a theoretical point of view, the total computational effort required to prove a property of a program is the same (given the same conditions), no matter when it is carried out.

You are ignoring two things:

(0) There is no magical way to implement the correct program if you don't have verification in mind right from the start. You will implement the wrong program, then attempt to fix it, then have another wrong program, then attempt to fix it, etc. That too has a computational cost.

(1) When you design your program upfront with verification in mind, you actually have a different mathematical object than when you don't. (Just like 2 as a member of the naturals is a completely different mathematical object from 2 as a member of the reals.) If your program design is split into modules (as it should be, if your program is large enough), then enforcing abstraction barriers across module boundaries constrains the size of the design space, often dramatically. (e.g., there are too many non-parametric total functions of type `forall a. a -> a` to even form a set, but there is only one parametric such total function: the identity function.) You note that verified properties could be thrown away after the verification is done, but why would anyone in their right mind want to?


Again, you're talking about engineering. That's great, but I'm talking about theory. You're saying that it's easier to build a building from the bottom upwards -- and that's true -- but theoretically a building would be just as strong if it were somehow built from the top down. Of course, no one would think to build skyscraper starting from the top, but that doesn't mean that the theory of structure must always incorporate the study of construction. I'm not talking about the process of construction, but the properties of the structure. I'm not arguing with any of what you've said here, simply talking about different aspects. Keep in mind that not all computational process -- certainly not all studied in TOC -- are programmed by humans at all. Computational processes in a living cell, for example, also have to ensure correctness -- and therefore spend computational resources on "verification" -- yet they don't use symbolic logic for that task. This means that there's value in studying the problem of verification even separately from the study of programming. Indeed, this is how it has mostly been done in academia. The software verification community uses very different concepts from the PLT community, concepts that are much closer to abstract machine models than to language models.


> Again, you're talking about engineering.

No, I'm not. I'm talking about math. (If you read my comment history, my distaste for “engineering” should be obvious, at least if by “engineering” we mean “software engineering”.)

> but theoretically a building would be just as strong if it were somehow built from the top down.

The amount of mental gymnastics required to produce this statement (not to mention expect someone else to believe it) completely baffles me.

> Computational processes in a living cell, for example, also have to ensure correctness -- and therefore spend computational resources on "verification" -- yet they don't use symbolic logic for that task.

Nature doesn't do “verification”, because there's no specification in the first place. Nature just evolves a variety of systems, and those that aren't fit enough to survive, well, just die. Also, the evolution of biological systems happens way too slowly to use it as inspiration for how to design computational systems for our own purposes.


> No, I'm not. I'm talking about math.

Good. In that case, verifying that a program conforms to spec in the general case is of as much computational complexity no matter when it's carried out. This is proven and well known. For an overview, you can see my post here: http://blog.paralleluniverse.co/2016/07/23/correctness-and-c...

In short, the reason for that is that the every computational problem with a computable upper bound on complexity can be efficiently reduced to verification. The complexity required to verify is a function of the spec, and can be made arbitrarily hard.

> The amount of mental gymnastics required to produce this statement (not to mention expect someone else to believe it) completely baffles me.

OK, so let me use another analogy. That you can wind up a wind-up toy before I enter the room doesn't mean that just because you hid that from me, the toy requires any less external energy to run than when you'd wind it at any other time. Discussions of ergonomic ways to wind the toy (with a larger lever perhaps) may be interesting, but they're separate from this basic fact.

Verification is a computational problem, and one of a certain complexity. That complexity must be paid, no matter what, no matter when. The knowledge that a certain program has a certain property has a price-tag, and there are no discounts. Because all computational problems efficiently reduce to verification, if you believe you've found a loophole, I could use it to solve much bigger problems than ensuring that your recipe application is correct.

> Nature doesn't do “verification”, because there's no specification in the first place. Nature just evolves a variety of systems, and those that aren't fit enough to survive, well, just die.

You're wrong because, again, you're blind to computational complexity and to computational processes that aren't man-made programs. What you described is precisely how nature computes. That things that aren't fit enough to survive "just die" is a verification process by a genetic algorithm (that requires tremendous time and space complexity) to conform with the spec for genes -- i.e. life's "software" -- which is: reproduce.

Computational complexity results are used in the study of biological systems. Look at the work by Leslie Valiant (Turing award) or Stuart Kauffman (https://en.wikipedia.org/wiki/Stuart_Kauffman). Incidentally, TLA+ can also be used to model systems such as metabolic pathways, as those are very easily described as nondeterministic state machines: http://www.cosbi.eu/research/publications?pdf=5437

> Also, the evolution of biological systems happens way too slowly to use it as inspiration for how to design computational systems for our own purposes.

Maybe, but TOC doesn't study how to design computational systems for our own purposes. It's pure science. If its discoveries turn out to be useful, then that's a happy coincidence.

For that matter, neither does most of TOP for the past couple of decades or so. If you do math to study a system designed for use by a computationally-complex collaborator (the programmer in this case), you must study the collaborator as well. TOP shows no interest in such empirical study, and has therefore been utterly unable to prove that the approaches it advocates actually help write very large complex software systems; indeed it has been even unwilling to test this hypothesis, not even once.


> In that case, verifying that a program conforms to spec in the general case is of as much computational complexity no matter when it's carried out.

Then, as Dijkstra said, don't focus on the general case: “So, instead of trying to devise methods for establishing the correctness of arbitrary, given programs, we are now looking for the subclass of "intellectually manageable programs", which can be understood and for which we can justify without excessive amounts of reasoning our belief in their proper operation under all circumstances.” https://www.cs.utexas.edu/users/EWD/transcriptions/EWD02xx/E...

> The knowledge that a certain program has a certain property has a price-tag, and there are no discounts.

So what? If you're designing programs out of reusable components and you don't throw away their proofs of correctness, then much of the cost has already been paid by someone else before you even start.

> You're wrong because, again, you're blind to computational complexity and to computational processes that aren't man-made programs. What you described is precisely how nature computes. That things that aren't fit enough to survive "just die" is a verification process by a genetic algorithm (that requires tremendous time and space complexity) to conform with the spec for genes -- i.e. life's "software" -- which is: reproduce.

This still doesn't make sense. No biological system is “correct” in any meaningful sense. Correctness is always relative to a specification, and, even if your specification is as vague as “survive”, what's happening in nature is that some life forms are lucky enough to be fit for their current circumstances, not all possible ones.


> Then, as Dijkstra said, don't focus on the general case

Exactly, except that since Dijkstra gave that talk (early seventies) we have learned some important results in computational complexity. The bottom line is that we know that it is impossible to create a generally useful programming language (even not Turing-complete; even a finite-state machine) where every program is relatively cheap to verify (in 2000 it was proven that language abstractions can't significantly reduce verification cost even though it would have been reasonable to believe that they could; that result is discussed in my linked post). Whatever language is chosen, only a small subset of programs written in that language will be verifiable. Identifying that subset requires extensive empirical research. This is done in the software verification community; unfortunately the PL community is reluctant to conduct empirical research.

> much of the cost has already been paid by someone else before you even start.

Not true. See the Schnoebelen results quoted in my post that I linked to. Composing proofs of program correctness requires proof effort that is intractable in the complexity of each component, not in the number of components. This was proven in 2005. In technical terms, verification isn't Fixed-Parameter Tractable, or FPT, a relatively modern[1] complexity class created to incorporate problems that are intractable in the general case but may be tractable in practice. In simple terms, that each of your components has been proven correct still means that the difficulty of proving the composite's correctness is an intractable function (exponential at least) in the complexity of each component. Of course, theoretically it's better than nothing, but not by enough.

> is that some life forms are lucky enough to be fit for their current circumstances, not all possible ones.

That is only if you view each life form as the computational system. However, that genetic algorithm I was referring to is conducted by all living systems as a single computational system.

[1]: https://en.wikipedia.org/wiki/Parameterized_complexity


> Whatever language is chosen, only a small subset of programs written in that language will be verifiable.

You're still focusing on arbitrary programs. Just don't write arbitrary programs. Let your intended proof tell you what program you need to write.

> Composing proofs of program correctness requires proof effort that is intractable in the complexity of each component, not in the number of components.

So keep each component small.

> However, that genetic algorithm I was referring to is conducted by all living systems as a single computational system.

This is still not verification. It's just testing on a much larger scale.


> Just don't write arbitrary programs. Let your intended proof tell you what program you need to write.

Of course, but easier said than done. In other words, we haven't figured out how to do this yet. So far, the cost of proof exceeds what we're willing to pay for what needs proving. OTOH, researchers often focus not on the programs we need to write, but on the properties they can prove (searching under the lamppost, so to speak); this is the best way of getting results. Some of them then try to convince themselves and others that the properties they can prove of the programs they can prove them on are really those that need proving, and that the cost is affordable. Sadly, this wish hasn't been shown to be true just yet.

Look, I suggest that you try to write a large-scale formally verified program. Forget large; small and non-trivial will do. And I don't mean "verified" as in it type-checks in Haskell, because the Haskell type system can only prove extremely weak properties, but actually verified with respect to interesting logical properties. There is a reason why no one in the history of computing has never come close to achieving such a feat. Interestingly, you find that it is the people who actually do formal verification (you can watch talks by Xavier Leroy, who comes to verification from the PLT direction, or statements made by the people who worked on seL4) are usually the ones most restrained in discussing its possibilities. It can work, depending on what exactly you want to do, but requires a great investment, and end-to-end verification is possible only on very small programs (like seL4 or CompCert) and only at a prohibitive cost, that makes the process worth it only for a handful of very special projects.

Let me put it another way: so far, practice has shown to be very much in line with the difficulties predicted by the theory. Even Schnoebelen says, when discussing the result that language abstractions don't help, that, while true for the worst-case, "has been observed in many instances, ranging from reachability problems to equivalence problems, and can now be called an empirical fact"

Personally, I do formal verification extensively on large projects (much larger than CompCert or seL4), but I don't even attempt end-to-end or manual deductive proof because the cost of those becomes very clear very fast once you start trying these things on big systems. These observations have been reported, AFAIK, by every single industry or research unit that carries out significant software verification efforts.

Formal verification can and does work. But in order for it to work, you need to approach it humbly and know the limitations. Also, you need to know where best to spend your efforts, which is a bit of an art. When you do, you discover that maybe you've not found a way to consistently churn out bug-free programs, but that the quality of your programs has improved considerably.

Just to give you a taste of program proof, this is a simple exercise by Leslie Lamport:

Consider N processes numbered from 0 through N-1, in which each process i executes

    x[i] := 1;
    y[i] := x[(i-1) mod N]
and stops, where each x[i] initially equals 0. (The reads and writes of each x[i] are assumed to be atomic.) This algorithm satisfies the following property: after every process has stopped, y[i] equals 1 for at least one process i. It is easy to see that the algorithm satisfies this property; the last process i to write y[i] must set it to 1… The algorithm satisfies this property because it maintains an inductive invariant. Do you know what that invariant is?

If you can produce a machine-checked proof of this 2-line program that has no loops or recursion in under one hour, hats off to you. The difficulty of the problem grows super-linearly with the size of the code. People writing real programs, even those intended for verification, write programs that are several orders of magnitude more complex than this toy example.

> This is still not verification. It's just testing on a much larger scale.

It's verification (by an algorithm that is not common in software). The cost of verification comes from the knowledge you wish to obtain, not how you obtain it. Computational complexity doesn't care if you know something by proving it using FOL and natural deduction, dependent types, or by exhaustively trying each input. BTW, as far as the theory is concerned, a test is verification of a more specific property (e.g., the property that your program, when given input x, would, say emit output y followed by output z).


> Look, I suggest that you simply try to write a large-scale formally verified program. And I don't mean "verified" as in it type-checks in Haskell, because the Haskell type system can only prove extremely weak properties, but actually verified with respect to interesting logical properties.

This is exactly what I'm doing: a library of data structures and algorithms, where the verification effort is suitably split between the language and me. The language (in this case, Standard ML) ensures that abstract types can only be used abstractly, and I prove (using paper and pen) that the implementations of these abstract types are correct. A primary concern is simplicity, so if a module has too many invariants to manually prove, it is a natural candidate for splitting into smaller modules.

I'm not saying that Standard ML is the ideal language for this kind of thing, e.g., some limited form of dependent typing would be very helpful (as long as it doesn't compromise type inference), but it's much better than anything else I have tried (including Haskell).

> Do you know what that invariant is?

The invariant to maintain is “there exists some distinguished index i such that x[i] is written by process i before it is read by process (i+1) mod N”. If N = 1, then i = 0. If N = N' + 1, consider some valid interleaving of N' processes, with distinguished index i', and then consider the effect of interleaving one more process.

> The cost of verification comes from the knowledge you wish to obtain, not how you obtain it.

In this case, what I wish to ascertain is whether a living organism is fit to survive in a range of circumstances, most of which will never actually happen.

> BTW, as far as the theory is concerned, a test is verification of a more specific property (e.g., the property that your program, when given input x, would, say emit output y followed by output z).

Tests can verify atomic propositions. They can't verify that a predicate holds for all inhabitants of an infinite set.


> a library of data structures and algorithms

That's excellent, but I didn't say "program" for nothing. Programs are usually interactive, with temporal components. But if data structures are your thing, try writing and verifying, say a concurrent R-Tree.

> The only invariant I can see is that “y[i] has been assigned to” implies “x[i] has been assigned to”. From this, we can see that “all y[i]s have been assigned to” implies “all x[i]s have been assigned to”. But this doesn't involve any induction on N so far.

Your reasoning is OK, but it shows how much you're thinking in terms of the language rather than more abstractly. It's also not enough to prove the property.

The crucial state here is the hidden state in the program counter. The induction is not on the number of processes, but on the steps of the algorithm. Each step executes one line of one process (chosen nondeterministically). The invariant, expressed in TLA+ is:

    Inv ≜  (∀ p ∈ 0..N-1 : Done(p)) ⇒ ∃ p ∈ 0..N-1 : y[p] = 1 \* (1)
          ∧ ∀ p ∈ 0..N-1 : pc[p] ≠ "L1" ⇒ x[p] = 1 \* (2)
(1): this is the simple property invariant

(2): this is the inductive bit; `pc` models the program counter

This is instantly verified with a model checker. A machine-checked proof is about 60 lines long (I try to avoid formal proofs as they are extremely laborious, but I guess that a more experinced machine-prover than me could do it shorter).

> Tests can verify atomic propositions. They can't verify that a predicate holds for all inhabitants of an infinite set.

Exactly, and each of those atomic propositions has a cost of verification. Moreover, we can prove (again, it's in my post) that we can't in general generalize from one input to another, so the cost of verification over an infinite number of inputs is infinite in the general case. Of course, it can be done in practice, but we can tell exactly how hard that would be: the cost of verification is proportional to the number of states in the algorithm -- the minimal algorithm that satisfies the full spec; not necessarily your actual program. Properties that are easily checked by simple type systems like those of Haskell and SML can always be satisfied by much simpler programs (in fact, a FSM or, if polymorphic recursion is used, then a PDA).


> It's also not enough to prove the property.

Errr, yes, you're right. I noticed I messed up after I posted my comment, then edited it accordingly. I don't see why I need to perform induction on the program counter, though.

I posit that, no matter how you interleave your processes, there is some distinguished index i such that x[i] is written by process i before it is read by process `(i+1) mod N`. To prove this, I use induction on N:

(0) If N = 1, then i = 0.

(1) If N = M + 1, inductively assume that we have a collection of M processes with distinguished index i. Now consider the effect of adding one more process, respecting the interleaving of the original processes:

(1.a) If the new process is added at any index other than `i+1`, then the original processes i and `(i+1) mod M` still have contiguous indices in the new collection, and the former still writes to the common variable before the latter reads it.

(1.b) If the new process is added at index `i+1`, then at least one of the two following things must happen:

(1.b.i) The original process i writes to x[i] before the new process reads it.

(1.b.ii) The new process writes to x[i+1] before the original process `(i+1) mod M` (now `(i+2) mod N`) reads it.

No analysis of the program counter was required, and as a result we get a far more beautiful proof. You used a sledgehammer, I used a sculptor's chisel.

> Properties that are easily checked by simple type systems like those of Haskell and SML can always be satisfied by much simpler programs (in fact, a FSM or, if polymorphic recursion is used, then a PDA).

I never claimed that I could delegate the entirety of the formal verification task to a type system like that of Haskell or SML. And, to be honest, I don't even think using computers to verify everything is such a good idea. What I want is automation for the boring and repetitive parts (the vast majority), so that I can concentrate exclusively on the hard ones. (Which isn't to say that I'm happy with the little amount of work Damas-Milner does for me. It's possible to do much better, without sacrificing automation.)


> and the former still writes to the common variable before the latter reads it.

This is not true. At no point in your induction is it proven that it is always the last process that writes x before the first reads it, nor is it necessarily the case.

> then at least one of the two following things must happen:

This, too, is untrue. The events may occur in the precise opposite order. If this were true, then it would have been an invariant that either process N-1 or process 0 would necessarily write 1, but this is not the case.

> No analysis of the program counter was required, and as a result we get a far more beautiful proof. You used a sledgehammer, I used a sculptor's chisel.

Excellent, except that my "sledgehammer" is a proof technique that's been proven complete, i.e. if a program property is provable, it is provable via an inductive invariance, which is amenable to mechanical proof checking (and, indeed, I mechanically checked it in TLA+). Good luck getting your proof (which is called "behavioral") -- even if it were correct -- to verify formally.

Chord is a distributed algorithm that was published in 2001 and quickly became popular. It got thousands of citations and plenty of industry implementations. In 2011 it even won a "test of time award". Its authors said, "Three features that distinguish Chord from many peer-to-peer lookup protocols are its simplicity, provable correctness, and provable performance". They, of course, provided proofs. However, in 2012, Pamela Zave, using a model checker, showed that "of the seven properties claimed invariant of the original version, not one is actually an invariant". The point is that when algorithms are concerned, "proofs" are often not proofs.

See: https://brooker.co.za/blog/2014/03/08/model-checking.html and https://www.youtube.com/watch?v=iCRqE59VXT0

> I never claimed that I could delegate the entirety of the formal verification task to a type system like that of Haskell or SML.

Just to be sure: proof with pen and paper isn't formal verification. It is pretty the very definition of informal verification. Formal proof means using natural deduction or sequent calculus. This is too hard, so a machine-checked proof is acceptable. But any proof or any other kind of verification not carried out or verified by a computer is not formal.


> At no point in your induction is it proven that it is always the last process that writes x before the first reads it, nor is it necessarily the case.

I didn't say anything about “the last process”, though. I just took a collection of M processes, with some valid interleaving, and inserted one more process, at whatever index j, displacing all pre-existing proceses j...M-1 one place to the right.

>> then at least one of the two following things must happen:

> The events may occur in the precise opposite order.

Let A and B be processes with adjacent indices (for my purposes, M-1 and 0 are considered “adjacent”), such that A's write happens before B's read. If we insert a new process C in between A and B, then A and B no longer manipulate a shared variable, but instead A writes to a variable C reads, and C writes to a variable B reads. Then at least one of the two following things must be true:

(0) A's write happens before C's read.

(1) C's write happens before B's read.

This follows from these facts:

(0) A's write must happen before B's read.

(1) C's write must happen before C's read.

Was that too hard to understand? If it was, you need to relearn to prove things by hand.

> Just to be sure: proof with pen and paper isn't formal verification. It is pretty the very definition of informal verification.

While the proof I gave above is most certainly informal, you can prove things formally using pen and paper. The formality comes from the use of explicitly given deduction rules, not from the use of a computer.


> I didn't say anything about “the last process”, though

Of course. Sorry. For some reason my mind read i as N. It was late at night... I'm convinced the proof follows from your invariant, but the inductive step is a bit tricky, as i is not the same for every interleaving. My proof is simpler, as the invariant holds for -- and more importantly, maintained by -- every step made by the system, and there's no need to consider various interleavings. It also doesn't require induction on N.

> The formality comes from the use of explicitly given deduction rules, not from the use of a computer.

That's true and I said as much, but doing so is extremely laborious. Do you actually use natural deduction?


> but the inductive step is a bit tricky, as i is not the same for every interleaving

Ah, the joys of nondeterminism! It gives “there exists” an, ahem, “interesting” meaning. This is why I only do imperative and concurrent programming when it's absolutely necessary (e.g., when no functional and deterministic program could be asymptotically as efficient).

> My proof is simpler, as the invariant holds for -- and more importantly, maintained by -- every step made by the system

My invariant is prettier. It doesn't need to be maintained “throughout the computation”, because it's a static assertion about the program: “regardless of the interleaving of operations, there exists some i such that...” Almost invariably, the program is a much simpler mathematical object than the computations it gives rise to. Dijkstra noted “that our intellectual powers are rather geared to master static relations and that our powers to visualize processes evolving in time are relatively poorly developed.” https://www.cs.utexas.edu/users/EWD/transcriptions/EWD02xx/E...

> It also doesn't require induction on N.

Induction on the program counter is even harder! The inductive case of my proof doesn't even look at the entire computation: it looks at most at three processes at a time. It's literally less work than what you propose.

> Do you actually use natural deduction?

In the design of type systems, yes. If I want to prove something by hand, hell no. Natural deduction is much nicer to prove things about than it is to prove things in.


> Ah, the joys of nondeterminism!

In TLA+ nondeterminism is as natural and graceful as determinism (it's a matter of ∨ vs ∧), and the functional concepts of parameterization and function types are just special cases. Nondeterminism is what allows refinement relations. You have it whenever you have types, but the language you use just doesn't use the same words to describe it.

> because it's a static assertion about the program

Mine is as static and as global as yours (and has the benefit of being easily verifiable mechanically)! You're just not trained to see it. You've been told that concurrency and imperative constructs are "dynamic" and are hard to reason about. That's just because FP is bad at reasoning about them, and prefers syntactic decomposition. There is nothing dynamic about a proper, pure description of what to you seems a messy dynamic process. The entire program is just a single, pure semantically unambiguous first-order predicate.

I had a discussion about this with a PLT person once, and he kept asking how you prove static properties in TLA+. It took me a while to understand what he meant, as "static" vs. "dynamic" are necessary when you want to reason about syntax. In simple logic, there's just a single formula.

> Almost invariably, the program is a much simpler mathematical object than the computations it gives rise to

You don't reason about each computation, but about what you'd call the "program", which in TLA+ is called the algorithm, as not to unnecessarily confuse with ideas that are related to syntax. The algorithm is just a boolean predicate, and my proof is that that predicate implies another (which implies the property).

Now when I look at how FP people reason about code I wonder why they have to complicate things so much. Sequential, concurrent, parallel, nondeterministic, deterministic, algorithms, properties -- they can all be reduced to a single first-order predicate, with the simple implication relation proving anything you want to know about any of them (that the program satisfies a property or a "type", that the deterministic program is a refinement of the nondeterministic one, that the parallel program computes the same result as the sequential one, that the sequential program has a worst-case complexity of such-and0such), all with simple, familiar logic and semantics that fit on a single page.


> But any proof or any other kind of verification not carried out or verified by a computer is not formal.

Very suprised to hear this from someone who wrote this very article on the nature of computation!


It was an informal statement :) In any event I said deduction rules or a computer, by which I meant some tool that can reduce the rote effort, and that we can be sure is itself correct.


There is still one glaring issue with your thesis that remains unresolved.

Verification is hard. We all agree. But that is for the worst case. Alternatively we can say that it is hard to verify arbitrary programs. I am utterly unconvinced that it is hard to verify programs that are designed to be verified, and I am utterly unconvinced that it is hard to design programs to be verified.

How should I convince myself that it is hard either (1) to verify even programs that are specifically designed to be verified, or (2) to write such programs?

(I don't mean hard in practice, I mean in theory.)


> How should I convince myself that it is hard either (1) to verify even programs that are specifically designed to be verified, or (2) to write such programs?

Didn't we discuss that issue on Reddit? Anyway, as the complexity of verification is proportional to the number of states. If you write the program from scratch, the number of states is dictated by your spec (as every computational problem, every function has an inherent computational complexity). Your spec can be arbitrarily hard.

Now, it's not a matter of worst-case. That writing and verifying arbitrary programs is the very same problem is easy to see: given a program A, and a property I wish to verify P, I then create a spec S = A ∧ P, i.e, the program you write must behave like this arbitrary program and satisfy P. So there is no difference between writing from scratch or verifying. Indeed, you can see that static analysis tools can often prove on arbitrary programs the very same properties that are enforced by type systems.

Of course, not every algorithm and property are worst-case, regardless of whether the program is being written or given. There are useful classes of programs/properties that are easier to verify, and the language can make useful restrictions. So far, examples that work well in practice are SPARK and SCADE. Each forbids memory allocation, including unrestricted stack growth.


> That writing and verifying arbitrary programs is the very same problem is easy to see: given a program A, and a property I wish to verify P, I then create a spec S = A ∧ P, i.e, the program you write must behave like this arbitrary program and satisfy P. So there is no difference between writing from scratch or verifying.

I don't follow. What has A got to do with it? I am given a spec P. You claim that it's hard to write a program that satisfies P, pretty much regardless of what P is. I don't see how A is relevant at all.


Yes, but P can be arbitrarily hard to implement. This is hard for you to see, so I'm incorporating A into P just as an example of how we can make the spec hard. If it were easy to write programs that conform to arbitrary specifications, it would have been just as easy to verify arbitrary programs, because I can reduce the latter to the former by making the arbitrary program the spec of the new one.

If you want another example of how a spec can be hard, consider this one: "write a program that for every input n, an even integer greater than two, outputs the smaller of two primes whose sum is n". Do you think writing that program and proving it correct is any easier than verifying the 10-liner in my post?

Or even this: "write a program that sorts a list of integers". You know for a fact that that program has a minimal time complexity of O(n lg n). That complexity, in turn, partly determines how hard it will be proving the program correct. You know that it will be easier than "write a program that returns the first element of its input list", whose time complexity is O(1) (actually the difficulty of verification is more related to the space complexity, because the space complexity is a better approximation of the number of states the algorithm must have in order to solve the problem, unless the algorithm can be a pushdown automaton).


Sure, I know that there are some specifications P that are hard to implement (and some that are impossible!). What's not clear to me is whether the specifications we care about implementing in practice are hard to implement.


In practice it's obvious that they're hard! Look at the bugs people make and look at the effort required for seL4/CompCert (both are fairly simple systems).


Ah, but that's not my question! My question is whether the specifications we are interested in implementing in practice are hard in theory.


That's a good question and I don't think anyone has a definitive to that. I certainly don't. I think we should do empirical research and try to find out. What would be your guess and why?

I do have a some ideas, though. First, I think that the effort required for end-to-end proofs -- even for heavily simplified, relatively small programs -- does indicate that the problems are inherently hard because I don't think we're lacking some secret proof technique that would all of a sudden make things significantly easier. The difference between cases that are extremely hard (Goldbach's conjecture) and real world code doesn't seem big. Second, I think that empirical research might uncover something that can have a bigger impact. End-to-end proof is not a very important practical problem, because very few software projects ever need it or anything close to it. Improving correctness or reducing development costs (two sides of the same coin) for non high-assurance programs is much more important, and where the theory leaves much more wiggle room (because once you step away from actual watertight proof, basically anything is possible). I believe that empirical research can uncover certain common program patterns and bugs that can be analyzed/prevented, and that those may give a significant boost. But without empirical research, we won't even know which patterns/bugs to tackle.

I think that emphasis on deductive proofs is misplaced (in light of theory and practice) as it's hard to fine-tune[1], and indeed, the software verification community has been moving further and further away from that direction for a long while; since the seventies, I believe. It's the PL community that's decided this is interesting to them. Software verification is interested in many other approaches, like static analysis (that can verify some properties -- hopefully important -- some of the time), probabilistic methods, combination of formal methods and testing (like so called "concolic" testing) etc..

------

[1]: I'm referring to actual proofs with dependent types. Simpler type systems can be invaluable, if researchers only tried to figure out which properties are actually important. For example, Rust's borrow checker is extremely interesting, and tackles a class of bugs that's known to be both very harmful and easy to prevent mechanically. We should be looking for more of those. An example of what not to do if you want real-world impact, is the peculiar focus on effect systems. Other than shared state (which is a whole other matter), how did some PL researchers get it into their minds that effects are an important problem to tackle?


> That's a good question and I don't think anyone has a definitive to that. I certainly don't. I think we should do empirical research and try to find out. What would be your guess and why?

I have absolutely no idea. The entire focus of my software development philosophy for a decade or has been to make programs easier for humans to understand, not easier to write proofs about (although there's some overlap). I do wonder how big the weak but extremely useful class of properties (like Rust's memory safety) is.


This, I think, is a beautiful example of what empirical research can uncover: https://www.usenix.org/system/files/conference/osdi14/osdi14...

> almost all (92%) of the catastrophic system failures are the result of incorrect handling of non-fatal errors explicitly signaled in software.

The exception-handling code is there, it's just that no thought has been put into it. They say that their simple linting tool could have detected and prevented a good portion of those very costly bugs.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: