
Developing Provably-Correct Software Using Formal Methods - ashurov
http://www.infoq.com/news/2015/05/provably-correct-software
======
gizi
_some tools such as Verum Dezyne and Quantum Leaps can generate 100% correct
source code in C, C++, Java ..._

The input into these tools, from which they generate source code, is
therefore, the real program and the real source code. How do they prove that
this real source code, from which they generate C,C++, or Java, is "correct"?

In fact, this input just constitutes a new programming language -- a DSL --
and the tool is just another compiler or scripting engine. If existing
compilers are unprovable, this new compiler is for the same reasons unprovable
too. In other words, they did not solve the problem.

~~~
tunesmith
Because the code will literally not compile if it isn't correct. As opposed to
weaker languages where bugs (run-time problems) can exist.

I'm still something of a layman on this, and I know less about formal
specification languages like TLA+, but languages like Coq work by using things
like dependent types.

Here's an example: if you have a method that takes (through specifying type
parameters) an Int and returns an Array of Integers, and it compiles, then you
have effectively _proven_ that it is possible to take an Int and return an
Array of Integers.

This is not particularly useful to the method author that is only using that
method signature to write some more specific business logic. Dependent types
allow you to spruce up that method signature until it actually reflects your
business logic. So then you know that if it compiles, the actual intent of
your method/function is bug free.

Yes, it is true that there is a "turtles all the way down" element to this
kind of thing - "how do they prove that this real source code is correct?" The
answer is that languages like Coq rely on a small kernel of code that cannot
be proven. It's like their axiom from which everything flows. But the kernel
is small enough that people verify it with eyeballs. Bugs can still happen
there, and I think they even found one a few years ago, but it didn't have any
impact on all the mathematical proofs that have been written using Coq.

I'm sure some experts can jump all over what I just wrote but that's the gist
as I understand it so far.

~~~
schoen
Is the Coq bug you're thinking of this one, from earlier this year?

[https://github.com/clarus/falso](https://github.com/clarus/falso)

~~~
tel
That wasn't really in the core verifier exactly. It was more like an
optimization.

------
nickpsecurity
It's all definitely worth looking into. Anyone trying to start will find
Wheeler's page [1] much more helpful. I also included a book [2] on using
dependent types for certified programming. There's also a few examples after
of verifications with [3] being pretty cutting-edge in automation &
thoroughness. I think we need people to put together a single resource where
newcomers can see the various approaches, tools, and practical examples of how
to use them. Further, we need to research more on which tools give you the
most bug hunting benefit with the least time and expertise. There's some
research done already but everything in this field is scattered.

Another point came up in a similar discussion: we need to focus most of our
energy on raising the baseline of correctness for the average programmer. This
includes things such as automated test generation, design by contract, and
better type systems. The developer should be able to specify his or her
intention with the code while the type system or runtime catches most of the
errors. Maybe we just need a version of BASIC/Pascal with Ada-like safety
built-in or similar restrictions on certain constructs. Not sure of the
specifics but this is an important problem to solve.

[1] [http://www.dwheeler.com/essays/high-assurance-
floss.html](http://www.dwheeler.com/essays/high-assurance-floss.html)

[2] [http://adam.chlipala.net/cpdt/](http://adam.chlipala.net/cpdt/)

[3]
[https://www.umsec.umn.edu/sites/www.umsec.umn.edu/files/hard...](https://www.umsec.umn.edu/sites/www.umsec.umn.edu/files/hardin-
icfem09-proof.pdf)

[4]
[http://se.inf.ethz.ch/people/morandi/publications/prototypin...](http://se.inf.ethz.ch/people/morandi/publications/prototyping.pdf)

[5]
[http://compcert.inria.fr/compcert-C.html](http://compcert.inria.fr/compcert-C.html)

[6]
[http://repository.readscheme.org/ftp/papers/vlisp/guide.pdf](http://repository.readscheme.org/ftp/papers/vlisp/guide.pdf)

[7]
[http://research.microsoft.com/pubs/122884/pldi117-yang.pdf](http://research.microsoft.com/pubs/122884/pldi117-yang.pdf)

------
Wayne17
(I'm the author of the original post.) Model-checkers don’t prove that a
_program_ is correct, only that the logic in the program’s abstracted and
simplified communications/state _model_ has no inherent race conditions,
deadlocks and so forth, i.e. no typical concurrency flaws. Model-checkers
don’t “prove” correctness in the mathematical sense (and maybe I should have
written “verifiably-correct” instead of “provably-correct”). Rather, model-
checkers search for counterexamples to the hypothesis “there are no
concurrency flaws in this model”, by exhaustively tracing through all possible
distinct execution paths through the model. Amazon calls their models
“exhaustively testable pseudo-code”. If no counterexample is found, then we
can be sure the model has none of the standard concurrency flaws, and we can
be sure that the communications “housekeeping” code auto-generated from the
model will have none either. This is a limited yet extremely powerful and
valuable aspect of correctness, the hardest kind of correctness for humans to
achieve in concurrent software.

I believe software engineers can and should start routinely using model-
checking on concurrent logic, taking advantage of the machine’s ability to
race relentlessly through enormous numbers of execution paths to uncover
subtle flaws in logic. Even a small change to concurrent logic can have far-
reaching consequences that are very hard for humans to envision. I believe
competitiveness and sound engineering practice demand use of model-checkers on
concurrent logic.

------
tracker1
Okay, so the software is now provably correct.. now what were the costs in
doing so, both in additional time pre-writing definitions and other
abstractions for the software itself? Also, how does one prove that said
software actually meets the business needs behind said software?

In an academic environment, or something tied to physical devices, this
approach isn't a bad one at all... These are settings where software is much
more akin to an engineering discipline than it is a craft.

In _most_ software development, which are in any number of business
environments building/adapting or bridging custom software that will run in
any number of environments... the additional time that this will take over
current approaches will be simply unacceptable. Most people don't care if a
software is provably correct.. and for that matter, most bugs aren't critical
in nature, and don't lead to end of the world scenarios... since most systems
don't actually touch personal medical information or financial data.

I think on an intellectual level that approaches like this for some systems
make a lot of sense... The problem is rarely do you have well defined
interfaces/algorithms for well defined problems... so, you build your system
with provably correct software, then the proverbial moose's neck breaks,
because you didn't know about the moose.

~~~
Wayne17
In the context of physical systems, companies in the Netherlands including
ASML, FEI, Philips Healthcare and Ericsson, have found in the past 10 years
that applying model-checking to concurrency logic in control systems _reduces_
both costs and time-to-market while increasing reliability of the built
software. Why reductions? Because concurrency bugs are removed at the earliest
possible moment, hence they cost orders of magnitude less than if they were
found during testing or (worse) at customer sites.

