
A Hoare Logic for Rust - steveklabnik
http://ticki.github.io/blog/a-hoare-logic-for-rust/
======
Animats
Nice. It makes sense to try to formalize the semantics of Rust's new
intermediate representation, MIR. It's much easier to get unambiguous
semantics at that level. All name issues such as shadowing are gone, type
issues have been resolved, and operations are machine-level unambiguous (not
"+", "32 bit unsigned add").

An old project I worked on, the Pascal-F verifier,[1] from the early 1980s,
worked in a similar way. We took the output from the first pass of the
compiler, which was something like Java bytecode, and verified from that.

I found some old 9-track tapes of the sources recently, and I'm going to run
them through a data recovery service in Morgan Hill next week and see if I can
bring the system back to life. It's a historical curiosity, but should be fun
to play with on today's machines. It was just too slow in 1982. Meanwhile, I
brought the original Boyer-Moore theorem prover back to life and put it on
Github.[2]

Some of our lessons learned were:

1\. Integrate the verification statements into the programming language. Don't
try to do it with comments. You want the language's regular syntax, type and
variable checking to apply to the verification statements. We added the
keywords ASSERT, ENTRY, EXIT, MEASURE, STATE, EFFECT, INVARIANT, DEPTH, and
PROOF, EXTRA, and RULE.

2\. Make verification program-like, not math-like. For example, if we wanted
to verify the structure of a tree, we would add fields for a back-pointer and
a tree depth to the data structure each node. We'd put in all the code to
maintain them, and verify that the forward pointers and back pointers were
consistent and that child objects always had a greater depth than their
parents. This would all be done by writing ordinary-looking code with ASSERT
statements. But the new data fields would have the EXTRA attribute, and the
code manipulating those fields would have PROOF in front of it. This meant it
was there only for verification purposes. No executable code could depend on
that data, so it could all be stripped out for execution.

3\. Use two theorem provers. One prover was an Oppen-Nelson decision
procedure. This is fully automatic proving for integer +, -, multiplication by
constants, inequalities, conditionals, logic operators, structures, and
arrays. This subset of mathematics is decidable and there's a fast, complete
decision procedure for it. That knocks off all the easy stuff automatically.
Easy stuff usually includes subscript checks and overflow checks. The second
prover was the Boyer-Moore prover, which is semi-automatic; you have to
propose lemmas to help it along. This is hard. By using ASSERT statements to
narrow the area of trouble, you could reach the point where you had two
successive ASSERTs, one of which should imply the other, but the Oppen-Nelson
prover couldn't prove it and there was no previously proved rule on file to
cover it. But now the problem had been narrowed to an abstract mathematical
problem - prove the second ASSERT from the first. Someone could work on that
in the Boyer-Moore prover and then export the rule for use in the main system.
This provided a separation of functions - you could have mathematically
oriented people to struggle with the theorem prover, independent of the code.
You could reuse that rule elsewhere, and change other code without having to
re-prove it. Today, we'd put files of useful theorems on Github. We never let
the user add "axioms". That opens a huge hole.

4\. Expect to do a lot of compiler-type analysis and bookkeeping as part of
the verification process. For example, the static analysis to determine that a
function is pure ("pure" means x = y implies f(x) = f(y), and no side
effects.) is something to do as a routine compiler operation. Potential
aliasing has to be discovered. Do this using compiler techniques; don't dump
it into the theorems-to-be-proved mill, where it's much harder to give the
programmer good error messages.

5\. Don't fall in love with the formalism. Too much work in verification has
been done by people who wanted to publish math papers, not kill program bugs.
This is about creating bug-free code. I think people will get this now, but
when I was doing this, everybody else involved was an academic.

[1]
[http://www.animats.com/papers/verifier/verifiermanual.pdf](http://www.animats.com/papers/verifier/verifiermanual.pdf)
[2] [https://github.com/John-
Nagle/nqthm/tree/master/nqthm-1992](https://github.com/John-
Nagle/nqthm/tree/master/nqthm-1992)

~~~
rando832
> Nice. It makes sense to try to formalize the semantics of Rust's new
> intermediate representation, not MIR.

I'm no expert, but according to this: [https://blog.rust-
lang.org/2016/04/19/MIR.html](https://blog.rust-lang.org/2016/04/19/MIR.html),
MIR /is/ Rust's new intermediate representation.

~~~
Animats
Oops, put an unwanted "not" in there. Yes, formalize MIR. Don't try to
formalize the source language.

------
nickpsecurity
Nice description and first steps. Ideal moves would probably be the following:

1.Modifying front-end for Frama-C flow or similar tool to take a subset of
Rust with specs, generate the VC’s, and feed them to supported provers. He
seems to be doing something similar but Im sure there’s some good tools to
build on.

2\. For manual and high-assurance, embed Rust into Simpl/HOL that seL4 used.
Do an AutoCorres tool with that. You now have ability to pull similar effort
with translation validation to machine code. Next, extend COGENT to generate
Rust subset Simpl supports. You can now do, like their C example, a whole
filesystem functionally that outputs verified Rust or machine code. Optionally
extend the COGENT tool with QuickCheck, QuickSpec, etc esp where tests/specs
can translate to Rust to. Quite a foundation for other Rust developments to
build on. Including redoing Rust compiler in COGENT at least for certified,
reference version. ;)

------
cbHXBY1D
Very cool. I've long that that having formal program verification built in to
a language and run as part of the compiler is the way forward. Giving the pre
and post-conditions for a function is much more reasonable than dropping down
to a verification tool like Coq or TLA+. I know that AWS uses TLA+ to reason
about the correctness of algorithms.

~~~
Jtsummers
Design by Contract:
[http://c2.com/cgi/wiki?DesignByContract](http://c2.com/cgi/wiki?DesignByContract)

Which is a key element of the Eiffel language (
[https://en.wikipedia.org/wiki/Eiffel_(programming_language)](https://en.wikipedia.org/wiki/Eiffel_\(programming_language\))
).

SPARK Ada also includes this concept of pre/post conditions:
[https://en.wikipedia.org/wiki/SPARK_(programming_language)](https://en.wikipedia.org/wiki/SPARK_\(programming_language\))

I haven't programmed in the latter, but I'd like to. Since I'm often in the
maintenance end of the software cycle I don't have much opportunity
(professionally) to introduce new languages (new tools, yes, but not new
languages).

I have a coworker who _thinks_ we do this with asserts (in C and C++), but
it's only half the battle. The asserts only tell us that our exercised version
of the program (through our non-comprehensive testing) hasn't failed pre/post-
conditions. It's helpful, but not sufficient.

~~~
catnaroek
Design by contract has nothing to do with formal verification, though.

~~~
Jtsummers
Respectfully, I have to disagree. Design by contract is, or can be, a
component of formal verification. If it's all you've got, it's probably better
than nothing.

Design by contract brings the concept of using preconditions, postconditions,
invariants, and other contractual objects into the design and implementation
of the software. These are able (depending on implementation) to be analyzed
statically by machine or by hand, or dynamically during testing (with some
systems the tests can be automatically generated to exercise the contracts).
This may not be the limit or ideal of formal verification, but it is certainly
a part of it and a good measure forward compared to the typical approaches
used in our industry.

EDIT:

Let's settle a pertinent question first: Is Hoare logic a tool/approach for
formal verification? If it is, then design by contract (being largely based on
the concept of Hoare logic) is also a tool/approach for formal verification.

The potential points of disagreement, as I see them, are that Hoare logic is
_not_ a tool for formal verification, or that design by contract is not based
on Hoare logic. Is there something else that I've missed or are these the
points of disagreement?

~~~
catnaroek
Design by contract forces you to be explicit about your _intended_
preconditions and postconditions. But “intended” isn't the same thing as
“actual”. _Verification_ is making sure that the preconditions and
postconditions _actually_ hold, in every possible case where the statement (or
procedure or whatever) could be reached.

EDIT: Hoare logic is a tool for formal verification, indeed. DbC is not,
though.

~~~
Jtsummers
We're talking around each other. What is your definition of formal
verification?

~~~
catnaroek
Proving that a program meets its specification.

~~~
Jtsummers
Is Hoare logic a tool in that process?

~~~
catnaroek
Yes. Hoare logic's inference rules define which Hoare triples can be “legally”
derived. The programmer's task isn't only to provide a Hoare triple (which by
itself is just an assertion that a program meets a specification), but also to
provide a _derivation_ of that triple (the actual proof of the assertion).
Design by contract is akin to giving a Hoare triple, but never giving its
derivation.

~~~
Jtsummers
Design by contract, is it based or not based on Hoare logic?

~~~
catnaroek
It's not. In DbC, you have preconditions and postconditions, but no means
whatsoever to actually prove that, whenever the procedure's initial state
meets the precondition, its final state will meet the postcondition.

~~~
naasking
> In DbC, you have preconditions and postconditions, but no means whatsoever
> to actually prove that, whenever the procedure's initial state meets the
> precondition, its final state will meet the postcondition.

Except many implementations of design by contract execute the postcondition
check and raise an exception if it's not satisfied, so I don't think your
statement is correct.

~~~
catnaroek
The very reason why you need to raise the exception is because the
postcondition _doesn 't_ hold. The point to verification is to make sure the
postcondition _holds_. So I stand by my original assertion: DbC isn't
verification.

~~~
nickpsecurity
If I were to categorize it, I'd call DbC _formal specification_ that, combined
with Eiffel or SPARK tools, can be used to verify that interfaces or code
maintain those specific properties. It's a partial form of formal
specification and verification that's still useful.

It doesn't have to be fully specified or proven to count as formal
verification. Partial verification is a thriving field.

~~~
catnaroek
Yep, that's perfectly reasonable.

------
agentultra
Very nice! Is work being done to produce a formal, verifiable specification?

 _edit_ : I'm particularly interested in formal specifications and their use
in open source software.

~~~
dikaiosune
I know of a few efforts to better specify Rust in addition to ticki's (OP):

* [https://github.com/nikomatsakis/rust-memory-model](https://github.com/nikomatsakis/rust-memory-model) * [http://plv.mpi-sws.org/rustbelt/](http://plv.mpi-sws.org/rustbelt/) * [https://kha.github.io/2016/07/22/formally-verifying-rusts-bi...](https://kha.github.io/2016/07/22/formally-verifying-rusts-binary-search.html)

The first isn't using formal methods AFAIK, the second is, and the third is as
well, but I think they're targeting safe code only.

~~~
weinzierl
Also the Rust compiler had Typestate analysis at one point of its development.
[1]

[1]
[http://marijnhaverbeke.nl/talks/rustfest_2016/#4](http://marijnhaverbeke.nl/talks/rustfest_2016/#4)

~~~
pcwalton
Typestate is basically all subsumed by the uniqueness system. The two systems
are equivalent, so in a sense Rust still has typestate.

------
andrewflnr
This is a pretty good general introduction to Hoare Logic.

------
taktoa
Tony Hoare meets Graydon Hoare :^)

~~~
weinzierl
I always wondered if they are related.

~~~
brson
They are not.

------
Ericson2314
Heh, just realized my [https://github.com/Ericson2314/a-stateful-mir-for-
rust](https://github.com/Ericson2314/a-stateful-mir-for-rust) is a lot more
like separation logic than I knew.

------
tudelo
Very broken for me on Chrome. All the equations are randomly overlapping
words.

(Version 53.0.2785.116 m (64-bit))

~~~
lifthrasiir
Probably your connection to cdn.mathjax.org is somehow disabled or blocked.

