
Coq to Rust Program Extraction - kushti
https://github.com/pirapira/coq2rust
======
harveywi
Most of the action happens in this file:
[https://github.com/pirapira/coq2rust/blob/rust/plugins/extra...](https://github.com/pirapira/coq2rust/blob/rust/plugins/extraction/rust.ml)

I have considered attempting to write a few code extraction plugins for Coq,
but I took a long look at the extraction plugin for Scala
([https://bitbucket.org/yoshihiro503/coq2scala/src/1657d65c747...](https://bitbucket.org/yoshihiro503/coq2scala/src/1657d65c747bb9747a4ec32b3da36464631bcd18/coq-8.3pl2/plugins/extraction/scala.ml?at=master&fileviewer=file-
view-default)) and lost all courage.

------
kartD
Excellent idea in theory. Translating a formally verified system to actual
application code is possibly the next frontier of system critical programming.
A verified system along with a memory safe concurrent language should be able
to avoid a significant set of common bugs. I hope to see some industry
adoption.

~~~
nickpsecurity
It's been going on in safety-critical field with groups like Rockwell-Collins
and Eschersoft extracting to Ada/SPARK. Rockwell also compiles formally-
verified CPU, the AAMP7G. My proposal was to extract to SPARK, Rust, C, and
Java simultaneously since all the best static verifiers are for those. What
one misses other will likely catch.

~~~
kartD
Sounds awesome! What kind of equipment is this done for (if you're comfortable
answering) ?

~~~
nickpsecurity
I work outside defense industry on purpose. No discomfort at all. ;) The
remains of high-assurance safety and security fields are mostly used for
defense (esp guards), aerospace, trains, private high-security, some
industrial control (esp energy), some space, some medical, and plenty
academics. Far as this conversation, here's an example of what I was talking
about that mentions many others:

[https://www.cl.cam.ac.uk/~mjcg/cambridge-wilding-
sept08.pdf](https://www.cl.cam.ac.uk/~mjcg/cambridge-wilding-sept08.pdf)

Specifics on how they do EAL7 CPU and crypto here:

[http://www.ccs.neu.edu/home/pete/acl206/slides/hardin.pdf](http://www.ccs.neu.edu/home/pete/acl206/slides/hardin.pdf)

Eschersoft's tools below. They also have C analyzer.

[http://www.eschertech.com/products/pd_and_spark.php](http://www.eschertech.com/products/pd_and_spark.php)

Specware from Kesterel Institute, like lots of their stuff, is pretty cool but
I think efficiency dictates manual derivation of code. Gets you close to it,
though.

ftp://ftp.kestrel.edu/pub/papers/specware/specware-jm.pdf

Esterel's SCADE w/ SPARK Ada below. Has huge list of application areas. I
clearly forgot automotive in applicatino areas.

[http://www.sigada.org/conf/sigada2003/SIGAda2003-CDROM/SIGAd...](http://www.sigada.org/conf/sigada2003/SIGAda2003-CDROM/SIGAda2003-Proceedings/Vendor-
Esterel-SCADE-Suite-for-Ada.pdf)

Note: There's also been hardware synthesis from PVS and HOL specs. Most cool
stuff like that is hard to get legally due to ACM/IEEE/Spring BS. I mention it
so you know it's been done. I even saw one manual, formal verification of
standard cells themselves. Getting close to the bottom. If my brain was whole,
I'd be pulling out solid-state, physics books & doing HOL models of process
nodes' molecules to get a jump on their asses lol.

~~~
kartD
Cool stuff, you're work sounds really cool.

~~~
nickpsecurity
That's others work. I dont do formal methods. Glad you enjoyed it though.

------
eggy
I wonder if it helped at all that Rust was originally written in OCaml and Coq
is written in OCaml.

Idris produces C as one of its outputs, but this is producing similar proof
code, not implementation here? I am trying to understand the secondary usage
other than the primary of correspondence.

~~~
tedks
Well, very indirectly, in the sense that OCaml hackers will think alike to
some extent, and all languages involved have strong types. But otherwise not
very much.

This is producing implementation code. Basically, this is program synthesis
via dependent types. See
[http://adam.chlipala.net/cpdt/](http://adam.chlipala.net/cpdt/) for a deep
dive into this concept; program extraction is introduced in the first chapter
past the introduction.

This is possible because of a false dichotomy in your post: thanks to the
Curry-Howard isomorphism, proof IS implementation.

~~~
alimw
> This is possible because of a false dichotomy in your post: thanks to the
> Curry-Howard isomorphism, proof IS implementation.

From your own reference (section 1.3), "a correctness proof for a program [may
have] a structure that does not mirror the structure of the program itself".

~~~
tedks
In context:

> Nonetheless, almost any interesting certified programming project will
> benefit from some activity that deserves to be called proving, and many
> interesting projects absolutely require semi-automated proving, to protect
> the sanity of the programmer. Informally, proving is unavoidable when any
> correctness proof for a program has a structure that does not mirror the
> structure of the program itself. An example is a compiler correctness proof,
> which probably proceeds by induction on program execution traces, which have
> no simple relationship with the structure of the compiler or the structure
> of the programs it compiles. In building such proofs, a mature system for
> scripted proof automation is invaluable.

What this means is that the proof of the compiler's correctness will often
look different from the implementation of the compiler, and the proof language
can and should have different structure and produce different output.
Isomorphic doesn't mean identical.

But, this section also doesn't refer to program extraction. That's introduced
later. In the program extraction world, rather than having a program, and a
proof of that program, you start with the proof, encode the program behavior
into the proof, and extract the program mechanically from the proof. Keep
reading and you'll see how :-)

Edit: Even more context, for those unfamiliar with coq:

> In comparisons with its competitors, Coq is often derided for promoting
> unreadable proofs. It is very easy to write proof scripts that manipulate
> proof goals imperatively, with no structure to aid readers. Such
> developments are nightmares to maintain, and they certainly do not manage to
> convey "why the theorem is true" to anyone but the original author. One
> additional (and not insignificant) purpose of this book is to show why it is
> unfair and unproductive to dismiss Coq based on the existence of such
> developments.

Coq has a proof language that's very powerful, and very rich, but can also be
very obtuse at times. Because a single proof statement might discharge a
number of proof goals without really making it clear how, and because you can
write your own proof "tactics" (functions/macros, essentially), you can write
a proof that looks totally different from the program you could extract from
it. Nevertheless, the fact that you can extract the program proves that the
information required to construct the one gives you the other -- they're
isomorphic. (We currently lack the tools to go from a correct program to the
proof of the same, but this is a statement about ourselves, not about the
correct program, modulo incompleteness.)

~~~
alimw
I skimmed the subsequent chapter to its end where program extraction comes up.
It is invoked on a function called tcompile, defined on page 35. Meanwhile the
theorem about the correctness of tcompile, called tcompile_correct, has only
been stated on p.36 and concluded on p.37.

The program extracted from tcompile is surely independent of its proof of
correctness tcompile_correct. I suppose that by the Curry-Howard isomorphism
there must be some program corresponding to tcompile_correct, but it's
probably not a compiler...

We may be talking at cross-purposes :)

~~~
tedks
I'm afraid I don't know enough about dependent types and PL theory to really
explain this well, but bear in mind that you're on page 35. You can use
program extraction on a proof itself and yield an executable program. You can
use program extraction on anything in Coq, so if you use it as an
implementation language you can compile it down to OCaml (or Rust now!) after
proving your implementation correct.

I suppose it's possible that there are only some programs and proofs amenable
to being written that way, but also suspect that's more due to the human
difficulty than anything else.

~~~
alimw
Thanks but I only wanted to point out that there is a real dichotomy between
an implementation and its proof of correctness, even up to isomorphism. But I
maybe misunderstood what was being discussed!

------
cpeterso
The seL4 microkernel generates C code from a formally-verified "executable
specification" written in Haskell. They could instead generate Rust code that
would extra typechecks and might be closer to the Haskell specification.

[https://en.wikipedia.org/wiki/L4_microkernel_family#High_ass...](https://en.wikipedia.org/wiki/L4_microkernel_family#High_assurance:_seL4)

~~~
hga
As far as I know, _The seL4 microkernel_ team _generates C code_ by hand _from
a formally-verified "executable specification" written in Haskell._

Which is then verified by analyzing the binary code and linking it back to the
spec.

