
CompCert – A formally verified C compiler - cjg
http://compcert.inria.fr/
======
ttul
So when you implement a buffer overflow in your C program, you can be assured
it will overflow in the manner specified.

~~~
pslam
It's just one part of the tooling, and doesn't solve program-correctness on
its own. A formally verified toolchain is only really useful for projects
which have pervasive mitigations against these kinds of errors. In other
words, if you have a code base which has been verified to a high standard, you
would also want a toolchain verified to a high standard.

In an ideal world you'd have a formally verified toolchain for a language
without as many deficiencies as C, but here we are.

~~~
pjmlp
Yep, where all C compilers would be Frama-C compliant, integrated into the
CI/CD, and all OSes would be targeting CPUs with memory tagging like Solaris
on SPARC ADI, but oh well.

------
als0
One of the really nice outcomes of this approach is that it allows you to use
different optimisation levels.

Other approaches try to bypass the compiler by verifying the generated binary
against the semantics of the source code (treating the compiler itself as a
black box). The major drawback is that you had to completely disable all
optimisation.

~~~
monocasa
> Other approaches try to bypass the compiler by verifying the generated
> binary against the semantics of the source code (treating the compiler
> itself as a black box). The major drawback is that you had to completely
> disable all optimisation.

sel4 verfies the binary against the specification, _and_ lets you use
optimization. Once you've extracted the flow graphs from the binary, following
the optimizations isn't difficult.

~~~
sanxiyn
Yes, AutoCorres works, but it is somewhat user hostile, even for these tools,
or compared to CompCert. CompCert looks and feels like a compiler, AutoCorres
isn't.

[https://github.com/seL4/l4v/tree/master/tools/autocorres](https://github.com/seL4/l4v/tree/master/tools/autocorres)

------
herodotus
In 1983, Ken Thompson gave a Turing Award talk in which he shows how to embed
a backdoor into a compiler in such a way that it would not be visible, even if
you had access to the compiler source.
([http://delivery.acm.org/10.1145/1290000/1283940/a1983-thomps...](http://delivery.acm.org/10.1145/1290000/1283940/a1983-thompson.pdf))
I wonder if the verified compiler could be altered in the same way? (A snippet
from the paper so you get the idea:

>We compile the modified source with the normal C compiler to produce a bugged
binary. We install this binary as the official C compiler. We can now remove
the bugs from the source of the compiler and the new binary will reinsert the
bugs whenever it is compiled. Of course, the login command will remain bugged
with no trace in source anywhere.)

~~~
fulafel
As long as the compiler system is self-bootstrapping. I don't know if this is
the case with CompCert C. Probably not becuse of the ML compiler tech
involved, which usually doesn't bootstrap on C.

But aside from these incidental properties, yes the same thing could be done.

(This still leaves a large class of related, less recursively symmetric
integrity attacks that could be made without the language fully bootstrapping
itself, of course)

~~~
nickpsecurity
There's a new extractor that can output C from Coq. There's also multiple
compilers for each language you can extract to. My solution was to do diverse
compilation where multiple compilers, OS's, and ISA's compiled the compiler.
Then you just run it through itself on each one. Use end result that most
agree on.

~~~
unboxed_type
Are you talking about CertiCoq project? It is not ready yet and not clear when
will be.

[1]
[https://www.cs.princeton.edu/~appel/certicoq/](https://www.cs.princeton.edu/~appel/certicoq/)

~~~
nickpsecurity
Talking about this:

[https://staff.aist.go.jp/tanaka-akira/succinct/slide-
pro114....](https://staff.aist.go.jp/tanaka-akira/succinct/slide-pro114.pdf)

They extract from Coq to C. Like in my Brute-Force Assurance concept, that
would let us confirm and supplement the formal verification by hitting the C
output with every V&V tool for C programs that we have on hand.

------
fuklief
Does anyone in the industry use it ? Does anyone knows for sure that Airbus
uses it ?

~~~
jjrh
[https://www.absint.com/](https://www.absint.com/) is the company that sells
licenses/support and they advertise:

> For two decades now, Airbus France has been using our tools in the
> develop­ment of safety-critical avionics software for sever­al air­plane
> types, including the flight control soft­ware of the A380, the world’s
> largest passenger air­craft.

[https://www.absint.com/success.htm](https://www.absint.com/success.htm) More
customers listed here.

~~~
jng
I have no idea about this specific case, but this kind of claim can be very
misleading, while remaining true. Maybe they used the tooling once for an
offline script of some sort, and that is all.

~~~
fmap
It's not misleading in this case. It's used in practice.

You have to understand that CompCert's main competition in this space was an
ancient version of GCC without _any_ optimizations. This is mostly an issue of
certification. CompCert got the same certification and is already a great
improvement just by virtue of having a register allocator...

~~~
jng
Good to know, thanks for explaining.

A register allocator! :)

------
kazinator
> _Such verified compilers come with a mathematical, machine-checked proof
> that the generated executable code behaves exactly as prescribed by the
> semantics of the source program._

Which semantics? The ISO C semantics is rather lacking; "undefined behavior"
lurks around every corner.

A C compiler that correctly implements all ISO C requirements is a fine thing,
and certainly better than one which gets some of that semantics wrong, but
doesn't achieve all that much in the big picture.

If the verified compiler actually provides extended semantics beyond ISO C
that eliminates much of the undefined behavior, such that programmers using
this dialect can rely on a safer language, then we're talking.

~~~
chrisseaton
> doesn't achieve all that much in the big picture

Bizarre way to talk about award-winning breakthrough research in applying
verification to large practical programs.

~~~
kazinator
Don't mean to say that the technique isn't valuable; just the particular
application (verified C compiler) not as much as other potential applications
(verified ... almost anything else).

------
User23
Sadly even a proved correct C compiler can produce incorrect behavior in a
real system:
[https://blog.regehr.org/archives/482](https://blog.regehr.org/archives/482)

------
jacquesm
How does a project like this deal with the bootstrap problem?

~~~
dranov
CompCert doesn't do this, but CakeML, a verified compiler for a variant of ML
bootstraps itself _in the logic_ :

> A unique feature of the CakeML compiler is that it is bootstrapped “in the
> logic” – essentially, an application of the compiler function with the
> compiler’s source implementation as argument is evaluated via logical
> inference. This bootstrapping method produces a machine code implementation
> of the compiler and also proves a theorem stating functional correctness of
> that machine code. Bootstrapping removes the need to trust an unverified
> code generation process. By contrast, CompCert first relies on Coq’s
> extraction mechanism to produce OCaml, and then relies on the OCaml compiler
> to produce machine code.

For details, see section 11 "Compiler Bootstrapping" in
[http://www.cs.cmu.edu/~yongkiat/files/cakeml-
jfp.pdf](http://www.cs.cmu.edu/~yongkiat/files/cakeml-jfp.pdf)

~~~
sansnomme
Does this guard against trusting trust attacks?

~~~
dranov
I might be mistaken, but my understanding is that it does not. You still need
to trust the implementation of the logic. But if you don't trust that, you
wouldn't trust the compiler correctness proof anyway, so bootstrapping in the
logic does give a benefit.

~~~
sansnomme
So Coq basically have to be correct and trusted for this to be secure?

~~~
smallnamespace
Note that Godel's Incompleteness Theorems tell us that formal systems with
sufficient power cannot 'prove themselves' in some sense, and we see that
rather convincingly:

1\. Implementations can be wrong, so we want a formal proof system

2\. The proof system can be wrong, so we want to apply the proof system to
itself

3\. But the proof system may be wrong (either in implementation or
specification) in a way such that its self-proof is actually invalid

At some point, we want to assert some property (completeness, correctness) of
the system as a whole, but whenever we augment the system to allow this, that
augmented system's own properties now come into question.

~~~
ChrisLomont
Logic proving systems need never hit Godel incompleteness. Many systems weaker
than Godel are proven consistent within themselves, and are strong enough to
verify code.

Godel’s Proofs rely on statement encoding using properties of integers not
required for these logic systems.

Presburger Arithmetic, for example, avoids Godel incompleteness, yet provides
a decent system for a lot of work.

------
m4r35n357
See limitations in the docs.

------
marinintim
Going to great lengths to avoid writing Rust.

~~~
stcredzero
Would the world be a better place if Linux, the BSD kernel behind OS X, and
the Windows kernel were all rewritten in Rust? (I don't think it would be
dramatically better overnight.)

~~~
c1yd3i
It would be better overnight. But it would be even better if we had one NT
like kernel and a new OS built from scratch with a browser (webkit) based ui.

~~~
stcredzero
_a new OS built from scratch with a browser (webkit) based ui_

So basically, ChromeOS rewritten in Rust? Why not a Servo based UI?

~~~
cjmaria
The hard part is the NT-like kernel in Rust, not the web-based UI. ChromeOS
uses the linux kernel.

