Translation of Rust's core and alloc crates to Coq for formal verification

thaliaarchi · 2024-05-15T07:36:23

That's really impressive.

Automatic translation like this shifts the trust to the tool. coq-of-rust itself is written in Rust, not in Coq. The recursive nature is somewhat boggling, but I think a proof of its correctness is possible in a similar process to David A. Wheeler's “Countering Trusting Trust through Diverse Double-Compiling” (2009) [0] (which circumvents Ken Thompson's Trusting Trusting attack by using a second compiler), but with a mix of a CompCert approach.

To verify it, you'd use coq-of-rust to convert the coq-of-rust translator from Rust to Coq. That translation is not trusted, because it was performed in Rust, but it doesn't matter. Once in Coq, you prove the desired correctness properties—crucially, that it preserves the semantics of the Rust program when it translates a program to Coq.

As in the article, it is likely easier to work with more functional definitions in proofs instead of generated ones, so you'd undertake the same process as they do with the stdlib of proving equivalence between definitions. Since the current line count for the coq-of-rust translator (specifically, lib/ [1]) is 6350 lines of Rust, it even seems feasible to write a full translator in Coq and prove its equivalence to the generated one.

Then, you execute the proven-correct Coq coq-of-rust translator on the Rust source of the coq-of-rust translator. The Coq definitions it outputs should match the output of the Rust coq-of-rust translator that you started with.

As an aside, it's nice to see industry funding for work like this. I'm often cynical of cryptocurrency, but its correctness constraints really push for improvements in areas I like (Rust, Coq, funding for masters students I know, etc.).

[0]: https://dwheeler.com/trusting-trust/wheelerd-trust.pdf

[1]: https://github.com/formal-land/coq-of-rust/tree/main/lib

clarus · 2024-05-15T07:55:33

Thanks for the comment! One of the authors here.

Indeed this would be a nice process to verify coq-of-rust. Also, although the code is rather short, we depend on the Rust compiler to parse and type-check the input Rust code. So that would need to be also verified, or at least formally specified without doing the proofs, and the API of rustc is rather large and unstable. It could still be a way to get more insurance.

thaliaarchi · 2024-05-15T08:38:26

I didn't touch on that, but I did assume trust of the Rust toolchain, verifying starting at THIR. Verifying rustc would be a monumental undertaking, though I think some people are working on it.

Since we don't have a verified rustc (a la CompCert [0]), I wonder if an approach like the translation validation of seL4 [1] would work. They prove that the artifact (ARM machine code) produced by an existing compiler (gcc) for a chosen program (seL4) matches the source semantics (C). Thus you could circumvent trusting rustc, but it only works to verify a specific output of a chosen program. If the chosen program was coq-of-rust, I don't think this would be easier than the approach I detailed above. The seL4 kernel is 9,500 lines of C, while their Isabel/HOL specification is over 200,000 lines, so the technique doesn't seem to scale to a large chosen source like rustc.

Isn't bootstrapping fun?

[0]: Xavier Leroy. 2008. “Formal verification of a realistic compiler”. https://xavierleroy.org/publi/compcert-CACM.pdf

[1]: Thomas Sewell, Magnus Myreen, and Gerwin Klein. PLDI 2013. “Translation Validation for a Verified OS Kernel”. https://sci-hub.st/10.1145/2491956.2462183

deredede · 2024-05-15T09:15:42

Since you contrast the two approaches you might be interested in learning that CompCert also uses translation validation in some part of the compiler (notably for register allocation), see Jean-Baptiste Tristan's papers with Xavier.

deredede · 2024-05-15T08:37:34

Is there a Coq formalisation for enough of Rust that would be usable here? I thought people were still figuring that out.

clarus · 2024-05-15T10:04:43

The formalization work for Rust was done mostly at the MIR level, which is one step lower than the THIR level we use here. See, for example, the https://plv.mpi-sws.org/rustbelt/ project. MIR should be more amenable to formal specification, as this language is more compact than THIR and aims to be more stable.

However, we also lose some information going to MIR, as there are no expressions/loops from the original code anymore. There are still ways to reconstruct these, but we preferred to use the THIR representation directly.

deredede · 2024-05-15T18:14:37

That makes sense and fits what I had in mind: there is no formalization at the source level. Thanks for the details!

Gajurgensen · 2024-05-15T18:23:48

Very interesting work! I'm curious how you handle loops/recursion? I imagine the `M` monad seen in the examples has a special primitive for loops?

clarus · 2024-05-15T22:31:45

Yes there is a special primitive for loops in the monad. The primitives are uninterpreted, we define valid finite traces of execution of a program, and reason about these traces.

If a program has a loop we show that it terminates by constructing an execution trace. Note that we do not yet consider concurrency, so programs are deterministic.

noneeeed · 2024-05-15T08:13:22

It reminds me of when I used to work in SPARK Ada. On a number of projects where there was no supported Ada target (especially very small devices), it would be converted to C and then compiled for the target. That allowed us to perform the various forms of static analysis in the SPARK land.

It obviously introduced issues around verifying the output or the transpiler, but I think the resulting C code was quite readable and limited in style for verification purposes, and the tools had a high degree of trust.

The SPARK static analysis was only ever a part of the whole verification and validation process and there was plenty of testing and other activities that provides additional levels of trust.

The whole approach seemed to work pretty well.

im3w1l · 2024-05-15T09:52:45

> As an aside, it's nice to see industry funding for work like this. I'm often cynical of cryptocurrency, but its correctness constraints really push for improvements in areas I like (Rust, Coq, funding for masters students I know, etc.).

That's part of it, but another part I think is that crypto currencies simply transferred a lot of wealth to people interested in CS (and a wouldn't-it-be-cool-if mindset in general). And they are spending that money on CS research not just because it benefits them but because it's philanthropy aligned with their hobby.

epolanski · 2024-05-15T11:24:51

You reminded me of this blog post by Stephen Diel:

https://www.stephendiehl.com/posts/crypto.html

hedora · 2024-05-15T15:40:40

I’m not sure how that approach prevents the rust compiler that’s used to build binaries from injecting its malicious payload. (You could build A with B and B with A, and diff the binaries, I guess).

Another approach is proof carrying code: The rust compiler would emit a coq proof that its executable output matches the semantics of the source code input.

Of course, you could also write a compiler of a subset of rust (perhaps without the borrow checker or optimizations) in machine code for one architecture, then bootstrap everything with that.

thaliaarchi · 2024-05-15T20:02:28

A compiler that injects backdoors in targeted programs and self-propagates the meta-backdoor (to avoid detection in the source) is exactly the trusting trust attack and it can be mitigated by diverse double-compiling (paper linked above). It requires a second compiler and we have mrustc, a Rust compiler in C++ built specifically for circumventing the unverified bootstrap chain of rustc.

The process is: Compile mrustc with a C++ compiler. Compile rustc sources with untrusted rustc binary and compile rustc sourcs with mrustc (these have identical behavior, but different codegen). Compile rustc sources with rustc-by-rustc and compile rustc sources with rustc-by-mrustc (these will have identical behavior and codegen). Those will match. If you compile once more, they will match. Since mrustc is never compiled by rustc, such a backdoor would have to also exist in gcc/clang and propagate with exactly identical behavior in mrustc. The process could be repeated for gcc/clang.

pron · 2024-05-15T11:54:37

The size of programs (as opposed to libraries, which could be easier because their code is not nearly as self-interacting) that has been end-to-end verified with semi-automated deductive proof systems like Coq is quite small. In fact, the size of programs that can be verified in this way has grown at a slower pace than the growth in the average size of programs overall. While sound software verification (i.e. a 100% conformance to a spec) certainly has its place (usually to prove the correctness of some core algorithms), it doesn't scale well, which is why there's been a shift in research into unsound methods because the cost of 100% assurance can be 10x that of 99.9999% assurance, and the difference in probability may fall below that of non-software failures (after all, a physical system cannot be proven to conform to a spec) or the probability of a spec that doesn't conform well enough to reality.

pcwalton · 2024-05-15T21:21:52

The point is that, by proving the unsafe parts of the standard library correct and Rust's safety guarantees correct, we transitively prove the memory safety (as well as data race freedom, etc.) of all safe Rust code (that uses only the standard library). Only proving the unsafe code correct is far less effort than proving all Rust code correct would be. This addresses the common criticism of "what about unsafe code?" when people talk about Rust's safety guarantees. We can use a more powerful system, like Coq, to fill in the gaps that the Rust type system itself isn't powerful enough to handle.

pron · 2024-05-15T22:45:22

Sure (assuming Rust's type system were sound, which, I believe, it isn't yet). But my general point -- which isn't at all specific to Rust -- is that with deductive methods, sometimes the effort required may not be worth the added assurance probability they give over unsound methods, while in other cases, deductive methods may be cost-effective. The question of cost-effectiveness is one that always needs to be asked in software assurance.

The level of confidence that software requires (as opposed to an abstract algorithm) is not that much higher than one minus the probability of an "unaccounted" failure (such as a hardware failure). Sometimes the extra confidence beyond that point is free, and sometimes it can be very expensive.

pcwalton · 2024-05-16T01:21:29

Rust's type system is sound. There are some known bugs in the compiler, however.

saagarjha · 2024-05-16T11:58:59

Has this been verified somewhere?

weinzierl · 2024-05-15T08:19:18

What I don't quite understand with these efforts is this: If we have to translate the code to Coq manually or semi-manually, isn't the likelihood that we make mistakes there much higher than what we ultimately gain with the formal verification. In other words, how do we know that what we proof is still valid for the original code?

muldvarp · 2024-05-15T08:41:29

> In other words, how do we know that what we proof is still valid for the original code?

We don't. We will always have to trust _something_ (the hardware, the compiler, the specification, Coq's trusted kernel, ...). Formal verification is sadly often discussed as removing the possibility of bugs in their entirety. In practice, formal verification only makes bugs much less likey.

That said, yes, automatic translation between Rust and Coq massively reduces the complexity of the things you have to trust and should therefore be preferred over manual translation.

epolanski · 2024-05-15T11:30:15

Every tool can only guarantee absence of some categories of bugs.

The most common ones, business logic, often escape verification simply because of wrong assumptions.

Simple example: some websites use the browser's default language to set your locale, some use your ip. Thus I travel to Poland and suddenly YouTube shows me Polish content and ads (this can be changed often, but not always, see Google docs idiocy). They will all unavoidably lead to business logic bugs because the assumptions are wrong in the first place.

jlokier · 2024-05-15T17:34:39

That's why some industries have separate "verification" and validation" (V&V) activities.

Verification is checking against formal or at least known requirements like those being discussed in this thread.

Validation is a set of processes for ensuring the specifications actually correspond to how the product ought to work. "How it ought to work" is open-ended and meant to include real-world aspects you didn't know beforehand. For example business logic assumptions. But despite being open-ended it is possible to formalise and track the process of looking for issues systematically.

In your example of YouTube showing Polish content when you travel, verification is the process of checking the website uses the IP correctly because using the IP is specified somewhere, but validation is the process of asking "what's the right thing to do when a user travels, is it really to use the IP?" and "let's systematically research what issues are affecting users due to incorrect specifications that we should change".

bennofs · 2024-05-15T08:38:05

Yes, that is a limitation. But this limitation is not too bad.

In many cases, a bug in the translation simply makes the proof impossible. Somebody then investigates why the proof does not go through and finds the bug in the translation.

We only have a problem if the bug in the translation specificially cancels a bug in the original code. If there is no systematic risk, it is quite unlikely that two bugs cancel each other in this way.

im3w1l · 2024-05-15T14:29:49

Let's say you want to check if use after free can ever occur. Your translation is bugged and translates the whole program to a single "nop". Then the verifier happily asserts that the translated program does not cause use after free.

I doubt the translation would be that bad but it could have more subtle issues of the same kind.

bennofs · 2024-05-16T07:06:16

My argument is that you typically also check other properties, like properties describing what your program should do. These other properties likely wouldn't be satisfied by a nop program, and hopefully the same is true for many subtle issues.

user2342 · 2024-05-15T08:58:54

In case of coq-to-ocaml: is it feasible to do an extraction to OCaml on the translated code and compare it with the original?

cccbbbaaa · 2024-05-15T09:09:53

You can write programs in Coq and extract them in OCaml with the `Extraction' command: https://coq.inria.fr/doc/v8.19/refman/addendum/extraction.ht...

This is used by compcert: https://compcert.org/

user2342 · 2024-05-15T10:57:49

Yes, I know, I mentioned the extraction.

My question was whether it can help detecting translation errors from the first step.

cccbbbaaa · 2024-05-15T11:39:02

I'm not sure which first step you are talking about. Typically, one would write the program directly in Coq and use the extracted code as-is.

clarus · 2024-05-15T08:26:59

The code is translated automatically with coq-of-rust! When issues are found in the translation they can be fixed once in the coq-of-rust tool, and all the translations are updated.

weinzierl · 2024-05-15T08:57:59

Ok, what I think I do not understand is what they mean by "tedious and error prone"? Is it tedious to write the automated translation (aka the coq-of-rust tool in this case) or does the translation of a concrete piece of code (e.g. the core crate) still involve manual work?

clarus · 2024-05-15T10:08:56

The "tedious and error prone" code was what we were doing before, when the translation of the standard library was not yet working automatically with coq-of-rust. Now, this is automatic. Maybe the explanations on the blog post were not clear.

p4bl0 · 2024-05-15T08:52:28

Note to interested readers: I submitted this blog post because it is less directly versed in cryptocurrency-related stuff than other posts from the same blog, but there are many more technically interesting posts there, especially the last two discussing the same approach but for Python instead of Rust.

konstantinua00 · 2024-05-15T13:08:56

I remember lecture [1] that mentioned fuzzer finding bugs in formally-verified C compiler - because formal verification didn't cover frontend and backend

I know people raise questions about trust in Coq itself and in translation (and I question how it will synchronize with rust updates), but even perfect formal verification doesn't mean 100% correctness from start to finish

[1] https://youtu.be/Ux0YnVEaI6A

fsloth · 2024-05-15T11:22:25

Totally not expert on formal verification. If Rust’s base libraries get formally verified, and one does not use unsafe code, does that means all Rust programs that use the formally verified libs get de-facto formally verified quality regarding their memoryhandling?

Galanwe · 2024-05-15T12:05:42

Rust safety has very little to do with bugs.

Rust has its own definition of what "safe" means, which is arguably a subset of memory safety.

You can have perfectly safe Rust code with data races, deadlocks, memory exhaustion, etc. Not even talking about logical errors.

dist1ll · 2024-05-15T12:16:44

> You can have perfectly safe Rust code with data races

No, you can't. Data races are explicitly one type of race condition that Rust protects you from. Anything else would indicate an unsound library or compiler bug. For reference, you can take a look at the non-exhaustive list of undefined behaviors[0]. None of those behaviours should be possible to trigger from safe Rust.

[0] https://doc.rust-lang.org/reference/behavior-considered-unde...

hu3 · 2024-05-15T12:52:02

As per https://doc.rust-lang.org/nomicon/races.html

> However Rust does not prevent general race conditions.

> This is mathematically impossible in situations where you do not control the scheduler, which is true for the normal OS environment.

> For this reason, it is considered "safe" for Rust to get deadlocked or do something nonsensical with incorrect synchronization: this is known as a general race condition or resource race. Obviously such a program isn't very good, but Rust of course cannot prevent all logic errors.

couchand · 2024-05-15T13:11:55

On the one hand, we like to encourage learning here. On the other, we prefer not to copy-paste a bunch of possibly-irrelevant content. Well, forgive me for pasting in a StackOverflow answer that may be relevant here: https://stackoverflow.com/questions/11276259/are-data-races-...

> Are "data races" and "race condition" actually the same thing in context of concurrent programming

> No, they are not the same thing. They are not a subset of one another. They are also neither the necessary, nor the sufficient condition for one another.

The really curious thing here is that the Nomicon page also describes this distinction in great detail.

hu3 · 2024-05-15T13:56:25

[flagged]

couchand · 2024-05-15T15:18:38

I apologize if my comment came off as snark. Your comment was nothing but pasted text which ommitted relevant detail, so it was not clear what the intent was. In context, to me, it did not seem to be illuminating. It actually seemed to be introducing confusion where there previously was none.

Data races are not possible in safe Rust. Race conditions are. The distinction is made clear in the Nomicon article, but commenters here are really muddying the waters...

hu3 · 2024-05-15T18:48:22

Clearly there is still confusion since we don't agree (as does other the aforementioned poster).

I could have also belittled your comment as "bunch of possibly-irrelevant content" since most of the content was and still is unnecessary snark.

But then it would have said more about my own etiquette and capability to debate objectively than about the topic at hand.

Our definition of data race seems to differ, and because you don't seem to be able to separate objective discussion from personal attacks, I'll stop here.

tialaramex · 2024-05-15T19:25:15

> I could have also belittled your comment as "bunch of possibly-irrelevant content"

That doesn't really make sense because there are other witnesses, so everybody who knows about this topic can see immediately that you're wrong and the other person is right.

hu3 · 2024-05-15T20:04:59

Regardless if technically right or not tialaramex, couchand's messages also contained belittling and snarky content.

So one could have written just the same about what they wrote because of what it also contained. "bunch of possibly-irrelevant content".

This kind of behavior is not justifiable on technical merits alone. At least it shouldn't be.

Galanwe · 2024-05-15T13:51:15

You can have both data races and race conditions in a perfectly safe Rust program.

Rust can only offer its safety for the lifetimes it tracks, and this tracking is limited to a single instance of your process.

In other words, if you perform concurrent accesses through multiple threads spawned from a same process, you're safe from data races, at risk of race conditions. If you perform concurrent accesses through multiples processes, you're at risk of both.

That implies that even in a world where everything is safe Rust, you cannot guarantee that these two Rust processes will not data races together.

oconnor663 · 2024-05-15T14:02:27

> If you perform concurrent accesses through multiples processes

How would you do that in safe code? If I understand correctly, the problem you're referring to is the exact reason that mmap APIs in Rust are unsafe: https://docs.rs/memmap2/latest/memmap2/struct.Mmap.html#safe...

nindalf · 2024-05-15T14:29:07

I don't agree with the point GP is making, but presumably something like two processes writing a file and reading their changes back. If the writes and reads interleave you'd get a race condition.

couchand · 2024-05-15T15:14:52

That's decidedly not a data race, however.

bigstrat2003 · 2024-05-15T15:24:52

I don't think most people expect language safety guarantees to extend to other processes. That's a pretty unrealistic expectation you are bringing.

Galanwe · 2024-05-15T17:00:05

I don't think that's unrealistic, it all depends on which sphere of development you're navigating into.

In highly concurrent/long lived systems, you often end up using a multiprocess approach between your producers and consumers rather than multithreading. This is because it allows you to dynamically spawn new processes to consume from a shared memory without going through a bastion process managing threaded consumers. e. g. It allows you to spawn at will newly developed consumers without redeploying and restarting the producers. Note that this does not mean a different codebase.

As for the expectations, I think it's fair to highlight it. Because people tend to think that a world of applications fully developed in Rust would somehow imply no data races. That's not the case: On a fully fault-free operating system, with only 100% safe Rust applications running, you can, and will, still run into data races, because in the real world applications cross process boundaries.

steveklabnik · 2024-05-15T17:28:21

What is your definition of "data race"?

While I'm aware of some differences in opinion in the details, all of the definitions I'm aware of exclusively refer to multiple threads of execution writing to the same memory location, which makes the concept of a multi-process data race impossible. For example, Regehr: https://blog.regehr.org/archives/490

A data race happens when there are two memory accesses in a program where both:

target the same location

are performed concurrently by two threads

are not reads

are not synchronization operations

Galanwe · 2024-05-15T17:41:46

The subtle differences (or lack of) between threads and processes don't really matter IMHO. A data race happen when two concurrent accesses on the same memory location happen, one of these accesses is a write, and there is no (correct) synchronization in place. The means by which this memory location was shared don't really matter. Whether it was because both processes share the same initial memory space, or whether this memory space was mapped later on, or even whether both accesses were from a mapped space, is really not of the matter. You could have two threads mmap the same shared memory to communicate for what it matters.

steveklabnik · 2024-05-15T17:44:20

Okay, at least there's definitional agreement here. However,

> The means by which this memory location was shared don't really matter.

It does in the context of this conversation, because you are claiming that safe Rust can create data races. Is there a safe API that lets you map addresses between processes? The closest thing I'm aware of is mmap, which is unsafe, for exactly this reason.

Galanwe · 2024-05-15T21:28:22

Well, see, that's the point of the conversation where I just don't follow the Rust logic.

"Safe" and "unsafe" are scoped properties. They apply to the block of code that they encompass, by very definition of the fact that they are scoped. As long as an unsafe block respects the constraints that would be enforced if Rust could understand them, then that unsafe block can be wrapped in a safe block and exposed.

In other words, "unsafe" is not transitive as long as what is done in the unsafe block respects the constraints. Mapping a block of memory is not unsafe. AFAIK Rust is able to allocate memory, which means Rust calls mmap, and that is safe.

Thus, mmap being marked unsafe is not relevant here, because the actual unsafe block using mmap can be safe by knowledge of the programmer, while still inducing a data race later on (or not).

If your argument is that unsafety is transitive by virtue of calling something that might have unsafe side effects later on, then that is seriously broken. Because with the same logic, every file access could use /proc to overwrite your process memory or change your memory maps, should we mark the all unsafe? Or worst, you prone a "maybe unsafe" approach, where things are maybe transitively unsafe, in which case why bother, let's rename "unsafe" to "shady", or unsafety just becomes some qualitative measure.

steveklabnik · 2024-05-15T21:48:04

> If your argument is that unsafety is transitive

It is not. I am in full agreement with this:

> As long as an unsafe block respects the constraints that would be enforced if Rust could understand them, then that unsafe block can be wrapped in a safe block and exposed.

But I am not sure what you mean by

> AFAIK Rust is able to allocate memory, which means Rust calls mmap, and that is safe.

liballoc is not required to call mmap, and even if it did, that mmap isn't shared with other processes, as far as I'm aware.

> while still inducing a data race later on

If this is possible, then it is not safe, otherwise you lose the properties we're in agreement about above: no data races are allowed in safe Rust.

> Because with the same logic, every file access could use /proc to overwrite your process memory

It is true that Linux specifically exposes the ability to mess with process memory via the filesystem API, and you can use that to subvert Rust's guarantees. But this one specific hole on one specific platform is considered to be outside of the scope of the ability of any programming language to guarantee. An operating system corrupting your memory is a completely different thing.

I doubt we're going to come to an agreement here, but I appreciate you elaborating, as now I think I understand your position here. It's not the one in general use though, so you're going to encounter these sorts of arguments in the future.

saagarjha · 2024-05-16T12:06:18

Honestly I see this as a losing argument, to be honest. The real answer here is that Rust relies on certain guarantees from its system and when those are not provided what it layers on top of that is not guaranteed to be valid as well. Whether you mark the interfaces to do this as safe or unsafe are really a choice made by the implementer, not a hard-and-fast rule. Nobody will make the interface to open files unsafe, even if it lets you open a file that lets you touch memory that should not be modified. It's because the actual usecase for it is so overwhelmingly likely to be safe that it makes little sense not to. But there are plenty of "unsafe" ways you can subvert the Rust model, by changing what the OS or processor or hardware does, and it's not worth classifying all of those.

dist1ll · 2024-05-15T21:15:00

That's my thinking as well. Any library that exposes the such shared memory IPC via shared references as a safe API seems fundamentally unsound.

pcwalton · 2024-05-15T21:27:04

mmap is unsafe in Rust.

raphlinus · 2024-05-15T12:10:13

I think that to some extent represents the dream, with caveats. A few pieces need to fall into place.

1. You need to formalize the semantics of unsafe Rust. Ralf Jung's pioneering RustBelt[1] work was a huge step in that direction, but it is not yet complete. One element that's emerging as tricky is "pointer provenance."

2. Part of that is making a formal model of the borrow checker. Stacked borrows[2] was a good attempt, but has some flaws. Tree borrows[3] may correct those flaws, but it's possible something even more refined may emerge.

3. Another part of that is a formal memory model, which is mostly about the behavior of atomics and synchronization (thus, is critically important for standard library components such as Arc). It is widely understood that Rust's memory model should be similar to the C++ one (and that they should interoperate), but there are some remaining flaws in that, such as the "out of thin air" problem and also some missing functionality like seqlocks (part of why the Linux kernel still has its own).

4. You need a proof that the safety guarantees compose well; in particular if you have sound code written in unsafe Rust, composed with other code written in safe Rust, the safety guarantees still hold. There are good results so far but it needs to be proved over the entire system.

5. You do need to close all the remaining soundness bugs in Rust[1], for such a proof to hold for all code. Progress on this is slow, as many of these problems are theoretical, or are only really relevant for adversarial code[5].

When all this is done, you still only have a partial guarantee. Huge amounts of the surface area to interface with the system are based on unsafe code. If you're doing only pure computation, that's one thing, but if you're building, say, a graphical UI, there's a massive amount of stuff that can still go wrong.

Even so, there is a practical path forward, and that leads to a much better situation than the usual state of things today, which is of course systems riddled with vulnerabilities.

[1]: https://people.mpi-sws.org/~dreyer/papers/rustbelt/paper.pdf

[2]: https://plv.mpi-sws.org/rustbelt/stacked-borrows/

[3]: https://www.ralfj.de/blog/2023/06/02/tree-borrows.html

[4]: https://github.com/rust-lang/rust/issues?q=is%3Aissue+is%3Ao...

[5]: https://github.com/Speykious/cve-rs

tialaramex · 2024-05-15T20:03:39

Provenance is a serious problem for a language which has pointers and optimisation.

It's interesting to me that serious people propose "bag of bits" semantics (basically "What is provenance, lets just say pointers are raw addresses") for C++ which seems like a terrible idea and I can't tell whether they really mean that. P2414R3: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p24...

clarus · 2024-05-15T13:45:28

Thanks for the explanations and all the links!

hu3 · 2024-05-15T11:40:08

I'm no Rust expert but it seems to have a lot of unsafe in core code. And understandably so:

https://github.com/search?q=repo%3Arust-lang%2Frust+unsafe+l...

I don't think coq, as being presented here, can verify all unsafe calls.

steveklabnik · 2024-05-15T17:31:30

Most of the interest in formalizing Rust is specifically about formalizing the model that unsafe code is supposed to uphold.

xavxav · 2024-05-15T09:29:53

Can you compare how your approach contrasts to Aeneas or RustHornBelt? How do you handle pointers and mutable borrows?

clarus · 2024-05-15T10:25:27

I do not know how RustHornBelt works. We are focusing on safe code, although we still generate a translation for unsafe blocks as a "best effort".

Compared to Aeneas the goal is very similar as we want to verify Rust programs using interactive theorem provers. However, with coq-of-rust we write the purely functional version of the code (the one on which we make proofs) by hand, or with the help of some GitHub Copilot as this is rather repetitive, and prove it equivalent with the automatic translation. In Aeneas the aim is to directly generate a functional version.

We handle all the pointers as if they were mutable pointers (the `*` type). We do not use any information from Rust's borrow checker, which simplifies the translation, but we pay that at proof time.

To reason about pointers in the proofs, we let the user provide a custom allocator that can be designed depending on how the memory will be used. For example, if the program uses three global mutable variables, the memory can be a record with three entries. These entries are initially `None` to represent when they are not yet allocated. We do not yet know how this technique can scale, but at least we can avoid separation logic reasoning for now. We hope that most of the programs we will verify have a rather "simple" memory discipline, especially on the application side.

clarus · 2024-05-16T09:47:10

I should add that for pointers to immutable data, we can translate them to immutable values on the Coq side. Thus for Rust code written in a purely functional way (using only immutable data structures) the translation is almost one-to-one with Coq, that is to say the same code up to syntactical differences and the verbosity of our monad that behaves like the identity monad in that case.

Thus we can provide a path for people who are ready to sacrifice performance for proofs. I guess that immutable Rust is simpler to verify with the other systems too.

xavxav · 2024-05-17T09:48:01

> I guess that immutable Rust is simpler to verify with the other systems too.

I don't think that's the case (it's not harder either). The type system of Rust makes handling mutability fairly trivial in verification, in fact it could be translated to immutable code in a continuation monad like the Tardis monad from Haskell I think.

vlovich123 · 2024-05-15T15:41:13

Is formal verification specification writing similar to writing a more complicated property test? I’ve found writing property tests to be kinda difficult and time consuming for non-trivial programs.

Jtsummers · 2024-05-15T16:13:02

Similar, not always the same. Property tests generally work at the interface of a system (either down to the function/procedure level or higher level) where you specify a description of the inputs and test the outputs satisfy some property or set of properties.

This will look a lot like what you do with formal verification at the same level. Formal verification tools can go deeper, though, like answering questions about the state of the system internal to a computation. Like "I have this loop invariant, can I prove (or have a system prove) that it actually holds across all iterations?" or "Can I prove that while doing this computation there is never underflow/overflow in any intermediate computation?" The former could be done with property testing if you pull the loop kernel out into its own procedure and then test the invariant property by executing the kernel with a large number of intermediate states. The latter is harder to do without really, really tearing your program apart down to the level of making each line separately executable.

vlovich123 · 2024-05-15T16:29:26

I think my main point is that writing the specification is difficult enough with a non-trivial program with property tests, with many places tending to even avoid it if they can. If formal verification is even more difficult, it doesn't matter that it has more power and can make the code even more correct - the cost is a huge blocker.

Jtsummers · 2024-05-15T16:46:52

It depends on how you do the formal verification. There are systems like SPARK/Ada which brings the specification into the language you're developing in (supports only a subset of Ada, but every year they're expanding what it covers), JML which does something similar for Java but the specification language is expressed in its own notation in Java comments, and others.

If you use these kinds of systems you can be more judicious in what you try to prove. Whole program proofs are infeasible (or at least impractical) generally, but you pick the subsets you care about most and you can add annotations and prove them (automatically ideally, but you can bring in an external proof assistant as well). You leave the rest of the system to testing which has proven remarkably effective over the decades (when taken seriously, at least).

This isn't an all-or-nothing proposition, you choose how to spend your time and effort with formal verification just like you do with testing or other work.

clarus · 2024-05-15T17:17:03

There are some specifications that are always there, like the absence of reachable panics or the backward compatibility between releases on stable entry points.

Otherwise you can outsource this work to specialized companies such as Formal Land or others!

muldvarp · 2024-05-16T13:01:34

> the cost is a huge blocker.

It is. Formal verification is and will be (for a very long time) a thing you only do for airplanes and nuclear reactors.

Havoc · 2024-05-15T08:03:02

Didn’t know that is even feasible.

I wonder if efforts like this could speed up rust in key parts of kernel adoption

CrimsonCape · 2024-05-16T19:15:33

Could someone ELI5 the concept of "verification", why does an entire language (Coq) exist for this purpose, and what practical implications this has on wider society?

zem · 2024-05-15T09:29:38

wonder how much work it would be to add a rust backend to F* [https://github.com/FStarLang/FStar]

clarus · 2024-05-15T10:33:31

It shoud be possible. A specific feature of Coq that we use is impredicative Set. I do not know if this is the case in F*.

the8472 · 2024-05-15T14:24:25

Did you find any bugs in the crates?

clarus · 2024-05-15T22:23:08

No, we have not found bugs! We have only scratched the surface and bugs are probably well hidden as the standard library is very well tested. We do not expect to find bugs there, especially as we also assume the unsafe code is safe.

Our main hope with the translation to Coq of the standard library is to get a formalization of it, so that we can precisely verify Rust programs calling core and alloc.

hgyjdetjj · 2024-05-15T06:52:22

[flagged]

baq · 2024-05-15T07:08:13

The 'problem' was very obviously directed at politically correct English people who have no idea what the word means in other languages.

Also, people don't have a problem with the 'git' brand, somehow?

rob74 · 2024-05-15T07:19:35

Well, the primary meaning of "cock" is a male chicken (also applied to some other male birds) in English too, irrespective of the slang terms. "Le coq" is also a symbol of France https://en.wikipedia.org/wiki/Gallic_rooster, see also https://en.wikipedia.org/wiki/Le_Coq_Sportif, and coq being a French project, it's understandable why they would choose this name (and have second thoughts about changing it because of other people's sensitivities).

thih9 · 2024-05-15T12:10:56

> and have second thoughts about changing it

Do you have a source? As of now this is still in the roadmap[1] and I saw no information about removing it from there.

[1]: https://github.com/coq/ceps/blob/coq-roadmap/text/069-coq-ro...

baq · 2024-05-15T12:15:48

of course it's the 069 roadmap doc...

yau8edq12i · 2024-05-15T14:36:35

"Coq" is also the first letters of the creator's name (Coquand).

deredede · 2024-05-15T07:23:02

It is also very likely (if not definitely known) that the name was chosen at least in part because of the English meaning of the word, making the "people have no idea what the word means" argument moot.

Outside of France, most users of Coq are in the US where it is taught in many universities. I am sure there were no female professors or TA whose job was made harder purely due to being "the coq teacher" (this is sarcastic, multiple such professors and TAs have mentioned having this type of issues on the Coq mailing list).

hurutparittya · 2024-05-15T07:27:37

Oh no, giggling students! Someone do something, quick! I can only imagine the horror.

cccbbbaaa · 2024-05-15T08:05:10

This is what happens in french classrooms when you talk about bits :-)

littlestymaar · 2024-05-15T10:31:30

Don't forget strings.

baq · 2024-05-15T07:32:35

So teachers complain on the mailing list instead of telling the students to grow the f* up.

I don't doubt this is true, but I question the worth of complaints.

deredede · 2024-05-15T08:12:45

You are assuming a lot here: as far as I remember, someone proposed a name change, then these accounts came up supporting the change. It's not like people were complaining left and right on the mailing lists, but when the opportunity arose to fix once and for all a problem they had, they spoke in favor.

And in my opinion multiple such accounts coming up are a good reason to proceed with a name change: the problem shows up as a pattern rather than isolated incidents, and a reasonable fix is identified.

saagarjha · 2024-05-16T12:08:09

Ok, but you claimed otherwise in your parent comment.

littlestymaar · 2024-05-15T10:31:01

Do you know what a «string» means in French? Or a «bite» (which is pronounced exactly like “bit”). If not, look that up.

And besides lame jokes when 14 years old hear it for the first time (both male and female, btw), it's fine.

It's essentially a problem of American puritanism (and likely also a problem of misogyny among students, but the solution to that really isn't to change the project name!)

rugina · 2024-05-15T08:00:03

The English word bit has the same meaning in French as word Coq has in English.

deredede · 2024-05-15T08:28:31

I am French, I know :)

But in my experience this simply causes some giggling in the classroom for a few days and that's it, which is not the experience I have seen recounted regarding Coq in the US.

There are multiple differences between these cases:

- The word "bit" is mostly used when speaking English, not in a French sentence (we use "octet" - I don't know the history here but I wouldn't be surprised that this is specifically because of the French meaning of "bit"). Coq, being the name of the tool, is used as-is in English sentences. [This is wrong, I somehow confused bit and byte here]

- Even when used in a French sentence, the gender is different ("un bit" vs "une bite"), removing ambiguity

- Bits are just one fraction of the curriculum, not the name of the tool used in every single lesson of the course

I will refrain from commenting further on this topic, as it has been rehashed many times already and distracts from the work on rust-to-coq translation.

croemer · 2024-05-15T08:43:07

Are you sure? I'm not French but according to French wikipedia "octet" means "byte" not "bit": https://fr.m.wikipedia.org/wiki/Octet

deredede · 2024-05-15T09:07:04

Sorry, you are right of course! I somehow confused bit and byte, I have updated my comment. Thanks for pointing that out.

hiddencost · 2024-05-15T07:28:20

Reminds me of renaming NeurIPS. Another change we're better for.

Certhas · 2024-05-15T07:32:25

I mean... I _do_ think that language sensitivities are overblown these days. E.g. the majority of women participants did not "agree or strongly agree" that NIPS should become NeurIPS, yet it did.

GIMP and Git are words that most non-native English speakers don't have a strong association with. So they get a pass in most of the world.

But cock is as universal a swear word as you will find on this planet.

If I do a retreat in the village of Fucking, and write a compiler there, and then call it TFC -- the Fucking compiler, I know what I am doing.

Gare · 2024-05-15T07:55:58

> If I do a retreat in the village of Fucking

Funnily enough, they have renamed it to Fugging in 2021 because of annoying tourists.

https://en.wikipedia.org/wiki/Fugging,_Upper_Austria

peoplefromibiza · 2024-05-15T08:17:48

> But cock is as universal a swear word

is it?

Are you confused with Dick? the name of course...

Ironically the word dick comes from the German language, meaning "dominant ruler"

croemer · 2024-05-15T08:37:37

Dick means fat/big in German

peoplefromibiza · 2024-05-15T10:06:05

sorry, my bad, you're right of course, I meant not literally but as a short name for Richard

didntcheck · 2024-05-15T08:34:33

Well we do have the Brain Fuck Scheduler

Frankly I'd welcome more rejection of corporate safetyism. We're adults, not kids in a "Christian Minecraft server"

yau8edq12i · 2024-05-15T16:08:01

> But cock is as universal a swear word as you will find on this planet.

It's not, though. Come to France, say "coq" - people will wonder why you're talking about roosters all of a sudden. Holy Anglo-centrism, Batman...

Certhas · 2024-05-15T17:42:43

I am German. German culture does not have the Anglo Saxon obsession with swear words to begin with. The US/British obsession with keeping dirty words from children's ears is a never-ending source of amusement for me.

I think you missed my point. "As universal as you will find" does not mean universal. It means There is nothing "more universal" - no word understood as a swear word by more people. There is no absolutely universal swear word. Obviously.

But English is spoken by a lot of people, most of them not native, most of whom don't have a native word that sounds like cock as a first association.

boxed · 2024-05-15T07:45:34

Python has a pretty big naming problem too. Ophidiophobia is the number one phobia in the world, and even though "python" is from Monty Python and not the snake, people keep posting enormous pictures of snakes on book covers and blogs, making it basically impossible for ophidiophobia sufferers to go near python. There are estimates that this is 10% of the population.

Now.. do I think Python should change its name? No.

yau8edq12i · 2024-05-15T14:39:11

I am extremely skeptical about the idea that 10% of the population has a phobia of serpents severe enough to avoid learning a programming language named after a kind of snake.

boxed · 2024-05-17T06:50:08

You mean severe enough to want to avoid being exposed to pictures of snakes. You have to see it from the sufferers perspective.

humzashahid98 · 2024-05-15T06:56:37

I thought to myself a rename to FACT Prover (Formerly known As Coq Theorem Prover) would be nice when seeing they wanted to rename the project and it's a bit of a disappointment that didn't quite pan out.

Rocq is similar to the Roc programming language which is confusing naming (and they sound identical as far as I know).

cjfd · 2024-05-15T07:45:27

Since I am both gay and a user of coq, I have had the opportunity to jokingly say that I like coq in all of its spellings. It goes without saying that I am against the name change.

zorked · 2024-05-15T07:14:39

> niche theorem prover

As opposed to popular theorem provers.

rob74 · 2024-05-15T07:00:01

Quite fittingly, there is a South American bird called Coq of the Rocq... errr, Cock of the Rock: https://en.wikipedia.org/wiki/Cock-of-the-rock

gwervc · 2024-05-15T06:55:37

A new name always take time to disseminate after a rebranding, especially when the change is controversial. The most glaring example being X.

deredede · 2024-05-15T07:09:40

The decision to change the name has been taken by the Coq team, but I believe they do not want the new name to be used until it has been acted administratively, legally and technically (e.g. the documentation should mention the new name and be hosted on rocq.inria.fr).

psychoslave · 2024-05-15T07:57:41

True, W(indow), X, Xfre86, X.org. And you might thought that they would have come with Y and Z next, but no they throw Wayland (name after some locality) instead in your face. How do you want to follow this stuffs? :D

Narishma · 2024-05-15T09:49:57

There was a Y windowing system.

http://www.y-windows.org/