Hacker News new | past | comments | ask | show | jobs | submit login
TLA+ (wikipedia.org)
230 points by kristianpaul 6 months ago | hide | past | favorite | 69 comments



I have successfully used TLA+ to clear my 2-phase "put a thread to sleep" protocol from deadlock and livelocks.

I also proved that one of the nasty deadlock bugs I was fighting on Linux (but not on Windows or Mac) was actually glibc missing condition variables wakeup signals from time to time.

100% convinced it's worth learning to debug or verify any kind of distributed or multithreaded synchronization primitive. I was bug hunting for a couple weeks before deciding to learn formal verification / model checking via TLA+.

Writeup: https://github.com/mratsim/weave/issues/56


I have compiled my research in formal verification tools and papers here, there are a couple dozens:

- https://github.com/mratsim/weave/issues/18


>"100% convinced it's worth learning to debug or verify any kind of distributed or multithreaded synchronization primitive."

Might you or anyone else have some resources you could recommend for learning TLA+?


Check out the video course from Leslie Lamport (the creator of TLA+) himself:

https://lamport.azurewebsites.net/tla/learning.html


Leslie is one of those ridiculously over-accomplished people in computer science. I think it's safe to say you've earned respect when people can legitimately recognize your name for multiple, mostly-unrelated achievements, without realizing everything else you've done.


Perhaps I'm missing something but shouldn't that bug be filed with the glibc project rather than with the weave project?

Has glibc since fixed the issue?


Weave is my multithreading library. It's just a recap of what I went through for future reference.

I unfortunately didn't have time to go through:

- compiling glibc from source

- converting my code to C

- trying to reduce multithreaded synchronization code to a minimal example given how non-deterministic any of those bugs are

- do a write-up

That said, it seems like people went through the motions 4 months after me in this bug:

- https://sourceware.org/bugzilla/show_bug.cgi?id=25847 which affects C#, OCaml, Python

Someone also fully reimplemented Glibc condition variable in TLA+ to prove that the glibc code was buggy:

- https://probablydance.com/2020/10/31/using-tla-in-the-real-w... (at the bottom)


Wow.

I'd normally respond with (when looking at the very complex relevant glibc code), "complexity is a code smell", but I don't understand the problem space enough to legitimately suggest that here, other than to say that in my experience programming thus far, if you require this much complexity for any part of your system, it is in all likelihood reflective of a design flaw (possibly elsewhere).


High-performance concurrency is just the kind of problem that may well require a complex solution.


Why do so many people use the word smell to disparage code? The human part of me always feels it's super douchy and elitist, and the engineer part hates it because it's inaccurate and misleading.


It's an analogy to the way we detect things like food spoilage or gas leaks by smell. For the vast majority of us most code smells are actually observed visually (Why is this nested 10 levels deep? Why are there 20 parameters to this method?) so I suppose we could start saying code eyesores, but that often has the wrong sense (even more strongly negative) than smell.

It's really only douchy or elitist if it's strictly used in an insulting or disparaging way. Most of the time when people talk about code smells it's meant to indicate, just like with real-world odd, bad, or unexpected smells: Something doesn't seem right. I could be wrong, but it warrants investigation.

Using the examples above, deep nesting and large numbers of parameters are indications of a potential problem, but are not necessarily problems. It could be that a refactoring is in order, but it may also be that everything is fine (just not pretty).


I understand the point of the analogy, my question was more why is such a bad one so popular. something something makes my hair stand up, as alternative that's actually good; used to mean the feeling of anxiety. No one mentions smells when they're talking about good code. Similar to normal american english conversations, if someone mentions a smell unprovoked, it's normally a disgusting smell. So when used against code, it's implying the code is repulsive. Which to me at least feels super insulting to respond to code (text on a screen) with such a vicseral response, of distrust/dislike.


I don't think everyone, let alone most people, makes the leap from "smell" to "repulsive". "My fridge smells funny" is not the same as "My fridge smells so bad it makes me want to puke". The first is a cause for investigation, the second is a cause for real concern.

Most code "smells" are in the former category, they are not describing the code as "repulsive" but "off". If you're making that leap then you're going to need to supply (or find) another term for what is described, for now over 20 years, as a code smell. Since most people don't have the same visceral reaction you do, they won't care.


It's a qualitative word for a qualitative assessment.

For example, it is quite possible that code that is nested 30 levels deep is provably correct. Since we are humans, is that likely? No, it is not. Or at least, it is certainly less likely than 3-level-deep code. So you look at some code and without doing any deep analysis you can make some "probably valid" assumptions which come from your experience.

If you're so bent on accurate engineering, answer me this: Why are technology choices often "fashion-driven", without any significant empirical data to support their choice over other alternatives? (My latest pet peeve in this space is widespread use of Docker.) And where data DOES exist (such as to support the statement "Functional languages produce far fewer bugs per LOC written"), why don't more programmers/engineers heed that? Is it perhaps because... you're not as "strictly engineer" as you think?


This is beautiful and satisfying.


I find it suspicious that glibc and musl should have the same bug. Double check your code?


I checked my code dozens of times over days.

I thought I had bugs in my code for sure, I couldn't believe glibc and musl had a bug.

Then I decided to spend a couple days to learn formal verification, solved all the bugs it found.

It worked on my Mac, it worked on Windows, it was still deadlocking on Linux. I replaced my code with raw futexes, no deadlock anymore. And I added logs that showed signaling happened but the thread signaled wasn't unblocked.

The futex<->condvar swap is pretty simple. - https://github.com/mratsim/weave/blob/f41a562/weave/cross_th...


Yes, I saw your screenshots, and I think they demonstrate lack of understanding of how condition variables work.

Any use of condition variables where the lock is "never used" screams fundamentally broken to me. No amount of atomics and memory orderings or manually inserted fences/memory barriers can help you if you leave a race window open between checking your predicate and starting to wait on a signal. Condvar signals are not sticky. If nobody is waiting for a signal when you signal, that signal is lost. There is nothing between your atomics and lock & wait to prevent such loss.

Also your TLA+ (or rather PlusCal) does not model a condition variable correctly, so it would obviously not detect this issue. You have a sticky boolean variable and await, which blocks the procedure until the condition becomes true. This detects lockups of the class where the condition is never signalled at all, but it does not model the possibility of losing a signal because nobody was waiting for it.

I don't believe you discovered a bug in glibc and musl.


Also worth noting that the VS Code extension for TLA+ is steadily improving:

https://marketplace.visualstudio.com/items?itemName=alygin.v...

https://open-vsx.org/extension/alygin/vscode-tlaplus


Heckin'! One of my favorite things to see in TLA+ adoption is people making more tools to work with it.


There are two pieces of TLA+ and formal modeling that I don't get. (And I'm pretty sure all the authors and documenters have tried their best at explaining it again and again, but maybe asking the question again is not useless.)

1. Given you have to learn the language, and, at first, you don't know it - how do you even trust that you're writing a proof right, and that you're not simply writing gibberish ? TDD has code execution as an "arbiter of peace". If, when calling a function, you get an unexpected result, then... you did something wrong.

If my TLA+ set of rules holds, how do I get confidence that I wrote the right set of rules ?

2. Once I've got a set of rule that I decided to trust, when I implement it and I get bugs (as demonstrated by executable code) ; how do I decide if the most plausible cause is a bug in my code, or a bug in the spec ?

Maybe it becomes obvious once you've done it, and I'm sorry to be the million'th to say such nonsense. Sorry.


TLA+ is fear driven development. If you have any fears that your TLA+ model or proofs might have been written wrong, your job is to encode those fears as further TLA+ proofs.

The last TLA+ specification I wrote was only about two pages long. The actual model was a handful of lines. A handful of definitions. And then a page and a half of proofs and proof proofs.

So it's easy to look over your spec exhaustively. And if there's any part of it you aren't sure about, or think you might have written wrong, then write more proofs that show that it you can't have written it wrong that way.

The IDE provides lots of feedback and ways to debug. So like learning any language, you'll spend your formative time just trying out different parts of the TLA+ semantics and introspecting the outputs to make sure the models are behaving the way you expect them to.

TLA+ is very much like doing math. If you do it wrong, you get answers that don't make sense. And it's easy to check if you're doing it wrong, by doing something you _know_ is wrong. If you get a "right" answer, you know you are making a mistake or don't understand a semantic correctly. The joy is that TLA+ allows you to write out these proof proofs and have them checked every time along with the rest of your spec.

> how do I decide if the most plausible cause is a bug in my code, or a bug in the spec ?

Your implementation and your TLA+ model share the same state data structures (though the TLA+ model might have a simplified representation of those states, eliding unimportant details). If you suspect your spec, you just copy the broken state sequence your implementation is encountering into your TLA+ model. If the model displays the same behavior you know your spec is wrong. If it doesn't, you know your implementation is wrong.


> 1. Given you have to learn the language, and, at first, you don't know it - how do you even trust that you're writing a proof right, and that you're not simply writing gibberish ? TDD has code execution as an "arbiter of peace". If, when calling a function, you get an unexpected result, then... you did something wrong.

I write TLA+ specs in the same way I do TDD: first write an invariant or property I know I should half, then make it fail, then modify the spec to pass the invariant.

> If my TLA+ set of rules holds, how do I get confidence that I wrote the right set of rules ?

Ultimately, nothing can guarantee this for us, just as nothing can guarantee we wrote the right code. This is one reason why I'm a strong advocate of spec reviews. Getting another pair of eyes on the spec goes a long way.

> 2. Once I've got a set of rule that I decided to trust, when I implement it and I get bugs (as demonstrated by executable code) ; how do I decide if the most plausible cause is a bug in my code, or a bug in the spec ?

It depends. This has happened to me a couple of times before. In one case, it was because I forgot to include something in the spec. Adding that caused the spec to immediately fail, which corresponded to the executable bug. In another case, I was able to identify, assuming the spec was correct, where the spec-code mistmatch must have been to cause the bug. So I was able to identify the code issue much faster than without the spec.

It helps that specs are usually much smaller than the corresponding codebase, so you can sweep it for issues much faster.


TLA+ specs are often short which aids in verifying and validating them, especially in comparison to typical development projects. TLA+ also permits ignoring non-essential details and focusing on the core aspects of the problem.

As an example of the latter, I used TLA+ once (only) for work to great success. But the spec I wrote ignored nearly every detail about the larger system, and focused on a few hardware details and modeled the shared memory system as a simple sequence of, rather arbitrarily, 5-11 elements (fixed in the models, in the spec it was a variable). What was stored in the memory? It didn't matter at all, it could literally be anything as far as the model and spec were concerned.

My point being, the spec was so simple that visual inspection and a bit of analysis could tell us that it was correct, it fit in a couple pages if printed out. Model checking let us verify that it had the desired behaviors (with respect to the invariants that I put in based on the official specs for the hardware). Later, weakening the invariants allowed me to "recreate" the race condition that existed in the hardware implementation and was causing all our problems.


I'd love to find a "bottom up" description of a language like TLA+. Start from the appropriate primitive (SAT solver? SMT solver?), show how to encode an example model directly, add layers of syntax etc as necessary for more complex examples.

I've watched all of Lamport's lectures but it really leaves me wondering what's happening under the hood: what are the primitives and how are TLA+ models mapped onto them.

(Lamport likes to say that TLA+ is just mathematics and imply that the engine underneath is not relevant but c'mon.)


TLA+ is just a mathematical language; it doesn't do anything other than help you formally express the behaviour of discrete dynamical systems and write formal proofs of propositions about them. It was designed to be used with pen and paper, and often the greatest benefit comes from using it that way. Various automated tools -- two model checkers employing completely different algorithms and a proof assistant that also uses different technique to talk to different backends (SMT solvers, Isabelle, tableaux provers) -- were added much later, and at different times, and they each cover different subsets of the language. So the multitude of algorithms employed by the tools are not the primitives.

I've written a series of posts that covers the TLA+ theory (but it isn't a tutorial): https://pron.github.io/tlaplus


SAT and SMT solvers deal with propositional logic only. TLA+ is based on propositional logic with added modalities (i.e. what a type theorist would call monads) for "time" (in the temporal logic sense) and non-determinism. It is essentially a somewhat bespoke modal logic. Modal logics in general have lower expressive power than FOL, but are conversely more computationally tractable.


Just found the TLAPS Tactics page: https://tla.msr-inria.inria.fr/tlaps/content/Documentation/T...

Sounds like the first thing TLAPS does is to translate the TLA+ program into SMT and try to prove it with Z3. If that times out then it translates it into a couple of other kinds of provers before giving up (...).

Maybe I should look at how TLA+ "hello world" is compiled into SMT as a first step.


Depending on the problem that you are trying to solve with TLA+, you may prefer one encoding or another. For instance, here is one encoding for the proof system: https://hal.archives-ouvertes.fr/hal-01768750/. And here is another encoding for model checking: https://dl.acm.org/doi/10.1145/3360549


Direct link to the Apalache model checker: https://apalache.informal.systems/


I think you'd be better off starting with a simpler formalism for this, like Linear Temporal Logic. That's a good pick because there's a straightforward mapping to Büchi automata. This will give you some more intuition on model checking in general, but TLA+ is much more complex than LTL.

(The most popular model checker for TLA+, TLC, is a brute forcer that represents the state space as a directed graph. Lamport talks about it a bit in Specifying Systems.)


TLA is the awesomest thing I've ever learned, yet the least frequent technique I use as an "enterprise product" software engineer.

No regrets. I recommend watching the video series introducing TLA+ by Lamport, what a wonder.


TLA has some amazing capabilities, no doubt about it. A few years ago I had to try to fix some security issues in the most complicated code I'd ever worked with [1]. Some of the "attacks" involved race conditions across three different processors: it was incredibly difficult to keep everything in my head.

So I entered the key parts of the algorithm in to TLA+. It found all of the races that I'd found in the original algorithm; and then didn't find any issues in my fixed algorithm. That gave me at least some confidence that I'd gotten things mostly right.

That said, it's really quite a wonky tool in a number of ways. The almost-C dialect which doesn't have local variables is quite a footgun. And it's quite tightly bound to their particular UI tool, which makes keeping the models in git possible but somewhat quirky.

What I'd honestly like more is to be able to make some files which could be compiled either in the real software project, or as part of a formal verification testing step. That way you could verify the properties as part of your CI loop (or perhaps part of more extended testing done elsewhere, if that's too expensive), and also have more confidence that the implementation matched your model.

[1] https://xenbits.xen.org/xsa/advisory-299.html


> And it's quite tightly bound to their particular UI tool, which makes keeping the models in git possible but somewhat quirky.

I couldn't find a way of tracking the TLA+ Toolkit's files nicely in git, but with a bit of manual work it's possible, using the `tlc-bin` CLI[0]. I used the TLA+ Toolkit to write the spec[1], but wrote the models by hand (eg. [2]).

This was before I heard of the VS-Code extension. I think it's quite good now, so there might be a better workflow that avoids the TLA+ toolkit altogether.

> What I'd honestly like more is to be able to make some files which could be compiled either in the real software project, or as part of a formal verification testing step.

Do you mean you'd like to generate a TLA+ spec from your source code?

[0] https://github.com/pmer/tla-bin

[1] https://github.com/statechannels/tla-specs/blob/6d7227e2d183...

[2] https://github.com/statechannels/tla-specs/blob/6d7227e2d183...


Yes, I think I was more or less inspired by tlc-bin; but in the end it didn't give me the control that I wanted, so I ended up just putting the Java command into my Makefile.

I'm not sure what you're calling "spec" vs "model"; I wrote stuff in PlusCal, which I checked into git; then used `make` to translate that into .tla automatically as needed. The .tla files are .gitignored, so don't choke up your commit history with machine-generated stuff.

Here's a snippet of my Makefile:

    %.tla: %.pcal
        cp $< $@
        java -cp $(TLATOOLS_JAR) pcal.trans -nocfg $@
        rm -f $*.old

    .PHONY: all-sources
    all: all-sources

    SOURCETARGETS += XSA287.tla
    TESTTARGETS += XSA287-A.out
    XSA287-A.out: XSA287-A.cfg XSA287.tla
        java -cp $(TLATOOLS_JAR) -XX:+UseParallelGC tlc2.TLC -config $< -workers 8 XSA287.tla | tee $@
The idea here is that doing "make" would simply compile all the .pcal files into .tla files (thus effectively doing a syntax check); but if you wanted to run any specific test, you would name the output file the test would generate.

You can see how I'm working against the way TLA wants to work: It wants to take your PlusCal code at the beginning of the file and rewrite the end of the file, "helpfully" backing up the old file with a ".old" extension; I want to take plain PlusCal code and generate an intermediate file (like an object file) which is not checked in. So I have to copy $FOO.pcal to $FOO.tla, run `pcal.trans` on $FOO.tla, and then delete the useless "$FOO.tla.old" file.

I always meant to publish these after XSA-299 went public, but haven't gotten around to it.

> Do you mean you'd like to generate a TLA+ spec from your source code?

Well obviously the whole project is so large as to be completely prohibitive. The model is written in the C-like variant of PlusCal, and uses a lot of the same variable names and such to help make sure they match the real C functions; and then translated into TLA. It would be nice if I could designate just one file (written in C) whose functions would be translated into TLA. The properties you want to verify about those functions would obviously need to be written in TLA directly.


> I'm not sure what you're calling "spec" vs "model"

This is from memory, but the TLA+ toolkit distinguishes between your "specification of your algorithm" and a (bounded) "model of your specification".

So, you might have some spec for a sorting function that works for arrays of natural numbers of arbitrary length. And your model would limit it to arrays of length 5, and "define" the natural numbers to be `range(0,10)` -- ie. you are checking a finite "model".

I think this is consistent with the language used in the wiki article (the OP).

> You can see how I'm working against the way TLA wants to work

I'd much prefer your suggested workflow. I don't recall that you can write plain old pcal

It's amusing that you want some TLA CLI tools that work kind of how the LaTex toolkit works, another of Leslie Lamport's contributions...

> I always meant to publish these after XSA-299 went public, but haven't gotten around to it.

Please do!

> It would be nice if I could designate just one file (written in C) whose functions would be translated into TLA. The properties you want to verify about those functions would obviously need to be written in TLA directly.

That seems like a really tractable problem to me, if the C files were written with this in mind. I've actually been thinking about this since giving an internal talk on my TLA+ project last week!


> This is from memory, but the TLA+ toolkit distinguishes between your "specification of your algorithm" and a (bounded) "model of your specification".

That's right, I'm remembering now. So yeah, the "models" are the ".cfg" files (which as you can see from the Makefile snippet are actually specified with the "-config" argument.

The .pcal files are actually just "normal" TLA files with the generated content replaced with

    \* BEGIN TRANSLATION
    \* END TRANSLATION
pcal.trans doesn't check to see that there's something there, since it's just going to throw it away. So I call them .pcal, but they're "really" non-generated TLA + pcal.

> It's amusing that you want some TLA CLI tools that work kind of how the LaTex toolkit works, another of Leslie Lamport's contributions...

I've got a number of tools I've written for my own personal use over the years, and TLA+ has "one person spent a few decades being both sole author and user" fingerprints all over it. :-)

> Please do!

Tomorrow I'll look through things and if it's suitable I'll post my git repo somewhere.


Not sure if you'll see this, but finally got around to posting my repo:

https://gitlab.com/xen-project/people/gdunlap/tla


I use a personal wrapper around the command line for most of my work. It's incomplete and has a few holes, but it gets the job done for me. https://github.com/hwayne/tlacli


Thanks for your all your content on TLA+ -- learntla.com was very helpful!


I've mentioned this a number of times here. AWS uses TLA+ all over the place, and use is strongly encouraged and supported.

A service team I worked for had to implement a back-end process that had zero tolerance for mistakes. The engineers involved had a principle engineer peer reviewed process worked out, and got encouraged to model it in TLA+ as part of that review process given what it was at hand. Two engineers learned TLA+ and PlusCal over the space of about a week, and then outlined the process over the next week or so, and kept finding minor, improbable, correctness bugs.

Once fixed, they found translating the code in to actual Java to be very straight-forward "It's fill in the blanks. The whole logical structure is there already".

In the end, the feature landed well ahead of planned time, even accounting for the time taken to learn and prove correctness, and was able to be launched with little fanfare and high degrees of confidence.

I know at least one of those engineers has gone on to use it very regularly. It gives them a good chance to model the logic first, solving the actual problem at hand, and then start thinking about "the boring stuff" like database connections, unit tests etc.


> I recommend watching the video series introducing TLA+ by Lamport

I'm guessing this is the video series for anyone looking for a link https://lamport.azurewebsites.net/video/videos.html


I love his meticulous explanation of how to use the toolbox and hammering on the Getting started and help. It really show his experience with explaining this to so many students and reflects the questions he got the most.


Relevant: "Hillel Wayne is Designing Distributed Systems with TLA+" https://www.youtube.com/watch?v=qubS_wGgwO0


If you're interested in TLA+ I recommend https://learntla.com which uses the simpler notation PlusCal


How does a program go from TLA+ spec to code that can be executed, and verify that they both do the same thing?

Is this a manual (and therefore, subject to human error) process, or do tools exist for automating it?


Think of writing a TLA+ spec as structured thinking with error checking. It helps create a correct algorithm. Having your code implement the algorirthm correctly is a different problem.

There was a Racket language that actually syntesized programs from specs (e.g. go from spec of a sorting algo to its implementation automatically), but I forget what it's called. Fun to play around with.


I've seen a couple things on using Racket for code synthesis. My google search revealed Rosette, which wasn't what I was thinking of. The other work may have been in Scheme, using minikanren, but my Google Fu seems weak this morning.

https://docs.racket-lang.org/rosette-guide/index.html


Fun fact: Emina Torlak also wrote Kodkod, the constraint solver used to convert Alloy specs into SAT problems. She's pretty cool!


For TLA it's manual. For other systems, like Coq for instance, you can extract to another language.

(Also important are that systems in this space are different. What something like Coq is good for is completely different to what TLA+ is good for. Another idea in the same vein is property-based-testing. Here the concrete code is tested by invoking random test examples, so you are testing the actual code/implementation. But this also means you can't use typical tricks of model checkers where they can regard whole classes of inputs the "same" in a test).


It's a manual process, but I read there are tools that try to automate it.

But as far as I understood it, if you know TLA+ and the target language well, it's like writing an article from a good outline.


TLA+ is what rekindled my genuine love for mathematices.

I am a self taught dev and learning TLA+ gave me the feel of a "real" computer science.

Learning how to think of of programs excusively as state machines gives a kind of intellectual clarity I didn't know existed.

It is also the only thing that actually improved my dev productivity by magnitudes.


James Hamilton, the architect of AWS and one of the authors of DynamoDB wrote a blog post on formal verification of distributed systems. He specifically mentioned TLA+ as what his team at Amazon used to do this.

https://perspectives.mvdirona.com/2014/07/challenges-in-desi...


How valuable is TLA+ for non-concurrent applications? I've always been interested in formal verification (is TLA+ technically that?).


If you don't have any concurrency or nondeterminism in your spec, then you'll probably want different tools. It's not that TLA+ is bad at modeling single-threaded algorithms, it's just that makes a lot of design decisions around representing concurrency, and there's an opportunity cost to that. I'd consider a code-level formal verification language, like Liquid Haskell or Dafny.


> How valuable is TLA+ for non-concurrent applications?

My understanding is that TLA+ is for concurrent applications, but with a broad definition of "concurrent." For example, it could be...

* multiple hosts messaging each other

* multiple threads in one process sharing a data structure

* multiple API calls that all use the same Postgres DB

In other words, any sort of system where different actors can step on each others' toes.


You define variables and code that changes the variables over time. TLA+ lets you observe variable states and assert constraints. The amount of concurrency is up to you, it works fine with a single process.

Non-determinism might be a better thing to anchor the value to. For example in a deterministic "normal" programming environment, an if/then/else statement will execute only one out of two possible code paths. You have to run the code with a complementary condition to observe the other code path.

In a non-deterministic environment like TLA+ both possible code paths are executed. You can observe state transitions in both possibilities.

In a deterministic context the combinatorics of code paths can get out of hand quickly. Non-deterministically you have one set of assertions for all code paths, or patterns of paths.


That's definitely the main use. But TLA+ can also be applied to single threaded specifications by encoding various invariants (pre/post conditions on entry and exit, loop invariants).


You can construct formal proofs about a TLA+ spec, using the "TLA proof system" (TLAPS). Or, you can verify properties for "bounded models" of your spec via brute force.

The latter is much less overhead, so it's good to use bounded models (with the TLC model checker) to become confident in the spec, then construct proofs if warranted.


TLA+ is a nice tool to help with thinking. It is surprisingly approachable for a formal system, so don't hesitate to try it.


It looks quite useful; especially for cannot-fail systems, like operating systems, development tools, and Mars Rover firmware.

It seems a bit overkill for the kinds of code I write, which represents a design flexibility that makes wet spaghetti look like a load-bearing truss.


Software companies widely uses TLA+ https://github.com/ligurio/practical-fm


In CS212 we're mostly writing simple recursive programs in SML and proving using induction that they evaluate to the correct result. We step through the programs by hand on paper. This seems very different, and I'm wondering if what we're learning has any practical application.


That's a different style of proof, based on equational reasoning; Agda [0], Coq [1] and Lean [2] are three theorem provers that work with those proofs better. Being able to quickly and informally perform equational reasoning on functional programs is also what (imo) makes those languages easier to think about, so it's a valuable skill outside of proving.

[0]: https://wiki.portal.chalmers.se/agda/pmwiki.php [1]: https://coq.inria.fr/ [2]: https://leanprover.github.io/


Not familiar with that course, but there are definitely applications for proof-writing to show that code does what you expect and/or satisfies certain properties - the seL4 microkernel [1] is probably the highest-profile example - but it quickly becomes impractical beyond small programs, or if the code wasn't written with proving in mind initially.

IMO the most "practical" application for those methods is in strengthening type systems to allow more expressive constraints - one small example that comes to mind is Idris [2] allowing you to explicitly mark certain functions as total [3] and fail type-checking if they don't return a value for every possible input.

The main thing to keep in mind with TLA+ is that it's not really meant for validating code, it's more for showing that your design is consistent and satisfies safety requirements (e.g. "no temporary node failure should result in data loss"). However, having a TLA+ specification usually makes the implementation fairly straightforward, especially in languages/environments with message-passing concurrency.

You can use TLAPS [4] to write proofs-by-induction similar to yours for TLA+ specifications, but IMO the real power is in model-checking them with TLC, which gives you a lot of the advantages of formal methods with much less proof work.

[1] https://github.com/seL4/l4v

[2] https://www.idris-lang.org/

[3] https://idris.readthedocs.io/en/latest/tutorial/typesfuns.ht...

[4] https://tla.msr-inria.inria.fr/tlaps/content/Home.html


In my school our Software Eng program is heavily focussed on formal verification. In my experience, the practical applications we learn are still too fat detracted from industry to be applicable.

Everything starts with two logic courses EECS/MATH 1028 which covers basics like induction and pigeon hole princele and MATH 1090 which covers first principles and axioms of logic.

Then, using the courses on logic that we did by hand, we have a system verification course (EECS 3342) with SMTs like Rodin which can solve on its up to a point where we need to manually input the proofs. Beyond this, we also cover proving preconditions and post conditions through design by contract in EECS 3311 - Software Design (which is usually tough in Eiffel, but finally being taught in Java after a year of lobbying the faculty).

Then our final courses are EECS 4312 and 4315 for requirements engineering and mission critical systems respectively. In 4312 we learn TLA+ and 4315 is focussed on tree and path logic for states. As well, we have EECS 4313 advanced software testing which is primarily about writing bug reports, creating tests (JUnit) and touches on distributed systems tests a bit.

The point is that we do learn the "practical" (quite impractical since our prof's are not in industry) only once we have had a lot of exposure to proving things by hand. But even once we have learned everything, they fail to connect material to realistic SDLCs and how to implement the tools in an agile methodology.


What you will realize years after you leave college is that in many cases, the way you do things is designed to get you to integrate a concept with your understanding of the domain. For instance, this is why many courses will ask you to write code to do something while disallowing the standard library. I see a lot of students complain "There's a library function that does exactly this but for some reason I can't use it!" But not using it is exactly the point.

Viewing college as a trade school is a mistake many of us make. No, you're almost certainly never going to be manually proving things by induction in the real world. But that doesn't make the exercise valueless.


IMHO your CS212 work is didactic, not applied


Microsoft built Azure CosmosDB based on TLA+: https://github.com/Azure/azure-cosmos-tla




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: