Hacker News new | past | comments | ask | show | jobs | submit login
Introduction to TLA+ Model Checking in the Command Line (medium.com)
176 points by mbellotti 3 months ago | hide | past | web | favorite | 44 comments



I've considered using the command-line when I started using TLA+, but the manual discouraged it and I use it only occasionally so I didn't want to invest too much time on it.

I'd love to see a TLA+ package for vscode [1]. It seems there is more and more TLA+ adoption, it is surprising there aren't better environments to use it. The TLA Toolbox is awful (well, to be fair to the people that developed it, it does the job and maybe it's ok for some category of users).

EDIT: [1] there's one actually, https://github.com/TeamTilapia/vscode-tilapia


I like the Toolbox.

- It shows you the ASCII and pretty printed spec side-by-side.

- It plots TLC data as it runs and presents live coverage data in a nice way. Soon it will also show TLC performance profiles as color maps. (https://twitter.com/lemmster/status/1082787089140183040)

- It makes running TLC in the cloud (single-node or distributed) very easy. (https://tla.msr-inria.inria.fr/tlatoolbox/doc/cloudtlc/)

- I don't know of any other way of using TLAPS. (https://tla.msr-inria.inria.fr/tlaps/content/Documentation/T...)

Then again, I'm really not a fan of either vim or emacs, despite having used the latter as my main editor for some years. I think this may be because I value tools that present more information in a convenient way that allows me to inspect and analyze my text over tools that allow editing more quickly (when I write code, I spend 80% of my time in Sourcetrail and only use the editor to punch-in characters), but I don't know.


I think it's mostly just sticker shock. Downloading another thing, and it's a bit spoiling these days when most toolsets have some amount of cross-editor support. You can do similar things in various language modes in VS Code, after all (plotting/fancy rendering/cloud APIs).

There don't seem to be many developers working on TLA+ outside of Microsoft Research though, so I think their work is probably better spent elsewhere than just recreating the Toolbox in other editors. It's not that bad. Maybe in a year or two if TLA+ continues growing in popularity someone will help hash out an alternative.

In a prior project one of our developers checked their models in the Toolbox originally and then we just ran them in CI from the command line, which also worked fine and made it easier if people wanted to run/edit them "by hand" in vim or something.


Thanks for the link. I always prefer emacs above anything, but VS Code is not bad at all.


It doesn't seem to be maintained, and it's not packaged either. Not sure it's worth installing.


I am glad to see that I am not the only person who hates the TLA Toolbox. TLA+ is brilliant, but I always feel like an outsider using Vim to edit my specs.


I rarely use the Toolbox (especially since I couldn't find a pre-built Toolbox for ARM processors). I'm happy with vim or emacs and I usually have a script (that also does some massaging on the PlusCal-generated code): https://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/ker...


Frama-C is similar; every one says to use the gui, but I find it horrible and just use the command line while feeling like a cave man.


If it helps at all, I do that a lot too, and have started working on extending my vim plugin[1] to also do configuration and run TLC.

Honestly the big blocker here is learning how vimscript better.

[1] https://github.com/hwayne/tla.vim/


Out of curiosity (and I'm speaking largely out of ignorance on this), have you considered making this plugin NeoVim specific and doing it in another language you are more comfortable with?

EDIT: Also, just realize that you are the person who wrote the Practical TLA+ book...I loved it! Thanks for writing it!


I do it in emacs, have been meaning to write a mode for a while, but.. time.


The TLA+ Video Course by Leslie Lamport: https://news.ycombinator.com/item?id=16956778


Another question is whether translation from TLA to, say Haskell or Scala is on the roadmap? That would be a killer feature but I've never heard this being something that is planned for implementation.


I’ve actually found translation between tla plus and Haskell or agda pretty straight forward using what I call the coinductive approach (which could also be called purely functional OOP done right), I have some examples I shared on reddit two years ago https://www.reddit.com/r/haskell/comments/4aju8f/simple_exam...


It would be impossible. The whole point of TLA is to test & document the high-level design. You can only generate complete implementation for toy problems (and that is a feature)


I wouldn't say that's "the whole point." You can specify at any level you choose, even electronic circuits (in fact, TLA+ has been successfully used in the design of microprocessors). It's just that the most effective general-purpose use of formal methods, given our current technology, is in a high-level design (where high-level is relative to whatever your domain is), as formal methods are constrained by a certain amount of detail, and therefore it is best to use that budget on the actual tricky/subtle parts of the problem, rather than waste it on stuff that's fairly simple (translating the spec into code, which is simple but contains a lot of "uninteresting" detail).


I would still say that, though. For complex/ non-"toy" systems, the required level of detail increases so much, that if you specify it all in something like TLA+, even ignoring the performance aspects, it probably becomes too hard to understand it all and you lose most of the advantage that TLA+ is supposed to give you. It's not just about the "formal verification" part - it's about the whole cycle of model -> verify -> learn/improve understanding -> improve model.

And the fact that it was used to design microprocessors doesn't say that it can or should be used to specify all the details. I bet nobody ever attempted to do the layout from the TLA+ model; it's just like in software, TLA+ was used to design geo-distributed databases... but not to write any piece of the code; just to check that the general design is right.


> it probably becomes too hard to understand it all and you lose most of the advantage that TLA+ is supposed to give you.

Not necessarily. There are tools that compile C or Java to TLA+[1]. But, like all other code-level reasoning, when using them you are limited to very small programs.

> It's not just about the "formal verification" part - it's about the whole cycle of model -> verify -> learn/improve understanding -> improve model.

I agree, but again, that's because TLA+ is intended for industry use, not as a research tool, and as we simply don't have the technology to verify most real programs all the way down to code, TLA+ places an emphasis on the best ROI you can get from formal methods, in practice, given current technology.

[1]: https://link.springer.com/chapter/10.1007%2F978-3-319-17581-...


Have any cloud providers or open source projects started publishing their proofs of correctness yet?

Is it cheaper to hire someone who knows TLA+ or to find a consultant?


I haven't seen Amazon publish any, but various teams in AWS use TLA+ for critical parts of their infrastructure.

A service I worked for there used it for an "absolutely must not make a mistake" component. Even with the initial learning curve of TLA+, two engineers were able to define the application logic, identify bugs and come to a final model in comparatively short order. Once they'd got that final model they found it relatively easy to translate that in to the final service code.

By the time they were done, they realised they'd actually spent less time in total, TLA & coding, than they'd anticipated just building the component from scratch without TLA+.


The architect of AWS, James Hamilton, is a huge fan of formal verification and has spoken about it at length.

https://perspectives.mvdirona.com/2014/07/challenges-in-desi...

http://lamport.azurewebsites.net/tla/amazon-excerpt.html


Indeed. As I recall, he was really happy to hear my team was using it for the component.


My understanding is that TLA+ is primarily used for model checking not correctness proofs. I've dug around for TLA+ proofs and found only simple examples.

That being said, model checking alone is really powerful and can uncover serious design flaws.


You can find some big deductive proofs here: https://members.loria.fr/SMerz/papers.html

But you're right that deductive proofs are hardly ever worth the effort when you have a model checker. If you can get 99.99999% confidence for essentially free, most people wouldn't pay extra months of work to get to 99.999999% confidence.


Cosmosdb is Microsoft multi model, multi master, geo database with five consistency levels for performance.

https://github.com/Azure/azure-cosmos-tla



SPIN has long been used for similar stuff in academia and industry. They have published a lot of their specs and results.

http://spinroot.com/spin/whatispin.html

http://www.imm.dtu.dk/~albl/promela.html

There were many projects using Pi Calculus, too. One of my project ideas is something that converts TLA+ to SPIN and Pi Calculus to use their tools and works. Or just otherwise integrates them.


Which model checkers use Pi Calculus? I've thought about learning Pi Calculus but an much less interested if there's no checker.


I don't think TLA+ is far too expressive to translate to Promela. There are, however, tools that translate subsets of TLA+ to ProB: https://www3.hhu.de/stups/prob/index.php/TLA


Yup, here's some good ones! https://github.com/elastic/elasticsearch-formal-models

Along with someone fixing a bug well before someone found the bug in the wild: https://github.com/elastic/elasticsearch/issues/31976#issuec...


1. You can find plenty of examples, some for production systems (Elasticsearch, some algorithms in the Linux kernel) on the TLA+ subreddit: https://www.reddit.com/r/tlaplus/

2. Learning TLA+ from available tutorials until you can write serious specs of real systems takes 2-4 weeks (part time); achieving the same competence level in a full-time workshop takes 3-5 days.


For those who care to provide constructive criticism, the TLA+ Toolbox issue tracker is just a click away (https://github.com/tlaplus/tlaplus/issues) and - since today - even has a UX label.


Sorry, I still am wobbly on the matter of how TLA+ is different from QuickCheck and it's clones?

The idea that "you don't test a use-case, you test many inputs that may work differently" is solved by fuzzing/test factories.

So TLA+ without the fuzzing, leaves what, temporal stuff? Did I miss anything?


TLA+ is a language that allows you to specify your system in any level of detail. As such, it is more expressive than any programming language is or could ever be.

TLA+ does indeed have a model checker, but model checkers don't "try many inputs." They check the entire state-space, covering potentially infinitely many executions.


> TLA+ is a language that allows you to specify your system in any level of detail. As such, it is more expressive than any programming language is or could ever be.

I don't see that this necessarily follows? In most programming languages it's normal to have high level functions that describe your program in broad strokes and lower-level functions that describe your program in more detail. If you're talking about the ability to check aspects of program correctness without necessarily implementing an executable program, several languages support a technique of using "undefined" or similar to typecheck a "program" that can't actually run, or maybe can't even be built.


It does follow from the need for arbitrary nondeterminism (basically, the unconstrained use of existential quantifiers) for arbitrary-level specification, and the absolute inability to compile/interpret such uses in any feasibly efficient manner. In TLA+ you could describe an algorithm that "somehow" performs some operation, without specifying how. E.g. in TLA+ you can say "this component somehow sorts a list efficiently," if the particular sorting algorithm is not central to what you want to specify.

The requirement that descriptions of computation be mechanically translated into something you can run is an extremely constraining one (both theoretically and in practice -- most high-level descriptions of computational systems cannot be mechanically turned into executables). If you remove it, you get something that is not a programming language, but that is strictly more expressive. So, if all specifications of computations (programs) in your language can be interpreted/compiled into an executable, it necessarily follows that it is less expressive than TLA+.

While programming languages could add constructs for such specifications (contracts/dependent types) they cannot compile them into an executable. This means that you get a language that really contains two separate languages (and they must be separate for compilation to work), one for nondeterministic specification and one for deterministic specification, with only the latter used to describe actual computation, and resulting in a language that is both an exceptionally complex specification language and an exceptionally complex programming language -- basically the worst of both worlds. That such language have only ever been used for software that's substantially smaller and simpler than jQuery is the empirical manifestation of that (not that such use cases aren't important, but that you can only effectively fully specify tiny programs does imply that their expressivity is limited).


> If you remove it, you get something that is not a programming language, but that is strictly more expressive. So, if all specifications of computations (programs) in your language can be interpreted/compiled into an executable, it necessarily follows that it is less expressive than TLA+.

> While programming languages could add constructs for such specifications (contracts/dependent types) they cannot compile them into an executable. This means that you get a language that really contains two separate languages (and they must be separate for compilation to work), one for nondeterministic specification and one for deterministic specification, with only the latter used to describe actual computation, and resulting in a language that is both an exceptionally complex specification language and an exceptionally complex programming language -- basically the worst of both worlds.

I'm not convinced. Using "phantom types" to express additional specifications that don't get compiled is a fairly common/widespread technique (indeed any use of generics in Java could be considered an example). There is a lot of overlap between the things you want to do with specifications and the things you want to do with executable code (composition, structuring, reuse), and in a lot of simple cases (plumbing rather than logic - but I'd estimate that's the majority of business code) one can be inferred directly from the other. So to my mind there's a lot of value to be had from a language that integrates specification and executable and is able to express both.


I'm not talking about phantom types/ghost variables, and I don't mean "not compiled" but cannot possibly be compiled or not compileable. You can easily write non-computable specifications in TLA+, and it's useful! (e.g., to state that a certain program decides halting in certain circumstances). You can also very easily define things like being in NP with angelic nondeterminism. Many interesting and useful things you can say about a computation cannot themselves be compiled to computations.

> There is a lot of overlap between the things you want to do with specifications and the things you want to do with executable code

Sure. Programs are a special case of specifications. TLA+ easily expresses both. The tradeoff is this: is every specification of an algorithm compilable or not? While TLA+ can easily specify algorithms at any level of determinism, not ever specification of an algorithm is compilable -- i.e., some may be too nondeterministic. For example, even Quicksort is not-compilable. Quicksort implementations that can run (i.e., those expressed as programs) do not actually specify Quicksort, but particular refinements of it (see https://pron.github.io/posts/tlaplus_part3).

> So to my mind there's a lot of value to be had from a language that integrates specification and executable and is able to express both.

As I said above, TLA+ easily expresses both (in fact, there are research tools that compile C and Java bytecode to TLA+). But to guarantee compilability you must demarcate the executable part (a language like, say, Agda, does this with a formal separation between types and "terms"; Java+JML does this with a formal separation between code and specification). So as to languages that are made of two sub-languages -- one that affords arbitrary nondeterminism and one that can always be compiled -- sure, that has a value, but that value comes at an extremely steep price. Sometimes that price may be worth it; sometimes not. Not to mention that even those languages rarely achieve TLA+'s strengths, because very often TLA+ is used not to specify a particular executable, but a whole system (which includes multiple executables and their environment). In those very common cases the ability to compile becomes pure cost.

What TLA+ shows is the great value in giving up the ability to compile every specificaton -- an extremely expressive specification language that can be easily learned and used. I've written someplace else (https://www.reddit.com/r/tlaplus/comments/abi3oz/using_tla_t...) that TLA+ might well be the first rich formal logic (as opposed to programming languages that have no quantification) in the history of formal logic that is actually used by practitioners whose job isn't to use formal logic all day (or have been trained in formal logic for a very long time) -- in effect, the first to achieve formal logic's stated goal as a simple tool for practitioners. A goal that has been abandoned by logicians very early on.

In TLA+, you could write a realizable implementation of Quicksort, but also specify Quicksort itself and show that any implementation works. You could do the same in, say, Agda (or Java+JML), but at a much higher cost. TLA+'s power is in being extremely expressive, and at the same time extremely simple (never mind Agda - it is simpler than Python).


TLA+ lets you work at much different scales. I have one spec I'm working on where the "domain model" is multiple nodes trying to reach consensus, where some fraction of the nodes are adversarial and intentionally trying to subvert the others. With that you can ask questions like "what is the maximum fraction that can be adversarial and still guarantee these specific global system properties."

Another one involved modeling several microservices, each of which had its own team and codebase. That kind of scale is really hard to do at the lines-of-code level of verification.


Normal testing feeds selected inputs to the software and succeeds for the corresponding output.

QuickCheck feeds random values to the software and looks for "good" results.

TLA+ takes a model of the system (not the software) and explores the entire state space and looks for anything that violates given logical predicates. It is a different later of abstraction.


TLA is about testing the design, not the implementation. Read the TLA homepage, I promise it's fun.


Thank you, hn and mbellotti. I was getting ready to learn tla+ but IDE seemed like impenetrable wall to me.


You still might want to use the IDE at first, mainly because resources like Lamport's video lectures use it.


I've watched TLA+ creators videos and was impressed positively. It is lovely to see such logic in action with powerful results.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: