
TLA+ model checking made symbolic - feross
https://blog.acolyer.org/2019/11/29/tla-model-checking-made-symbolic/
======
QuinnWilton
Is anybody using TLA+ in production outside of academia? What for? I've always
wanted to, but I'm not familiar enough with it to always recognize when it'd
be the right tool to reach for.

I primarily code in Elixir and Erlang, and have used property testing to
perform some level of model checking when working with state machines, and I
suspect TLA+ could be helpful for formalizing that a little bit, but it isn't
a theory I've ever put into practise.

~~~
im_down_w_otp
We use TLA+. There's nothing particularly magical about TLA+. There are a
variety of modeling languages and model checking toolkits out there. I've yet
to encounter someone in industries that have commonly adopted model definition
& checking who uses TLA+. It's popularity seems highest among folks in newer
SaaS/IT arenas than elsewhere.

More often I run into folks using SCADE's model checker. Tied for a distant
2nd, I'd say I've run into more folks who work with NuSMV, CADP, and mCRL2
than TLA+. This could also be mostly sample bias, and it's worth noting that
the folks who use _any_ of these things are in the very extreme minority of
architects & developers that I speak to in automotive, med. tech., industrial
automation, etc.

Though, like I said, we use TLA+. Not for any particularly good reason than it
was a sufficient tool among the variety of comparable tools for what we needed
to do, but I expect that if we had to go back and do it all over again we'd
have selected something else. The specification language has some annoying
warts, the checking performance isn't comparatively very good, and the
provided IDE works in almost exactly the opposite way any of us wish it would
work. To the point that one of our engineers figured out how to wire it all up
through Emacs so we could stop monkeying around with TLA Toolbox for 90% of
the work. Of course, everything has problems, so TLA+ isn't notoriously
different here. We've just used it in anger more than other things.

For whatever it's worth, I also used to be an Erlang developer and used to
play around a bit with McErlang, which I think might be a dead project now.
However, there's an active project called Concuerror
([https://concuerror.com/](https://concuerror.com/)) which kind of covers the
same domain. I don't know how well it works or what its level of maturity is,
but it is very interesting in this one very important regard... it's model
checking against your actual program implementation. That's an enormous and
fantastic difference. Model conformance of concrete implementation to your
design (and checked) model is an enormous problem, and tools which can close
or remove that gap can be enormously valuable. What I don't know is the level
of sophistication of Concuerror, nor if it is able to check for the kinds of
properties you care about in your systems.

~~~
pron
TLA+'s secret power is refinement; the entire logic is built around the
abstraction/refinement relation. One of the best ways to use TLA+ is to
specify you system in two or three levels of detail and check that they are,
indeed, refinements. If you just use TLA+ for model-checking and don't use
refinement mappings that are possible due to the language's elegance, you're
giving up on most of its power, and if that's the case, you might as well use
weaker specification languages like Promella and NuSMV. But I think it's
worthwhile to learn how to use refinements and enjoy the full power of TLA+
rather than just as a language for model-checking. Unlike Promella and NuSMV,
TLA+ is not designed as a frontend language for a model-checker. It's a
language primarily designed to allow you to think about and understand your
system in interesting ways, and it _also_ has a model checker. It's designed
to be read and discussed (hence the emphasis on pretty-printing) much more
than written and edited. If you're using it anyway, make the full use of it.

~~~
hwayne
I'm now _really_ curious as to how many people who use and value TLA+ have
actually used refinement. I've used it, you've used it, Lamport and (probs)
Merz and Markus have used it, most other people I've talked to haven't.
Probably time for a community poll or something.

~~~
pron
Technically it's impossible to use TLA+ without using refinement, as the
entire TLA logic is based on it. Checking Machine ⇒ □P is checking that
Machine refines □P. But, of course, the real power -- and what you and I both
meant -- comes from writing machine formulas on _both_ sides of the
implication sign.

I don't think many of those who use TLA+ use refinement between two machine
formulas, but I think that's partly because they still consider TLA+ as a
frontend language for TLC, which is unfortunate, because if all you want (or,
rather, think you want) is a model-checker, there are probably better ones. In
fact, the little use of refinement is just a symptom of this bigger issue. I
don't think refinement is always useful, but I do think that everyone who uses
TLA+ should use its power for analysis rather than just mundane model-
checking. I admit, TLC is so enticing, that once you use it, TLA+ can feel
secondary to it. Of course, I'd be happy if people used model-checkers more,
whether TLC, Spin, NuSMV, Simulink or others, but I'd be even happier if
people used TLA+ for its primary purpose, perhaps after being enticed by TLC:
thinking about system design. I guess the problem is exactly what Lamport
pointed out: programmers (including myself) don't like to think, especially
when we can use automation. To be honest, I don't think I would have come to
TLA+ if it weren't for TLC. So, in a sense, TLC both pulls people to TLA+
_and_ obscures its main purpose.

In the TLA+ workshop I imagine one day giving to my colleagues, I _start_ with
refinement. I would also start without TLC at all; just showing how TLA+ helps
you think -- its main purpose and advantage. When writing specifications, I
now always start with refinement, first writing how the system would behave at
a very high-level, and only then refining a specific algorithm.

(If anyone reads this without having read the article, and doesn't know what
TLC is, it's a model-checker for TLA+ that's included in the TLA+ Toolbox)

~~~
im_down_w_otp
Why exactly is TLA+ better suited to thinking about system design compared to
say mCRL2, SCADE, or Simulink? Especially relative to the latter two since
they provide a paved road to going from your system model to your system
implementation, which TLA+ makes no attempt to do.

~~~
pron
I don't know mCRL2 at all, but SCADE and Simulink are niche tools (excellent
at what they're for). TLA+ is completely general-purpose because it can
describe a system at _arbitrary_ levels of abstraction (and link the levels).
You could hypothetically describe the same system at the level of logic gates
or at the level of human interaction. Now, it is not true that TLA+ makes no
attempt to directly link to the implementation, but it makes no _particular_
attempt to do so. Its ability to do so stems directly from its power to
describe a system at truly arbitrary levels. You want to describe your system
at your code level so that you can verify your actual code? You can do that.
But here's where things get interesting. If you want to _verify_ a
specification using any sound tool -- be it a model-checker or deductive proof
-- the size of the specification becomes very limited. We can verify about
3000-6000 lines of a specification. It can be this number of lines of a low-
level specification, i.e. code, or of a high-level specification. So this
means that if we want to verify our actual implementation, our program must be
tiny. But TLA+ gives you the option to specify and verify, say, a 2000-line
high-level specification of a 2MLOC program to find problems at that level,
and that's what people in general-purpose software need most.

If your tool -- any tool -- links directly to the implementation, we know that
the verified implementation will be very limited in size. It is TLA+'s ability
to choose arbitrary levels of detail that gives it the power to help design
large system. Such designs, however, cannot be formally (i.e. mechanically)
linked to the implementation. If they could, we'd know how to fully verify
large systems, but we don't.

~~~
im_down_w_otp
Everything you said about TLA+ here also applies to the other tools I
mentioned. You'd just work with them differently to get to similar ends, but
with the added benefit of you can actually directly use the output of your
efforts. I also don't understand how you arrived at SCADE & Simulink being
niche tools, but somehow TLA+ is general purpose?

There must be _at least_ 2 orders of magnitude more Simulink users doing
model-based design & development than there are people who use TLA+ for pretty
much anything. Simulink has limitations, but being comparatively niche
relative to something like TLA+ isn't one of them.

As an aside, not only does TLA+ not make an effort to provide a means of
mechanically linking one's model to one's program, but when the question was
asked of one of its progenitors how a developer might do this they were
literally scoffed at for the suggestion.

You're not describing anything to me that I'm not already aware of insofar as
TLA+ is concerned. Having also used several other similar or adjacent tools,
and being in constant contact with folks who use them for avionics systems,
medical devices, large scale industrial automation, etc., I'm still inclined
to say that TLA+ is _fine_. It's unnecessarily annoying in a bunch of
dimensions. I am skeptical that if TLA+ wasn't descendent from one of the
distributed systems godfathers, and wasn't associated with AWS in the software
developer zeitgeist, that there'd be as much intrigue & interest around it.

Also, I begrudge having had to come to the defense of SCADE and Simulink. I
spend a non-trivial amount of my week essentially arguing against them in some
respect, but they deserve their due.

~~~
pron
> You'd just work with them differently to get to the same ends, but with the
> added benefit of you can actually directly use the output of your efforts.

With TLA+ you can also directly use the output of your efforts. But if you're
talking about compilation to some executable, that, unfortunately, requires a
certain level of detail above which you cannot rise due to computational
complexity concerns. So if you want that, you'd have to use TLA+ at the same
level as other tools that emit code.

> There must be at least 2 orders of magnitude more Simulink users doing
> model-based design & development than there are people who use TLA+ for
> pretty much anything. Simulink has limitations, but being comparatively
> niche relative to something like TLA+ isn't one of them.

Oh, of course! Those are excellent tools, but still niche (they target mostly
embedded and/or real-time). You cannot use them to design and verify a 2MLOC
system. I wasn't talking about popularity, but about target domains.

> As an aside, not only does TLA+ not make an effort to provide a means of
> mechanically linking one's model to one's program, but when the question was
> asked of one of its progenitors how a developer might do this they were
> literally scoffed at for the suggestion.

First of all, as I said, it does not make a _particular_ effort, because it's
not necessary. People have used TLA+ to verify actual C, Java and Go code
(that I know of). The problem is that once you do that you are bound by the
scaling limitation of _all_ tools that operate at that level. TLA+ is
relatively special in that it can rise beyond that limitation, and _that_ is
where most of its users find value. Using TLA+ at the code level is just not
using it to its full potential, and so misses the point. Unfortunately,
specifying and verifying at that level can only be done by abandoning what we
call "end-to-end", i.e. verification all the way to the code.

> Also, I begrudge having had to come to the defense of SCADE and Simulink. I
> spend a non-trivial amount of my week essentially arguing against them in
> some respects, but they deserve their due.

Of course, and I'm sorry if it sounded as if I was suggesting that TLA+ is
better than them at what they do. I, myself, used Esterel decades before first
learning of TLA+. TLA+ is better at _other_ things -- things that are more
applicable to a much wider application domain. My point is that due to
fundamental limitations -- we can only verify a specification of a certain
size -- there is an inherent tradeoff between scale and application domain and
the ability to generate and executable.

------
adizere
[https://github.com/interchainio/verification/tree/develop/sp...](https://github.com/interchainio/verification/tree/develop/spec)

