Hacker News new | past | comments | ask | show | jobs | submit login
Building a “Simple” Distributed System – Formal Verification (jack-vanlightly.com)
151 points by pron 86 days ago | hide | past | web | favorite | 9 comments

If you're interested in learning TLA+, the best resources are:

(1) Specifying Systems by Leslie Lamport (https://lamport.azurewebsites.net/tla/book.html)

This is the original TLA+ textbook (I myself learned from it) and is very thorough, but sometimes fails to differentiate between specs which are model-checkable and specs which are not. Since model-checking has become recognized as the main source of value from TLA+, this might frustrate readers who think "woah, TLA+ can do THAT!?!?" only to be disappointed. I still like it, though. You can get a free digital copy.

(2) The TLA+ Hyperbook by Leslie Lamport (https://lamport.azurewebsites.net/tla/hyperbook.html)

Unfinished and unlikely to ever be finished (Leslie has stopped working on it), serves as a good companion book to Specifying Systems. I believe this is the only resource for learning the formal proofs capability of TLA+, which I myself have not yet used.

(3) The TLA+ video course by Leslie Lamport (http://lamport.azurewebsites.net/video/videos.html)

Very goofy, very fun. When Leslie teaches courses to engineers in person, he uses these videos. They're effective.

(4) Learn TLA+ by Hillel Wayne (https://learntla.com/introduction/)

A free online resource for learning TLA+.

(5) Practical TLA+ by Hillel Wayne (https://www.apress.com/us/book/9781484238288)

I own this book, and it is a great introduction to PlusCal (a pseudocode-type language that transpiles to TLA+) and TLA+ in general. Highly recommend! Along with the video course, probably the ideal starting point these days.

If you're capable of understanding TLA+ proper, is there much value in starting with PlusCal, in your opinion?

Is PlusCal more a gentle gateway or properly useful in real-world applications?

I lean towards starting with Lamport's book over Wayne's, because the latter seems very PlusCal-centric (and the layout and typography don't appeal to me), but that may be a mistake.

I use PlusCal and TLA+ for different things. TLA+ shines in almost every area, with one huge hole: sequential algorithms (or communicating sequential processes) are a PAIN to specify - you have to keep track of which "instruction" you're executing in each process, and it's a ton of boilerplate. TLA+ works best when the interacting state machines you're specifying are individually small but emergently generate extremely complicated behavior. PlusCal patches this hole perfectly by letting you write the logic in pseudocode and transpiling it to TLA+, taking care of all the boilerplate.

I think Hillel has said people find PlusCal easier to understand off the bat, but I don't recall exactly where he said that. In the Amazon TLA+ whitepaper (https://lamport.azurewebsites.net/tla/formal-methods-amazon....) different engineers found TLA+ or PlusCal easier to learn.

> different engineers found TLA+ or PlusCal easier to learn

That continues to be my experience. It varies both by engineer, and (as you point out) domain. Also, as I said in a sibling comment, it’s also important to think about what people want to read, because models are extremely useful formal documentation of protocols and interactions.

Source: I’m one of the authors of that Amazon paper.

One advantage of PlusCal is that it’s super easy and convenient to experiment with which operations need to be atomic and isolated than it is in raw TLA+. In pluscal, you can just move labels around, and try out “if the system could crash between these two operations, would it still be correct?” and similar questions. You obviously can do that in TLA+, but it’s more work.

Also, for me, TLA+ and PlusCal have huge value as documentation and communication tools. It’s not about whether I would prefer to read TLA+ or PlusCal, it’s about which my intended audience would prefer. Different situations lead to different choices.

Also, TLA+’s syntax is (in my entirely subjective opinion) needlessly weird, and makes the semantics somewhat less approachable.

> Also, TLA+’s syntax is (in my entirely subjective opinion) needlessly weird, and makes the semantics somewhat less approachable.

For me it's the opposite. I find programming language (and PlusCal) syntax weird, though not needlessly so, as programming languages are necessarily extremely complex. TLA+ is the cleanest, simplest, most natural syntax I've yet seen for describing computations (and it is not surprising that research into "natural" programming for non-programmers in languages like Eve has ended up with something similar), and it's hard for me to imagine any other syntax with clearer semantics. But obviously this is a matter of personal taste.

I don't particularly like PlusCal (and I find TLA+ easier to understand), but I've used it for low-level concurrent algorithms.

This is a question I lose sleep over. I keep going back and forth on it. Here's my current position:

_In my experience_, PlusCal is easier for the average engineer to start with, because it lets me decouple learning about behaviors, learning temporal logic, and learning basic predicate logic. It's also my experience that for many people, it's easier and faster to learn PlusCal and then TLA+ than to start with TLA+.

But PlusCal adds syntax and has its own quirks and logic. So if you are fine with TLA+, it's unnecessary overhead and can distract from the core ideas. Here's a quick test:

   Op(x, y) ==
      /\ x > y
      /\ \E s \in S:
          /\ x' = F(s)
      /\ UNCHANGED y
If that makes sense to you, start with TLA+ proper.

I think this is ultimately an empirical question about "the average beginner." The main issue I have with PlusCal (aside from being more complicated in that it has complicated constructs like subroutines and a stack) is that it could confuse people into thinking that specifying (in TLA+/PlusCal) resembles programming because the PlusCal syntax resembles programming. This can lead to confusion down the line. This is why I think it's a question of would you rather be confused now or later? But if someone finds it easier to start with PlusCal, I see no issue with that.

I'll give it a try:

We choose some element s out of the set S, but only if x > y. That element gets fed into a function whose output is the value of x in the next step. y stays unchanged.

If x <= y the whole clause doesn't apply.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact