Absolutely, and thus types and specification via TLA+ are not in opposition. In fact they can work in harmony.
> someone who's familiar with programming languages may view specification languages as groundbreaking
That's not what I find groundbreaking. What I find groundbreaking is that TLA+ is (supposedly) so composable despite being so simple. I remain skeptical. Indeed, even the fundamental TLA+ notion of "priming" variables seems that it would immediately be an impediment to composability.
> There's a result (mentioned in my Curry On talk/post) that simply put states that correctness doesn't compose.
Yes, I recall that. I believe the theoretical result, of course, but I am yet to be convinced of its practical relevence. Correctness does compose in my experience as a mathematician and as a software developer. In fact, I can't see any way that correctness can be achieved except through compositionality.
> A TM with 6 control states can have 10^35,000 states or so
Please explain your terminology. Surely it can have infinitely many states? Indeed a Turing machine with 1 control state can have infinitely many states.
> states in sub components simply cannot be hidden
I find this a very puzzling statement. States in subcomponents can be easily hidden in every common-or-garden programming language.
> You write your assumptions, and hope to get verification for your claims.
Sure, but as we've observed, it doesn't actually tell you anything about the final implementation you come to write.
> The problem here may still be that you're thinking of composability in terms of functional languages instead of composability of abstract programs.
I'm thinking of composability in terms of building things out of smaller things and being able to deduce properties of the things based on the properties of the smaller things. That notion precisely captures "composability" (even "compositionality", if we want to get formal) as far as I am concerned. What does it mean to you?
> The book won't tell you that what you know as parameterization is just another form of many of nondeterminism
OK, but you could. What exactly does that mean?
Sure, it's just that TLA+ can open your mind to the idea that you may not need types for things you think you do -- even in a programming language. It can also show you how you can do mutation and still remain pure without using monads or any higher-order function composition. This idea is used in synchronous programming languages, too, so this is something that definitely applies to programming languages.
> Indeed, even the fundamental TLA+ notion of "priming" variables seems that it would immediately be an impediment to composability.
Why? It's a perfectly pure operator and it distributes. Op(x, y)' = Op(x', y'), assuming Op doesn't have any other variables in it, i.e., it's a "level 0" operator.
> Correctness does compose in my experience as a mathematician
I can show you a trivial example of why that's not true: the gravity equation. Everything composes, yet two instances are already chaotic. You can't extrapolate almost anything you know about the gravity equation to predict the bodies' positions over two instances. For discrete processes, such chaotic behavior is small potatoes. We don't even use the word "chaotic" because we're way beyond that. The problem is this: if everything is simple, you can compose correctness. But when everything is simple, composition isn't really required. This is what I'd like to call the "FP extrapolation problem". FP works beautifully for simple examples, i.e., where new approaches don't really matter. People then extrapolate from toy examples and assume it can scale (in spite of the theory that says otherwise), yet no one has yet shown that it does in practice. By that I don't mean that you can't write large programs in FP. Of course you can. But the advantages that are very clear in the simple cases (where they don't matter), start dissipating and you just don't see a drastic effect at scale (maybe the effect is even negative; nobody knows).
> Please explain your terminology.
Sorry, I meant a TM that terminates eventually. I think that's either the current record or the lower bound. Don't remember.
> I find this a very puzzling statement. States in subcomponents can be easily hidden in every common-or-garden programming language.
They're hidden from the programmer's view, not from their effect on correctness. That is precisely the result that says that verification isn't fixed-parameter tractable. Namely, if the cost of verification is O(|S|) when S is your number of states, you can't say, well, each of my components has n states and I verified them individually, and now I want to compose k components, and so the cost of verification would be O(small function of k). It will be O(n^k) or something like that.
> Sure, but as we've observed, it doesn't actually tell you anything about the final implementation you come to write.
Right, but nothing else can, either, unless you do end-to-end verification, which is far beyond what 99.9% of projects can afford to do (or need to). Here is where you rely on software engineering practices, tests and whatever tools you have for local properties that help prevent easy bugs (like types and static analysis). There is nothing that can help you prove interesting logical properties at this level, at anything close to an acceptable cost.
> What does it mean to you?
Oh, the same. It just doesn't look the same way in TLA+, where you compose things as A ∧ B rather than A(B) (they don't mean the same thing).
> What exactly does that mean?
Alright, let me try: suppose you have a higher-order function, say `filter`, that takes a function of type `a -> bool` (where a is a type parameter), and filters a list. Now suppose you want to prove that, say, the length of the resulting list is no bigger than the input list. In a functional language you say, I have a function parameter `p` that could be anything of the appropriate type `a -> bool`. In TLA+ you don't need this because you can say that p is not a parameter of a certain type, but a concrete, specific program that nondeterministically returns either TRUE or FALSE. You prove your theorem for that specific p, but it really covers the same as all predicates p of that type.
The cool thing, then, is this: you can ask, what is the relationship between that nondeterministic program p, and a deterministic program p1, that, say, returns true iff its argument is an even number? Well, it's a refinement relation (any behavior of p1 is a behavior of p, but not vice-versa), and in TLA+ this is expressed simply as p1 => p, where => isn't some special notation but the same-old familiar implication operator from propositional calculus, because in TLA+, a program is a binary boolean predicate that evaluates to TRUE iff there's a next-state relation between its two input states, the unprimed and primed.
The key to that is understanding the model (in the logical sense) of a TLA formula (i.e. a program). The model is a set of all possible behaviors, where a behavior is a sequence of states (similar, but not exactly a trace in the process-calculi sense). So a program is a set of behaviors. A program property (i.e., the program never sends the character "x" over the network), or a "type", is the set of programs that satisfy it (I mean by the simple isomorphism between predicates and sets). Here is the crucial bit: as the TLA logic only allows to specify properties that must be true for all behaviors or none (hence the problem with probabilistic algorithms), there is no need to keep the behavior sets for each program in the "set of programs" separated, and instead of a set of programs -- i.e. a set of sets of behaviors -- a program property can simply be a set of behaviors. But a set of behaviors is a program! Therefore a program property, or a "type", in TLA is the same thing as a program. Program properties and actual algorithm specifications become a hierarchy of nondeterministic programs that refine or abstract one another. So the program that does parallel merge-sort is a refinement of a program that does some sort of merge sort, and that, in turn is a refinement of a program that "sorts somehow". And all those relationships are simple propositional calculus connectives, and you can reason about what kind of low-level steps a low-level specification does for each abstract step in a higher level specification.
I'm very much hoping this will be the case!
> > Indeed, even the fundamental TLA+ notion of "priming" variables seems that it would immediately be an impediment to composability.
> Why? It's a perfectly pure operator and it distributes. Op(x, y)' = Op(x', y')
> assuming Op doesn't have any other variables in it, i.e., it's a "level 0" operator.
That's not an assumption I want to have to make! Since I don't understand this specific example perhaps my challenge here is unwarranted, but in general if I have to "assume" anything then that's an impediment to composability.
> In a functional language you say, I have a function parameter `p` that could be anything of the appropriate type `a -> bool`. In TLA+ you don't need this because you can say that p is not a parameter of a certain type, but a concrete, specific program that nondeterministically returns either TRUE or FALSE
OK, I think I see. You prove something using the specific, concrete p which has fewer properties than any other p, so therefore anything that holds for it must hold for the other p too. I don't quite see how this is different from proving it for "arbitrary p" but perhaps it will be come clear as I learn more.
> I mean by the simple isomorphism between predicates and sets
And this is why I get worried about TLA+ using "simple mathematics". This "simple isomorphism" simply doesn't exist, as I'm sure you know. You have to restrict yourself to saying "simple isomorphism between predicates on a set and subsets of that set", at which point I start to worry that any higher-order possibilities go out of the window.
> But a set of behaviors is a program! Therefore a program property, or a "type", in TLA is the same thing as a program.
Sure, but a possibly nondeterministic one that you can't actually implement (at least not directly).
Anyway, thanks for the introduction. When I get some time I will look into it further.
You don't need to make that assumption, though. TLA+ checks it for you. My point is that everything is pure and well-behaved. The very same issues of modality happens when you use monads. Only in TLA+ it's very simple to see when an operator is a constant (called level 0), a state function/predicate, i.e., a function of the current state (level 1), an action/step (level 2), or a full temporal formula (level 4). In a temporal formula, you can say something like, whenever my program sees an event x, it would eventually output a response y. In a way, TLA+ has 4 syntactic types, and the relationship between them is checked instantly at parsing. E.g. you can't take an action or a temporal formula and prime it (TLA+ checks "inside" your formula and knows what level it is). Priming a constant keeps it a constant; priming a state function turns it into an action (a relationship between two states).
> This "simple isomorphism" simply doesn't exist
No, I meant something very simple. Any boolean predicate, e.g. "this number is even", is the same as the set of all even numbers. It's a characteristic function. It always exists.
> Sure, but a possibly nondeterministic one that you can't actually implement (at least not directly).
Right, but nat -> bool, is the same as saying "all programs that take a natural and return a boolean", which is the same as the ND program that for any natural nondeterministically returns true or false. It's describing the same set or type, but in a different way, which makes simple logical implication the same as subtyping or type inhabitance. B is a subtype of A, is, in TLA+, B => A. a is an instance of A, is in TLA, a => A.
BTW, it's not true that TLA+ can't tell you that your implementation is correct. It can, it's just hard (but probably no harder than in Coq). What you do is, you write a high-level model and prove it. Then you implement it in a programming language, and you write a compiler from that language to TLA+. Then you check that the TLA+ version of your implementation is a refinement of your high-level model. Academic teams have done this for C and Java. The problem is that implementations have many details, so it doesn't scale beyond small programs.
> When I get some time I will look into it further.
I'd say that the effort to learn TLA+ is comparable to learning a very simple programming language with a couple of deep concepts. Scheme maybe. It takes two weeks to be able to write large, complex specifications (though probably longer for the theory to become intuitive). The entire documentation for all the "standard libraries" fits on a single page. The reference for all the language, with full explanations, fits on 4 pages. This is all of TLA+, including the standard library, minus the proof language, including all syntax sugar: http://research.microsoft.com/en-us/um/people/lamport/tla/su... (e.g. all the special syntax for sequences and records is just sugar for functions, so r.f is sugar for as r["f"]). Personally, I'm interested in theory for intellectual reasons, but the single reason I picked TLA+ is that Amazon technical report, where they showed that 1. learning it is easy for engineers to do on their own, and 2. it scales well for very large systems.