Please suggest alternatives - that actually work.
> Most of the relevant meaning is in human interpretation of the input and output or the sensors and actuators.
The relation between the input and the output. This can only be stated formally.
> Components' meaning may be stated informally.
That leads to disaster very quickly.
> I am not aware that formal meaning is the only valid meaning.
Well, I am. Too much nonsense has been said as a result of not formalizing things.
> Second, you keep identifying the notion of computation with the notion of a program written by a person.
No. Even if a computation “arises in nature” (which is just your way of looking at things, actual physical phenomena is what you can measure, plain and simple), the only way to make sense of it is to write a program that reproduces the computation. This is how scientific theories work.
> Or TLA+, which makes types notoriously more difficult to use by comparison.
Some type systems are more difficult to use than others, and the type theory community is definitely guilty of not paying enough attention to usability, but I don't see how model checking has a higher power/cost ratio than sensibly designed type systems. Which tool satisfies the following criteria?
(0) Compositionality: you can understand a large system by studying its parts in isolation.
(1) Handles higher-order constructs (first-class functions, objects) gracefully.
(2) Stops you from saying nonsense as early as possible.
I rest my case.
AFAIK, the only languages that fully express the computational meaning formally with types are the dependently typed ones. AFAIK, there has been exactly one non-trivial real-world program written in such a language, and its author wasn't too pleased with the process (I'm referring to CompCert and Xavier Leroy). Every other program in every other languages has most of its meaning expressed informally, in the mind of the programmer. Unless you believe that no programs work, then nearly all programs that do express most of their meaning informally.
> Too much nonsense has been said as a result of not formalizing things.
Formalizing things doesn't imbue them with good sense. You can state plenty of nonsense formally. The only thing you're guaranteed is that it's consistent nonsense, and, as someone said on one of my favorite TV shows, "The sole virtue of the ineffectual is consistency".
> Which tool satisfies the following criteria? ...
TLA+, and far more elegantly and simply than any typed language you've seen.
I said “formally”, not “with types”. I firmly believe in using the right tool for the job, and some things are best handled with manual proofs. As helpful as types might be for computer-generated proofs (type inference), I'd rather carry out my manual proofs using good old-fashioned predicate logic. Some proof techniques still need to be backed up by type structure, though (e.g., induction over datatypes).
> Every other program in every other languages has most of its meaning expressed informally, in the mind of the programmer.
Unfortunately, I can't read minds. I can only read proofs that have been explicitly written down.
> Unless you believe that no programs work, then nearly all programs that do express most of their meaning informally.
My definition of “work” is “works in all cases”. No bugs. No unforeseen cases.
> TLA+, and far more elegantly and simply than any typed language you've seen.
Still eagerly awaiting your elegant and simple explanation of this! It sounds groundbreaking (genuinely).
and an older paper covering an interesting theoretical property has over 1000. Also, the author did get a Turing award -- not primarily for this, but it got a mention. So yeah, we're talking about a very well-known formalism in software verification, and people were impressed. I don't know about groundbreaking because the community is different. In software verification you get points for new proof methods and such; not necessarily for creating an elegant language. TLA+ is about finding the ideas that have been shown to work at scale, and creating a very elegant, minimalist language to express them, based on simple ordinary math as much as possible.
There is no shortage of good tutorials (especially considering how simple TLA+ so there's no need for 500 tutorials on monads), and the language was designed for and is used by engineers in industry working on large, complex systems. Companies that have started to use TLA+ (or use it extensively) on large systems -- Amazon, Oracle and Microsoft -- report that their engineers pick it up on their own from available materials in two weeks (that was my experience, too; it's easier than most programming languages). But those are tutorials and don't focus on the theory. Amazon published two excellent technical reports, one in CACM, about their experience, with pretty good data (that's what convinced me; I'm impressed with what's been shown to work in the field).
There is also no lack in more academic, theoretical material, but, as usual, that is mostly concerned technical details.
What's missing is a brief overview of the theory. I'd read at least the introduction to the original paper above, this 4-page historical note, and this.
Your point would be so much more effective if you could bring yourself to avoid having a dig at other technologies.
> I don't know about groundbreaking ... TLA+ is about finding the ideas that have been shown to work at scale, and creating a very elegant, minimalist language to express them
TLA+ is groundbreaking, if it satisfies your description. I have never found anything that hits the sweetspot of elegancy and simiplicity along with robustness and composability as well as Haskell does. I'm not ideological about this. That is simply my practical observation borne out of a couple of decades of experience writing software. If TLA+ is genuinely better than Haskell on those axes then I'll be delighted to switch without a second thought. I am hoping to be convinced!
> I'd read at least the introduction to the original paper above, this 4-page historical note, and this.
I read them ages ago when you first suggested them to me. The impression of TLA+ I got was far from compelling in the composability stakes. I suppose I could go Googling for info on how TLA+ is composable but if you know the answer yourself why not explain it or point me to a specific article which explains it?
Thing is, TLA+ does what specification languages are supposed to do. It is just an exceptionally small and elegant specification language. It may feel like groundbreaking to people who have never used specification languages. Haskell is not a specification language. It's a programming language, therefore it is much more complicated -- like all programming languages -- and much weaker -- like all programming languages. There are programming languages that seek to do what both specification languages and programming languages do. Examples include SPARK, Coq, Agda. Of them, the only successfully used language is SPARK, but it is limited as a programming language. Coq and Agda have never been used for large programs, so nobody knows if they work at all. Early results (like CompCert) are not promising. TLA+ is used on very large projects every day at Amazon, Oracle and Microsoft. No academics are involved. Same goes for other specification languages like Z.
Because for years the PL community has been trying -- without success -- to combine a specification and a programming language into one, someone who's familiar with programming languages may view specification languages as groundbreaking, but they've been doing their job for decades. They just don't try to do what the PL community is trying which is to also serve as programming languages. Once you give that up, the things you want and that are so hard to get become much easier.
Here's how things become easier:
1. There's a result (mentioned in my Curry On talk/post) that simply put states that correctness doesn't compose. In intuitive terms, if your program consists of 10,000 functions, each 3 lines long, composed in various ways, you can't prove global properties by doing simple proofs that are as easy as you'd expect for 3-line functions, and have the global result pop out. The idea of the proof is very simple: in computations, complexity builds up very fast. A TM with 6 control states can have 10^35,000 states or so, and states in sub components simply cannot be hidden. The way to tackle that is not to try to prove the global properties directly from the local properties, but to describe the entire program in various abstraction levels and then prove that each lower level refines the higher one. This is very hard to do in programming languages at scale. In a specification language, you have better control over level of abstractions and what details you want to leave out.
2. There's nothing external; I.e. there's no IO. An example of something that's very important to prove is, e.g., if you read a certain record from the DB many times, its "last_updated" field gorws monotonically. I don't know how you can try to do it in a programming language. In a specification language, you specify the entire system -- the DB, the network, the OS, the users -- and you do it in the abstraction level that precisely suits your needs. There is no information overload and no missing information. You write your assumptions, and hope to get verification for your claims.
3. There's no need for efficiency. You can freely use simple mathematical definitions whose "automatic" implementation is extremely inefficient, because the code never gets executed.
So, specification languages work, and have worked for a long time. What doesn't work is a specification language that's also a programming language. That would be truly groundbreaking, but I don't think anyone is even remotely close, or is even looking in the right direction.
> I suppose I could go Googling for info on how TLA+ is composable but if you know the answer yourself why not explain it or point me to a specific article which explains it?
See chapter 10 of this freely downloadable book http://research.microsoft.com/en-us/um/people/lamport/tla/bo...
The problem here may still be that you're thinking of composability in terms of functional languages instead of composability of abstract programs. The book does not translate the terms you're familiar with to TLA+ terms because it doesn't assume any existing predilections. For example, you're used to thinking of the "map" function as a higher-order algorithm. That doesn't mean that's what it is; that's just how it's described in Haskell. Another way of looking at it is that map is nondeterministic. The book won't tell you that what you know as parameterization is just another form of many of nondeterminism (concurrency, IO are other forms). If you have nondeterminism as a built-in construct, there's no need to work hard on higher-order things; they're just special cases. But, again, the book starts "from scratch" and doesn't assume that you're used to thinking in Haskell. So there's no lack of material; there just isn't a dictionary for people who are already used to thinking about things in some specific way. I hope to write such a dictionary one day.
Absolutely, and thus types and specification via TLA+ are not in opposition. In fact they can work in harmony.
> someone who's familiar with programming languages may view specification languages as groundbreaking
That's not what I find groundbreaking. What I find groundbreaking is that TLA+ is (supposedly) so composable despite being so simple. I remain skeptical. Indeed, even the fundamental TLA+ notion of "priming" variables seems that it would immediately be an impediment to composability.
> There's a result (mentioned in my Curry On talk/post) that simply put states that correctness doesn't compose.
Yes, I recall that. I believe the theoretical result, of course, but I am yet to be convinced of its practical relevence. Correctness does compose in my experience as a mathematician and as a software developer. In fact, I can't see any way that correctness can be achieved except through compositionality.
> A TM with 6 control states can have 10^35,000 states or so
Please explain your terminology. Surely it can have infinitely many states? Indeed a Turing machine with 1 control state can have infinitely many states.
> states in sub components simply cannot be hidden
I find this a very puzzling statement. States in subcomponents can be easily hidden in every common-or-garden programming language.
> You write your assumptions, and hope to get verification for your claims.
Sure, but as we've observed, it doesn't actually tell you anything about the final implementation you come to write.
> The problem here may still be that you're thinking of composability in terms of functional languages instead of composability of abstract programs.
I'm thinking of composability in terms of building things out of smaller things and being able to deduce properties of the things based on the properties of the smaller things. That notion precisely captures "composability" (even "compositionality", if we want to get formal) as far as I am concerned. What does it mean to you?
> The book won't tell you that what you know as parameterization is just another form of many of nondeterminism
OK, but you could. What exactly does that mean?
Sure, it's just that TLA+ can open your mind to the idea that you may not need types for things you think you do -- even in a programming language. It can also show you how you can do mutation and still remain pure without using monads or any higher-order function composition. This idea is used in synchronous programming languages, too, so this is something that definitely applies to programming languages.
> Indeed, even the fundamental TLA+ notion of "priming" variables seems that it would immediately be an impediment to composability.
Why? It's a perfectly pure operator and it distributes. Op(x, y)' = Op(x', y'), assuming Op doesn't have any other variables in it, i.e., it's a "level 0" operator.
> Correctness does compose in my experience as a mathematician
I can show you a trivial example of why that's not true: the gravity equation. Everything composes, yet two instances are already chaotic. You can't extrapolate almost anything you know about the gravity equation to predict the bodies' positions over two instances. For discrete processes, such chaotic behavior is small potatoes. We don't even use the word "chaotic" because we're way beyond that. The problem is this: if everything is simple, you can compose correctness. But when everything is simple, composition isn't really required. This is what I'd like to call the "FP extrapolation problem". FP works beautifully for simple examples, i.e., where new approaches don't really matter. People then extrapolate from toy examples and assume it can scale (in spite of the theory that says otherwise), yet no one has yet shown that it does in practice. By that I don't mean that you can't write large programs in FP. Of course you can. But the advantages that are very clear in the simple cases (where they don't matter), start dissipating and you just don't see a drastic effect at scale (maybe the effect is even negative; nobody knows).
> Please explain your terminology.
Sorry, I meant a TM that terminates eventually. I think that's either the current record or the lower bound. Don't remember.
> I find this a very puzzling statement. States in subcomponents can be easily hidden in every common-or-garden programming language.
They're hidden from the programmer's view, not from their effect on correctness. That is precisely the result that says that verification isn't fixed-parameter tractable. Namely, if the cost of verification is O(|S|) when S is your number of states, you can't say, well, each of my components has n states and I verified them individually, and now I want to compose k components, and so the cost of verification would be O(small function of k). It will be O(n^k) or something like that.
> Sure, but as we've observed, it doesn't actually tell you anything about the final implementation you come to write.
Right, but nothing else can, either, unless you do end-to-end verification, which is far beyond what 99.9% of projects can afford to do (or need to). Here is where you rely on software engineering practices, tests and whatever tools you have for local properties that help prevent easy bugs (like types and static analysis). There is nothing that can help you prove interesting logical properties at this level, at anything close to an acceptable cost.
> What does it mean to you?
Oh, the same. It just doesn't look the same way in TLA+, where you compose things as A ∧ B rather than A(B) (they don't mean the same thing).
> What exactly does that mean?
Alright, let me try: suppose you have a higher-order function, say `filter`, that takes a function of type `a -> bool` (where a is a type parameter), and filters a list. Now suppose you want to prove that, say, the length of the resulting list is no bigger than the input list. In a functional language you say, I have a function parameter `p` that could be anything of the appropriate type `a -> bool`. In TLA+ you don't need this because you can say that p is not a parameter of a certain type, but a concrete, specific program that nondeterministically returns either TRUE or FALSE. You prove your theorem for that specific p, but it really covers the same as all predicates p of that type.
The cool thing, then, is this: you can ask, what is the relationship between that nondeterministic program p, and a deterministic program p1, that, say, returns true iff its argument is an even number? Well, it's a refinement relation (any behavior of p1 is a behavior of p, but not vice-versa), and in TLA+ this is expressed simply as p1 => p, where => isn't some special notation but the same-old familiar implication operator from propositional calculus, because in TLA+, a program is a binary boolean predicate that evaluates to TRUE iff there's a next-state relation between its two input states, the unprimed and primed.
The key to that is understanding the model (in the logical sense) of a TLA formula (i.e. a program). The model is a set of all possible behaviors, where a behavior is a sequence of states (similar, but not exactly a trace in the process-calculi sense). So a program is a set of behaviors. A program property (i.e., the program never sends the character "x" over the network), or a "type", is the set of programs that satisfy it (I mean by the simple isomorphism between predicates and sets). Here is the crucial bit: as the TLA logic only allows to specify properties that must be true for all behaviors or none (hence the problem with probabilistic algorithms), there is no need to keep the behavior sets for each program in the "set of programs" separated, and instead of a set of programs -- i.e. a set of sets of behaviors -- a program property can simply be a set of behaviors. But a set of behaviors is a program! Therefore a program property, or a "type", in TLA is the same thing as a program. Program properties and actual algorithm specifications become a hierarchy of nondeterministic programs that refine or abstract one another. So the program that does parallel merge-sort is a refinement of a program that does some sort of merge sort, and that, in turn is a refinement of a program that "sorts somehow". And all those relationships are simple propositional calculus connectives, and you can reason about what kind of low-level steps a low-level specification does for each abstract step in a higher level specification.
I'm very much hoping this will be the case!
> > Indeed, even the fundamental TLA+ notion of "priming" variables seems that it would immediately be an impediment to composability.
> Why? It's a perfectly pure operator and it distributes. Op(x, y)' = Op(x', y')
> assuming Op doesn't have any other variables in it, i.e., it's a "level 0" operator.
That's not an assumption I want to have to make! Since I don't understand this specific example perhaps my challenge here is unwarranted, but in general if I have to "assume" anything then that's an impediment to composability.
> In a functional language you say, I have a function parameter `p` that could be anything of the appropriate type `a -> bool`. In TLA+ you don't need this because you can say that p is not a parameter of a certain type, but a concrete, specific program that nondeterministically returns either TRUE or FALSE
OK, I think I see. You prove something using the specific, concrete p which has fewer properties than any other p, so therefore anything that holds for it must hold for the other p too. I don't quite see how this is different from proving it for "arbitrary p" but perhaps it will be come clear as I learn more.
> I mean by the simple isomorphism between predicates and sets
And this is why I get worried about TLA+ using "simple mathematics". This "simple isomorphism" simply doesn't exist, as I'm sure you know. You have to restrict yourself to saying "simple isomorphism between predicates on a set and subsets of that set", at which point I start to worry that any higher-order possibilities go out of the window.
> But a set of behaviors is a program! Therefore a program property, or a "type", in TLA is the same thing as a program.
Sure, but a possibly nondeterministic one that you can't actually implement (at least not directly).
Anyway, thanks for the introduction. When I get some time I will look into it further.
You don't need to make that assumption, though. TLA+ checks it for you. My point is that everything is pure and well-behaved. The very same issues of modality happens when you use monads. Only in TLA+ it's very simple to see when an operator is a constant (called level 0), a state function/predicate, i.e., a function of the current state (level 1), an action/step (level 2), or a full temporal formula (level 4). In a temporal formula, you can say something like, whenever my program sees an event x, it would eventually output a response y. In a way, TLA+ has 4 syntactic types, and the relationship between them is checked instantly at parsing. E.g. you can't take an action or a temporal formula and prime it (TLA+ checks "inside" your formula and knows what level it is). Priming a constant keeps it a constant; priming a state function turns it into an action (a relationship between two states).
> This "simple isomorphism" simply doesn't exist
No, I meant something very simple. Any boolean predicate, e.g. "this number is even", is the same as the set of all even numbers. It's a characteristic function. It always exists.
> Sure, but a possibly nondeterministic one that you can't actually implement (at least not directly).
Right, but nat -> bool, is the same as saying "all programs that take a natural and return a boolean", which is the same as the ND program that for any natural nondeterministically returns true or false. It's describing the same set or type, but in a different way, which makes simple logical implication the same as subtyping or type inhabitance. B is a subtype of A, is, in TLA+, B => A. a is an instance of A, is in TLA, a => A.
BTW, it's not true that TLA+ can't tell you that your implementation is correct. It can, it's just hard (but probably no harder than in Coq). What you do is, you write a high-level model and prove it. Then you implement it in a programming language, and you write a compiler from that language to TLA+. Then you check that the TLA+ version of your implementation is a refinement of your high-level model. Academic teams have done this for C and Java. The problem is that implementations have many details, so it doesn't scale beyond small programs.
> When I get some time I will look into it further.
I'd say that the effort to learn TLA+ is comparable to learning a very simple programming language with a couple of deep concepts. Scheme maybe. It takes two weeks to be able to write large, complex specifications (though probably longer for the theory to become intuitive). The entire documentation for all the "standard libraries" fits on a single page. The reference for all the language, with full explanations, fits on 4 pages. This is all of TLA+, including the standard library, minus the proof language, including all syntax sugar: http://research.microsoft.com/en-us/um/people/lamport/tla/su... (e.g. all the special syntax for sequences and records is just sugar for functions, so r.f is sugar for as r["f"]). Personally, I'm interested in theory for intellectual reasons, but the single reason I picked TLA+ is that Amazon technical report, where they showed that 1. learning it is easy for engineers to do on their own, and 2. it scales well for very large systems.