
Clojure.spec – Rationale and Overview - SCdF
http://clojure.org/about/spec
======
pyritschard
This is an important move for the Clojure language. As with all dynamic
languages, and exarcerberated by the fact that it encourages relying on data
first - as opposed to say, classes in other languages - Clojure puts the onus
on the developer to catch errors in data.

The clojure community as a whole embraced prismatic's schema as a way to
provide occurence typing for data. Since library authors often tend to reduce
their dependency surface, most libraries do not ship with it though.

With the inclusion of this in clojure core, it will now be possible to provide
occurence typing at the edge of functions instead of tracking malformed input
deep inside apps.

The racket-type contract notation is also a very welcome change from other
similar approaches in my opinion.

[1]: [https://github.com/plumatic/schema](https://github.com/plumatic/schema)

~~~
gshayban
Agreed that this is a big step, but what is being included in core is very
distinct from Prismatic Schema. For example the way keysets vs values are
handled is intentionally different from schema.

The rationale is a good read.

~~~
jrcii
> The rationale is a good read.

Hickey's design choices consistently appear to be exceptionally well reasoned.

~~~
solipsism
It's because of the hammock time.

------
lispm
Luckily Common Lisp already has ways to do that:

Functions have defined keyword argument lists and can have type declarations
which for example a compiler like SBCL can partially check at compile-time.

Common Lisp uses for anything more complex than lists of keyword value
combinations CLOS classes, which form a multiple inheritance hierarchy and
types for slots.

That way I can look into a debugger at runtime and see what functions are
actually supposed to receive and what these collects of keyword/value maps
actually mean (since they are CLOS classes).

Having simple lists as maps was popular with assoc lists in the 70s -
basically the same as hash tables, but slower. Then lots of dynamic object
systems were developed which were in essence dynamic key/value sets. Later
various object systems appeard, with CLOS as a local maximum.

Where Rich Hickey points to RDF, there is a lot prior art in Lisp as well ->
Frame systems / description logics / ....

[http://wilbur-rdf.sourceforge.net/docs/ivanhoe.html](http://wilbur-
rdf.sourceforge.net/docs/ivanhoe.html)

[http://franz.com/agraph/racer/krss-
spec.pdf](http://franz.com/agraph/racer/krss-spec.pdf) and zillion others

CLIM, the Common Lisp interface manager, uses 'presentation types' to specify
user level data structures...

------
ComNik
Clojure, with its dabbling in schemas and rule systems / logic programming (as
a substitute for conditional statements) and thanks to its great tools (like
figwheel or devcards), could really be establishing a cheaper, "as-good"
alternative to static typing and full formal verification.

Instead of encoding constraints as type signatures, the Clojure folks (true to
character) encode them in data. In my eyes a very interesting, pragmatic
trade-off between expressiveness and automatic verifiability.

~~~
llamaz
> dabbling in schemas and rule systems / logic programming (as a substitute
> for conditional statements

That sounds interesting. Could you elaborate or post a link?

~~~
ComNik
As a side note in his talk "Simple Made Easy"
([https://www.infoq.com/presentations/Simple-Made-
Easy](https://www.infoq.com/presentations/Simple-Made-Easy), around minute 42)
Rich Hickey mentions, that conditional statements are complex, because they
spread (business-)logic throughout the program.

As a simpler (in the Hickey-sense) alternative, he lists rule systems and
logic programming. For example, keeping parts of the business logic ("What do
we consider an 'active' user?", "When do we notify a user?", etc...) as
datalog expressions, maybe even storing them in a database, specifies them all
in a single place. This helps to ensure consistency throughout the program.
One could even give access to these specifications to a client, who can then
customise the application directly in logic, instead of chasing throughout the
whole code base.

Basically everyone involved agrees on a common language of predicates
explicitly, instead of informally in database queries, UI, application code,
etc...

But Hickey also notes that this thinking is pretty "cutting-edge" and probably
not yet terribly practical.

~~~
goldbrick
It can work. My current company uses a rule system to represent most of our
business logic since it is so dynamic. The downside is that we have to rebuild
the entire graph into memory (times the number of threads, times the number of
app servers) every time anything changes (which is constant).

Facebook wrote about rebuilding a similar system in Haskell that only changes
memory incrementally, so it's definitely possible to do better.

~~~
ComNik
Interesting note, thank you. Are you referring to "Sigma"
[https://code.facebook.com/posts/745068642270222/fighting-
spa...](https://code.facebook.com/posts/745068642270222/fighting-spam-with-
haskell/) ?

~~~
goldbrick
That's the one.

------
drcode
> spec has been designed from the ground up to directly support generative
> testing via test.check. When you use spec you get generative tests for free.

Great, I've been holding out on digging into the test.check generator
syntax... looks like my laziness paid off!

~~~
puredanger
It's still worth digging in! There are times when you will want to write your
own generators.

------
noelwelsh
It's unfortunate that Rich Hickey (assume he's the author) can't write this
without trolling about type systems. It's pretty clear to anyone who's read
the literature, or has thought for a few moments, that contract systems
provide different guarantees to type systems, so no need to poke at type
systems throughout the Guidelines.

~~~
lomnakkus
It's quite annoying, but he definitely has a tendency to do this, though I
wouldn't go so far as to call it 'trolling' per se. I think he's more sort of
speaking to the 'in-group' and that can often be quite off-putting for people
who are not members of that group because all the "but..." caveats are
missing.

Btw, remember 'transducers'? He did the exact same thing there and claimed (in
so many words) that it couldn't really be done in a typed language until
people showed that, yes, it really could be done in a typed language such as
Haskell. I suspect it's because he just doesn't know enough Haskell and
couldn't get the type signatures to work. However, it's a big leap from "I
cannot figure out how to do it" to "It cannot be done" and it requires a wee
bit of (probably 'well-earned') arrogance to make that leap. :)

~~~
puredanger
Rich provided the Haskell translation at
[https://www.reddit.com/r/haskell/comments/2cv6l4/clojures_tr...](https://www.reddit.com/r/haskell/comments/2cv6l4/clojures_transducers_are_perverse_lenses/cjjyay7).
His point was that some aspects of transducers were not possible to capture
well in a type signature.

~~~
bad_user
Some aspects, as in: not possible to describe state machines by types.

This is a well known problem, an active subject of research actually. Add
asynchrony in the equation and you're out of luck. And transducers can also
have asynchronous behavior.

~~~
lomnakkus
> Some aspects, as in: not possible to describe state machines by types.

I _really_ don't understand this, I mean:

    
    
        data TrafficLight = Red Int
                          | Green Int
                          | Yellow
    

(let's assume that you have a flexible traffic control system which permits
chaning the duration of Red/Green due to live traffic data, but Yellow _must_
always be, say, 5 seconds because the safety requirements say so.)

Add a function

    
    
        nextState :: Input -> State -> State
        nextState = ...
    

and you're done. Am I missing something?

 _EDIT_ : Oh, I think I see what you're saying. This doesn't provide _that_
many guarantees at the _type_ level about state _transitions_ , but I'd
counter that: a) You have other options than "Int", so you could manage valid
states more strictly, and b) There's such a thing as Dependently-Typed
languages, say Idris, so you can go even further. Then, perhaps Dependently-
Types languages was what you meant by "active subject of research". I'd
probably agree on that :)

> Add asynchrony in the equation and you're out of luck.

Not sure how that's to be interpreted? Are you talking about asynchronous
updates to state, or...?

> And transducers can also have asynchronous behavior.

Is there a formal description of what semantics they have in that scenario? Or
are we just going by implementation details?

EDIT: I should say: I'm not being faceitous or going for any kind of 'gotcha'
type thing here. I'm really curious; maybe I'm just misunderstanding your
post.

~~~
mac01021
I'm as clueless as you are about that asynchrony business, but would like to
point out that you are not fully describing your state machine within the type
system unless the typechecker prohibits you from defining (nextState Yellow ==
Green).

~~~
lomnakkus
That would be an invalid definition since you _must_ provide an (integer)
parameter to Green. Effectively you must "magic" a default value for what
happens when Yellow goes to Green.

(Obviously in my sketched scenario the value would really come from Input.)

~~~
mac01021
That's true. My point was just that your type definition does not tell the
compiler/typechecker that the light, if it is currently yellow, must become
red before it can become green.

That's a rule of the state machine and, if the compiler doesn't stop you from
writing a function that breaks that rule then your state machine has not been
modeled fully within the type system.

I'm pretty sure this is what Hickey et al mean when they say that Haskell's
type system is not powerful enough to describe a state machine.

~~~
lomnakkus
> That's true. My point was just that your type definition does not tell the
> compiler/typechecker that the light, if it is currently yellow, must become
> red before it can become green.

Ah, yes, I understand what you mean now, but you _can_ actually encode that if
you _really_ want to. (If we assume Haskell, there's very limited DT
programming which will probably let you state which transitions are legal. I
must admit I'm not very up-to-speed on it, but see the next paragraph.)

The thing is that people usually don't want to go that far. Whether you're
working in a typed scenario or not. In the unityped scenario, you _can 't_ go
that far without reinventing a (run-time?) type-checker.

> That's a rule of the state machine and, if the compiler doesn't stop you
> from writing a function that breaks that rule then your state machine has
> not been modeled fully within the type system.

> I'm pretty sure this is what Hickey et al mean when they say that Haskell's
> type system is not powerful enough to describe a state machine.

Certainly this is true, but if you're in an entirely unityped setting... do
you have _any_ guarantees?

(To which the answer is, of course, "no".)

Foregoing _all_ guarantees just because you can't guarantee _everything_
seems... rash. (I believe this is the position that Hickey takes, for better
or worse.)

Obviously, that's why _I_ , personally, prefer languages with multiple types.

Note: I _do_ think there are geniune scenarios where unityped may win out,
e.g. "generic data transformation without knowing anything about the
structure", but there's even "typed" answers for that, e.g. lenses. For my
money -- and I'm obviously semi-joking -- ekmett >> hickey :)

~~~
tree_of_item
> Certainly this is true, but if you're in an entirely unityped setting... do
> you have any guarantees?

This is what clojure.spec is all about. The answer is not, of course, "no".
They are "reinventing a run-time type checker" and _prefer_ it that way
because it allows increased expressive power at the cost of delayed
verification.

> Ah, yes, I understand what you mean now, but you can actually encode that if
> you really want to. (If we assume Haskell, there's very limited DT
> programming which will probably let you state which transitions are legal. I
> must admit I'm not very up-to-speed on it, but see the next paragraph.)

So you don't actually know how to do it, but assume that you can. This a
little strange. Are you sure that "people usually don't want to go that far"?
Because the real answer is probably "it's too hard to do".

By the way, you probably want linear types in order to encode state machines
in the type system in a natural way, which Haskell does not have.

~~~
pron
> reinventing a run-time type checker

It's not a runtime _type_ checker, but a spec (not a type[1]!) that tools can
use to check in different ways.

\------------

[1]: A spec is a proposition; a type is a judgment (that corresponds to a
proposition). E.g., you can have a type: `data A = 1 | 2 | 3`, and `data B = 3
| 4 | 5`, but then testing `a < b` (assuming `a:A b:B`) may not be
straightforward (depending on the language), while a spec of `A = {1, 2, 3}, B
= {1, 2, 3}, a ∈ A, b ∈ B` does let you test `a < b`.

~~~
lomnakkus
Types _are_ propositions. See Curry-Howard.

~~~
pron
Types _encode_ propositions (see Curry-Howard) but are logical _judgments_ in
the language's logic. The expression `a:A` does not have a boolean truth value
(like a logical proposition), but is rather a judgment. In my example above,
the value of the expression `a ∈ B` is true iff a = 3 and false otherwise, but
the expression `a:B` is just not a well-formed expression. In other words, you
could create a type that corresponds to a proposition, but type assignment is
distinct from a proposition (e.g. the proposition `b1 < b2`, assuming
`b1,b2:B`) in the language.

More precisely, you could say that types encode propositions at the type level
(even if the language allows the same data-level syntax in type definitions,
as in Idris), while propositions at the data level are distinct, while specs
are data level propositions.

~~~
catnaroek
`x:T` is a _typing judgment_. A typing judgment is not a type, even if one of
its constituent parts (`T`) is a type.

What you call “data-level propositions” is really just “data-level abstract
syntax representation of propositions”. Propositions _really_ are types.
(Though, of course, there are types that are not propositions.)

And your example is simply not valid Haskell.

~~~
pron
> A typing judgment is not a type

You are absolutely right. `x` is a proof object of `T` in the logic induced by
the language, but I didn't want to get to that level of technicality, just to
show the difference between data-level propositions and type-level ones.
That's why I added the second paragraph, but in the process confused matters;
sorry.

> Propositions really are types.

Types have a syntactic meaning that determines the well-formedness of an
expression. Once you talk about types, you must talk about a language; a type
cannot exist without a language because it is a syntactic construct, but a
proposition can: it is a set (of programs, or Turing machines if you like;
people in the verification community may say it's a set of states in a Kripke
structure). Now, you could say that given the proposition (or the set) _P_ ,
you can construct type P in language L that expresses it, but P would still
not be a data-level expression in L. You can then argue about whether or not
this matters, but you can't deny that the two expressions of proposition,
i.e., the dynamic `x < 10` and `T`, are distinct and "live" in two separate
worlds. When data-level propositions are concerned, this is not the case.
E.g., in TLA+, an untyped logic (TLA+ZFC), the program and the propositions
are the very same objects and not at all just a "data-level abstract syntax
representation of propositions".

> And your example is simply not valid Haskell.

It wasn't meant to be (I haven't written a Haskell program in over 15 years).
Sadly, unlike standard logic and set notation, there isn't a standard notation
for typed LC with data types, or any typed logic for that matter (not one that
I'm familiar with), so you have to use something with a meaning people would
understand or define a syntax every time.

~~~
catnaroek
> but a proposition can: it is a set

Set theories are normally built _on top_ of the apparatus of logic, so a
proposition can't be a set.

> (of programs, or Turing machines if you like; people in the verification
> community may say it's a set of states in a Kripke structure)

Ah, I see. You aren't talking about general propositions. You're talking about
assertions about program states that can be checked dynamically, but want to
verify statically. You can't represent an arbitrary proposition as a data
structure, but it's of course a-whole-nother matter if you can decide the
proposition's truth value by evaluating a Java expression of type `boolean`.
The `boolean` expression itself is still not a proposition, though.

> but you can't deny that the two expressions of proposition, i.e., the
> dynamic `x < 10` and `T`, are distinct and "live" in two separate worlds.

Most programming languages don't have propositions at all, for a very good
reason: a proposition is the type of a programmer-supplied proof that has no
runtime computational relevance, and how to integrate proofs into programming
is still a research question whose answer language designers can't wait for.
In Java, “the dynamic `x < 10`” is just a `boolean` expression - not a
proposition!

> E.g., in TLA+, an untyped logic (TLA+ZFC), the program and the propositions
> are the very same objects and not at all just a "data-level abstract syntax
> representation of propositions".

In TLA+, or any other external program logic, the program itself is just a
syntactic object. And by extension so is any `boolean` expression the program
might contain. OTOH, if Java had propositions, you could carry out your
reasoning internally, side-stepping the need to encode propositions as
abstract syntax for a metalanguage to see.

~~~
pron
> Set theories are normally built on top of the apparatus of logic, so a
> proposition can't be a set.

You don't need a set theory. P is a "naive" set. It's well defined no matter
what, because TMs are a recursive set, and P is just a RE subset. You're not
in danger of running into any sharp edges.

> The `boolean` expression itself is still not a proposition, though.

I'm not sure what you mean by that, but I think we're talking past each other.
That boolean expression is not a proposition _about the program_ \-- that is
correct -- but the distinction may be artificial. You may be used to languages
where there's a clear distinction between program properties and program code,
but if you think about it, that distinction is artificial. For example, the
expression `IF x > 0 THEN x ELSE -x` can be understood to be a program
description taking the absolute value, but it can also be a proposition, or a
description of the set of programs returning the absolute value of their
input. Your language may use the same expression in both cases, yet still make
a distinction about its meaning based on where it is used. Another language
(like TLA+) has no such distinction.

> a proposition is the type of a programmer-supplied proof that has no runtime
> computational relevance

That's _your_ definition. For me, a proposition about code is a set of TMs
with a property (in particular, safety properties are a closed set of TMs, and
liveness properties are dense etc. etc.). Its truth value is whether your
program is in this set or not. To be more precise, a proposition to me is a
sentence in any logic denoting that set.

I think my definition is both more encompassing and more precise than yours.
It is more encompassing because every proposition by your definition is a
proposition by my definition; it is more precise because the converse may not
be true (although, again, I'm not certain about what your definition really
means, its ties to a specific language, or whether or not it applies only to
denotational semantics of programs).

> has no runtime computational relevance

I'm not sure what you mean by that. Does the proposition "the program returns
a list containing exactly the same elements in its input, each appearing the
same number of times, only in sorted order" has a runtime computational
relevance or not?

> In TLA+, or any other external program logic, the program itself is just a
> syntactic object.

TLA+ is not an external program logic, but a complete logic -- i.e. a TLA+
expression can describe any unique and arbitrary TM, or a set of them -- and
there is no difference between a program and a proposition about a program in
TLA+. The program is certainly not a syntactic element in TLA+ (certainly no
more than any proposition is a syntactic element). Conceptually, in TLA+ you
can describe the set of all programs that perform quicksort (it can be more
than one; TLA+ allows nondeterminism) and then ask whether quicksort implies
sorting, or, in other words, whether the set of quicksort programs is a subset
of all programs that sort their input. Both the program and the proposition
are logical expressions (describing a set of TMs, if you will), and you may
then try to prove that one implies the other (or any logical combination of
the two):

 _In a quest for minimality and orthogonality of concepts, TLA+ does not
formally distinguish between specifications and properties: both are written
as logical formulas, and concepts such as refinement, composition of systems
or hiding of internal state are expressed using logical connectives of
implication, conjunction, and quantification. Despite its expressiveness, TLA+
is supported by tools such as model checkers and theorem provers to aid a
designer carry out formal developments._ [1]

[1]:
[http://www.loria.fr/~merz/papers/tla+logic2008.pdf](http://www.loria.fr/~merz/papers/tla+logic2008.pdf)

~~~
pron
P.S.

If you think about it, the conjunction of all true propositions about your
program _is_ the program (modulo the equality semantics of your logic). This
is true whether or not your program is deterministic, or, in the case of LC,
is a collection of state machines, depending on the evaluation strategy).
Conceptually, then, there is no difference between a program and the true
propositions about it. Some languages make a syntactic distinction between the
two (Haskell), some languages use similar syntax but different semantics
(Idris), and some make no syntactic or semantic distinction (TLA+): your
program _is_ a logical sentence (a conjunction of axioms, if you will), from
which all true propositions about it can be derived (and, obviously, only
those). A program _is_ the complete axiomatization of all true propositions
about it[1].

[1]: This may get more tricky if intuitionistic logic is used, where the
program is actually the proof object of all true propositions, but I think
that even in intuitionistic logic, the distinction may be artificial: a proof
may be no more than a hidden proposition.

~~~
catnaroek
There's so much that's wrong with this comment. You're either ignorant of
formal logic, or deliberately trolling. If the former: being ignorant isn't a
crime, but pretending you know what you don't know is uncivilized. If the
latter: of course trolling is uncivilized. Let's keep things civilized, okay?

> If you think about it, the conjunction of all true propositions about your
> program is the program (modulo the equality semantics of your logic).

This doesn't even make sense. A program can be run. What does it mean to “run
a proposition”? Even if I charitably interpret that as “decide its truth
value”, well, not all propositions are decidable!

> your program is a logical sentence (a conjunction of axioms, if you will),

What you can treat as a bunch of axioms is the semantics of your programming
language of choice. But a program isn't the same thing as the language it's
written in.

> the distinction may be artificial: a proof may be no more than a hidden
> proposition.

This isn't true in either classical or intuitionistic logic. A proposition has
a truth value - a proof does not.

~~~
pron
> Let's keep things civilized, okay?

Formal verification of large, very complex real-world systems is what I do all
day every day. If you don't understand what it is that I'm saying (and that's
my fault -- I'm probably not clear enough) you can just ask. I'm a
practitioner, not a theoretician, so I may make mistakes, but I think that in
this particular case you are too attached to the typed LC to consider other
reasoning systems (which, BTW, happen to be far more prevalent in the
verification community).

> This doesn't even make sense. A program can be run. What does it mean to
> “run a proposition”?

Let's step back and consider the theory and then get to the specifics.
Consider the set of all TMs (perhaps TMs isn't the best name because it may
imply a specific programming model, but that is not my intent; I'm using the
term here interchangeably with an abstract state machine, or a program). Each
and every one of them is "runnable", right? Each and every subset of that set
also consists of TMs, which are also runnable, right? A proposition about a TM
constitutes such a subset. Under certain conditions, you can incorporate all
DTMs in the set corresponding to the proposition into a single NDTM, which can
also be run. Hence, there is a direct correspondence between propositions
about programs (e.g., the program returns its input sorted) and NDTMs. Hence,
you can most certainly run a proposition about programs (though perhaps not
all propositions).

The temporal logic of actions (TLA) is a _complete_ logic in the sense that
for every NDTM (or a DTM as a special case) there exists a (non-unique) TLA
expression, consisting of the simple logical connectives, the priming operator
(an action) and possibly LTL operators (describing say, fairness, properties)
that describes that particular TM. But, of course, a TLA expression can be
less precise, and describe a whole set of TMs -- a proposition. Therefore in
TLA, propositions and programs are the same object. Here's Lamport in the
opening paragraph introducing TLA in the 1993 paper[1]:

 _Correctness of [an] algorithm means that the program satisfies a desired
property. We propose a simpler approach in which both the algorithm and the
property are specified by formulas in a single logic. Correctness of the
algorithm means that the formula specifying the algorithm implies the formula
specifying the property, where implies is ordinary logical implication._

FYI, that paper has 2458 citations and is far from obscure. This theory is
much better known in the software verification community than type theory
(certainly when you consider it within the greater framework of temporal
logics, of which TLA is one), and has had a greater impact -- both academic
and real-world -- on formal methods, at least so far. Specifying programs as
propositions in temporal logics was a modern theoretical turning point in CS,
and has yielded two or three Turing awards (Pnueli in '96, Clarke et al. in
'07, and Lamport -- at least partly for this -- in '13).

> Even if I charitably interpret that as “decide its truth value”, well, not
> all propositions are decidable!

This is where things are a bit tricky. For a complete treatment see the
preliminaries section in this '88 paper by Abadi and Lamport[2].

> A proposition has a truth value - a proof does not.

Of course, but there's room for interpretation here (like I said, I'm not a
type expert so I'm not 100% sure about it): `<x = 3, y = 5>` can be viewed as
a proof object of the proposition `∃x,y ∈ 1..10 . x < y`, but it can also be
viewed as the more precise proposition, `x = 3 ∧ y = 5`. Existential
propositions (or types) can therefore be viewed as a set of multiple programs,
while the proof object is a particular program, that could have been specified
with a more precise proposition. That is what I meant when I said "hidden
proposition".

If you step back from typed LC and consider the original proof objects, those
objects are just a series (a tree, actually) of logical deductions. Each of
those deductions is a proposition. A proof is, therefore, a series of true
propositions, each derived from the previous one using one of the deduction
rules. You can certainly say, then, that a proof is a collection of hidden
(true) propositions.

[1]: [http://research.microsoft.com/pubs/64074/lamport-
actions.pdf](http://research.microsoft.com/pubs/64074/lamport-actions.pdf)

[2]: [http://research.microsoft.com/pubs/64046/abadi-
existence.pdf](http://research.microsoft.com/pubs/64046/abadi-existence.pdf)

~~~
catnaroek
> Consider the set of all TMs (...) Each and every one of them is "runnable",
> right? Each and every subset of that set also consists of TMs, which are
> also runnable, right?

This much makes sense.

> A proposition about a TM constitutes such a subset.

A unary _predicate_ whose argument ranges over TM may have such a subset as
its _extension_. Two syntactically different predicates may have the same
extension.

> If you step back from typed LC

At this point I'm not even talking about types. I'm talking about deductive
systems in general.

> and consider the original proof objects, those objects are just a series (a
> tree, actually) of logical deductions.

This part is correct.

> Each of those deductions is a proposition.

Propositions exist inside of a logic, deductions exist in the logic's
metatheory. There oughtn't even be the possibility of conflating them.

> A proof is, therefore, a series of true propositions, each derived from the
> previous one using one of the deduction rules.

A deduction is a series of _judgments_ related by rules of inference. But a
judgment is _not_ a proposition!

~~~
pron
> A unary predicate whose argument ranges over TM may have such a subset as
> its extension.

Yes, but now we can talk about a specific logic/language, and now it matters
exactly how the NDTM is expressed syntactically. E.g. in TLA (and LTL, or CTL
for that matter) `[](x ∈ {1,2,3})` means "the variable x in the program is
always in 1..3", or (semantically) "the set of all TMs where x is always in
1..3". x is a name given to a variable in an infinite set of variables
defining the abstract machine's state (i.e. a state in TLA is defined to be an
assignment to the infinite set to all possible variables), and the existence
of the machine (or a Kripke structure, if you will) is implied. If you only
supply a proposition on variable x, it means that all other (infinite set of)
variables can take any value.

You can read about TLA/TLA+ in one of the links I provided to get the
specifics (it's extremely easy to pick up, BTW). In any event, a proposition
in TLA can be thought of as a predicate over a(n implicit) TM. You may also
find this short note by Lamport about the history of TLA/\+ (and in particular
the unification of temporal-logic propositions and programs) interesting[1].

> Two syntactically different predicates may have the same extension.

True, I didn't say the syntactic representation is unique. Also, in TLA it's
no longer just a predicate but any proposition.

> There oughtn't even be the possibility of conflating them.

I am not conflating them, and I'm not saying a proof _is_ formally a
proposition. I'm saying that in intuitionistic logic, a proof (especially of
an existential quantification) may be also interpreted as a more
precise/limited proposition. But this is just a vague notion, and it could be
completely wrong (even if I were able to communicate it clearly), so I see no
point in debating this.

[1]: [http://research.microsoft.com/en-
us/um/people/lamport/pubs/c...](http://research.microsoft.com/en-
us/um/people/lamport/pubs/commentary-web.pdf)

------
jraines
Could someone explain what the paragraphs under "Map specs should be of
keysets only" are getting at?

I'm trying to map it onto my experience with Clojure and plumatic/schema, but
nothing is clicking.

~~~
puredanger
Generally in most of the existing libs like Schema you specify that a "person
entity" is a map that has particular attributes that have particular
properties and a "company entity" is a map that has other attributes (some of
which might overlap).

In spec, you specify that a first-name attribute is a string and an email
attribute matches some regex. Those are specifications about the attributes
and can be used across many entity types. A person entity spec is then defined
as a map that contains a set of required or optional attributes. A company
entity spec is a different map with different attributes, some of which might
be shared with a person (like email address).

This is in many ways a shift in perspective, allowing you to start defining
and reusing attribute specs to a much greater degree.

The use of structural systems with mathematical properties (regular
expressions for sequence structure and sets for map keys) turns out to have
many useful effects in computing error sets, etc.

~~~
greghendershott
The division -- between predicates for attribute values vs. specs for
maps/seqs -- is interesting because AFAICT Datomic schemas ~= the former,
while leaving the latter to your app? (Although I guess :db.cardinality/many
straddles that line.)

------
junke
Those regular expressions for sequences seem related to "Efficient dynamic
type checking of heterogeneous sequences", or maybe some other prior work I
don't know about.

See [https://github.com/jimka2001/regular-type-
expression](https://github.com/jimka2001/regular-type-expression) and
[http://www.lrde.epita.fr/dload/papers/newton.16.rte.report.p...](http://www.lrde.epita.fr/dload/papers/newton.16.rte.report.pdf)

------
xmlblog
Incidentally, the source is here:
[https://github.com/clojure/clojure/blob/master/src/clj/cloju...](https://github.com/clojure/clojure/blob/master/src/clj/clojure/spec.clj)

------
minikomi
I enjoy using racket and contracts make it quite enjoyable to use - error
messages in particular are much improved by their presence. Looking forward to
this!

------
DigitalJack
If you are new to using a snapshot of clojure (like me), you can find some
helpful information here:
[http://dev.clojure.org/display/community/Maven+Settings+and+...](http://dev.clojure.org/display/community/Maven+Settings+and+Repositories)

tl;dr

    
    
      In a Leiningen project.clj file:
      :repositories {"sonatype-oss-public" "https://oss.sonatype.org/content/groups/public/"}
    
      Also, in that lein project file, reference clojure as
      [org.clojure/clojure "1.9.0-master-SNAPSHOT"] (1.9.0 being the dev head right now)

~~~
drewr
This won't work with Cider btw. I had to change it to 1.9.0-SNAPSHOT in
pom.xml before a mvn install.

    
    
      error in process filter: version-to-list: Invalid version syntax: '1.9.0-master-SNAPSHOT'
      error in process filter: Invalid version syntax: '1.9.0-master-SNAPSHOT'

~~~
DigitalJack
Interesting. It works for me, but I'm using a dev snapshot of cider too.

Having -master- in there is pretty non standard, but I was poking in
oss.sonatype.org and that's what the jar looked like in there. Maybe I goofed.
There is a little bit of magic that happens with maven and snapshots (magic to
me anyway).

------
jcadam
Excellent. This definitely looks to be an improvement over schema, which I use
but don't particularly like. Though, I'm fairly new to Clojure (about a month
or so), so it may just be my inexperience :)

------
sdegutis
> Defining specifications of every subset/union/intersection, and then
> redundantly stating the semantic of each key is both an antipattern and
> unworkable in the most dynamic cases.

This is the reason I haven't jumped on the Schema bandwagon. Surprised to see
I'm not the only one who had reservations about it.

> Finally, in all languages, dynamic or not, tests are essential to quality.
> Too many critical properties are not captured by common type systems.

Glad to see some acknowledgement of the need for at least some tests from
Rich, who previously critized TDD, saying that you don't keep a car on the
road by banging into the guardrails, and advocated H(ammock)DD.

> But manual testing has a very low effectiveness/effort ratio. Property-
> based, generative testing, as implemented for Clojure in test.check, has
> proved to be far more powerful than manually written tests.

(Assuming by manual testing he means manually writing automated tests,) I
can't help but wonder if he came to that conclusion by writing the _wrong_
tests and, naturally, finding them unhelpful.

> Enable and start a dialog about semantic change and compatibility

Okay, slightly rant-y tangent, but out of all the fluff phrases and buzzwords
I've ever heard, "start a dialog" is probably the worst offender, and I wish
it was wiped from the English language.

> I hope you find spec useful and powerful.

Looks useful so far. But where's the Github repo? Or is it coming in the next
version of Clojure? Or, like, how/when can we use it?

~~~
nickik
Rich was never against testing, he was against TEST DRIVEN development, and
thats what TDD stands for.

------
thesorrow
This is awesome. Is it available for clojurescript too ?

~~~
MikeOfAu
From Alex Miller via slack:

> will be in cljs as well, but not on first release

> most of the code should work as is, most of the tricky bits are around vars

------
hellofunk
>The basic idea is that specs are nothing more than a logical composition of
predicates.

This reminds me of this technique:

[https://github.com/astoeckley/Eat-Static#going-
further](https://github.com/astoeckley/Eat-Static#going-further)

------
georgewsinger
I'm confused what a function spec would look like using fdef (etc). Can
someone write an example?

If in Haskell we have a type signature

f :: Type1 -> Type2

...what will a clojure function spec roughly look like?

~~~
puredanger

      (spec/fdef clojure.core/range
        :args (spec/alt
                :infinite (spec/cat)
                :range-0  (spec/cat :end number?)
                :range    (spec/cat :start number? :end number?)
                :step     (spec/cat :start number? :end number? :step number?))
        :ret #(instance? clojure.lang.Seqable %))```

~~~
georgewsinger
Interesting. I don't know what to think. It's a lot to read. But in some ways
adds more information than types (you can make reference to values). Maybe you
could make it more concise with more definitions? For example:

    
    
        (def UnboundedInterval (spec/cat))
        (def PositiveUnboundedInterval (spec/cat :end number?))
        (def ClosedInterval (spec/cat :start number? :end number?))
        (def StepClosedInterval (spec/cat :start number? :end number? :step number?))
        (def SeqableInstance #(instance? clojure.lang.Seqable %))
    

And then:

    
    
        (spec/fdef clojure.core/range
            :args (spec/alt
                    :infinite UnboundedInterval
                    :range-0  PositiveUnboundedInterval
                    :range    ClosedInterval
                    :step     StepClosedInterval
            :ret SeqableInstance)
    

Perhaps this would be non-idiomatic, and perhaps I'm trying to make this into
something that looks like a static type annotation when it _isn 't_.

~~~
catnaroek
> (you can make reference to values)

Dependent types let you do just that as well. While theoretical computer
scientists for the most part focus on dependently typed systems capable of
expressing and proving any mathematical proposition (which obviously results
in increased complexity), there exist more pragmatic variants of dependent
types aimed at solving the everyday needs of programmers, with a minimum of
annotation burden, see:
[http://hackage.haskell.org/package/liquidhaskell](http://hackage.haskell.org/package/liquidhaskell)

------
macmac
clojure.spec is available in 1.9.1/alpha1 via [org.clojure/clojure
"1.9.0-alpha1"] in your project.clj.

There is an excellent guide available here:
[http://clojure.org/guides/spec](http://clojure.org/guides/spec)

------
georgewsinger
Will there be performance hits to using this?

------
elwell
This site could really benefit from a custom stylesheet for printing. Having
to delete lots of nodes with devtools just to get a printable doc.

~~~
puredanger
Noted. Won't get to it anytime soon though.
[https://github.com/clojure/clojure-
site/issues/99](https://github.com/clojure/clojure-site/issues/99)

------
nikolay
I'm waiting for the day somebody with write a Clojure transpiler for a non-
Lispy syntax! Although I love Lisp (I used to use it 20+ years ago daily), but
the year is 2016!

~~~
bad_user
Not sure what the year has to do with anything, but most things in computer
science were invented in the sixties and the seventies or even earlier and
since then we've only been refining them. In terms of programming languages,
the theoretical breakthroughs have been in terms of static typing, even though
note that even the Hindley-Milner type system has been first described in
1969.

In other words, strange thing to complain about a syntax that has stood the
test of time. After all, most mainstream languages (C, C++, Simula, Pascal,
Java, etc) are Algol-based, a language that first appeared in 1958. And no,
you can't have LISP without its syntax, because that's what's giving it power.
The " _code is data_ " mantra only works because Lisp is homoiconic, because
it basically has no meaningful syntax other than syntax for describing its
primitive data-structures, which makes its macros first class and its "eval"
natural.

But in the LISP family Clojure is actually quite modern. It runs on both the
JVM and Javascript engines. It does have extra syntax for describing sets and
maps. Its data-structures are immutable and it encourages immutability to a
far larger degree than other LISPs. But it also has very good interoperability
with the underlying platform.

~~~
xearl
> And no, you can't have LISP without its syntax, because that's what's giving
> it power. The "code is data" mantra only works because Lisp is homoiconic

No. Homoiconicity is nice for "code is data" and macros, bot not necessary.
Being able to reify and generate ASTs is sufficient. See for example Dylan,
Terra/Lua, Scala LMS, MetaML, or Julia's or Rust's macros.

~~~
bad_user
Note sure what you're arguing against. I stand by my claim.

I'm actively working with Scala's macros and played with Scala LMS. In fact
working with those convinced me of how much all LISP alternatives suck for
expressing macros.

Do you know why? Because in these languages macros are not first class, hence
(1) they tend to be an afterthought, (2) hard to use and (3) filled with bugs
because, wouldn't you know, they are targeting "library authors" and not
users.

Scala's macros for example are exposing the compiler's internals. And boy, I
can tell you stories. Like how it wants an "untypecheck" call on the AST if
you modify it, because those ASTs happen to be mutable (the horror), but then
"untypecheck" has a bug in it, crashing the compiler if you have an implicit
"unapply" in a pattern match, so you have to rewrite your AST first to get rid
of implicit "unapply" calls to work around it. Behold Sincron, an otherwise
simplistic library, for which it took me a whole month to achieve inlining
Function0 literal arguments, with help from others and copy/pasted code,
between cries that I crashed the compiler, again and again:
[https://sincron.org](https://sincron.org) \- and btw, this rewrite for
getting rid of unapply, this was the still-dangerous shortcut I took, because
the general consensus at this point is that if you want to workaround the bugs
of "untypecheck", you need to fix that AST manually.

You can't blame Scala too much really. This is definitely preferable to
hacking your own compiler plugin or to forking the compiler. And eventually
beautiful solutions can emerge from that.

The alternative to something like this is to expose a limited form of macros.
For example F# quotations, or Rust's macros, since you mentioned those. The
problem with these macro systems is that they are very limited, only
applicable to a narrow set of use-cases. Sorry, but I personally think Rust's
macros are next to useless as they needlessly complicate the language without
that big of a gain. And the irony is that macros are basically used for error-
handling in Rust. You know, if they added higher-kinded types, there wouldn't
have been such strong demand for those macros in the first place.

LISP is the only language (family) exposing macros to users.

~~~
steveklabnik
The standard library contains many macros that are not about error handling:
[http://doc.rust-lang.org/stable/std/#macros](http://doc.rust-
lang.org/stable/std/#macros) Only try! is, specifically. vec!, write!, panic!,
and println! are also very popular.

