
What Clojure Spec is and what you can do with it - icey
https://www.pixelated-noise.com/blog/2020/09/10/what-spec-is/
======
gw
I think the instrumenting and generator stuff gets disproportionate attention.
For me by far the biggest win from spec has been with parsing. This completely
changes how you'd write a library that takes a data structure and parses it
into something meaningful (for example, what hiccup does for html or what
honeysql does for sql).

In the past, this required a lot of very ugly parsing code and manual error-
checking. With spec, you write specs and call s/conform. If it failed, you get
a nice error, especially if you pair it with expound. If it succeeded, you get
a destructured value that is really easy to pull data out of. I've done this
in a half dozen different libraries and i'm pretty sure i wouldn't have even
written them without spec.

~~~
stingraycharles
and if you then combine s/conform with core.match, you can build really
elegant code to traverse these structures.

~~~
bgorman
Why not just pattern match immediately if you are going to bother with using
core.match?

~~~
stingraycharles
How would you pattern match with Clojure without using core.match? Do you mean
using another library?

Edit: oh I understand what you mean, you were thinking of skipping the
s/conform part, and use core.match directly. Personally, I consider spec an
excellent library to describe the shape of data, and core.match allows for
better describing the actions you want to take based upon that.

For example, with spec you can define a bunch of alternatives using s/or, and
then use core.match to then easily traverse the results.

It’s more a matter of separation of concerns to me. I don’t use core.match for
validation or describing shapes of data.

~~~
jmiskovic
Do you have to adapt spec definition to core.match somehow? Is there a
resource to read more how to use them together?

~~~
stingraycharles
No you don't have to do anything.

Let's say I have something like

    
    
      (s/def :thespec (s/or :foo ::foo 
                            :bar ::bar))
    

I can then use it in conjunction with core.match like

    
    
      (match [(s/conform :thespec val)]
        [:foo x] (do-something-with x)
        [:bar y] (do-something-else-with y))
    

Which imho is an easier way to navigate these things.

------
tekacs
As a very heavy user of spec, I’ve since switched to using Malli [0], which is
similar but uses plain data to model specs (and doesn’t use Clojure Spec at
all).

Also, Malli is being funded / supported by Clojurists Together [1], which is a
wonderful initiative that’s also worth a look.

[0]: [https://github.com/metosin/malli](https://github.com/metosin/malli)

[1]: [https://www.clojuriststogether.org/news/q3-2020-funding-
anno...](https://www.clojuriststogether.org/news/q3-2020-funding-
announcement/)

------
jwr
I found spec very useful and use it more and more. I'm looking forward to
newer revisions, with support for optionality, it's been a big problem area in
my case.

Here's a quick list of gotchas (well, they got me, so perhaps other people
will find this list useful):

* `s/valid?` does not actually tell you that the data is valid

The naming of `s/valid?` suggests that you can call it on your data and find
out if the data is valid according to the spec. This isn't true. What it
actually tells you is if the data, _when conformed_ , will be valid according
to the spec. If you pass the data as-is to your functions (without passing it
through `s/conform`), you might find that they will be surprised at what they
get.

* Conformers are likely not what you think. They are not intended for coercion and have many pitfalls (for example, multi-specs dispatch on unconformed value).

* s/merge doesn't necessarily do what you wanted if you're using conformers, only the last spec passed to merge will be used for s/conform (but you're not using conformers, right?)

* specs are checked eagerly. If you think that (s/valid? ::my-spec x) will only check ::my-spec, that is not the case. It will check any key of x found in the spec registry.

I settled on a subset of spec, because of the pitfalls.

~~~
didibus
I think all of those problems are only true when you try to do coercion with
custom conformers. Which is not the intended use of conformers.

Conformers are meant to parse the data when there are multiple possibility of
what something can validate against, the conformer will disambiguate and
return a result that tells you which path was chosen.

Coercion is not supported as part of Spec, you're expected to do that
seperatly either before or after validating/conforming.

~~~
jwr
Yes, that was largely my point. Many people (me included) assume that
s/conform is a kind of coercion, which it is not.

Not all of the above problems are due to coercion, but the majority are.

To be clear: I'm not complaining here, I find spec to be very useful and I
like it, just pointing out traps for the unwary.

~~~
didibus
Ya fair enough, I've definitely seen a lot of people think conforming is meant
for coercion. But it's not, it's only meant for disambiguating the chosen path
(when multiple are possible) for validation.

------
hospadar
We use spec A LOT to validate a huge config file/DSL thing for our internal
ETL application.

My general feelings are:

\- spec is great, you should use it, the composeability and flexibility are
awesome features.

\- I've never once used conformers - maybe I just don't "get it" (which if so,
speaks badly of them I think since I've been heavily using spec for years),
but the use cases for them seem strange to me and I feel they cause more
confusion than they're worth. I wish they were separated out into more
separate/optional functionality.

\- It's SO MUCH more powerful than things like JSON schema, but that comes at
the cost of portability - there's no way we could send our spec over the wire
and have someone else in a different environment use it. But also, there's no
way we could implement some of our features in a tool like JSON Schema [and
have it be portable] ("Is the graph represented by this data-structure free of
cycles?" "When parsed by the spark SQL query parser, is this string a
syntactically valid SQL query?").

\- Being able to spec the whole data input up front has saved hundreds of
lines of error-checking code and allows us to give much better errors up-front
to our users and devs

\- Spec has a lot of really cool features for generative testing, but we
rarely use them since we've implemented lots of complex specs where it's not
really practical to implement a generator (i.e. "strings which are valid sql
queries" or "maps of maps of maps which when loaded in a particular way meet
all the other requirements and are also valid DAGs"). I feel torn about this
because the test features are great, but the extreme extensibility of spec is
what I love most about it. I haven't often found a scenario where I actually
have a use for the generative features (either the data is so simple I don't
need them, or so complex that they don't work).

~~~
dimitrios1
Your feeling on there being no way to transport this over the wire is puzzling
to me but I admit I don't have all the details. My feeling is why not? Surely
if we have wire formats for self describing binary objects that can then be
serialized into an in memory structure, transporting a spec shouldn't be
harder than that?

~~~
joshlemer
Specs in the general case require code execution, so you'd essentially need to
execute that (untrusted) clojure on the other end of the wire.

~~~
dimitrios1
Again, apologies if this sounds ignorant, but we have pretty standard
practices now for sandbox execution of untrusted code. A LISP seems especially
suitable for this type of task.

~~~
joshlemer
I don't have a clue how you would implement this. The difference in
portability between spec and something like json-schema/protobuf/avro is that
you can serialize the schema in these and then clojure and (say) python, go,
java, C#, JavaScript applications can talk to one an other.

How would you propose to serialize clojure spec's and use them from a python
app? Port the clojure compiler to python?

~~~
hospadar
I second this emotion - "How would you check a spec from anything other than
clj/cljs" is (IMO) the critical question here. Sure you could check out my git
repo in a safe VM and execute it there, but that's a WHOLE lot more hassle
than an XML or JSON schema. It's not just a language barrier thing.

There's nothing stopping spec predicates from making network calls, looping
forever, etc. If I wanted to be able to call my spec from other apps I'm
writing, I could package it as a library easily, but a workflow like rest
call->get spec->validate data (which I've implemented many times for JSON
schema for simpler things) wouldn't really be practical with spec (without at
least setting some really tight restrictions on what features of spec you're
allowed to use)

Again, not really a failing of spec, it's just not designed for that kind of
workflow.

~~~
slifin
Yeah spec is quite a general thing, there are libraries which convert specs
into database schemas, JSON schema, swagger etc

If communicating constraints to another environment is required they should
help

------
Jeaye
I strongly recommend that anyone using spec for validation should check out
Orchestra, to instrument the functions and have them automatically validated
on each and every call:
[https://github.com/jeaye/orchestra](https://github.com/jeaye/orchestra)

For my team, generators and parsing are basically useless with spec. We just
don't use them. But describing the shape of data and instrumenting our
functions, using defn-spec, to ensure that the data is correct as it flows
through the system is exactly what we want and nothing I've seen in Clojure
land does it like spec + Orchestra can.

I think part of this may boil down to different types of testing. We primarily
use functional testing, especially for our back-end, so we're starting from
HTTP and hitting each endpoint as the client would. Then we ensure that the
response is correct and any effects we wanted to happen did happen. This is
much closer to running production code, but we do it with full
instrumentation. Being able to see an error describing exactly how the data is
malformed, which function was called, and what the callstack was is such a
relief in Clojure.

    
    
        Call to #'com.okletsplay.back-end.challenge.lol.util/provider-url did not conform to spec.
    
        -- Spec failed --------------------
    
        Function arguments
    
          (nil)
           ^^^
    
        should satisfy
    
          (->
           com.okletsplay.common.transit.game.lol/region->info
           keys
           set)
    
        -- Relevant specs -------
    
        :com.okletsplay.common.transit.game.lol/region:
          (clojure.core/->
           com.okletsplay.common.transit.game.lol/region->info
           clojure.core/keys
           clojure.core/set)
    
        -------------------------
        Detected 1 error

~~~
didibus
I'm onboard with orchestra because I tend to be lazy. But I also want to
explain the rational for why Spec's instrumentation only instrument the input.

This has to do with the philosophy. If you want to write bug free programs,
and I mean, if you care A LOT about software correctness.

The idea in that case will be that all your functions will have a set of unit
tests and generative tests over them that asserts that for most possible
inputs they return the correct output.

Once you know that, you know that if provided valid input, your functions are
going to return valid output. Because you know your function works without any
bugs.

Thus, you no longer need to validate the output, only the input. Because as I
just said, you now know that any valid input will result in your code
returning valid output as well. So re-validating the output would be
redundant.

And this goes one further. After you've thoroughly tested each functions, now
you want to test the integration of your functions together. So you'd
instrument your app, and now you'd write a bunch of integration tests (some
maybe even using generative testing), to make sure that all possible input
from the user (or external systems if intended for machine use) will result in
correct program behavior and an arrangement of functions that all call each
other with valid input.

Once you've tested that, you now also know that the interaction/integration of
all your functions work.

At this point you are confident that given any valid user input, your program
will behave as expected in output and side-effect.

You can thus now disable instrumentation.

But before you go to prod, you need one more thing, you have to protect
yourself against invalid user input, because you haven't tested that and don't
know how your program would behave for it. Thus with Spec, you add explicit
validation over your user input which reject at the boundary the input from
the user of invalid.

You now know there things:

1\. All your individual functions given valid input behave as expected in
output and side-effect.

2\. Your integration of those functions into a program works for all given
valid user input.

3\. Your program rejects all invalid user input, and will only process valid
user input.

Thus you can go to prod with high confidence that everything will work without
any defect.

\---

Now back to orchestra. Orchestra assumes that you weren't as vigilant as I
just described, and that you might have not tested each and every function, or
that you only wrote a small amount of tests for them which only tested a small
range of inputs. Thus it assumed because of that, probably when you go towards
running functional/integ tests, you want to continue to assert the output of
each function is still valid, as you anticipate those will probably create
inputs to functions that your tests over that function did not test.

~~~
Jeaye
Something like Haskell, or even Rust, requires similar vigilance in order to
get the program even into a working state. With thorough, strong, static type
checkers, novel borrow checkers, and more, a lot of development time is spent
up front, dealing with compiler/type errors. Thus you can go to prod with high
confidence that everything will work without any defect.

Now, back to Clojure. Clojure assumes that you weren't as vigilant as I just
described, and that you don't have static type checking for each function, or
that you don't have a fixed domain for all of your enums. Thus it is assumed
because of that, probably when you go running toward testing (unit,
functional, or otherwise), you want to assert the validity of all of this
data.

My point in re-painting your words is that we all trade certain guarantees in
correctness for ease of development, maintainability, or whatever other
reasons. Developers may choose Clojure over Haskell, for example, because
maintaining all of that extra vigilance is undesirable overhead. Similarly,
developers may reasonably choose not to unit test every single function in the
code base, but instead functionally test the public endpoints and unit test
only certain systems (such as the one which validates input for the public
endpoints), because maintaining all of that extra vigilance is undesirable
overhead.

~~~
codygman
Also note that if you try thinking with types, you may start seeing them as
tools rather than overhead.

A good blog post about this is:

[https://lexi-lambda.github.io/blog/2020/08/13/types-as-
axiom...](https://lexi-lambda.github.io/blog/2020/08/13/types-as-axioms-or-
playing-god-with-static-types/)

------
reitzensteinm
My pet project is a partial evaluator for Clojure code that uses data
generated from spec to fuzz code and optimize it. The coverage is accepted as
complete, so there are no guards and deoptimizations like you'd have in a JIT,
the programs are just wrong.

It seems like a fairly powerful technique, although you couldn't ever rely on
it with production code. After several years of tinkering I managed to get a
Forth interpreter written in Clojure executing a specific input string
partially evaluating down to OpenGL shader code, to hardware accelerate my
friend's Stackie experiment (link to his version below).

(nth (sort [0 n 5]) 1) where sort is a merge sort also successfully compiles
down to just the branches you'd hand optimize it to, which is Graal's party
trick. Although they're solving the problem in a bulletproof general way, so
the difficulty is incomparable.

The eventual goal is to write Clojure in Clojure without it being horrendously
inefficient.

[http://blag.fingswotidun.com/2015_09_16_archive.html](http://blag.fingswotidun.com/2015_09_16_archive.html)

------
dustingetz
At Hyperfiddle we are using spec to specify UIs, so e.g. (s/fdef foo :args
(s/cat :search-needle string? :selection keyword?) :ret (s/coll-of (s/keys
...))) describes exactly what you need to render a masterlist table with some
query parameters. It's being used with pilot customers, now in production.
This stuff works!

------
krn
Malli[1] and specmonstah[2] are my favourite Clojure(Script) libraries built
on top of Clojure Spec.

[1] [https://github.com/metosin/malli](https://github.com/metosin/malli)

[2]
[https://github.com/reifyhealth/specmonstah](https://github.com/reifyhealth/specmonstah)

~~~
elwell
+1 for Specmonstah; been very useful for us at Reify Health.

------
kmclean
I find spec really interesting but struggle to find a good fit for it in my
projects. I end up getting bogged down writing generators and give up, or end
up breaking down functions into such small pieces that speccing every one of
them results in an unreasonable proliferation of specs and generators. This
was an interesting and helpful overview of some different ideas about how to
take advantage of spec, so thank you!

Also I definitely share your feelings about clojure massively increasing my
job satisfaction and teaching me how to think about programming in a new (and
better tbh) way.

~~~
jetti
We use spec at work for validating data passed to our HTTP routes in our
luminus web application. When we get data as JSON we use spec to validate that
there are the required fields being passed in and if it isn't valid we send
back an error HTTP status code. It is nice to be able to easily see what
fields are required without having to dig deep into code.

------
john-shaffer
If you're considering using spec, you should know that spec alpha2 is
significantly different internally, and is much easier to write tooling for,
but has not been ported to ClojureScript. I would highly recommend trying
alpha2 if you're not using CLJS, even if it's a little less stable.

[https://github.com/clojure/spec-alpha2](https://github.com/clojure/spec-
alpha2)

------
amgreg
> Still alpha: spec is still alpha and it looks like it will remain like that
> since the intention is for spec2 to completely replace it in the future.

Out of curiosity for the folks in the know: why is this the state of affairs?
Where we have the library in alpha for a long time, and being replaced
already?

~~~
parsimoniousplb
I remember Rich mentioning his in a talk. There's one problem he considers
unresolved and he wouldn't finalize the API until he had a good solution.
Unfortunately, I can't remember what it was. Also, this must have been more
than a year ago, whatever goes on in spec2 might have evolved since then.

------
fnord77
for simple typing, does this seem more complex than plumatic/schema?

~~~
vemv
I have used both Spec and Schema professionally.

It's not exactly a secret that Spec is not _intended_ to be a type-like
system. But it turns out, it's perfectly possible to use Spec as a building
block to do so!

As for complexity, I think instrumentation (Spec's approach) _is_ complex.
People routinely struggle with it, in fact you must use an external lib
(Orchestra) to fully enable it. It also slows down execution, sometimes
disproportionally.

Whereas Schema simply has a global toggle that also can be overriden in a
fine-grained manner. It's explicitly made to be fast and have a type-like use.

As for type definitions, honestly both are elegant and powerful; you can quite
easily use arbitrary predicates as "types" with both, and compose those
predicates. Ultimately Spec is better designed because it fosters namespaced-
qualifed keywords which compose better.

For addressing the instrumentation problem, I created
[https://github.com/nedap/speced.def](https://github.com/nedap/speced.def)
which uses Clojure's :pre system. I have used :pre for over a decade so I know
it has an extreme simplicity. Spec can take :pre's usefulness to the next
level.

------
tanrax
Very interesting

