
“Transducers” by Rich Hickey at Strange Loop [video] - sgrove
https://www.youtube.com/watch?v=6mTbuzafcII
======
hawkice
I enjoyed the discussion of the types. I dig haskell (and clojure), and I
think this is perhaps the perfect lens with which to view how to make choices
between them. You can have an insanely complex typesafe haskell transducer, a
still very complex but unsafe haskell transducer, a weaker and less flexible
version of transducers with a simpler type encoding in haskell, some
combination of those ideas, OR...

you just test your code out in the repl while developing in clojure and just
kinda rely on the fact that core infrastructure or popular libraries will
generally work.

~~~
lomnakkus
Personally, I don't think the gist posted by tel qualifies as "insanely
complex" by any stretch.

But a more serious question is: how do you know if "will generally work"
applies to your particular case? The short answer is that you don't and you
end up having to test library-provided functionality as part of your own test
suite. You can argue that you'd have to do the same in e.g. Haskell, but
really the surface area of potential failures is _hugely_ reduced by the fact
that you _know_ whether a function "f: a -> b" can have side effects and that
given something shaped "a" it _will_ give you something shaped "b" (modulo
termination, but that applies in any non-total language).

This is not merely a theoretical issue: Compare the amount of documentation
you need when using some random library in Haskell vs. e.g. JavaScript. The
types act as compiler-checked documentation. In JavaScript it's really a
crapshoot if the library has sufficient documentation and it's always
specified in an ad-hoc manner (which naturally differs from library to
library).

~~~
hawkice
I think if you sit down with the code clojure uses to implement transducers
and compare it directly against the haskell code and _don't_ come to the
conclusion that the clojure code is simpler, there might be some motivated
cognition at work. I don't think it's trivial for someone with Haskell
experience to even verify that code implements the same (or a similar) thing.

I also believe it lacks sufficient polymorphism, for instance surrounding the
output of the step function, and lacks safety around (notably, but not
exclusively) the application of the step function (i.e. to only the last value
of the transducer, not just something of that type). So this would be squarely
in the tries-to-be-simple category, despite it's use of profunctors (don't
know why that was used here, it's not a super-standard abstraction).

But this is all beside the larger point. Things generally working is learned
through philosophical induction in the case of clojure -- just seeing
something work a bunch of times and understanding the mechanisms in some level
of detail. That's not the same as having a machine-verified proof, but it's
also not the same as not knowing at all.

~~~
lomnakkus
It depends on what you mean by "simpler", I suppose.

> there might be some motivated cognition at work.

Did you really just say that?

> I don't think it's trivial for someone with Haskell experience to even
> verify that code implements the same (or a similar) thing.

No, not to verify that it does the same thing. For that you'd have to
understand exactly what the Clojure version does too. I'm a quite rusty on
Clojure, so I can't make a fair comparison on how easy it is to understand vs.
the Haskell version. However, you'll note that I didn't actually say anything
about the Clojure version being harder to understand.

In fact, my point is that I don't even _have_ to understand the implementation
of the Haskell version: I just have to understand its interface (i.e. the
types) and have a general idea of what it's supposed to do (in fuzzy terms).

------
tel
First, to be clear, I really liked this presentation. The criticism below is
both technical and small---all in all I greatly enjoy Rich Hickey's work and
generally admire his ability to talk compellingly about complex technical
topics.

That said.

I somewhat disliked Hickey's presentation of typing transducers here. I feel
as though he builds a number of strawmen toward typing and then tries to knock
them down and suggest that either Clojure has some kind of mystical mechanism
that is ineffable to types or that the exercise of typing transducers is
wasteful. I disagree on both accounts, I suppose. I think types are useful for
analysis and teaching.

The two major points he seems to make is that in order to "properly type"
transducers you must

    
    
        1. Index the type of the "accumulation so far" so 
           that it cannot be transformed out-of-order
        2. Implement early stopping "without wrapping anything
           except for the reduced value"
    

There may be other critiques as well, but I want to examine these two in the
context of Haskell.

With respect to the first point, the major concern appears to be prohibiting
behavior loosely described as "applying the reducing function, say, 3 times
and then returning the first resulting accumulation". In some sense, the idea
is to force us to be faithful in passing on the accumulating parameter. In
code, a pathological setting is the following:

    
    
        transduce :: (r -> a -> r) -> (r -> a -> r)
        transduce reduce accu0 a = 
          let acc1 = reduce acc0 a
              acc2 = reduce acc1 a
              acc3 = reduce acc2 a
          in  acc1
    

The concern is unfounded in a pure language, however, since calling `reduce`
can have no side effects. This entails that all possible effects on the world
of calling `reduce` are encapsulated in the return and, therefore, we can
completely eliminate the steps producing `acc2` and `acc3` without worry.

    
    
        transduce :: (r -> a -> r) -> (r -> a -> r)
        transduce reduce accu0 a = 
          let acc1 = reduce acc0 a
          in  acc1
    

Now, there may be concern here that we still want to index the `r` type
somehow to allow for changes of accumulation to occur. This is not the case
(in _this_ simple model!) as in order to achieve the "baggage carrier
independence" property the `r` type must be left unspecified until the
transducer is actually applied. The cleanest way to do that is to use a
higher-rank type (Hickey mentions these briefly and offhandedly toward the end
of his talk)

    
    
        type Transducer a b = forall r . (r -> b -> r) -> (r -> a -> r)
    

which thus prohibits the implementer of a Transducer from affecting the values
of `r` in any way whatsoever---they must be left anonymous until someone
decides to _use_ the Tranducer on a particular collection of values `a`.

(It must be noted that the model given above is isomorphic to a "Kleisli arrow
on the list monad" which I described a little bit here
[http://jspha.com/posts/typing-transducers/](http://jspha.com/posts/typing-
transducers/). It should also be noted that this model includes neither (a)
the ability to use local state to capture time-varying transductions or (b)
the ability to terminate early)

With respect to the second point, I'd like to suggest that there is a
difference between the semantic weight of wrapping the result types in an
Either in order to indicate early termination and the implementation weight. I
completely agree that using an Either to implement early stopping (as it's
easy, if finicky for the library implementor, to do) will involve wrapping and
unwrapping the "state" of the transduction continuously. I also would like to
suggest that it's a very natural way of representing the "accumulation |
reduction" notion Hickey uses in his own "Omnigraffle 8000 type system".

We really would like to capture the idea of the transducer state as being
"either" in-progress or fully-reduced and act accordingly. If Clojure's
implementation of that requires fewer runtime tags than an Either, so be it,
but I personally fail to see a semantic difference except in the way one can
play fast-and-loose with dynamic types over static types to begin with.

\---

So, I gave above an implementation of Transducers in types which has some of
their properties, but certainly not all. In fact, I abused the fact that there
is no ambient state in Haskell in order to ensure that a certain property
would hold (notably this doesn't require a type system at all, just purity). I
also argued that using Either is a perfectly natural way to implement early
termination in such a transduction pipeline.

I've also made an extension to the `(r->b->r) -> (r->a->r)` mechanism which
enables local state to be enabled for various components of the transduction
pipeline. A version without early termination is available here:

[https://gist.github.com/tel/714a5ea2e015d918f135](https://gist.github.com/tel/714a5ea2e015d918f135)

Notably, this uses most of the same typing tricks as `(r->b->r) -> (r->a->r)`
but adds a "reduction local hidden state" variable which lets us implement
`take` and `partition`. This takes Hickey's notion of needing to be explicit
about the state being used to a whole new level.

\---

So what is the point of all this?

I'd like to argue that Transducers do not present such a mysterious mechanism
that they cannot be sanely typed in a reasonably rich language. I believe that
I can capture most of their salient features in types without using the
dependent indexing Hickey suggested was necessary.

More than this, the compartmentalized, hidden reducer-local state in the Gist
implementation allows for each reduction step to include fairly exotic local
states in their state machine. You could implement a kind of type indexing
here if desired and no end-user would ever know of its existence.

I also absolutely concede that many type systems people regularly use could
not achieve this kind of encoding.

Finally, what I really want to say is that type systems are not something to
be denigrated. I believe some of the earliest "transducers v. types"
argumentation took a nasty turn as amateur type theorists (myself included)
rushed to write things like "Transducers are just X".

I want to apologize for any kind of bad feelings my own writing in that thread
may have stirred up. I try not to be haughty or dismissive with this kind of
writing, but I also make mistakes.

So what I'd really like to suggest is that types should not be taken as
reductivist on interesting techniques like Transducers but instead as a tool
for analyzing their construction and either improving it or better teaching
it. Hickey himself often turns to some kind of "pseudotyping" to talk about
how Transducers work---formalizing those notions should only lead to greater
clarity.

Of course, implementations will differ in small ways. As I've noted abundantly
here, a major difference between the Haskell and Clojure implementations is
driven more by Haskell's purity than its typing. Hopefully, however,
exploration of alternative implementations and the rich analysis produced by
their typing can help to introduce new ideas.

For instance, the Gist implementation, if you strip the types away, is an
interesting divergence in raw functionality from Clojure Transducers---if
Clojure Transducers are "reduction function transformers" than the Gisted
Transducers are "Moore-style (Infinite) State Machine transformers" and that
difference allows the implementer to be extra explicit about the use of local
state.

I'd rather see discussion about whether such InFSM transformation techniques
have a place in Transducers literature than a fight over whether or not its
possible or reasonable to "type transducers".

~~~
nickik
> higher-rank type

Im not a type person. So here is my question, how many languages other then
haskell have this?

~~~
ufo
Any dynamic language should do, kind of :)

For example, the following code needs Higher Ranked Types to typecheck:

    
    
        id :: forall t. t -> t
        id x = x
    
        -- mk_pair's type is a rank-2 type, because its
        -- f parameter is a polymorphic function that gets applied
        -- to different types inside mk_pair.
        mk_pair :: (forall t. t -> t) -> (Int, String)
        mk_pair f = (f 10, f "str")
    
        a_pair :: (Int, String)
        a_pair = mk_pair id
    

But it can obviously be written in an untyped language if you wanted to. Just
erase all those type annotations:

    
    
        function id(x){ return x}
    
        function mk_pair(f){ return [f(10), f("hello")] }
    
        var a_pair = mk_pair(id);
    

The deal with higher-ranked types is basically a tradeoff between the
flexibility of the type system and the power of type inference. If you
restrict yourself to rank-1 types (that is, none of your function arguments
are polymorphic) then the compiler can infer all all types in the program
without you having to write any type signatures. If you want to use higher
ranked types then the compiler can't infer everything anymore so you might
need to add some type annotations yourself.

~~~
tel
The "kind of" being really pertinent. In particular, this forgoes the
guarantee that the only thing the transducer is allowed to do is return a
value from the reduction. Hickey notes that this is a law of writing a proper
transducer. Higher rank types ensure that this law is properly encoded
directly into the system.

That said, I'm pretty sure you know this :)

------
rovjuvano
Correcting two flaws with this leads to an interesting result: 1) no error
handling, at all, anywhere, not just missing from the presentation, but not in
the code. transduce just wraps reduce. FAIL!!! 2) the 'result' parameter
unnecessarily pollutes the interface. reduce can be reduced to foreach (doseq)
with state and be implemented like other tranducers that require state.

Correcting for these two errors: a) the 0-arity init function is removed b)
the 1-arity completed function becomes 0-arity c) the 2-arity reducing
function becomes 1-arity d) a 1-arity error function is added. Since these
functions cannot be distinguished by arity alone, we give them names: call the
reducing function 'onNext', the error function 'onError', and the completed
function 'onCompleted', and optionally, group the three functions into an
object and voila, we have Rx Observers.

Hickey's twist here is composing Observers instead of Observables. Whether
this buys you anything worthwhile is debatable.

Two derivations of Rx often accompany it's introduction: 1) Observable being
the dual of Iterable 2) Observables being Promises with multiple return
values. Thanks to Hickey, we can add Observers being an abstraction from
reduce/fold (along with it's many other names).

------
raspasov
Great talk by Rich on transducers, instrumental in understanding the "hows"
and "whys" behind the concept.

------
kazagistar
I'm still a little confused and will have to go over some code or something to
really understand the limitations of what transducers can do... can any
transducer be used in a "parallel" context (like map and filter) or are they
limited to a linear context (like the fold makes me suspect)?

~~~
mistaken
I think it could be parallelized by chunking up the input, however the
transducer doesn't know how to do that by itself, so the step function would
have to be modified somehow.

~~~
jjcomer
Exactly! core.async had the pipeline mechanism added which allows for the
parallel application of a transducer to a channel.

[https://github.com/clojure/core.async/blob/17112aca9b07ebba6...](https://github.com/clojure/core.async/blob/17112aca9b07ebba6ce760ca01d117c24c80cc9a/src/main/clojure/clojure/core/async.clj#L500)

------
atratus
Removing conj is what finally made it click

------
scythe
I threw together a toy implementation in Lua:

[https://gist.github.com/scythe/d28c3f4933ff2f1e5c47](https://gist.github.com/scythe/d28c3f4933ff2f1e5c47)

Granted, none of the cool out-of-order-iteration is there, but the reverse
composition looks natural to me now, so I can sleep at night.

~~~
GordonFreeman
Took me a while to understand the reverse composition too so I hacked on it in
Python

[https://gist.github.com/colinsmith/3140e7cb57c4095ed83f](https://gist.github.com/colinsmith/3140e7cb57c4095ed83f)

------
iamwil
Transducers really sound like monads. Are they the same thing?

~~~
tel
In almost every meaning of that sentence—no.

One simplified model of transducer happens to be similar to half of a
particular monad. It's very disingenuous to say that "transducers are monads"
or even "transducers are a particular form of monad", however.

~~~
iamwil
:( it was an honest question. I really didn't know. I watched the talk. I
didn't get everything that was said, so I came here to ask a question. Man, HN
is use to too much snark.

~~~
tel
Oh! Sorry, I may have written that poorly. I meant that technically: "in
_almost_ every sense, no". There is one sense in which "yes" they kind of are:
a simplified model can be seen as

    
    
        Transducer a b ~ (a -> [b])
    

where the right side is a pretty fundamental type when talking about the list
monad. In fact, composition of transducers is exactly "monad-arrow
composition" or "composition in the list monad Kleisli category".

So in that sense, they are a particular form of monad. But it's a bit of a
stretch.

Thus, I really meant it in "almost every way", not every way entirely.

~~~
iamwil
So by what you wrote, it sounds like they're related, but a completely
separate concept. Where can I find an intro to transducers that's readable for
beginners? I don't know what "list monad Kleisli category" means. Searching
for "transducers" came up with a physical transducer.

~~~
tel
If you know what monads are then you may know of

    
    
        join :: Monad m => m (m a) -> m a
    

Since list is a monad then we know that join can be specialized to

    
    
        join :: [[a]] -> [a]
    

which is just `concat` then. Now Transducers are a bit of an elaboration over
functions like

    
    
        Transducer in out === in -> [out]
    

and so we might compose them a little like

    
    
        comp t1 t2 in = concat (map t2 (t1 in))
    

which if you follow in types ends up working out. Finally, that concat/map
business is just monadic bind in the list monad. We could also write

    
    
        comp t1 t2 in = t1 in >>= t2
                      = (concatMap t1 t2)
                      = bind t2 (t1 in)
    

depending on what syntax is most familiar. That's why they're (an elaboration
of) "monad composition", i.e. a Kleisli category.

The only thing left is just that Hickey did a transformation called "Church
Encoding" or "Continuation Passing Style" to the regular list functions. This
eliminates the need to generate concrete lists and turns this fancy
composition into reverse normal function composition. It's a clever trick.

------
Animats
I think somebody just reinvented data flow programming.

~~~
blackkettle
yeah, is this going to become a 'thing'? i'm finding it really kind of
annoying that terms from the fundamentals of theoretical & applied compsci
([weighted] finite-state transducers WFSTs) are being appropriated like this.

