
Transducers for Python - espeed
https://github.com/cognitect-labs/transducers-python
======
fulafel
The README sounds like it's describing Python's venerable iterator protocol.
The example code looks like higher order functions and iterator composers from
the standard functools and itertools modules. It should probably explain how
this is different from those if the audience is Python programmers.

~~~
anon4
I concur. I don't really get the name "transducer". It's just map+reduce?
So... why do we need a new name and new libraries?

~~~
tel
It's not really map+reduce.

Transducers arise out of realizing that transformations between (possibly
infinite) lists can be encoded as a "reducer transformations", "transducers".
When you work in this encoding you automatically get lazy transformation
(since everything goes through a function call) and the avoidance of
intermediate data structures (since everything goes through a function call!).

Moreover, there's a neat property of this encoding, a CPS encoding, which is
that function composition still "works", it just works "backwards".

~~~
fulafel
Python iterators also give you lazy transformation and no intermediate data
structures, no?

~~~
rob_smallshire
Yes, although transducers are more general than iterators. Iterators model
lazy series from which you must 'pull' elements. Transducers can also be used
with elements that are 'pushed', such as events coming from channels, or
whatever. Essentially they decouple the transformation from the mechanisms by
which input elements arrive and result elements are received, whilst still
allowing separate transformations to be composed into more complex composite
transformations.

~~~
sametmax
That's what I was saying : this is from somebody not knowing Python.

Generators have a "send()" method to push data in it.

We have yield from to delegate to sub coroutines.

And generators can be classes for more advanced behaviors.

All this lib does is to make something that already exists in the language, in
a heavier and slower way.

~~~
danidiaz
> Generators have a "send()" method to push data in it.

What would be the pythonic way of "mapping over" the inputs that are sent into
a generator, or performing grouping operations on those inputs?

For example, how to transform a generator that expects to be sent sets of
strings into a generator that simply accepts a flat stream of strings?

~~~
rob_smallshire
You could use the python-transducer library <[https://github.com/sixty-
north/python-transducers>](https://github.com/sixty-north/python-transducers>)
I mentioned in my other comments to do this. It supports using the same
transducers with iterables and push-event coroutine generators.

------
fermigier
A similar, but different, project, with associated learning material:

\- [https://github.com/abingham/python-
transducers](https://github.com/abingham/python-transducers)

\- [http://www.slideshare.net/alinadolgikh/austin-bingham-
transd...](http://www.slideshare.net/alinadolgikh/austin-bingham-transducers-
in-python)

\- [http://sixty-north.com/blog/series/understanding-
transducers...](http://sixty-north.com/blog/series/understanding-transducers-
through-python)

The last link is an 8 part blog series discussing transducers from first
principles, and discussing their use (still in the context of Python).

~~~
rob_smallshire
Note that the first link has been superseded by [https://github.com/sixty-
north/python-transducers](https://github.com/sixty-north/python-transducers)
which is also available on PyPI as
[https://pypi.python.org/pypi/transducer/](https://pypi.python.org/pypi/transducer/)

------
rob_smallshire
This code by Cognitect is a fairly literal port of Clojure transducers to
Python. If you'd like to see a more Pythonic interpretation of transducers,
take a look at Sixty-North's alternative [https://github.com/sixty-
north/python-transducers](https://github.com/sixty-north/python-transducers)

------
danidiaz
My own attempt at implementing something like transducers in Haskell:
[http://hackage.haskell.org/package/foldl-
transduce](http://hackage.haskell.org/package/foldl-transduce).

It depends on the useful "foldl" library of reducers.

~~~
boothead
This looks really useful!

Isn't your Transduction the same as doing premap (or lmap from Profunctor)?

~~~
danidiaz
Transformations created with premap are stateless, they depend only on the
current input. So one can't build UTF8 decoders with them, for example. That
said, if you supply a function to premap, the result typechecks as a
"Transduction".

------
tallerholler
eli5: what can you use these for? what are some examples of when you'd use
something like this...

~~~
fnordsensei
The idea is that you can define a transformation separate from its use and
then reuse it in many contexts. Some of the win sits in decoupling what you
are doing from when and where you are doing it.

Rich Hickey wrote about them on Cognitect's blog a while back:
[http://blog.cognitect.com/blog/2014/8/6/transducers-are-
comi...](http://blog.cognitect.com/blog/2014/8/6/transducers-are-coming)

The author refers to the test suite for examples:
[https://github.com/cognitect-labs/transducers-
python/blob/ma...](https://github.com/cognitect-labs/transducers-
python/blob/master/tests/transducer_tests.py)

~~~
ufo
I always had a hard time figuring out how transducers are different "stream
transformation" functions like map and filter.

I know that they are more complicated/powerful than that but I can't quite
point that out precisely.

~~~
fnordsensei
You can transform map or filter into a transducer by decoupling the
transformation from the data. When you call map, you usually call it on some
data, or some stream. When you create a transducer, and you call for example
(map inc), you're saying "whenever this transducer is applied, on whatever it
is applied to, increase each unit by one".

You can lug around the transformation itself as if it were an object, almost,
separate from the data it's meant to transform.

~~~
ufo
How is that different from just partially applying/currying the map?

~~~
closed
A key component to transducers is to realize that map, filter, take, can all
be expressed as special cases of the reduce function.

So a curried map function can be expressed as a reduction operation, for
example, in python you could do..

    
    
      # map that adds 1 to each element
      # same as curried map(sub1)([1,2,3])
      sub1 = lambda x: x - 1
      map(sub1, [1,2,3])
      
      # same outcome using reduce
      reduce(lambda acc, x: acc + [sub1(x)], [1,2,3], [])
    

Notice that when writing a map operation using the reduce function, we've put
two pieces of logic on each step. (1) we have a transformation called sub1,
and (2) we have a step operation (returning an array with the transformed x
appended to it).

We can use this same logic to create filters with an if statement..

    
    
      lt2 = lambda x: x < 2
      filter(lt2, [1,2,3])
      reduce(lambda acc, x: acc + [x] if lt2(x) else acc, [1,2,3], [])
    

A critical thing to notice in each of these cases is that if we ran a curried
filter and map (with transformation / filtering functions already passed),
then they would loop over the data twice. However, if you look at the example
on this github repo, the want to

* take 3 elements from the sequence

* perform a transformation (map operation) to those elements

Notice that in this case take is being performed before the map, but you could
imagine doing something like

* map items from geometric series to float

* filter items that are greater than .15

* take 3 of the remaining items

How would you do that with map and filter functions without looping over the
entire (inexhaustible) sequence?

This is the beauty of transformers and transducers, they provide a composable
approach to filtering, transforming, and taking elements from a (possibly
infinite) sequence.

Of course, all of this could be done in a for loop, but often it introduces
complexity..

    
    
      # assume geom_series is the generator from the github example
      take = 3
      for el in geom_series:
          if float(el) < .15:
              result.append(el)
              if len(result) == take: break
    

However, this for loop is using the same ideas: we have a transformation / if
(predicate) operation, and step operation to append the data we want. The
problem transducers try to solve is writing a series of these types of
operations compactly and cleary as a pipeline of functions (erm,
transformers), so those functions can be replaced/moved around quickly, and
the pipelines can be combined together simply.

A good article that covers tranducers in javascript (sorry!) is here:

[http://simplectic.com/blog/2014/transducers-
explained-1/](http://simplectic.com/blog/2014/transducers-explained-1/)

~~~
ufo
> How would you do that with map and filter functions without looping over the
> entire (inexhaustible) sequence?

But isn't this just a matter of defining map and filter over a lazy stream
datatype (aka iterators) instead of over lists?

So the workflow would be

    
    
       list -> stream -> filter1 -> filter2 -> list
    

This would let you run the filters "in parallel" without iterating through
things twice.

~~~
closed
Python's map and filter already run over iterators. If you're talking about
creating filters that are composable, so that you could do

    
    
      filter1 = filter(predicate1)   # curried filter
      filter2 = filter(predicate2)   # curried filter
      pipeline = compose(filter1, filter2)
      pipeline(<generator of some kind>)
    

and make it so that the sequence of operations will be

    
    
      for el in some_generator:
        if predicate1(el) and predicate2(el): <accumulate value>
    

Then I would say defining them in this way is useful and important--
transducers are exactly one way of doing this! A critical aspect I didn't go
into detail on is the idea of a take function. In the github repo, T.take(3)
is the portion that allows the transducer to operate of an infinite stream of
values.

This is the piece your workflow example would need to take into account. How
could I apply a filter followed by something that takes 3 passing values from
an infinite sequence? I'm sure you could come up with a way, and it would be
worth comparing to the transducer approach :).

(it's worth noting that currying map, filter, etc.. are very complimentary to
the transducer approach)

~~~
bubulle
If you are wondering how to make a lazy take() in python, here is one
solution:

    
    
        def take(num):
            def gen(iterable):
                for i, item in enumerate(iterable):
                    if i == num:
                        break
                    yield item
            return gen
    

and then

    
    
        list(take(3)(range(100)) == [0, 1, 2]

------
agumonkey
Also transducers for Common Lisp
[https://www.reddit.com/r/Clojure/comments/3tha86/transducers...](https://www.reddit.com/r/Clojure/comments/3tha86/transducers_in_1989/)

------
espeed
NB: Here's the Clojure page for transducers, which includes a link to Rich
Hickey's introductory blog post and Strange Loop video.

[http://clojure.org/transducers](http://clojure.org/transducers)

------
anentropic
Genuine question: what is the point of this?

it seems like a long-winded and not very clear way to write something you
could just do with a function and a list comprehension

~~~
moomin
Well, in Clojure they solve the problem that the standard functions are
inefficient when you chain a bunch of list manipulations together (because
they all return lazy lists) and neatly sidestep the fact that Clojure doesn't
have generators.

~~~
ch
So if Clojure had some form of fusion of composed functions on lists, that
would equate out to what transducers are doing?

~~~
fnordsensei
That's almost literally what happens. (comp (map inc) (filter even?)) when
applied takes all even numbers from a collection/channel/what-have-you and
adds one to them. Filter and map normally take a collection as their last
argument, but when none are supplied, a transducer is returned instead.

------
amirouche
How does transducers compare to Gremlin traversal language?

