It seems folks want a working example. Here's one in prod:
Metabase is a BI tool, backend written mostly in Clojure. Like basically all BI tools they have this intermediate representation language thing so you write the same thing in "MBQL (metabase query language)" and it theoretically becomes same query in like, Postgres and Mongo and whatever. End user does not usually write MBQL, it's a service for the frontend querybuilding UI thing and lots of other frontend UI stuff mainly in usage.
Whole processing from MBQL -> your SQL or whatever is done via a buncha big-ass transducers. There's a query cache and lots of other stuff, you need state, but you also need it to be basically fast. Metabase is not materially faster than other BI tools (because all the other BI tools do something vaguely similar in their langs and because the limiting factor is still the actual query running in most cases) but it's pretty comparable speed and the whole thing was materially written by like 5 peeps.
Transducers are an under-appreciated feature in Clojure. They are incredibly useful, allow for composable and reusable code, and come with really nice performance benefits as well (by not creating intermediate collections).
Once you get used to them, they become a very natural tool — in my case, almost every time I do something to a sequence, I'll start with `into`. Even if I'm just applying a single transformation, this lets me quickly change the resulting collection and easily add more transformations if needed.
On a higher level, I found myself thinking about business logic (model) in terms of transducer pipelines and ended up with a number of reusable transformations, and clearly specified pipeline logic.
One gripe I have with transducers is that writing stateful transducers is hard. As in, well, really hard (hope you remembered to flush using `unreduced` in your single-arity version!). I still write them sometimes, but it's never a walk in the park (it does provide satisfaction when done, though). But I guess that's what you need to go through in order to get good performance.
I use pipelining everywhere. Going from piping through map/reduce/filter/etc back to inside out Lisp style calls felt like a huge ergonomic step back. Maybe it's a matter of familiarity but it's hard for me to visually scan that kind of code
I have had some plans on updating the SRFI to add some reducers I did not include because I never actually used transducers before writing the initial implementation.
I don't really understand what you mean by types (my implementation stays monomorphic so new types are easily introduced by TYPE-transduce) , but I have thought about generalising things like numerical ranges by having something like unfold-transduce.
> but I have thought about generalising things like numerical ranges by having something like unfold-transduce.
This is more or less what I was wondering about. Numerics, ports, SRFI-41 streams, etc. There's a lot of stuff that isn't in e.g. r7rs-small but is more or less expected in most Scheme implementations.
OK. It's a better map that can keep some state. (I scanned not read your doc and that was my takeaway; I have not thought about what I can do with state yet, particularly since I had never associated transducers with being able to keep state)
Can you give me an example of a classic problem (since we're talking about transformations, raytracers and compilers come to mind) where if you involve a transducer vs a map you get an interesting difference?
Edit: 'How state is kept is not specified': I assume that there are limitations to state keeping, particularly with the composability and performance pillars of transducers, but I'm just having a really hard time synthesizing everything.
You could keep the state using the state monad, or by letting every reducer keep a transparent state in a linked list that whoever is pushing values through it has to handle. The reference implementation keeps it hidden using closures.
This is mostly an API thing. In clojure you can pass a transducer when you create a channel. That way you can make a channel do just about anything. Send data in chunks of N. Filter Odd numbers. Or just do arbitrary transformations.
It is a protocol for composable transformations of data being passed in one direction. It is not fancy. Not really hard to understand. A generalization of map, filter, and friends.
Regarding state: you can make thread safe transducers. The current SRFI 171 reference implementation is NOT thread safe. You can create a transducer and use it across different threads no problem. But you cannot start a transducer and use the returned reducer in different threads. It uses hidden mutable state.
What is generally missing from this category of article is a motivating statement, eg here is a problem that is easier or at least different (transformed into a different category of problem) given this idea. Up top. When I don’t see this up top or scanning forward I assume the article assumes knowledge that I do not have, and I bounce.
To give a shallow overview, transducers allow you to define steps in collection processing _per item_ rather than having collection processing as a series of transformations of collections.
So rather than processing the collection, passing it to the next function that processes the collection, passing it to the next... etc.. consuming all the CPU and memory that involves, you can define steps that are applied for each item in the collection thereby having the iteration through the collection happen once.
These steps (transducers) are also composable and reusable.
I suspect you know this, consider this a basic explanation for other people reading.
> That seems like insufficient magic for the respect that transducers seem to have.
This is 100% correct. It's amazing how over hyped they are.
If you have ever used C#'s LINQ you are using transducers. The fact that LINQ works item at a time instead of collection at a time is all that's being discussed. That if you say take the first two in a 1,000,000 long collection LINQ will only enumerate the first 2 items and not all 1,000,000 is the other behavior. And the way to do this is by composing operations using a "." operator into a "query" to run.
C# had had this since 2007. It doesn't require a fancy name, it doesn't require streams, it doesn't require "thought pieces" every few months for over a decade for people to "master thinking in linq". Just sequence your operations and get back to coding.
And Clojure is a really great language, but the amount of mental space occupied by transducers is a bad look for the language. Via analogy it's like watching people be amazed by for loops for over a decade. And I know they're are a lot of really smart people in the clojure community, so I honestly put it on Rich. Either on hyping or up so much when he released it like he just invented sliced bread and it's a deep advanced topic, or for how it's presented in the language that people have to understand so much beneath the abstraction layer to use it correctly.
Your average blub enterprise programmer has been using LINQ for 15 years and never needed 100 thought pieces in how to use it and reason about it. Yes it's a monad, yes it let's your short circuit, yes it's item at a time, but a user doesn't need to know lots of detail to use it. It's like watching a language community that can do calculus be continuously hypong on the fact it can do long division.
Clojure is an amazing language, transducers are not that special, figure out why they are so hyped in clojure.
Or maybe I have it backwards and linq/transducers really are partial differential equations and C# snuck it into the 4th grade curriculum and nobody noticed.
I think there's a big difference between transducers work and how other lazy stream libraries work. Not too familiar with LINQ but more with Java streams and Scala views and iterators. Usually these libraries wherever the work is forced, like in a final call of `.toList()` or whatnot, they are materializing something like a stack of iterators. Each iterator calls its child/children iterators and the pipeline logic is in the stack of `.next()` calls.
The difference with transducers is that the transformation logic is completely decoupled from the fact that this is happening in collections. That is what makes them usable outside of the context of collections, most famously in core.async where you can map/filter/mapcat/flatten/take/drop channels just like you do with collections. This only works because transducers are completely decoupled from either the fact that elements are coming from, or going into, a collection. I think it is a really fascinating and creative achievement in decoupling. Java Streams and Scala view/iterator/iterable transformations can never be reused in other contexts since they are fundamentally about the collections. Whatever kind of Observable/Async Stream/Future or other places where you might want map/filter/reduce functionality has to reimplement the whole set of these operations anew (see for example Akka Streams, RxJava)
I think these languages use “Collection” as another way of saying “iterable”. So the only thing such an object needs to provide is a “next()”. So a stack of lambdas/generators all working on a “next()” seem to be doing the same “per element” work of a Clojure transducer. Am I misunderstanding?
The difference is in the way that the transformations compose.
Iterators are generic over some iterable type `U` in `E, R, T: Iterator<E, R, U|T>, U: Iterable<E>` that they either directly or indirectly capture some nested Iterator.
However that means that code composing iterators must also be generic over that iterable `U`, so you can't simply take two transformation pipelines and concatenate them, because they will both provide an element source already.
Transducers separate the transformation part and the processing part.
So you can have a `E, R, T: Transform<E, R>` which doesn't have a generic type parameter for the Iterable, and a function `transduce<E, R, S: Source<E>, I: Sink<R>, T: Transform<E, R>>(source: S, tx: T) -> I` which contains all the transformation application machinery, be it iterator style `next()` operations, or stream based `pull/push` operations,
and a function `comp<E, R, S>(left: Transform<E, R>, right: Transform<R, S>) -> Transform<E, S>` that composes transformations. The way transformers are implemented, this `comp` operation is actually simply function composition.
Oh I see, so with transducers, you can use the same transform pipeline (stack of functions) across different composition contexts (iterable next(), function composition, channel transforming, etc). I see the utility now!
Still I think just the “next()” context is quite enough because you can have everyone use it by convention for everything, even things that aren’t collections. Like a fluent tensor library for example. This is based on a quick understanding of transducers…
Yes, but don't forget that you can't compose `next` based iterators.
That might be a small limitation, but it means that you can't do something like.
```js
let pipelineA = map(i => i+1); // or makeComplexPipelineA();
let pipelineB = filter(i => i.isEven()); // makeComplexPipelineB();
let pipeline = pipelineA.combine(pipelineB);
let result = [1,2,3,4,5].transform(pipeline);
```
I see, thank you! I hadn’t realized the hidden limitation in the usual iterator/generator/streams concepts in modern languages. It looks like the thing Linq and Java Streams have in common is that they’ve been added later, with a pragmatic focus on collections. They “bake in” iterator/next based composition. If you start from a more principled functional foundation, the limitation probably seems more jarring (why fix one “template parameter”?)
It also reinforces my view that you should try to separate logic and control flow as much as possible. At work, we use awful callback chains. There must be a better way to express the logic and hide the callback logic, even in C++ (lots of rope to … find alternative solutions.)
> If you have ever used C#'s LINQ you are using transducers. The fact that LINQ works item at a time instead of collection at a time is all that's being discussed. That if you say take the first two in a 1,000,000 long collection LINQ will only enumerate the first 2 items and not all 1,000,000 is the other behavior. And the way to do this is by composing operations using a "." operator into a "query" to run.
Isn't that just function composition to build the map function? I guess that where the magic can come in is that function composition is associative, which allows for some really interesting opportunities for runtime optimization that, so far as I know, haven't been seriously explored.
If IIRC technically it's a bit more. It's a monad just like function composition, but it's bind has to handle data threading and short-circuit behavior. If you squint it's not too differently than than a parser combinator over Applicative matching a sequence of characters. Composing operations to correct thread sequential data while handing short circuiting.
I'm not certain about the optimizations due to associativity. While yes function composition is associative, that just builds the query. Running the query itself must be sequential as each operation depends on the data from the prior, leaving I believe little room for optimization.
It’s been a while since I did any C#. How do I write a function that returns something that encapsulates the operations to perform on the stream that I can then compose with other operations? And can I then apply to whatever sequence abstraction I have? Isn’t it limited to things that implement IEnumerable? I seem to remember when transducers were introduced, one of the selling points was it wasn’t fixed to a specific sequence abstraction.
Transducers do not require streams. You might want to learn a bit more about Clojure before poorly generalizing about this feature of the language. The OP did use streams to explore some specific applications of transducers. To say that they are overhyped is to be hung up on the buzz surrounding their initial release nearly 8 years ago.
Yes, and like most language features it's not about the feature, it's about having _that feature_ in a language with other benefits.
Think generics in Go or concurrency (effects) in OCAML or smart pointers in Rust. Not at all unique things, but having them in the language with other benefits is worth some discussion as it may provide extra leverage in context.
It’s the mapping itself that can be composed with transducers. Where before you had pipelines of map/filter/whatever, now you have a single function representing the sequence operations, which can be used for any kind of sequence (a list in memory, or items coming in over a channel or message queue) item by item.
That sort of reminds of broadcast fusion [1] in Julia.
Funnily enough, Julia is where I came across the term transducers too, via the Transducers.jl package [2]. The article and the comments here now make me wonder what the difference is, between broadcast fusion and transducers.
Generally, yes, but I also think the fn1,fn2,fn3 can happen independently (which is why its powerful). So items of collection may be at different steps.
> To give a shallow overview, transducers allow you to define steps in collection processing _per item_ rather than having collection processing as a series of transformations of collections.
This is a much better and clearer explanation than the entire article.
True, though I think this is part personal note, and part intended as additional material for people already trying to apply transducers effectively. The author suggests this by casting it as "my" mental model, in a work context.
Maybe this variant of explanation will suit your context better:
Most folks on HN who are interested in PLs, including me, are familiar with transducers at a high level. Composable, performant, yada yada ya. What we (or maybe just I) do not have is a nontrivial example of advantage.
Your comment sharpened that for me. I don't want FizzBuzz; I want someone taking a reasonable toy problem, such as a trad+photon+quadtree raytracer and demonstrating advantage by applying the concept.
I put together a transducer in JavaScript with the intent to simplify Redux/flux architected front-ends. Rather than have synthetic actions that trigger reduction, I allow the caller to simply add objects directly into the store, and rely on them being combined in a useful way. And rather than leave it purely at dead data, I allow you to add functions, too, such that once added they can handle subsequent input. My `combine()` is a recursive `Object.assign()` plus reification of function calls: https://simpatico.io/combine2. It's that second part that makes it a transducer.
This work predates the term "transducer" and I was happy to see Rich Hickey defining it, so I use the term retroactively.
Note that I don't see much use for transducers as a general programming technique. This one use is very specific, but I have never before or since needed one.
> This work predates the term "transducer" and I was happy to see Rich Hickey defining it
IIRC Hickey did not define this term (nor claimed to have done so), he found it in existing compsci research papers that approached them from a mathematical proof angle and realized they solved a problem he needed to solve (basically not allocating intermediate collections and doing wasteful work on parts of the data that will be immediately discarded).
I think one of the authors of the papers he cited even gave a talk at a Clojure conference once.
Thanks for the feedback, fixed. I like to use a simple method to speed up my BTD loop when doing webapp work, which is to add a `meta refresh` tag. But that's inappropriate for deployment. Note that my site is very strict about 3rd party content (none is allowed), and about privacy (the only log is in a tmux buffer), so there's no reason for this behavior other than my forgetfulness.
Initial problem was duplication of sequence library for CSP channels (core.async in Clojure). When the idea of ‘transducers’ (transformers of reducing function) was discovered, it turned out to have many additional useful properties and was integrated deeply into standard library.
I’ve always felt any mention of reducers is a big distraction. They’re not really key to the idea of transducers which are almost entirely about pure sequence to sequence transforms.
For the benefit of downvoters, let me explain myself: the 'reducing function' (also called a 'reducing step function' in other parts of the docs) isn't always what you'd think of as the normal arity-2 function passed to reduce itself. Take a look at the reducing functions in clojure.core - the 3-arity, innermost function returned inside all the transducers. They almost never do anything with result which would be the accumulator in a normal reduction - they just pass it on or return it. In fact, it wouldn't make much sense for transducers to do any sort of reduction because if they did, you'd no longer be doing a step-wise transform over a sequence, and you could just use normal function composition because you'd have a single value to pass forward. The reduced mechanism is only really used in transducers to short circuit, so even there the name clouds the meaning. The only time that something we'd recognise as reduction is happening is when you call transduce and specify a reducing function as its f argument. You'll have seen lots of examples with + as the reducing function because it handily returns 0 when called with no args, acts as identity with 1 arg, and reduces (actually reduces!) with two args. But instead look at sequence which can take an xform. It's implemented in Java code, but the inner reducing function it uses completely ignores the accumulator value! Everywhere you're using result inside a reducing function? It's just null and ignored. But sequence elsewhere stashes away the (transformed) output of the transducer stack and lets you have it back as a sequence. No reducing is happening, just transforming sequences. Elsewhere, into supports an xform argument - what's its reducing function? It's just conjing stuff back together! So to me, the interesting bits of transducers are that they give you an efficient and composable way of creating sequence-to-sequence transformations. Any and all mentions of reducers are necessary plumbing but slightly distracting from the core ideas.
I'm not sure why the comment above received downvotes (I don't have enough karma to downvote). I agree that perhaps the term "reducing function" isn't entirely accurate.
However, it's worth noting that transducers aren't strictly about sequence-to-sequence transformations. From its inception, the Clojure standard library was built upon the `seq` protocol, and all its collection functions were constructed using this protocol. If it were possible to implement the `seq` protocol on channels, it would address the issue. However, it's not feasible. When querying a channel about its next element, the only honest answer is, "I can't determine that without potentially waiting indefinitely until the next message comes in or the channel closes." Turning the `seq` protocol into an asynchronous IO-bound process would be problematic. This is why transducers are essential.
Yes, I'm being pernickety about naming and then misusing it myself! I use 'sequence' in a very loose sense to mean any sequential stuff you might push through a stack of transducers one at a time. core.async does actually use all the reducer plumbing to fill up buffers and pass them around. But I still think the key intuition for transducers should be push-based, stepwise operation (with the option to accumulate state or short circuit, as you go). It definitely helps that you can perform the final step of whatever job you're doing and have full control over how your stuff is materialised, without any intermediate collections, but it still feels slightly secondary to me.
# The Belt.
def belt():
# return itertools.cycle([1,2,3,4,5])
x = 1
while True:
yield x
x += 1
if x > 5:
x = 1
# The +1
def add1(iterable):
for x in iterable:
yield x + 1
# The +y
def add(y):
def _(iterable):
for x in iterable:
yield x + y
return _
# The Oddifier
def oddonly(iterable):
for x in iterable:
if x % 2 == 1:
yield x
# The Reaper
def partition(n):
assert n > 0
def _(iterable):
batch = []
for x in iterable:
batch.append(x)
if len(batch) == n:
yield batch
batch = []
if batch:
yield batch
return _
# The Composer
def compose(iterable, t1, t2):
yield from t2(t1(iterable))
# The Plumber
def pipe(iterable, p):
for xform in p:
iterable = xform(iterable)
yield from iterable
# The Press.
def printall(iterator):
import time
for x in iterator:
print(x, end=", ", flush=True)
time.sleep(0.3)
# printall(pipe(belt(), [add(1)])) # 2, 3, 4, 5, 6, 2, 3, 4...
# printall(pipe(belt(), [add(1), oddonly])) # 3, 5, 3, 5, ...
printall(pipe(belt(), [add(1), oddonly, partition(3)])) # [3, 5, 3], [5, 3, 5], ...
Sometimes sequence programming logic is easier to write and understand. Network streams being an example for me. The benefit I see of transducers is that it allows you to think with that model. For example if I want to find all image files on my computer, remove duplicates, etc. The processing is much easier for me to think about if I can think of it as one file at a time. Then some more sequential programming based on other characteristics of the file - like batching where and when it was taken.
I like the concept so much that I built a JavaScript version using generators.
Unfortunately, I do am not a fan of JavaScript at all and it appears that clojure now requires that I install a JVM. I realize that this is a very personal issue, but I am a "just say no to Java" person after using it for 10 years.
I realize this is controversial, and am not saying it is even rational. I spent years writing Java. And at the end I just decided that I could have written the same stuff in maybe 1/4 of the time. I understand that some people see benefits from all the tedium of it, but in the end I just felt like I had wasted and enormous amount of time. The cost was way high compared to the benefit.
My other reference is just as personal and perhaps just as irrational. Java was free to use. Then after many people and companies used this free tool, Oracle came along and wanted to exploit the fact that people had an investment in this. Yes. I know. Oracle can do what they like. They own it. But I can do what I like and that is not to use anything associated with Oracle. Ever. (EDIT: and yes I do know there are open source JVM's)
Oracle's main Java product (OpenJDK) is 100% open source and free to use. If you also want extra support, then you can also use the "Oracle JDK", which is technically the same as OpenJDK. The only difference is the support, which of course was never free.
> I could have written the same stuff in maybe 1/4 of the time
The only way that might be even remotely true is if you were doing some very over-abstracted, bureaucratic complex bullshit app, but that would have be similarly longer no matter the language.
As mentioned, Oracle itself offers that very free, libre same-license-as-linux OpenJDK, like they are responsible for 90+% of commits there. Their own support license sits basically on top the free offering for the very few gigacorps that need it, so basically almost every java implementation is open source as they just redistributed the OpenJDK codebase (yes, there are a few niche ones as well no one ever heard of).
Okay, so you know streams in Java. You know a Collector? A transducer is roughly a Function<Collector, Collector> (with some more type variables to make that valid). Given a Collector, a transducer will make another Collector which does some stuff to its input and delegates to the underlying Collector.
The nice thing about this is that you can implement all the non-terminal operations on Stream in terms of this. Mapping is passing each input element through a function before passing it on. Filtering is testing each element with a predicate, and only passing it on if it passes. Flat mapping is passing the element through a function to get a number of new elements, then passing them all on one by one.
But you can do more! You can write a transducer which batches elements, a bit like the inverse of flat mapping. You can't do that with the existing Streams API at all!
The things transducers operate on, which Clojure calls reducing functions, end up being a bit different to Collectors: their accumulate operation returns a boolean to control early stopping, which lets them implement take while, limit, and so on, and that means they can't be parallel. But the vibes are exactly the same.
A similar mechanism in another JVM language is Kotlin's sequences, which can also do windowed/chunked/batched operations by holding a little bit of intermediate state. Combine that with coroutines and it's pretty trivial to perform a parallel map/flatmap and saturate all cores. I'm surprised Streams can't do this.
In case what i wrote wasnt clear: Java's streams are (or can be) parallel, and can saturate as many cores as you like.
But parallelism is not compatible with early stopping at any stage - early stopping is inherently serial! - and transducers choose allowing any stage to stop early over being parallel.
The power it has on top of map is it can skip items and change the shape of the container: that is, it’s a decomposition of a fold and not just a composable map.
So, you can consume a vector and produce a subset of the vector as a map.
Finally, transducers separate the element-wise processing from both the construction of the final sequence and the details of iterating over the input.
It’s more than bind, because a monad requires that the input monad be the same as the output one (if you consume a list, you can only produce a list). Clojure’s Transducers are as powerful as foldr because the output type can be different. E.g.:
The other way I think of this is “explicit stream fusion”: transducers work by passing an explicit continuation around and you call that continuation to add results to the output: but, crucially, you don’t have to know what the output type is, you just say “add this to it”.
That's interesting. But every such type must have the same "shape" - a sequence of things. So it's the same as bind, plus an optional conversion at the end.
Which is the same as Rust's flat_map, because in Rust when you operate with iterators you can end up collecting into another type, like, myvec.into_iter().flat_map(|x| ..).collect() converts from a Vec<T> into SomeType<U>, where SomeType is any type that represents a sequence, and T to U is the type conversion from the map
> (A difference between fold and transducers is that transducers can work in infinite input types like core.async channels)
Ok so transducers are just lazy folds (I mean folds can work with infinite data in Haskell as well)
> it's the same as bind, plus an optional conversion at the end.
The mapcat transducer (flatMap in other languages) is equivalent to bind. The “optional conversion at the end” would defeat the whole point of transducers because it would force you to materialize an intermediate sequence before building up the result. The result type is a superset of monads, I’m fairly sure: it’s anything that can be the result of a two argument function with the signature `a -> b -> a`: there’s nothing in theory preventing using a transducer to sum up a list of integers.
Note that the Emacs Lisp variant is still under development. Once things looks good, we'd like to circle back around to Scheme's SRFI-171 (mentioned elsewhere here) and possibly redo it to include more fundamental primitives.
Rich gave a talk on this, and there is an associated HN thread for the talk, but some of you may be interested in the original transducers presentation. You can view it here: https://www.youtube.com/watch?v=6mTbuzafcII
Great that the author mentioned Christophe Grand’s xforms library. It has transducers (and reducing functions) useful for hashmap processing such as x/by-key and x/reduce. Very useful for utilizing the transducer paradigm with a wider array of data types and problem spaces.
I don't have experience with Clojure, but this description makes transducers sound like simple pipeline of functions transforming a stream along the way. This is quite common and readily available in any language, it's also how Unix piping works (for text/binary, but that's a stream of chars/bytes), and I'm wondering why was the name "transducer" required.
The ideas are similar, but transducers all greater flexibility and composability and testability. The pure idea of "pipelines" isn't anything new, though, as you pointed out.
When you have a Unix pipeline, let's say "cat file | grep -v '^#' | sed 's/\/t//' | wc -l", or something, each step must specify where the input and output comes from. i.e., you can't "pull out" the grep, sed, and wc parts, and then tell it that it's coming from a network socket, or web service instead. It _must_ come from a file descriptor (stdin, stdout). And you can't (easily) multi-thread just the sed piece. Or (easily) add a logger between cat and grep. Or a retry between two parts. Or have it read from a network socket instead. And it's not simple to combine these pipelines with other complex pipelines.
Anyway, take that concept, run with it for a while, and you get transducers.
For js/ts developer the best explanation is - you have a functions that take iterable on input and returns generator. If you arrange your functions as higher order functions (functions that return Iterable => Generator function) you can pipe them constructing arbitrary pipelines.
We use this technique in trading systems. It works very well for certain class of problems. It looks like this [0].
It's not, this is that Lisp thing where you can check the length of your argument list and do something completely different when you don't get enough arguments.
In this case `(map f)` notices that it was told to map a function but not told what to map it over, and so it decides to give you a new function. If this were a partially applied function the signature would be `[x] -> [y]` where `f: x -> y` would pick out the specifics.
But this is actually a totally different signature, isomorphic to `x -> [y]`, the signature of generators. Specifically `(map f)` generates what in Haskell would be `\x -> [f x]`.
However the type is not quite that straightforward for historical and compositional reasons; it is actually
∀z. (y -> z -> z) -> x -> z -> z
With the implementation being here
\handle x -> handle (f x)
This is a sort of enhanced map that can do filtering because it uses concatMap (also known as >>=, “bind in the list monad”) to combine. So to `(filter pred)` you would have the isomorphic versions,
\x -> if pred x then [x] else []
\handle x rest -> if pred x then handle x rest else rest
This also leads to an important nitpick for the article in question, a strictly better mental model of a transducer is not that it maps conveyor belts to conveyor belts, since that has more power than transducers do. (For instance, reverse is not a transducer.) But rather that it maps individual items on a conveyor belt, to their own conveyor belts on the first conveyor belt, then mashes them all together into one effective conveyor belt.
So filter will either map an object to a singleton conveyor belt containing that thing, or an empty conveyor belt. You can implement `dupIf pred x = if pred x then [ x, x] else [x]` as a transducer too, `handle x (handle x rest)`. Conveyor belt that either has one or two elements on it. You can potentially put an infinite conveyor belt inside your conveyor belts and make a chunk of the input unreachable, although Clojure is strict so I have the feeling this will just run out of memory?
You probably are thinking of normal functions, and not partially applied ones (which are also just normal functions that we get a special way totally unrelated here). Also I don't think they can reproduce
Blammo! Our gnome is now packaging together incoming items into bundles of three, caching them in the interim while the bundle is not complete yet. But if we close the input prematurely, it will acknowledge and produce the incomplete bundle:
(>!! b 4)
(>!! b 5)
(close! b)
; Value: [4 5]
They're like iteratees, but with awkward edge cases (particularly around error handling). If you've used Conduit you'll have already had the positive experiences other comments are talking about - realising how powerful and general-purpose the abstraction is and using it for everything.
Plain old functions & eager evaluation & a bit more awesome sauce.
Given transducers, we can compose mutually independent parts at will:
- Data source (sequence, stream, channel, socket etc.)
- Data sink (sequence, stream, channel, socket etc.)
- Data transformer (function of any value -> any other value)
- Data transformation process (mapping, filtering, reducing etc.)
- Some process control (we can transduce finite data (of course) as well as streams, and also have optional early termination in either case. I'm not sure about first-class support for other methods like backpressure.)
e.g. read numbers off a Kafka topic, FizzBuzz them, and send them to another Kafka topic, OR slurp numbers from file on disk, FizzBuzz them, and push into an in-memory queue. But each time, you don't have to rewrite your core fizzbuzz function, nor your `(map fizzbuzz)` definition.
I can't downvote, but might have as the first sentence is an over-simplication and misunderstanding -- particularly as laziness for collections has always been available in clojure.core. Clojure transducers offer an optimization orthogonal to collections best summed above with: "transducers allow you to define steps in collection processing _per item_ rather than having collection processing as a series of transformations of collections". Yes, transducers can be viewed as somewhat of an analog to the Haskell conduit library (as discussed here several years ago: https://hypirion.com/musings/haskell-transducers). However, I think the detractors coming from strongly typed languages are decidedly missing much of the generalization of the transducer model, particularly those conflating transducers exclusively with streams.
Thanks for this link. It seems to confirm things: "aren’t Conduits and Transducers then equivalent (isomorphic)? I am pretty sure they are."
I view this as a good sign. When two independent parties arrive at the same design it is usually an indication that they have discovered a universal and principled solution.
I consider the "conduit" library to be one of Haskell's "killer features", and sorely miss having something like it when working in other languages.
Maybe when Haskellers dismiss clojure transducers as being "just like conduit" it comes from a place of jealousy? I've seen several articles and discussions over the years of clojure transducers that take place outside of clojure communities and are aimed at the wider programming public, praising the benefits of it. But I've never seen conduit discussed outside of Haskell communities.
No clue what it means, but I'm convinced it can be written in 3 lines with a for loop in a way that 100% of people looking it will understand it, and not 5%. Probably even in Clojure!
I'm not expert Clojurist, but my first guess of this is that it prints and pushes every non-nil value into a go channel xf, one-by-one, recursively, in a new thread, then return the channel, presumably so more work can be done and passed to it.
The number of exclamations in <!! means something, but I forget what, I think something about non/blocking.
Metabase is a BI tool, backend written mostly in Clojure. Like basically all BI tools they have this intermediate representation language thing so you write the same thing in "MBQL (metabase query language)" and it theoretically becomes same query in like, Postgres and Mongo and whatever. End user does not usually write MBQL, it's a service for the frontend querybuilding UI thing and lots of other frontend UI stuff mainly in usage.
Whole processing from MBQL -> your SQL or whatever is done via a buncha big-ass transducers. There's a query cache and lots of other stuff, you need state, but you also need it to be basically fast. Metabase is not materially faster than other BI tools (because all the other BI tools do something vaguely similar in their langs and because the limiting factor is still the actual query running in most cases) but it's pretty comparable speed and the whole thing was materially written by like 5 peeps.
https://github.com/metabase/metabase/blob/master/src/metabas...
(nb: I used to work for Metabase but currently do not. but open core is open core)