
C++14: Transducers - ingve
http://vitiy.info/cpp14-how-to-implement-transducers/
======
gue5t
As far as I can discern, transducers are exactly the same as iterator adaptors
in Rust: you compose transformations on a stream and get back a computation
that pumps one element through the entire set of transformations on each
iteration, i.e. it fuses the loops together. Then you can either run this
computation and collect results into a data structure of your choosing or
compose different stream transformations onto the end.

The C++ implementation of this notion looks like an inscrutable, complete
trainwreck, like the rest of the language.

~~~
llasram
Kind of. Rust iterator adapters are closer to the previous Clojure reducers
namespace and the Java 8 streams API. All three take a "collection" and return
a new "collection" which includes additional behavior/transformations when
ultimately iterated over.

Transducers separate out the transformation processes into free-standing
composable functions. This makes the transformations first-class values, and
makes it possible to apply such transformations to less collection-like
entities such as async channels.

I'm personally still not sure if it's a good idea or not, but it is an
interesting approach.

~~~
gue5t
I'm not sure you're characterizing Rust iterator adaptors right.

Iterator _adaptors_ take an iterator (which usually, but not always, comes
from a collection, e.g. "my_vec.into_iter()") and give you another iterator
back, without allocating an intermediate collection; the iterator itself
stores only the current state of the iteration process. You can apply iterator
adaptors to such iterators as the receiving end of a channel (
[http://doc.rust-
lang.org/std/sync/mpsc/struct.Receiver.html#...](http://doc.rust-
lang.org/std/sync/mpsc/struct.Receiver.html#method.iter) ), and you can pass
around the iterator and add other computations to the end of it before you
"run" the whole thing.

However, I'm not sure what you mean by the transformation processes being
_functions_. In Rust these processes (iterators) are values, and have methods
that produce a derived process by appending a step; for a concrete example:

    
    
        let fb_chars = "foobar".chars(); //fb_chars is an iterator
        let not_o = fb_chars.filter(|&x| x!='o'); //not_o is an iterator; we got it by applying the filter iterator adaptor
        for i in not_o {println!("char: {}", i);} //run the iterator with a for loop; we could also have run it by calling .collect() and obtaining a collection
    

What are the domain and range of the functions you mention in Clojure?

~~~
bad_user
I guess Rust's iterators are pretty standard, meaning that you keep calling
"next(): T" until there are no more elements, right?

Well that's pretty cool and iterators are generic, but not generic enough. If
you'll look closely their protocol has certain constraints. For example the
next() call is synchronous. If the next element is not ready, well, tough
luck, the current thread needs to be blocked. Another constraint is that
iterators produce elements, right? Well, iterators are cool and useful, but
many operators like map(), filter() and flatMap() can be applied on things
that don't necessarily produce values. As a side note, this is why monads are
hard to understand, being a little more abstract than iterators.

What transducers from Clojure are doing is to define a more generic protocol
(more generic than that of Iterator) such that you can have defined operators,
like map and filter, operate on whatever you want, such that you can reuse the
implementation of those operators. You can also compose those operators,
before applying them to something concrete.

~~~
Veedrac
> Well, iterators are cool and useful, but many operators like map(), filter()
> and flatMap() can be applied on things that don't necessarily produce
> values.

You've lost me. You need to apply the function in `flatMap` to _something_ ,
and `flatMap` by definition will output zero or more values.

So what do you mean?

~~~
bad_user
The signature of flatMap is something like this:

    
    
        flatMap[M,A,B](cc: M[A], f: A => M[B]): M[B]
    

So given that you respect its signature, M[A] is not necessarily a producer of
values. It can be anything and it helps if you think about it not like a
collection or a container, but more like a _context_. For example M[A] could
be a function of type S => (A, S). And then you could map and flatMap on such
functions to produce other functions. This is the "state monad" btw.

------
rwmj
TL;DR: Higher order function features that have been available in functional
languages since the 1970s will soon be available in C++, with more verbose
syntax, explicit type decls everywhere, and (probably) enormous error messages
when you get things wrong. The "new" feature will interact badly with other
language features like exceptions or RAII and in any case you won't be allowed
to use them in the subset of C++ that your company permits you to use.

~~~
vvanders
I think the relevant bit is that we're seeing this without GC which makes it
applicable to a whole set of problems that weren't previously relevant(which
is also why I think Rust is so interesting).

Yeah, we get it that functional languages came first, but rather than looking
down on C++ why not celebrate that functional constructs are making it into
places where they haven't been before.

~~~
rwmj
What are the problems where GC can't be used? I'm going to guess very tiny
embedded systems and hard real time. What are the places where C++ is being
used? Unfortunately a lot more than those, resulting in expensive, unreliable,
insecure software.

~~~
Guvante
To be fair soft-realtime systems don't interact well with GC's since random
multi-millisecond pauses break everything.

There is at least work towards making controlled GC pauses a possibility.

~~~
pjmlp
> To be fair soft-realtime systems don't interact well with GC's since random
> multi-millisecond pauses break everything.

Like these ones?

[http://www.atego.com/de/products/atego-
perc/](http://www.atego.com/de/products/atego-perc/)

Used for this?

[http://mil-embedded.com/news-id/?21468=](http://mil-embedded.com/news-
id/?21468=)

~~~
Guvante
Deterministic garbage collection is a completely different beast than what
most people call GC. It is not a nearly magical fairy that lets you have
infinite memory.

Also IIRC aviation has incredibly strict rules on freeing during flight, so
this likely switches to a fixed memory model for at least part of the time.

~~~
pjmlp
You aren't allowed to free at all in avionics.

In Ada deallocattion requires use of the unsafe package and in SPARK dynamic
allocation it isn't even possible.

However the usage of GC based systems by the military in their weapon control
systems, shows that there are GC implementations that able to cover certain
types of real time use.

After all a GC pause during a battle situation is probably not something one
wishes for.

------
rburhum
I have been doing C++ for almost 2 decades and at a particular stage I was
able to say that I knew the language with a self-proclaimed score of "8 out of
10". Nowadays, I would give myself a "4.5".

The crazy thing is that, to me, some new features of the language feel like
they are being pushed because [x] language has it more than the actual _need_
to solve a particular problem that was to difficult to solve using only the
constructs available previously.

Two nights ago I was at a C++ Meetup and this guy was explaining how to solve
this problem using a crazy-complicated pattern with a new proposed C++
feature. Eventually I had to stop him and ask him why we could not simply do
this with a priority heap, a threadpool and some semaphores (simple shit). "Oh
yeah, you could do that, too, but this way is cooler".

~~~
meshko
The same thing is happening all over the industry. Things are growing
organically in all kinds of directions, new technologies appearing at a rate
and order of magnitude higher that anything that was happening 10 years ago.
It is making it much harder to say current and the net benefit (so far)
appears to be negative (I still can't believe that node.js made this terrible
untyped language take over the world). I am hoping something good comes out of
this, but it is certainly not the nice and cozy industry we knew and loved
10-15-20 years ago. Oh well, it too will pass.

~~~
OmarIsmail
Practical javascript can now be typed very easily (TypeScript or Facebook's
flow). Any "serious" JS shop is going to be using typed-JS in the next 12
months.

~~~
meshko
The problem is right there in your response. Do I do TypeScript or Facebook's
flow (whatever that is, I don't know)? Or perhaps CoffeeScript? Or wait, that
one is not typed. Or is it? Lots of people like it though! Except the ones
that perfer EcmaScript 2015. Because it is the future. At least for the next 2
weeks when 2016 starts. But EcmaScript 2015 is not typed either. So much for
your promise of all serious JS shops doing typed JS in next 12 months. Also,
yeah, great, "serious" shops will be typed. What about "not serious" shops?
They don't count? And their users who have to suffer through crappy products
built using incomplete tools don't count? Oh, and you know what all these half
a dozen most popular and legion of less popular solutions to the problem of
"JavaScript sucks" have in common? They add yet another step into the
toolchain making the developer's life so much harder. It is nobody's fault
that we are were we are, but please, let's call it as it is and not pretend
that this pathetic situation is somehow progress or even acceptable.

------
mavam
We've been seeing similar attempts recently, all attempting to fill a long
overdue gap: the composability of algorithms in the standard library. At this
point, there exists a TS [1] for _ranges_ [2] along with an implementation
[3]. It will be a thin layer on top of the iterator-based standard library,
sometimes dubbed "STL2."

On the one hand, it's great to new attempts to compose algorithms mushrooming,
but on the other hand it would be great to see more aligned efforts to
collaboratively push forward the TS and its implementation.

[1] A technical specification (TS) can be thought of as a self-contained mini
standard that vendors can implement, but do not have to in order to be
"standard-compliant". A is a promising candidate for future inclusion into the
standard, however.

[2]
[https://ericniebler.github.io/std/wg21/D4128.html](https://ericniebler.github.io/std/wg21/D4128.html)

[3]
[https://github.com/ericniebler/range-v3](https://github.com/ericniebler/range-v3)

~~~
fryguy
Are ranges essentially the IEnumerable<T> from c# for c++?

~~~
pjmlp
Basically yes.

------
spooningtamarin
Benchmark showing that the composition is exactly the same in a for loop as
unrolled functions would be nice.

~~~
mavam
The proposed Ranges TS has some nice benchmarks:
[https://ericniebler.github.io/std/wg21/D4128.html](https://ericniebler.github.io/std/wg21/D4128.html).

See Appendix 1 and 4.

(It's not exactly what you're asking for, but showcases that such claims often
hold true in practice.)

------
jokoon
To be honest, if I'd make a language, I would make something like C++, break
backward compatibility with C, but just much simpler and more beginner
friendly.

No templates, but standard maps, sets, arrays. Both compiled and interpreted.
No virtual functions. Clean, simple syntax close to C, but with all the nicer
stuff C++ brought along, like lambdas.

C++ is industrial-strong, but sometimes it seems it's not taught because it's
too expert, in favor of java or C# while C/C++ is the language that should be
taught.

I don't understand languages like rust and go which introduce a non trivial
new syntax. Why not keep the C flavor ? I don't see how people are going to
adopt languages like that. There should be a statically compiled language
which has the same philosophy of python.

~~~
pcwalton
> I don't understand languages like rust and go which introduce a non trivial
> new syntax. Why not keep the C flavor ? I don't see how people are going to
> adopt languages like that. There should be a statically compiled language
> which has the same philosophy of python.

I don't understand what having "the same philosophy of Python" has to do with
not keeping "the C flavor". The delta between Python's syntax and C's syntax
is basically equal to the delta between Go's syntax and C's syntax, or Rust's
syntax and C's syntax.

~~~
jokoon
I was talking about 2 different things. What is good about C is its
simplicity. Python is easy to learn too, and it has nice language facilities
and syntax sugar.

About the deltas, maybe you're right, but I disagree about the difficulty of
the languages. For example go function prototypes seem pretty heavy to fully
grasp and understand. Rust also have expert programming facilities and
paradigms.

------
jheriko
this is revolting code. period.

it does nothing special or novel whatsoever - except maybe to set a quality
bar for writing unmaintainable code.

the very first example given is simple, but demonstrates a thorough lack of
understanding for what constitutes readable code. if i had to code review this
from a junior i wouldn't let it go in without significant changes, and i'd
make time to teach them some of the most basic concepts about producing
maintainable and readable code... but only because its easier than firing
them.

i am skeptical that the label 'transducer' has any solid value whatsoever
here. i can understand the concept... its basically to encapsulate stages of
algorithms neatly. the fact that it manifests one way in Clojure says nothing
about the value of using exactly the same mechanism in C++ to achieve the
same...

normal classes, normal functions, and not being rubbish at writing code
combine to give a much better result for the same class of problems.

------
AndrewGaspar
I tried implementing transducers in MSVC about a year ago. I got pretty far in
implement a lot of the basic set of transducers one would use in development.
However, I ultimately abandoned it because they weren't really a great fit
with C++. You could make it all work, but the meta template code is extremely
difficult to reason about, and MSVC specifically had a lot of issues
processing the code (pretty quickly bumped up against MSVC's type name size
restriction!). There are also a lot of issues to think about regarding
lifetime semantics - figuring out the right behavior was pretty difficult, and
debugging issues with it pretty frustrating, especially when the code was
optimized.

The pros were that, at least for my limited testing, performance was
essentially equivalent, for optimized code, between the transducer
implementation and the equivalent hand rolled code (e.g., doing a map+filter
transducer vs. a foreach with an if), and that using transducers offered
further optimizations if you applied them completely up and down the stack, by
allowing you to, say, early terminate a sequence that would otherwise have
been done in entirety behind some opaque function call. I don't really
consider code readability to be a benefit here since transducers are, somewhat
inherently, a heady subject.

You can see my work here:
[https://github.com/AndrewGaspar/transducers.hpp](https://github.com/AndrewGaspar/transducers.hpp)

I haven't done a formal write-up or anything on it, but check out the tests if
you want to see what consuming them would kind of look like.

EDIT: One of the things I'm proud of was rigging it, without need of dynamic
typing, such that a transducer could produce heterogeneous types. Obviously
your sink would need to be capable of accepting all these different types
being produced by the transducible process, but it allowed for some
interesting integration with <iostream> \- say, creating a transducer that
inserts comma's between integers for output (see
[https://github.com/AndrewGaspar/transducers.hpp/blob/master/...](https://github.com/AndrewGaspar/transducers.hpp/blob/master/include/transducers/interjecting.hpp)
and
[https://github.com/AndrewGaspar/transducers.hpp/blob/master/...](https://github.com/AndrewGaspar/transducers.hpp/blob/master/unit_tests/interjecting.cpp)).

EDIT 2: I notice that early termination was not shown in this blog post. I
implemented this by essentially introducing a known wrapping type
([https://github.com/AndrewGaspar/transducers.hpp/blob/master/...](https://github.com/AndrewGaspar/transducers.hpp/blob/master/include/transducers/escape_hatch.hpp),
sorry I know it's a bad name). The trick with this was I only wanted the
returned values of a transducible process to be wrapped by this struct (since
it has the overhead of one bool value, and a check to actually determine if it
could early terminate) if a single transducer in the chain can early
terminate. It also made each individual transducer slightly more complicated
to implement, unfortunately.

------
xedarius
I think the best part of 'modern c++' is the fact that you don't have to use
any of it (I'll make a concession for move constructors, that just makes
sense).

