

Container Algorithms - signa11
http://ericniebler.com/2014/11/23/container-algorithms/

======
blt
For all of the STL's power, and all the useful stuff in std::algorithm, it's
still impossible to write something as elegant as a list comprehension in
Python. It's really exciting to see Eric's work taking us closer to this goal.

On the other hand, I always worry about all the magic implicit behavior in
C++, both in old and new language features. We're writing in C++, we care
about efficiency. Eric addresses this by requiring rvalue references in the
container algorithms, but in general it's pretty easy to write code that does
more work than you expected.

Since most undesired magic in C++ boils down to allocating heap memory
unexpectedly, it would be cool to add syntax to could tell the compiler that a
block is not allowed to allocate, like:

    
    
        noalloc {
            bigvec = std::move(bigvec) | cont::sort | cont::unique;
        }

------
repsilat
I don't understand why the lazy algorithms aren't enough. Can't we have an
assignment to vectors from lazy streams, or constructors from them?

Why can't the

    
    
        std::vector<int> ints =
            read_ints() | cont::sort | cont::unique;
    

from TFA can't be done "as lazily as possible" (i.e., sort eagerly and uniq
lazily). Alternatively, something like

    
    
        std::vector<int> ints(read_ints() | cont::sort | cont::unique);
    

or even

    
    
        std::vector<int> ints;
        read_ints() | cont::sort | cont::unique > ints;
    

What reason could we have for making `read_ints` and `unique` eager by
default?

Performance-wise, the lazy filter works in constant space, will result in
fewer dcache misses if the data set is large, and (I _think_ ) provides more
opportunity for the optimizer to do clever things. The eager chain may have
better icache performance if the filter is very complicated.

And it's easy to go from lazy to eager -- just put everything in a vector. If
you want to go from eager to lazy, though, you're out of luck.

~~~
zamalek
I think it comes down to a "memory bug" vs. "performance bug" trade-off. If
you make it default-lazy you run the risk of having the stream evaluated
multiple times. If you make it default-eager you run the risk of O(N) memory
(as you said).

I agree with your side of the trade-off, a performance bug (result: slow
application) is always preferable to a memory bug (result: failed malloc,
likely crash) in production.

To make things more desirable, you could always do something even simpler than
your example:

    
    
        auto ints = read_ints() | cont::sort | cont::unique | cont:to_vector;
    

That way fixing the performance bug would only need a one-liner.

Still, having the option is better than nothing.

------
untothebreach
as someone who knows C++, but doesn't use it everyday, nor keep up with the
latest and greatest in the community, I was surprised to see the "pipe"
syntax. It looks like the author is using them like unix pipes, or like the
`|>` macro in Elixir. When did C++ get this? Is it just functions overloading
the bitwise-or operator, or is it 'official' syntax?

~~~
pmr_
This is achieved by overloading the operator|, although there is some special
hackery to use static function objects to omit the constructor calls.

The first mayor library to use this syntax was Boost.Range and it is now being
proposed as the syntax of standard ranges.

~~~
eric_niebler
Actually, the authors of Boost.Range got the idea from an earlier range
library of mine, which is noted in the Acknowledgements section of the
Boost.Range documentation
[here]([http://www.boost.org/libs/range/doc/html/range/history_ack.h...](http://www.boost.org/libs/range/doc/html/range/history_ack.html)):

> Eric Niebler contributed the Range Adaptor idea which is arguably the single
> biggest innovation in this library.

------
rtpg
It still surprises me to this day how much minutia is found in C++'s standard
libraries when it comes to data structures. I have a hard time imagining how
one could write a more full-featured set of containers and the like.

~~~
pmr_
Just look at boost and you will get an idea of what other containers are
useful. Here is a quick list of the top of my head:

\- flat set/map \- multi_index \- static_vector (called SmallVector in LLVM)
\- bimap \- intrusive containers \- various from Boost.Heap

