
Intel Thread Building Blocks - samjltaylor
https://www.threadingbuildingblocks.org
======
krapht
I'd be interested in reading about real world experience with Intel TBB, and
comparisons to Fastflow, which seems to be the nearest competitor.

~~~
danieljh
From what I can see in the scientific community it seems like TBB does not get
the attention it deserves.

In my opinion this comes from two simple facts: 1/ it's easier to throw a
OpenMP #pragma on your loop after detecting the bottleneck with a profiler and
2/ you need to understand some TBB concepts like partitioning/splitting to
implement your own algorithms. That is, TBB much more integrates into the
language (modern C++ with lambdas etc.).

I won't tell you much about TBB's concurrent containers (maps, queues,
vectors, ...), parallel algorithms (sort, scan, reduce, ...), or memory
allocators. Those are explained in detail in the documentation. What I want to
tell you about is TBB's Pipeline and FlowGraph feature, because of how
powerful they are and how often they are simply ignored.

TBB's pipeline lets you build a pipeline of parallel or serial stages. On a
high level it really is function composition, with the benefit of TBB deciding
on the level of parallelism.

Here is an example of using TBB's pipeline to receive from the network,
deserialize the blob and merge the results concurrently and potentially in
parallel:

[https://github.com/daniel-
j-h/DistributedSearch/blob/97224b1...](https://github.com/daniel-
j-h/DistributedSearch/blob/97224b179fdc050dc219287616e8d3073e0e0a8c/Service.cc#L114)

You can find a quick explanation in this blog post:

[https://daniel-j-h.github.io/post/intuitive-monadic-bind-
kle...](https://daniel-j-h.github.io/post/intuitive-monadic-bind-kleisli-
composition/)

All you have to do is create your stages as functions: take input from stage
n-1, process it and move ownership of the item over to stage n. Note: C++11's
move semantics do not require you to pass raw pointers around or do the memory
management yourself, as it is done in TBB's documentation!

As you can see, you only need the parallel_pipeline function (and
make_filter<In, Out> for when the stages are not passed directly as lambdas):

[https://www.threadingbuildingblocks.org/docs/help/reference/...](https://www.threadingbuildingblocks.org/docs/help/reference/algorithms/parallel_pipeline_func.htm)

TBB's pipeline is really powerful for scenarios where a linear processing
chain is needed, e.g. face detection with OpenCV where you have to 1/ grab a
frame 2/ do some histogram corrections 3/ apply gaussian blur 4/ apply a face
detection algorithm 5/ merge face's position into circular buffer 6/ do
average over this buffer to estimate face position.

TBB's FlowGraph is for when your dependencies are more difficult to express as
a simple pipeline. In the OpenCV example, maybe you need to buffer 5 frames
for the face detection, maybe you need to join inputs from a camera and a
video and react on when a face is found in a camera frame.

An example for expressing arbitrary dependencies is here:

[https://www.threadingbuildingblocks.org/docs/help/reference/...](https://www.threadingbuildingblocks.org/docs/help/reference/flow_graph/dependency_flow_graph_example.htm)

And an example for joining two nodes:

[https://www.threadingbuildingblocks.org/docs/help/reference/...](https://www.threadingbuildingblocks.org/docs/help/reference/flow_graph/message_flow_graph_example.htm)

This is the documentation's starting point for FlowGraph:

[https://www.threadingbuildingblocks.org/docs/help/index.htm#...](https://www.threadingbuildingblocks.org/docs/help/index.htm#reference/flow_graph.htm)

There is also a book about TBB out there and it contains some additional
examples. But as it is with most of the documentation, those examples are not
written in modern C++ (C++11/C++14) and the book is a bit dated.

~~~
gonewest
That's a good answer. Here's one more example from the visual effects
industry. This is for parallel evaluation of potentially complex dependency
graphs in an interactive character engine.

[http://www.multithreadingandvfx.org/course_notes/Paralleleva...](http://www.multithreadingandvfx.org/course_notes/ParallelevaluationofcharacterrigsusingTBB.pdf)

~~~
vvanders
We'd look at TBB at a previous gig for something similar. However we never
ending up pursuing it for unrelated reasons.

