I worked on a C++ trading app in the past and our approach was "services" (actors) talking via message passing. Basically a bunch of long-lived threads, each with a queue (nowadays that could be something like boost::lockfree::queue ), running an event loop. Message passing worked by getting a pointer to the target service's queue via a service registry (singleton initialized at startup, you could also keep a copy per service though), then pushing a message to it. If you needed to do something with the result, then that would come back via another message sent via the same system.
It does mean if there's back-and-forth between services the code would be harder to follow as you'd have to read through a few event handlers to get the whole flow, but with well-designed service boundaries it worked quite well. I don't think futures would have been useful in our context as blocking any service for an extended time would have been very bad.
 If you don't want boost, you can wrap std::queue yourself, this is a simple example where you can replace the boost mutex etc. with std ones: https://www.justsoftwaresolutions.co.uk/threading/implementi...
Futures are somewhat useful if you spawn off a thread to do a single task and wait for it to finish, however, on many platforms, the thread creation overhead is high enough that you don't want to do that, you want to have existing worker threads being given work to do.
Once again, this is something Go does really well. Go routines are almost free, so you can use them with a futures model very easily.
The advantages of this data structure over a linked list queue are as follows:
* Faster; only requires a single Compare-And-Swap instruction of sizeof(void *) instead of several double-Compare-And-Swap instructions.
* Simpler than a full lockless queue.
* Adapted to bulk enqueue/dequeue operations. As pointers are stored in a table, a dequeue of several objects will not produce as many cache misses as in a linked queue. Also, a bulk dequeue of many objects does not cost more than a dequeue of a simple object.
* Size is fixed
* Having many rings costs more in terms of memory than a linked list queue. An empty ring contains at least N pointers.