

Aeron: High-Performance Open Source Message Transport [video] - enginous
http://www.infoq.com/presentations/aeron-messaging

======
danbruc
The described log buffer implementation does not seem wait-free to me. If one
of the writers fails all other subsequent writes will fail, too. Subsequent
writers can still advance the tail pointer and copy their messages into the
buffer but the reader will never see their messages because the failed writer
did not update the length field in its header.

So physically the operations for the subsequent writers completed after
updating the length fields in their headers but logically they are not
completed until they become visible to the reader and that requires updating
the length field in the header of the failed writer. The situation seems even
worse when the writer responsible to rotate the logs fails.

I did not look into the source code, they are probably dealing with failed
writers, but the presentation alone does not provide any hints to me how they
do it.

~~~
mjpt777
Can you expand on how you see the writer can "fail"? Also please relate this
to other algorithms you see as wait-free but don't have your failure issue.
The code is very simple and does not have external dependencies. It is a three
step process of advance tail, copy in message, apply header. All within the
implementation, all threads make progress and completes in a finite number of
steps, and thus wait-free.

[https://github.com/real-logic/Aeron/blob/master/aeron-
client...](https://github.com/real-logic/Aeron/blob/master/aeron-
client/src/main/java/uk/co/real_logic/aeron/Publication.java#L156)

~~~
danbruc
I will kill or at least pause the thread after it advanced the tail pointer
but before it updated the length field in the header. In other wait-free
algorithms all the other threads will usually help failed threads to complete
their work.

~~~
mjpt777
Can you point to a respected publication that states that is the definition of
a wait-free algorithm, i.e. that other threads must help and it must cope with
killing the thread externally?

By kill a thread I assume you mean use the OS to interrupt and terminate the
thread from the outside while in such a simple algorithm and yet expect it to
cope. Have you tried this on any java.util.concurrent classes and reported the
"bugs" you have found? I'd be interested in the feedback you got. :-)

~~~
danbruc
This is the very definition of wait-free including the case that some
concurrent operations never complete.

»A wait-free implementation of a concurrent data object is one that guarantees
that any process can complete any operation in a finite number of steps,
regardless of the execution speeds of the other processes.« (Wait-Free
Synchronization, Maurice Herlihy, 1991) [1]

As far as I can tell - I am definitely not an expert - there is no hard
requirement that concurrent operations have to assist but it seems at least a
common solution because undoing the partially completed work of other
concurrent operations may have the consequence that they can not make progress
and therefore the entire thing is no longer wait-free.

By the way, there is no requirement to explicitly kill the thread, you may as
well just run out of stack space, the scheduler may decide to not assign a new
time slice, cosmic rays may flip a bit in turn triggering a hardware exception
because of an unknown instruction or just buggy or malicious code messing with
the content of your address space. Every process can fail at any time.

[1]
[http://cs.brown.edu/~mph/Herlihy91/p124-herlihy.pdf](http://cs.brown.edu/~mph/Herlihy91/p124-herlihy.pdf)

~~~
mjpt777
Thanks for the link.

If you look at the implementation no thread is ever stopped from completing.
If a producing thread was killed mid operation by another malicious thread
then the stream is broken. Other threads are never blocked. Other writers can
add via a finite number of steps and the stream will back pressure when flow
control kicks in. The consumer is never blocked - it simply never sees more
messages. The algorithm meets the definition.

Unless I missed it that paper does not not state that the algorithm must cope
with a malicious thread to be wait-free.

Try applying your thinking to this other respected MPSC wait-free queue.

[http://www.1024cores.net/home/lock-free-
algorithms/queues/in...](http://www.1024cores.net/home/lock-free-
algorithms/queues/intrusive-mpsc-node-based-queue)

~~~
wflfof
Regarding applying the wait/lock/obstruction freedom definition to Vyukov's
MPSC implementation you reference.

It's clear that the MPSC design is _not_ wait-free, nor is it lock-free.
Dmitry identifies this fact in his initial postings about that design:
[https://groups.google.com/forum/#!topic/lock-
free/Vd9xuHrLgg...](https://groups.google.com/forum/#!topic/lock-
free/Vd9xuHrLggE)

""" Push function is blocking wrt consumer. I.e. if producer blocked in (*),
then consumer is blocked too. """

Between the XCHG and the update of next. If the producing thread completes the
XCHG but fails to update next, all threads will fail to make progress.
"Progress" in this case being the producers transferring data to the consumer.
Put another way, in this algorithm a single misbehaving thread can prevent all
threads from making progress.

Regarding the idea of "maliciousness", specifically if a participant thread is
"killed mid-operation". That idea is the very definition of wait-freedom: that
all participants complete the algorithm in a bounded number of their own
steps, irrespective of the activities of other threads.

The definitions of wait/lock/obstruction freedom are well specified. I suggest
the first half of _The Art of Multiprocessor Programming_ (the revised
edition!) by Herlihy and Shavit for a deep dive.

~~~
mjpt777
I've read the _The Art of Multiprocessor Programming_ and am happy to read it
again. Maybe I have misinterpreted.

So if "killed mid-operation" must be supported then I don't see how many
algorithms can be said to make progress. Take for example the Lamport SPSC
queue[1], if the producer gets killed mid operation between steps 4 and 5 of
the push operation. Then the data is in the queue but the consumer is blocked
from ever seeing it with this line of reasoning. The Lamport SPSC queue is
considered wait-free by the concurrency community I know. I base my reasoning
on this. What if the producer takes a pause for a long time between steps 4
and 5 before continuing?

However if to be wait-free an algorithm must allow all other threads to
continue using the data structure, not just continue making progress in other
ways by being non-blocking and completing in a finite number of steps, then I
stand corrected.

If wait-free must include coping with any thread being killed mid operation is
there a term for being lock-free but also having all threads not block and
complete in a finite number of steps for their interaction with the algorithm?

[1] - [http://arxiv.org/pdf/1012.1824.pdf](http://arxiv.org/pdf/1012.1824.pdf)

~~~
danbruc
I just had a look at the definitions again. The presented log buffer
implementation is neither wait-free, nor lock-free, nor obstruction-free
because one failed writer will prevent the progress of any other thread
unconditionally, in consequence this is accurately described as a blocking
algorithm. To just quote from Wikipedia »[...] an algorithm is called non-
blocking if failure or suspension of any thread cannot cause failure or
suspension of another thread [...]« which is obviously not the case because a
failed writer will cause all subsequent writes to fail, i.e. the writers will
happily fill the buffer but the messages never really make it into the buffer
in a way that the reader can see them.

In case of the SPSC queue the requirement is that the writer must be able to
enqueue new items no matter what the reader does and the reader must be able
to dequeue all items for which the enqueue operation succeeded no matter what
the writer does afterwards. The presented queue implementation meets this
requirements. If the writer fails between steps 4 and 5 the write did not
succeed and it does not matter that the reader can not see the item.

What you say about keeping the data structure usable for other threads seems
correct to me.

You must not confuse lock-free and not using a lock, these are two different
things. An algorithm using locks is a blocking algorithm but using locks is
only sufficient, not mandatory. If no process can block other processes
unconditionally the algorithm is non-blocking and obstruction-free, lock-free
and wait-free are different progress guarantees for non-blocking algorithms in
ascending order of strength.

One last point, if you try to obtain a lock and fail to do so you are
considered blocked, no matter if you spin, suspend the thread or do unrelated
things to waste some time until retrying later.

~~~
mjpt777
When I re-read the definitions I can see what you take from it. I think it
comes down to what you consider as systemic progress. With respect to the
preciseness of the definitions I have likely misinterpreted this. Is the
system the algorithm itself or is it the system it lives in? I assumed the
latter which may well be a mistake.

Each thread under the algorithm can perform their actions in a finite number
of steps without ever blocking. This means the producers can continue to do
other work. The consumer can continue to consume from other log buffers
without being blocked and complete in a finite number of steps. If a producer
is killed mid operation then no further progress can be made on that log
buffer. If this is considered blocking then the algorithm is blocking and
therefore not wait-free. It would need to be killed by another malicious
thread for this to happen.

What is clear is that this algorithm gives the best latency profile of of all
the measured messaging systems and the highest throughput. I now have the
challenge of searching for a name that best describes its behaviour.

~~~
danbruc
Glad to see that we reached consensus. As mentioned before, wait-freedom has
nice progress guarantees but does not necessarily provide the best possible
latency or throughput because of the overhead associated with guaranteeing
progress for all live threads. I also finally realized that you are the
speaker, didn't make the connection before.

And something I wanted to mention before but forgot to do - there is not only
a problem if the responsible writer fails while trying to rotate the buffers
but also if there is just another writer trying to write before the the buffer
rotation completed. There is really not much you can do in this case besides
retrying until you succeed. But this again also means that a writer may have
bad luck and every time he looks a buffer rotation - not necessarily the same
one - is in progress causing the writer to starve.

~~~
mjpt777
While one thread is in the action of rotation other producers return right
away from the offer. The possibility of starvation occurs if the same thread
each time it retried, that buffer has advanced to the next rotation. If adding
100 byte messages to a 128 MB buffer that is greater than a 1 in a million
chance on each rotation. To have this continue then the probabilities have to
be multiplied for the number of times you expect it to happen. So for ultimate
starvation that gets crazy very very quickly ;-) Do you see it as more likely
than that?

~~~
danbruc
Not really, all these things are pretty unlikely. But if you deploy enough
installations and run them with enough load you are pretty much guaranteed to
see all the issues, even if only rarely. There will be someone simply killing
no longer needed threads rendering a log buffer unusable from time to time.
Someone will from time to time send relatively huge messages making the
starvation scenario more likely. The thread won't of course starve forever but
just hitting the rotation twice will make for a pretty heavy outlier in the
latency chart.

And just in case I wasn't clear about that, I am looking at all of this from a
pretty theoretical point of view - I have at best a very rough idea what this
will be used for in the real world and what scenarios are important and likely
and what just doesn't matter. Actually maybe even the thing I just wrote is
nonsense because nobody in your target audience will simply kill threads, but
then again developers are notoriously good at bending and ignoring rules.

~~~
mjpt777
Thanks for the feedback. I'll adjust my terminology to be more precise as it
is the right thing to do. Doing things in the open is a great way to learn.
I'm not shy of airing my laundry :-)

------
anton_gogolev
Half-expected something related to Herman Miller's Aeron.

[http://www.hermanmiller.com/products/seating/performance-
wor...](http://www.hermanmiller.com/products/seating/performance-work-
chairs/aeron-chairs.html)

