
Tango: Distributed Data Structures Over a Shared Log (2013) - another
https://www.microsoft.com/en-us/research/publication/tango-distributed-data-structures-over-a-shared-log/
======
mankurt
Here is a nice summary of the Tango paper.
[http://muratbuffalo.blogspot.com/2014/09/paper-summary-
tango...](http://muratbuffalo.blogspot.com/2014/09/paper-summary-tango-
distributed-data.html)

------
andr
Sounds like a low-level implementation of event sourcing/CQRS. Is there
anything similar that is usable right now? Perhaps built on Kafka?

~~~
morsch
We're using eventuate[0], which is an event-sourcing framework with deep
support for cooperation via shared logs. It's based on the actor framework
akka; akka itself has akka-persistence[1], which is similar but different[2].
All of these techs are usable right now.

Though it doesn't feature either implementation (he does something similar on
top of Samza), I like this article[3] on the topic: turning the database
inside out really _is_ what we're doing.

[0]
[http://rbmhtechnology.github.io/eventuate/](http://rbmhtechnology.github.io/eventuate/)

[1]
[http://doc.akka.io/docs/akka/snapshot/scala/persistence.html](http://doc.akka.io/docs/akka/snapshot/scala/persistence.html)

[2] [http://krasserm.github.io/2015/05/25/akka-persistence-
eventu...](http://krasserm.github.io/2015/05/25/akka-persistence-eventuate-
comparison/)

[3] [https://www.confluent.io/blog/turning-the-database-inside-
ou...](https://www.confluent.io/blog/turning-the-database-inside-out-with-
apache-samza/)

------
rvenkatesh25
"the abstraction of a replicated, in-memory data structure (such as a map or a
tree) backed by a shared log"

If I read just this piece of text anywhere, the word popping up in my mind
would be zookeeper

~~~
noahdesu
Indeed, one of the prototype services built on Tango and evaluated in the
paper was a Zookeeper clone.

~~~
jasonwatkinspdx
One of the more eye opening aspects of the paper is just how little code it
took them to duplicate the Zookeeper API atop Tango. Granted there are some
caveats about a research project vs an industry ready codebase, but I still
interpret it as strong evidence that their approach is a good foundational
abstraction.

------
wavewash
A couple of my friends have been looking at this paper and created their own
visualization implementation:
[https://github.com/derekelkins/tangohs](https://github.com/derekelkins/tangohs)

------
GordonS
Maybe add '2013' to the title?

------
EGreg
Why need a shared log? Remember the CAP theorem. No need for these
bottlenecks. If you want to store that A happened after B, just have A store a
(hash of) B.

~~~
jamesblonde
That's a type of logical clock you're describing (without a partial order over
all events, just 2 events). Obviously, if you do that with all events, you
will have a logical clock. The hash of the previous event is not a good
logical clock, as you cannot define higher level operations over the values,
such as - is this event 'newer' than this other event.

