
Building DistributedLog: Twitter’s high-performance replicated log service - anand-s
https://blog.twitter.com/2015/building-distributedlog-twitter-s-high-performance-replicated-log-service
======
Cieplak
Blog post about the choice of replication scheme chose for Apache BookKeeper:

[http://fpj.me/2015/01/23/so-many-ways-of-
replicating/](http://fpj.me/2015/01/23/so-many-ways-of-replicating/)

Critique of the chosen replication scheme:

[http://www.ioremap.net/2015/04/28/apache-bookkeeper-or-
how-n...](http://www.ioremap.net/2015/04/28/apache-bookkeeper-or-how-not-to-
design-replication-consistency-scheme/)

~~~
sijieg
Ah, just happened to see those two blog posts together in same place. Those
are really good posts on explaining replication scheme of Apache BookKeeper.

One thing to add on Flavio's blog post. Readers of a log (ledger) agree on
LastAddConfirmed (lac), which LAC could be thought of 'commit' message in most
of consensus protocol. In replicated log, commit means making data visible for
readers.

BookKeeper doesn't enforce 'commit' like what other consensus protocol does.
Instead it exposes the core elements of a consensus protocol as primitives and
let applications decide things such as when to commit, how often to commit.
Readers could use API (readerLastConfirmed) to catch up to latest 'commit'
data. Controlling when to commit is the way how DistributedLog uses BookKeeper
to tune end-to-end latency for different types of workloads: for latency-
sensitive workloads like database, it does aggressive commits, for analytics
workload, it does periodical commits to get benefits (such as reducing
bandwidth by compression) by grouping.

------
ali_hao
Geo-replicated log sounds like a very ambitious and impressive project. As
what blog says, the DistributedLog just customizes data placement policy to
support geo-replication. Does that mean it runs same stack within datacenter
vs across multiple datacenters? What kind of performance it could achieve? I
am hoping twitter could disclose more details on that, such as concerns and
experiences on geo-replication log, comparison with spanner and CockroachDB.

------
fintler
"Kafka addressed these durability concerns in version 0.8"

I wonder if they would of used Kafka if it existed as it does today.

