Hacker News new | past | comments | ask | show | jobs | submit login

This post is a bit confused about how Kafka replication works. Replication in Kafka is always synchronous in the sense that the cluster internally has a strong notion of which messages are committed and no uncommitted message is handed out to consumers. It is just that the client has the option of writing to Kafka without blocking while the servers commit the message.

This is described in more detail here: http://www.confluent.io/blog/distributed-consensus-reloaded-...

Apologies if I misrepresented Kafka's replication. So the system described in the blog post is the new 0.8.2+ stuff with min.insync.replicas taking over required.acks? Will review.

Thanks for clarifying!

I think there are two separate things: 1. Is there a principled notion of when a write is "committed" and a leadership election algorithm that works with this to ensure committed writes aren't lost as long as the failure criteria are met? This has been true since we added replication to Kafka. 2. The second is what is the recovery behavior when you have no available replicas. Previously we would aggressively elect any replica even if it had incomplete data. Now that behavior is configurable.

There is a more detailed explanation here: http://blog.empathybox.com/post/62279088548/a-few-notes-on-k...

required.acks works together with min.insync.replicas.

Basically "required.acks" lets you choose between "no acks", "leader only" and "all in sync replicas", while min.insync.replicas lets you control what "all in sync replicas" actually mean.

Added a clarification to the relevant section. Apologies again for the confusion!

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact