
Exactly-Once Messaging in Kafka - bkirwi
http://ben.kirw.in/2014/11/28/kafka-patterns/
======
mjb
I'm the same 'mjb' from the linked lobste.rs thread. If anybody is interested
I expanded that comment into a blog post here:
[http://brooker.co.za/blog/2014/11/15/exactly-
once.html](http://brooker.co.za/blog/2014/11/15/exactly-once.html)

The core problem here is making the act of writing down that you've done
something atomic with doing the actual thing. If you can solve that problem,
exactly-once processing is easy. If you can ignore that problem, for example
because doing the thing is idempotent, exactly-once processing is easy. In the
real world, however, it can be really difficult to solve these problems in
general and very specific (and often 'incorrect') solutions are used. OP's
post talks about a bunch of those, which is very useful.

Very often, though, real-world systems settle for 'at least once' or 'at most
once' and find out-of-band ways to handle the missing or duplicate messages.
Whether this is practical or not depends on the message rate, and the cost of
getting it wrong.

~~~
bkirwi
Hey, thanks for the excellent writeup.

I enjoyed and totally endorse the content of your blog post, but I did find
the title / intro a bit misleading. Exactly-once is pretty much always what
people want: it's the easiest model for most folks to reason about, especially
when they're new to the whole distributed-systems thing. (If exactly-once
delivery was easy, you certainly wouldn't find people going out of their way
to build systems that dropped or duplicated messages.) Of course, you might
not _need_ exactly-once delivery to get what you want -- which is good,
because like you say, that's a hard / impossible problem to solve in general.

It feels to me like people in the stream processing universe have taken that
message a bit too much to heart -- things like the 'lambda architecture' treat
stream processing as "too hard to get right" and relegate it to approximate,
disposable calculations only. My post was partly an effort to push back on
this; as a community, I don't think we should give up without a fight.

------
fintler
Although indirectly related, the comments in
[https://issues.apache.org/jira/browse/SAMZA-390](https://issues.apache.org/jira/browse/SAMZA-390)
are also interesting reading (hi Ben!).

The Kafka/Samza ecosystem is coming alive with tons of neat ways to
interacting with streaming data. For example, take a look at
[https://github.com/milinda/Freshet](https://github.com/milinda/Freshet)

~~~
bkirwi
Oh hi!

The Samza ecosystem seems to be a magnet for this sort of thing these days --
the community's really committed to taking full advantage of Kafka, and not
just papering over it so it looks just like any other queuing system. Really
excited to see where all this will lead.

~~~
Terr_
Speaking as a bystander who mainly reads blogs, my perception of Kafka is that
it's marketing (for lack of a better term) has successfully positioned it with
"we can keep your arbitrarily large logs" and "they are pretty durable,
consumers can revisit them if they fall behind".

This offers a nice distinction from MQ-ish systems, where the emphasis seems
to be on a different set of benefits like "we can handle complex centralized
distribution logic for you" and "we help manage your synchronous calls".

Does that seem accurate in terms of how Kafka is evolving its future niche?
Personally, I'm interested in optimistic-locking when adding to a log.

~~~
bkirwi
Yes, Kafka certainly is good at those things, and I suspect Kafka will only
get better at them. But it's actually quite a simple bit of infrastructure at
heart[0] and this means it's useful for a large variety of other things, many
of which we're only figuring out now.

Re: optimistic locking... there are a couple proposals for this floating
around, and I'm not sure which / if one will get in. It certainly seems
consistent with the general mission, though.

[0] [http://engineering.linkedin.com/distributed-systems/log-
what...](http://engineering.linkedin.com/distributed-systems/log-what-every-
software-engineer-should-know-about-real-time-datas-unifying)

------
brandtg
Some caveats about Kafka replication are discussed here:
[https://aphyr.com/posts/293-call-me-maybe-
kafka](https://aphyr.com/posts/293-call-me-maybe-kafka).

It is important to include these in the "exactly-once" discussion... (hard to
be exactly-once if messages are lost broker-side).

Also, great discussion from Coda Hale about "CA" systems here:
[http://codahale.com/you-cant-sacrifice-partition-
tolerance/](http://codahale.com/you-cant-sacrifice-partition-tolerance/)

