
Apache Kafka 0.9 is released - nehanarkhede
http://www.confluent.io/blog/apache-kafka-0.9-is-released
======
felipesabino
I've been using kafka 0.8.2 for some time now together with Node.js for both
consumer and producer.

Although the producer side is quite simple to use and have more than one
option available, the consumer side there is only one project that is
"maintained" and works [1][2], all other opstions either only have producer
available [3] or have not received a commit in years [4].

I am a bit disappointed about how little attention Node.js with kafka had so
far as there are a lot of issues on keeping connection alive and rebalancing
that made it really hard to trust the system and automate zero downtime
deploys.

Although I still hope all these changes in 0.9 new consumer API solve these
issues, I am really happy about the decision to be backwards compatible,
making the transition/upgrade a much more smooth process

> To ensure a smooth upgrade paths for our users, the 0.8 producer and
> consumer clients will continue to work on an 0.9 Kafka cluster.

[1]
[https://cwiki.apache.org/confluence/display/KAFKA/Clients#Cl...](https://cwiki.apache.org/confluence/display/KAFKA/Clients#Clients-
Node.js)

[2] [https://github.com/SOHU-Co/kafka-node/](https://github.com/SOHU-Co/kafka-
node/)

[3] [https://github.com/sutoiku/node-kafka](https://github.com/sutoiku/node-
kafka)

[4] [https://github.com/wurstmeister/node-
kafka-0.8-plus](https://github.com/wurstmeister/node-kafka-0.8-plus)

~~~
nehanarkhede
Your critique is well received. The Apache Kafka project has support for the
Java clients and the non-java clients will be developed and available in a
federated manner. At Confluent, we are focused on providing first class non-
java clients that are API and functionality compatible with the java clients.
Forthcoming releases of the Confluent Platform will include a C/C++, python
and node.js client. Stay tuned
[http://www.confluent.io/developer#download](http://www.confluent.io/developer#download)

~~~
SEJeff
For those that don't realize it, Neha and Jay were two of the main developers
who wrote Kafka.

Thanks for the heads up!

------
jfim
Congrats to the Kafka team!

The biggest changes I can see in this release are SSL support, new consumer
API (beta), quotas and Kafka Connect.

~~~
nehanarkhede
Thanks!

------
simonw
The worst thing about Kafka in my experience has been the consumer libraries
for languages like Python. That's not to say that they are terrible or
unusable, just that they don't have nearly as much polish as the core of Kafka
itself. I'm very much looking forward to new client libraries built against
the new consumer API.

~~~
czinck
I'd say the Python library I used was borderline unusable, we stopped using
Kafka (it was just a trial period, wasn't rolled to production yet) because of
limits in one of the most popular Python interfaces. The interface worked well
enough, the API was good, but they didn't (and the bug tracker seemed to imply
they wouldn't) support synchronizing reads across processes for the same
group. What's the point in a distributed synchronized log if you can't do
synchronized distributed reads of the log?

~~~
emmett9001
Sounds like old news, but if this is still an issue, PyKafka does allow
balanced reads across a consumer group.
[https://github.com/parsely/pykafka](https://github.com/parsely/pykafka)

~~~
czinck
Yeah, it's no longer relevant for that project, but I like the ideas behind
Kafka and will probably use it again so I'll look at PyKafka before I look at
kafka-python in the future.

------
hoffcoder
I have been using Kafka 0.8.2 in a production setting for consuming real-time
event traffic from our caching layer for six months. The most difficult parts
of my experience were the occasional consumer lags that erupted without
warning/cause in the high level Java consumer APIs. A lot of experimentation
with their configuration proved to be futile and now we have had to create a
feedback system that triggers alerts to change group Ids of our high level
consumers every time some consumers start lagging.

Otherwise the performance of Kafka has been impressive (giving a throughput of
upto 15000 packets/sec to a 8-consumer pool), even though I have not had the
chance to compare it with any other such tool/library.

Nevertheless, I think this update is a long awaited one, and Kafka Connect may
really be good starting point for building more (and better) endpoints.

~~~
ora600
In this case, you will enjoy the new consumer in 0.9 a lot!

------
pcsanwald
Were the diagrams done with software, or hand drawn? If software, I'm curious
what package/style you used, the style looks very similar to Martin
Kleppmann's presentation at StrangeLoop; I assumed his were hand drawn but I'm
realizing now this might be a omni style or something.

~~~
optimusclimb
This always comes up with Martin Kleppman diagrams. See this discussion:

[https://news.ycombinator.com/item?id=9613118](https://news.ycombinator.com/item?id=9613118)

The bottom comments seem to agree it was done using Paper.

~~~
twic
From the Kleppmann's mouth:

[https://twitter.com/martinkl/status/629169643710775296](https://twitter.com/martinkl/status/629169643710775296)

I saw him talk at a conference, and that was one of the questions someone
asked. He must be so fed up with it by now!

------
mixmastamyk
What is the use case for this product? Could you use it as a replicating
database across sites?

~~~
mikeatlas
"What is the use case for a horizontally scalable message broker?"

~~~
kasey_junk
That promises durability, at least once delivery and sequential consistency
(an important set of promises that put it largely in a class by itself).

------
timc3
Has anyone come across a similar message queue that implements offsets in a
comparable way?

------
erichmond
I'm always fascinated by the lack of discussion around distributed systems
tooling on HN. Anyway! Congrats!

~~~
mathnode
The only other beast of similar nature that appears occasionally I can think
of, is Onyx. Which seems pretty cool.

Anyway,Kafka Connect will provide what probably most people are looking for in
Samza.

~~~
nehanarkhede
That observation is correct. Currently, people misuse stream processing
systems like Storm and Samza for data import/export. This is an overkill.
Kafka Connect is focused on providing scalable and operational connectors to
various systems using Kafka as the underlying transport mechanism.

------
jedisct1
Shameless plug: if you need to send application logs to Kafka, consider
Flowgger:
[https://github.com/jedisct1/flowgger](https://github.com/jedisct1/flowgger)

------
delive
Congrats on the release!

@nehanarkhede was there a specific reason the new consumer is written in Java?
The previous consumers are all written in Scala.

------
AYBABTME
Support for multi-tenancy is pretty awesome, will make it much easier to
support Kafka as a shared service within an organization.

