
Show HN: KafkaLite – A simple message broker inspired by Kafka - flashgordon
https://github.com/panyam/KafkaLite
======
sethammons
I feel the readme is lacking; I would hope to see something about its use
case, what makes it lite, etc. How is it like/unlike Kafka? It seems like it
is only available as a c library and does not expose a protocol. Is this just
a single instance of a producer/consumer and a transaction log with topics
available to the one running program? It would also be nice to see results
under the benchmarks.

~~~
pweissbrod
+1. FWIW It seems to be a kafka-style cursor based messaging system without
the dependency on zookeeper (just what i gleaned from the readme)

------
wruza
Can you please explain what this library is for, how is it lite or competing,
and how Franz Kafka is related? I just spent few minutes on Readme to
understand nothing.

~~~
rdeboo
I imagine it it related to Apache Kafka
([http://kafka.apache.org/](http://kafka.apache.org/))

~~~
dinedal
I wish how it was related to Apache Kafka was explained as well, since it
looks like a totally different beast, and appears to only share the name.

------
tveita
So it's a small in-process library for writing and reading append-only logs on
disk, that supports multiple threads concurrently reading and writing as a
pub/sub system.

It seems inspired by the Apache Kafka project in that it organizes logs into
named topics, but it is not networked, does not split topics into partitions
and segments, and does not do housekeeping like deletion of old logs.

It looks pretty new and immature, for instance
[https://github.com/panyam/KafkaLite/blob/master/src/kltopics...](https://github.com/panyam/KafkaLite/blob/master/src/kltopics.c#L39)
reads from a potentially uninitialized field and could randomly fail (but will
not propagate the error!). Have the tests been run with Valgrind?

Not bad, but it could definitely do with an introduction in the Readme
describing its purpose.

~~~
flashgordon
Very nice pick. Valgrind has been used extensively to test this (valgrind is
applied to all tests in the tests package). It looks like when i create my
topic array, I zeroed out the memory (with calloc). So the topic is
initialised to zero. But this would have resulted in uninitialised read
accesses for any new topics that were created beyond the original topic buffer
(with initial capacity 32). This has now been fixed (thanks again).

I chose to keep networking out of it for now as I felt it was something that
could be added by the client (eg by having a websocket listener publish events
as it saw fit). Also partitioning was not a high priority now as creation of
new topics actually had fairly low overhead. Also deletion and compation of
logs is something that was not a high priority right now but definitely
something I would like to look at (would love some help :) )

------
ddispaltro
Definitely seems like something to replace leveldb with vs kafka. Or maybe
more akin to [http://chronicle.software/products/chronicle-
queue/](http://chronicle.software/products/chronicle-queue/) minus the TCP
replication

------
flashgordon
Guys firstly thank you very much for the exceptional feedback. And I apologise
for the missing details. I found that as I was adding more to this the more I
felt I had to add (fear of publishing really). So I wanted to push one out so
I don't end up waiting for the perfect thing for ever.

In short the purpose of this library was to allow any application to embed an
event broker with persistence so that it could consume events at its own pace.
The purpose of this library is NOT to replace or compete with GCD or
java.util.conc (which provide excellent primitives to do in memory queuing and
processing of events).

The main use case that we wanted this for was message synchronisation.
Consider a mobile messaging application that allows users (on mobile clients)
to send and receive messages as well as do other CRUD ops on messages . To
synchronise these events across all clients (say users in a channel) we would
have to keep a connection (eg websockets) open to the server that would
receive publish events to the client or poll for change events periodically by
the client (of course the server would have to maintain its own event log -
but a Kafka event sprayer that publishes to websockets is easily done).

In either case, when these events are received by the client they need to be
processed as soon as they are received or the client would have to refetch
them (harder to do on a non-polling situation as the state of what events have
been consumed is expensive to record and maintain). Problem with having to
process these events as soon as they are received is that there is no way to
prioritise the events and there is no way to defer processing to a time
convenient to the client. So with this library we have an append only log of
events onto which we persist the events as we receive them (and because we can
have multiple topics for each KafkaLite - KL - context) it allows us to impose
prioritisation on which events are to be processed when.

I have not built in replication replication as it can be built on top as and
when required. Also sharing of messages on a particular topic across processes
while doable was not an immediate priority (would love to hear other use cases
for this) so was eschewed. One other thing I would definitely like to do is
provide a few sample producers that are platform specific (eg websockets based
publishers, polling based publishers) but these seemed not generic enough (at
least as of now) so I have not had a chance to investigate these further.

I would also like to apologise for any confusion with the name (especially to
the gods of Apache Kafka!). As I had mentioned, this was only inspired by
Kafka's original design (ie a stateless, fast and append only log). This
library is written in C so as to be portable across any platform and not
having to incur the cost bundling and invoking a JVM (clearly there is a lot
to be done for this to be useable on the server side) - Hence the "Lite".

Again I really appreciate the feedback. Even though you see it when you read
others content, with your own content you start with the bias of clarity as I
did. Even more so if any of you would like help in using it in your own
applications I would love to have a chat as to what features we could add to
improve the robustness and usefulness.

~~~
rakoo
> So with this library we have an append only log of events onto which we
> persist the events as we receive them

You could also use any member of the CouchDB family, such as PouchDB for the
browser (even as a lightweight desktop solution), or any member for the mobile
[0] [1] [2]

[0] [https://github.com/couchbase/couchbase-lite-
net](https://github.com/couchbase/couchbase-lite-net)

[1] [https://github.com/couchbase/couchbase-lite-
android](https://github.com/couchbase/couchbase-lite-android)

[2] [https://github.com/couchbase/couchbase-lite-
ios](https://github.com/couchbase/couchbase-lite-ios)

What could be really interesting is another member as a library; I don't think
there exists any, but kafkalite could be the beginning of something...

~~~
flashgordon
Hey rakoo thanks for sharing that. I actually came across this earlier. (Also
RocksDB /LevelDB are amazing as embedded key-value stores). My starting point
was to see what a pure log would give you and let the application worry about
what it did with the events (eg in the messaging case - update the messages in
the respective channels with the new content - the channel is just a table in
sqlite!) - essentially a way to decouple the processing of events from their
transport and persistence.

