
Siberite: A Simple LevelDB-Backed Message Queue in Go - Bogdanovich
https://github.com/bogdanovich/siberite
======
eis
Why would you choose a LSM Tree based storage mechanism for a message queue?

The only reason I can come up with would be because it's a read-to-use library
you can just plug in which gives OK performance and some handy features
because you can use the KV store for other things. But it doesn't scale well
and backups with LevelDB are not really easy either (close DB, copy all
files).

Message queues when they are ordered (at least on the local node/queue level)
usually just need some kind of append-only log file. You don't do random reads
or writes into the middle of the queue, you only modify the head and tail.

InfluxDB, albeit being a time series db has similar write patterns to a
message queue, learned it the hard way when they first tried to use a LSM Tree
database (LevelDB), then switched to a B+Tree (BoltDB/LMDB) but that also
doesn't scale once the DB gets big and the tree has quite some depth. They
kindly did a nice writeup of their journey:
[https://influxdb.com/docs/v0.9/concepts/storage_engine.html](https://influxdb.com/docs/v0.9/concepts/storage_engine.html)

Why not do it simple and use append-only files without complex structure and
management?

Check out Kafka for a better storage format for message queues of this kind.

PS: every message queue should first clearly explain what guarantees it
provides.

~~~
Bogdanovich
Yes, goleveldb was chosen because it's a ready to use library with a decent
write and read performance, and no external non-Go dependencies. It can also
be used to store multiple consumers offsets in future.

Regarding provided guarantees, with simple 'get work_queue' reads it provides
at-most-once delivery. With two phase reliable reads 'get work_queue/open',
'get work_queue/close' it provides at-least-once delivery (although message is
kept in memory on server during a reliable read and will be lost if you
SIGKILL siberite. On SIGTERM and SIGINT siberite will gracefully abort the
read and save the message).

~~~
dwenzek
I'm puzzled by your mention of consumer offsets.

Indeed, either Siberite is a queue system which purpose is to dispatch each
message to one and only one consumer for further processing and which requires
the consumers to acknowledge fully processed messages ;

or Siberite is a journal system (in the spirit of Kafka) which purpose is to
replay the full log to any consumer asking for it and which offers the
consumer a watermark mechanism to keep track of their progress.

In the former case, the queue system is responsible of what to do in case of a
missing or late acknowledgement (choosing between "at least once" or "at most
once" message delivering). In the later case, the consumers are responsible of
how to maintain an atomic view of message consumption and message processing
(for instance using a transaction to persist an offset with a state).

~~~
Bogdanovich
Right now it doesn't store any consumer offsets. And you can get either at-
most-once or at-least-once guarantees.

But I found the idea of multiple consumer groups per queue very interesting.
So basically you would still be able to fetch queue messages as you can do now
and it will delete dequeued items, but you would also be able to use something
like 'get queue_name:consumer_name' and it will create a consumer group
internally with a stored offset and will serve messages using that offset. In
case of reliable read failure each consumer group will keep it's own queue of
failed deliveries, will check that queue and serve these failed items first.
If source queue head has changed and became larger then consumer group offset,
then consumer group offset would just start from the source queue head.

This way you can get Kafka-like multiple consumer groups per queue as an
additional feature.

------
krat0sprakhar
This couldn't have come at a better time - I was actually looking for a
durable message-queue written in Go. Is there any way to read more about the
architecture of this system? I find systems like these to be quite fascinating
but taking the time to go through the code can sometimes be very time-
consuming. It would be awesome if more projects have a writeup as detailed as
cockroachdb[0]!

Aside: There used to be a site sometime back which used to distribute compiled
binaries of Go code for all platforms? Is it still up any chance?

[0] -
[https://github.com/cockroachdb/cockroach#architecture](https://github.com/cockroachdb/cockroach#architecture)

~~~
Bogdanovich
It's really simple. Each queue is a separate leveldb database on disk.
Messages are stored as key/value using incremental ids. Head and tail of the
queue are kept in memory and get initialized on startup via db scan.

~~~
dave_ops
Why don't you just store the head and tail as K/V entries? You have a durable
K/V store at your disposal.

~~~
Bogdanovich
There is no benefit in that except faster startup time. As a downside you'll
get a lot head/tail db keys updates.

~~~
dave_ops
Fast start times are a valuable thing for a service component.

Stick about 10GB of small entries in it (should be enough to create all the
levels) and then see what happens.

Also, you could reserve the persisted [H|T] for controlled shutdown scenarios.
Basically anything that isn't complete system failure if you're properly
trapping signals.

~~~
Bogdanovich
I added some more benchmarks including packing with 200M small 64 byte
messages (20Gb) and consumption of that queue. There is no slowdown because of
mass delete.
[https://github.com/bogdanovich/siberite/blob/master/docs/ben...](https://github.com/bogdanovich/siberite/blob/master/docs/benchmarks.md#queue-
packing-and-unpacking-64-byte-message-size)

------
xrstf
Sounds interesting. For my usecases, which require few (< 10) messages/sec and
no clustering, would I gain anything by using Siberite over Beanstalk?

~~~
Bogdanovich
You can have large queue sizes (larger than RAM size) and siberite would still
consume small amount of resident memory. You basically don't need a separate
server with decent amount of memory for it. You can also can get benefit from
two-phase reliable fetch - if your client gets disconnected without confirming
a message, the message will be served to another client (very convenient if
you use amazon spot instances for your workers).

~~~
eis
Note that this also means that messages can be delivered more than once and/or
that the clients need to remember the messages that they processed. In some
setups that can be a showstopper.

~~~
Bogdanovich
Reliable fetch is a feature, not a protocol requirement. You can use simple
'get work_queue' command to just get a message, or you can use 'get
work_queue/open', 'get work_queue/close' \- two phase fetch if you need a
reliable fetch. You can also use 'get work_queue/close/open' command to
acknowledge previous message and read a new one.

~~~
eis
Ok so you can switch between at-most-once and at-least-once guarantees. While
nice to have both options in a message queue, my point still stands.

Each of these have trade-offs and the way it is architectured here, in the at-
least-once case you will have to either remember all the processed messages or
be prepared to process a message multiple times, whatever that means in your
specific use-case.

------
clumsysmurf
Can you describe how the queue was represented as key/value?

~~~
Bogdanovich
Yes, as id/value with autoincrement key. Head and tail ids are kept in memory
and get initialized on startup via leveldb database scan.

