
Too small to Kafka but too big to wait: Really simple streaming in Clojure - zonotope
https://dataissexy.wordpress.com/2018/11/05/too-small-to-kafka-but-too-big-to-wait-really-simple-streaming-in-clojure-queues-pubsub-activemq-rabbitmq/
======
alextheparrot
This post completely ignores fault tolerance. The reason Kafka requires you to
have three machines isn’t because it is somehow magical, but because that
allows it to effectively recover from machine-level failure without data loss
(Depending on how your producers are configured, of course).

Saying this approach is even “a bit like Kafka” is incredibly weak - if he is
trying to do semi-durable message queueing, fine, but instead he consistently
attempts to pitch his solution as mostly on par with Kafka. In the end he has
created Kafka minus the utility.

~~~
jasebell
It does ignore fault tolerance yes I agree, because right now I don't need it.
When you're bootstrapping this stuff you want to be as lean as you can. The
plan is to move to Kafka, which I do use a lot btw, when there's a case to
move to that level of throughput. If I had my way now I'd use Kafka, my wallet
on the other hand disagrees.

I know the reason for needing three machines for a Kafka cluster. And I'm
certainly not pitching Durable Queue as a Kafka alternative unless, as in my
case, the throughput will be so low in the initial stages as not to warrant a
full Kafka cluster.

I could have run Kafka on a single node.....

The "a bit like Kafka", weak yeah it is but the solution posted presented a
fairly quick way to establish a queue and decouple messaging from the running
application, that was the whole point.

~~~
vinceve
Ever looked at Azure Event hubs? We’re usung this as a Kafka alternative
because it’s way cheaper actually (for xx million number of messages with a
throughput of 4 MBs per second its 75 €) then setting up a full Kafka and
keeping it running? Or do you see disadvantages on that technology? I’m not
trolling or so, just curious.

~~~
jasebell
No I haven't but I appreciate the headsup. It's been a few years since I even
looked at anything on Azure in all honesty.

------
vemv
As [http://tech.trello.com/why-we-chose-kafka/](http://tech.trello.com/why-we-
chose-kafka/) points out, Redis Streams would be another alternative. It seems
still underdocumented though as it's a recent addition.

I created
[https://github.com/antirez/redis/issues/5582](https://github.com/antirez/redis/issues/5582)
asking what should be a basic, necessary, straightworward question, still no
luck

~~~
antirez
Hello vemv, we have time complexities in every stream command, and the reason
there is no latency documentation is because it follows the Redis normal
latency, that is, like blocking lists or Pub/Sub work: as soon as something is
available for consumers waiting, it is immediately sent to the socket of such
consumers. So you can expect extremely low latencies without any differences
compared to other Redis commands. However I'll try to reply to your question
in more details in that issue. Btw what is bad is that you did not find your
answers in the Streams intro doc. I'll update that too.

------
lkrubner
I like this essay but I think it is very strange that he doesn't link to the
talk that Zach Tellman gave in which Tellman talks about the origins of
Factual's Durable Queues:

[https://www.youtube.com/watch?v=1bNOO3xxMc0](https://www.youtube.com/watch?v=1bNOO3xxMc0)

Factual's Durable Queues are a fairly minimal queue effort that is useful in
cases where you need some minimal way of handling back pressure, because a
service downstream has failed. In his talk, Tellman talks about the fact that
he needed to write a lot of data to AWS S3, and sometimes S3 stops accepting
writes, so he needed an easy re-try mechanism.

I used Durable Queues everywhere in my code and find it very handy. It's
useful when you need a very minimal queue that still has the backing of a file
system.

~~~
emidln
durable-queue was one of my favorite pieces of Clojure kit.

------
polskibus
I like to use Akka in places where I can't or don't want to introduce
operational complexity in the form of external messaging infrastructure.

------
cheriot
Comparing to Kafka and Kinesis (which aren't even in the same category) feels
like a straw man. The author needs durability, but one disk of one machine
satisfies that need? I'm skeptical.

~~~
ploxiln
I was under the impression that the original/core Kinesis is Kafka-as-a-
service (though big and tangential features have been added since).

~~~
cheriot
Ok, yeah, their marketing speak fooled me. I thought they were more like
Spark, BEAM, etc.

I still think the "Kafka or /tmp" framing of the article is a straw man,
though. There's plenty of options in between depending on the requirements.

------
faragon
The good thing of people using for small problems overkill queue systems like
Kafka or RabbitMQ is that they usually have time to react and correct the
mistake after burning lots of engineering resources. The sooner they make the
mistake, the better.

~~~
dtech
Is something like RabbitMQ really overkill though? When I started out with a
project that needed a durable queue and some inter-process communication I
bought it as a SaaS, within 90 minutes the application was happily
sending/consuming messages on a durable queue.

I mean, is PostgreSQL really overkill just because a CSV file can also do some
things?

~~~
majidazimi
The problem with RabbitMQ is that, you can't rewind back. It's good for
messaging. But if you need log-like abstraction you need Kafka which requires
zookeeper to start with. I would appreciate single node message brokers (very
much like RabbitMQ) that don't do any fault-tolerance and just focus on
providing log abstraction.

~~~
nitrogen
I've seen the ability to rewind cause problems in practice (accidental
rewinding causing redelivery of old messages). It's something to be used
cautiously, only if you really need it.

------
icholy
I've been using redis streams for a similar problem.

------
matchagaucho
If only AWS would allow Lambda methods to handle SQS FIFO queues.

That would be the serverless holy grail event solution that scales from small
to large.

~~~
lkschubert8
Azure functions do have the ability to be triggered by Azure Storage queues.

------
mollusk
A lot of people have said this doesn’t stand up to Kafka/RabbitMQ, but how
does it compare to Amazon’s SQS?

