
Hermes – A message broker built on top of Kafka - qluml
https://allegro.tech/2019/05/hermes-1-0-released.html
======
lllr_finger
Can someone help me understand the value proposition for Hermes? The only
thing I can see is that it abstracts away producing to and consuming from
Kafka. The use cases provided answer why you'd use a message broker system,
but not why you'd want to do it over HTTP.

Edit: I understand HTTP is easier than Kafka, but is this something developers
really struggle with when adopting Kafka? My experience is that they struggle
with the nuances, behavior, and maintenance of Kafka/ZooKeeper more than
anything.

I also didn't see how it dealt with concepts like exactly once delivery - any
experiences in that area?

~~~
thanatos_dem
Exactly once delivery is not a thing, and confluent needs to stop openly lying
to people about it. At least once delivery and idempotence is not the same,
and has existed forever.

Calling it “exactly once” is marketing BS. It’s the same as Oracle claiming
for years to support serializable transactions when they didn’t, except that
one is technically possible, they just didn’t support it in actuality.

~~~
ryanworl
They do not use the phrase exactly-once delivery as far as I’ve seen, they say
“exactly-once semantics”. And this is referring to a specific set of Kafka
features that you previously had to build yourself if you used Kafka. I have
not seen any reference to them claiming their solution is somehow novel. In
fact, it just wraps up patterns people were doing anyway.

Oracle claiming to support serializable transactions is also false (as you
say), but calling it a lie is not the whole story. “ANSI serializable” and
actually serializable are not the same thing. Oracle is “ANSI SQL-92
serializable”, as in no named anomalies from the spec. There just happen to be
more anomalies like write skew which are not in the spec.

~~~
tiew9Vii
No they do, I completely agree with the original poster on this. Confluent
over egg the marketing, I think they may of changed to the phrase “exactly
once schematics” as Apache Pulsar labelled it that from the start and people
calling out exactly once.

~~~
thanatos_dem
Even from their announcement blog post, they're intentionally mixing words -
[https://www.confluent.io/blog/exactly-once-semantics-are-
pos...](https://www.confluent.io/blog/exactly-once-semantics-are-possible-
heres-how-apache-kafka-does-it/)

They may never describe Kafka itself without the word "semantics", but here
are some other snippets:

\- "I know what some of you are thinking. Exactly once delivery is impossible"
\- "While some have outright said that exactly once delivery is probably
impossible!"

They mix their phrasing depending on what they're talking about, and whether
they are referring to Kafka directly or indirectly.

------
theomega
This sounds very interesting. Did anyone get from the homepage what kinds of
guarantees this offers? What if the HTTP endpoint where Hermes should push the
data to is down? Does it retry? If yes, for how long?

~~~
theomega
Answering my own question (RTFM): [https://hermes-
pubsub.readthedocs.io/en/latest/user/subscrib...](https://hermes-
pubsub.readthedocs.io/en/latest/user/subscribing/)

You can configure how long it retries and with what strategy.

------
SkyRocknRoll
If any of you looking for message and streaming system under single system
then pulsar.apache.org supports both and lot more reliable and scales better
than kafka

------
hestefisk
Very nice with a simple wrapper. That said I’m wondering if 9 out of 10 use
cases could do with something simpler, ie zeromq, which scales really well.

------
twa927
Can someone provide actual high-level use cases for using Kafka? Prefereably
use-cases not handled by RabbitMQ.

I've seen a few talks about Kafka but they focused on the internals. My guess
is that Kafka is for large systems for which managing a multi-node RabbitMQ
cluster is too much trouble.

~~~
josephg
I’ve long had the inverse view - I’m not sure what good use cases there are
for Rabbitmq that couldn’t be handled better by a Kafka cluster.

One company I worked with used Kafka as their central source of truth across
the organisation. All events generated by users were thrown into a massive
Kafka cluster. Each team in the organisation cared about a different view into
that data (financials, marketing, fraud, what we display to that user on the
website, etc). Each team would ingest the same kafka queue and do different
things with it - often consuming certain events into their own Postgres
instance, or other things like that.

I used Kafka when I made my reddit r/place clone a few years ago because it
gives great read and write amplification. With Postgres as a central source of
truth, you can only handle thousands of writes per second. And reads will slow
down the instance. With Kafka you can handle about 2M/sec. And reads can
really easily be serviced from other machines - you can just have a bunch of
downstream Kafka instances consuming from the root, and serving your readers
in turn.

It may be that you can also solve all these problems with a well configured
rabbitmq cluster. But coming from a database world I find it more comfortable
to reason about architecture, performance and correctness with Kafka.

~~~
nullwasamistake
Sounds like your org used Kafka for event sourcing. This is almost always a
bad idea, event sourcing and aggregate reconstruction is a nightmare IMO.

Kafka used as a pure FIFO cache for regular CRUD endpoints works fine

~~~
josephg
Yes; they did. It worked pretty well actually.

Why do you think it’s a bad idea? Most of the arguments against event sourcing
that I’ve read seem to be “yes but the tooling isn’t very good”. That might be
true, but maybe we solve that problem with more investment into event
sourcing; not less of it.

~~~
nullwasamistake
TLDR the tooling is so bad it's basically impossible to run at scale. I worked
for a company that tried. Maybe on a small scale it's fine, but replays and
storage of past events takes insane amounts of space at high event rates. To
the point that storage costs and replay times became a real problem. (Many
terabytes and days)

I also don't think it's a great idea in general. The event stream directly
replicates a DB commit log, and the aggregates your tables. It's building your
own database

------
DiseasedBadger
Increasingly, all technology news sounds like:

"X: a blazingly fast X built on top of {something I vaguely thought did X}"

~~~
dang
"Please don't post shallow dismissals, especially of other people's work. A
good critical comment teaches us something."

[https://news.ycombinator.com/newsguidelines.html](https://news.ycombinator.com/newsguidelines.html)

------
exabrial
Ironically: Hermes was the name of a JMS user interface/testing tool.

~~~
codeduck
It's also the name of a hilariously bad courier in the UK - so bad in fact
that most people call it Herpes instead.

~~~
vowelless
It’s also the name of a bureaucrat on a certain tv show.

~~~
chimen
curiously, nobody mentions the clothing designer :)

~~~
user5994461
thought it was a handbag designer.

