itestyourcode's comments

itestyourcode · on Nov 18, 2022

Is it the same dilemma if we can kill one to save more?

itestyourcode · on July 13, 2020

I agree a statement that Pulsar and RabbitMQ are two different products. I understand Pulsar could be missing key based binding, exchanging, and routing in RabbitMQ. (to certain extend, Pulsar Function and key based routing might be able to provide routing features but it is not baked as first class citizen as RabbitMQ.) Do you care to elaborate what are a lot of missing features for event driven from Pulsar?

(Another clarification - Technically Pulsar client is a thick client. It allows both client and broker to co-manage the flow control. It is more than a push based queue. It does data streaming.)

itestyourcode · on July 10, 2020

Full disclosure - I am co-founder of https://kesque.com . We provide Pulsar as a managed service and different tiers of SaaS plan. My opinion can be biased.

For a new project, you should try to document different aspects of requirements. Is it data streaming, queuing, or both? What's the data retention policy? Message rate? How many consumers and producers? Any inbound or outbound integration with 3rd party destination (i.e. S3, Flink)? Both Kafka and Pulsar have so many features to offer. It is not a simple task to pick one vs another. If you ask for guaranteed delivery, both will satisfy that requirement. A level up question would be who can guarantee in-order delivery.

Managing Kafka and Pulsar require knowledge. I do not think any of these durable messaging software is maintenance free (or industry is not there yet). Any reliable distributed system is complex out of necessity. These system more or less require log consensus algorithm to achieve high availability. They all use either zookeeper or one of raft implementations requiring multiple nodes to perform leader election. This is common in all distributed architecture (kafka, Pulsar, Cockroach, etcd...). I would attest Pulsar can be administratively simpler than Kafka, because of separation of broker and bookkeeper (data persistent layer). But this does not mean any dev-op without knowledge can proficiently manage the cluster. We use Kubernets/Helm to manage all of our Pulsar clusters. I would not credit Pulsar alone with low operation upkeep. It is combinations of Kubernetes, Helm, in-house tools, and engineering knowledge to lower the operation cost.

itestyourcode · on July 10, 2020

I agree shaving dozens or hundreds milliseconds of latency is hardly noticeable to end users. But latency is an indicator of how well the system can perform and scale up. Signs of high latency under normal load can reveal design or implementation flaws in the software (supposedly running on any modern hardware) Ultimately you want a system can scale up and delivers consistent latency. Therefore, a low and consistent latency is a health-meter to assure that. Within the same cluster (no network I/O), Pulsar has pretty impressive 5ms pub/sub latency for persistent topics (including written to and acknowledged by the non-volatile disk)

kjeetgill · on July 10, 2020

I agree with everything you're saying about the value of reduced/smoothed out latencies and I suspect the parent agrees with that as well, but the parent was mostly teasing about the choice of "business insights" in lieu of all you've mentioned.

itestyourcode · on July 10, 2020

Pulsar does offer a standalone mode. It has a vertical stack of Pulsar broker, bookkeeper, and zookeeper in one process. The standalone mode also comes in a single docker image.

However, the complexity you refer to is essential in a reliable messaging framework. Use of zookeeper or any log consensus algorithm requires multiple nodes ( 3 or more) to achieve durability and high availability goal. It is out of necessity. This is actually essential complexity.

There is another messaging framework called Nats.io. It is not persistent so architecturally relatively simpler. You might want to investigate.

itestyourcode · on May 20, 2020

Indeed https://kafkaesque.io/