Hacker News new | past | comments | ask | show | jobs | submit login
Pulsar – an open-source distributed pub-sub messaging platform (github.com)
155 points by dragonsh 21 days ago | hide | past | favorite | 58 comments

I've used Kafka a lot in the past and I've been looking at Pulsar. The broker being stateless is great. I've had issues with moving partitions around in Kafka.

That being said, I'm getting tired of managing these clunky memory-hungry JVM-based systems that rely on external dependencies like Apache Zookeeper and BookKeeper. They may be necessary if you're Yahoo, but I would argue that for the vast majority of companies, these complex systems create so many administration headaches that the net productivity impact is negative. I've spend days or weeks debugging Kafka issues. I almost feel like "big clunky heavy JVM enterprise software" has become synonymous with Apache projects.

If you don't absolutely need the messaging guarantees I would recommend looking into NATS (https://nats.io/) - it's a brokered systems that's significantly more lightweight and super easy to deploy and play around with. You can get persistence and delivery guarantees with NATS streaming, but that part is optional and a bit more early stage.

> The broker being stateless is great. I've had issues with moving partitions around in Kafka.

Well, the Pulsar broker is (kinda) stateless, because they are essentially a caching layer in front of BookKeeper. But where's your data actually stored then? In BookKeeper bookies, which are stateful. Killing and replacing/restarting a Bookkeeper node requires the same redistribution of data as required in Kafka’s case. (Additionally, BookKeeper needs a separate data recovery daemon to be run and operated, https://bookkeeper.apache.org/archives/docs/r4.4.0/bookieRec...)

So the comparison of 'Pulsar broker' vs. 'Kafka broker' is very misleading because, despite identical names, the respective brokers provide very different functionality. It's an apples-to-oranges comparison, like if you'd compare memcached (Pulsar broker) vs. Postgres (Kafka broker).

Brokers in Pulsar can be seamlessly added or removed.

Bookies can seamlessly be added.

Kafka unless there has been a KIP since I stopped paying attention to Kafka doesn't do this.

I remember adding brokers to Kafka and taking advantage of them on existing topics meant repartitioning which if I recall correctly breaks the golden ordering contract that most devs bank on. The data written to the partition will always be in the order for that partition itself.

> Bookies can seamlessly be added.

Kafka brokers can seamlessly be added, too.

> I remember adding brokers to Kafka and taking advantage of them on existing topics meant repartitioning which if I recall correctly breaks the golden ordering contract that most devs bank on.

Adding brokers to Kafka does not require repartitioning. It requires data rebalancing ('migrate' some data to the new brokers), which does not break any ordering contract. I suppose the words sound sufficiently similar that they are easy to be mixed up. :)

(For what it's worth, BookKeeper requires the same data rebalancing process.)

> The data written to the partition will always be in the order for that partition itself.

Yes, for Kafka, all log segments that make up a topic-partition are always stored -- or, when rebalancing, moved -- in unison on the same broker. Or, brokers (plural) when we factor in replication. Kafka's approach has downsides but also upsides: data is always stored in a contiguous manner, and can thus also be read by consumers in a contiguous manner, which is very fast.

In comparison, BookKeeper has segmented storage, too. But here the segments -- called ledgers -- of the same topic-partition are spread across different BK bookies. Also, because of how BookKeeper's protocol works (https://bookkeeper.apache.org/docs/4.10.0/development/protoc...), what bookies store are actually not contiguous 'ledgers', but in fact 'fragments of ledgers' (see link). As mentioned elsewhere in this discussion, one downside of this approach is that BK suffers proverbially from data fragmentation. (Remember Windows 95 disk fragmentation? Quite similar.)

No approach is universally better than the other one. As often, design decisions were made to achieve different trade-offs.

Maybe we're both mixed up.

Let's say we have a cluster with a sort of "global" topic called "incoming-events" and it has 10 partitions.

We'll most likely eventually end up with a "hot" write partition because rarely do I see a perfect even distribution in event streams.

I'd like to seamlessly add capacity to remove this hot spot.

With Pulsar you're using BK which means just spinning up more bookies which will take on new segments and then the rebalancement will move some segments off.

With Kafka I don't know what the option is aside from spinning up a larger and larger broker & rebalancing to the larger box. What I typically see in companies are they repartition from 10 to 20 to avoid expensive one off boxes.

Nobody likes non-uniform resources because it is a nightmare to manage. Imagine a k8s deployment where you have replica:10 but have to handle a custom edge case for resource allocation on one pod different than the other 9 brokers.

(Just assumed 10 partitions to 10 pods)

Both Kafka and Pulsar have this kind of bottleneck in your scenario -- say, one "hot" write partition.

If one Kafka broker or BookKeeper bookie node cannot keep up with the write load (e.g. network or disk too slow, CPU util too high), you must add more partitions. For Kafka, for the reasons you already mentioned. For BookKeeper (and Pulsar), because only a single ledger of a topic-partition is open for writes at any given time.

> Kafka brokers can seamlessly be added, too.

You can add as man as you want, but they take no traffic.

How do bookkeeper nodes handle addition then. AFAIK, if you add new storage nodes, whether bookkeeper or Kafka, there has to be some repartitioning, or else how will the new node be useful at all?

See my previous answer further up in this sub-thread. Neither Kafka nor BookKeeper require data repartitioning when adding new nodes. Instead, both require data rebalancing, which moves some data from existing nodes to the newly added nodes.

Think: repartitioning changes the logical layout of the data, which can impact app semantics depending on your application; whereas data (re)balancing just shuffles around stored bytes "as is" behind the scenes, without changing the data itself. The confusion stems probably from the two words sounding very similar.

For Kafka, you use tools like Confluent's Auto Data Balancer (https://docs.confluent.io/current/kafka/rebalancer/index.htm...) or LinkedIn's Cruise Control (https://github.com/linkedin/cruise-control) that automatically rebalance the data in your Kafka cluster in the background. Pulsar has its own toolset to achieve the same.

https://jack-vanlightly.com/blog/2018/10/2/understanding-how... goes into this.

> The data of a given topic is spread across multiple Bookies. The topic has been split into Ledgers and the Ledgers into Fragments and with striping, into calculatable subsets of fragment ensembles. When you need to grow your cluster, just add more Bookies and they’ll start getting written to when new fragments are created. No more Kafka-style rebalancing required. However, reads and writes now have to jump around a bit between Bookies.

If consumers are keeping up, there will be no reads to the BookKeeper layer as the Pulsar broker will serve from memory.

When reads need to go to BookKeeper there are caches there too, with read-aheads to populate the cache to avoid going back to disk regularly.

Even when having to go to disk, there are further optimizations in how data is laid out on disk to ensure as much sequential reading as possible.

Also note that the fragments aren't necessarily that small either.

Ok so it will sacrifice some throughout when a new node is added, as reads and writes need to jump a bit.

Depending on your application architecture I'd look into nsq (https://nsq.io/) which works quite differently from most message brokers in that you're supposed to run one instance of the broker per server so you always have one local instance that you can publish to, no matter the current network conditions.

There’s also Liftbridge (https://liftbridge.io/) which is built to offer Kafka-like features on top of NATS. It just hit v1.0 and looks super promising.

Full disclaimer: I worked on building a client library for it while it was in its early stages, and from my limited time with it, it was super easy to operate, very light on resource usage and stupid-fast.

Is liftbridge open source?

>That being said, I'm getting tired of managing these clunky memory-hungry JVM-based systems that rely on external dependencies like Apache Zookeeper and BookKeeper.

One of my pet peeves with Kafka were CLI tools written in JVM languages. Slow JVM startup time was killing my focus.

> ... systems that rely on external dependencies like Apache Zookeeper and BookKeeper.

I would argue that building up on existing systems (especially things like a distributed consensus system like Zookeeper) is generally good practice. Being able to just point a bunch of self-contained JARs/WARs at a Zookeeper/etcd cluster and not worry about startup sequencing or cluster bootstrapping makes me happy.

Hasn't Kafka been talking about eliminating external zk forever? Are they making it an internal library or adopting a new consensus scheme?

Yes, you remember correctly. The Kafka community has already started the work on getting rid of ZooKeeper. See https://cwiki.apache.org/confluence/display/KAFKA/KIP-500%3A... for details.

From my understanding, they’re going to use the brokers as their own metadata store, so the data that was in ZK goes in a topic instead, and I believe Raft for leadership?

KIP-500 is the proposal/PR (?) for ZK-free Kafka which I understand has been merged in and work is now progressing on follow up tasks.

Modern JVMs don't take that long to start. While not at raw binary levels they shouldn't be killing your focus as theyre on par with scripting languages.


That depends on the filesystem, memory pressure, etc. Most JVM-based apps have quite a bit of filesystem I/O at startup, and a surprising memory footprint. If the kernel's page cache is mostly full of dirty pages, that's a bunch of extra disk writes in order to free up RAM for the JVM.

These startup benchmarks are typically not run with the JVM and application jars being pulled from NFS or AFS on a machine with a lot of dirty pages in the page cache. Some small sh/bash app using curl or wget to hit a REST API is going to have a lot less network and disk I/O in that case, even if bash and curl are being pulled from NFS/AFS. Many companies run JVMs from NFS/AFS to simplify deployment and management.

Also, if the OP is logged into a server running the JVM-based CLI app, they may be running it on a server JVM and getting much more eager JITting.

Maybe it's not fair to the JVM that there are lots of circumstances where things are accidentally tuned poorly for the JVM. On the other hand, there is a lot to be said for tiny apps that are pretty resilient to poor conditions. Sometimes size still matters, even on big servers with plenty of RAM and cores.

Yep. I'm using containers and Kubernetes in production so I'm trying to stay far away from anything JVM-based. On top of being slow to start, memory hungry, and plagued by GC issues, JVM docker images are always huge. I don't want two layers of virtualization. If Docker and k8s become more mainstream that's probably bad news for a lot of JVM projects. Of course, Pulsar itself is quite old (2013?) so it was built well before any of the containerization stuff took off.

I’m not really sure what containers/k8s has to do with JVM? Containers are just name spacing, not virtualization. If you’re running in the cloud, either way you slice it you’ll have two virtual machines: hypervisor and JVM.

I’d argue that being able to herd your JVM procs like cattle makes them good candidates for k8s because you can always just set resource limits so they get purged when the heap becomes too large.

I don't understand the relationship between containers, kubernetes and JVM.

I run a prod Pulsar cluster using helm charts. All containers & kubernetes, zero issue.

You Probably just heard something about kubernetes and never really used it. Pulsar has its own helm chart. Installation in this cause is as trivial as any other product. It might require 1-2 gb of memory more but is it really critical ?

Nats itself is never stable product , which is been completely rewritten several times. If you want same level or functionality is not on pair or you have to use another rewrite of it in deep alpha. Just try to do a basis negative acknowledgment on NATS :) it’s not there

What I like about Nats it’s their dev focused approach. Clients are shining.

Pulsar instead doesn’t have a decent support of nodejs

+1 for nats.

It is amazing how quickly it went from "what is nats" to "that feature that required a message broker is complete".

Good job nats team.

No love for MQTT as message bus in this space?

Seconding this comment, I've had nothing but complexity added for not much benefit in this space by needing to _also_ manage Zookeeper. Would love something small/stateless that isn't necessarily "Yahoo Production Grade" and can service hundreds or thousands of requests per second.

Will check out NATS.

It sounds like you're gripe isn't actually with Java or even the memory footprint, is that's you don't want to manage Zookeeper in addition to managing Kafka.

Yeah, more or less. I'm not complaining about the JVM directly, but rather about what typical Java software, of which Kafka is one example, looks like, at least in my mind. Whenever I see JVM I think: Horrible user experience and tooling, bad memory footprint, slow to start and deploy, complex codebase where it's impossible to debug issues, huge Docker images, complicated configuration and tuning options that doesn't work out of the box, dependency hell, and so on. It's the typical enterprise package for managers that don't want to get fired - Nobody got ever fired for buying IBM.

When I see something like NATS - A few thousand lines of code and a 40MB image, that makes me happy. It's a lightweight thing that "just works."

That's why I stay away from anything JVM. And I acknowledge that I am totally biased and may be wrong in some cases, but that has been my experience.

For a drop-in Kafka replacement, check out: https://vectorized.io/redpanda/

I found redis pub/sub to fit most of the needs for a lightweight message passing system, if you do not need persistence.

I’ll throw in another alternative to look at for those using Kubernetes: https://enmasse.io

Indeed, Apache Pulsar is a very good project, but as far as I am concerned, I prefer RabbitMQ in Production env and for little projects, Redis is enough to take this role.

As I understand it, Pulsar combines Kafka-like event processing with RabbitMQ-like message queueing in a single system. It came out of Yahoo, is JVM based, and is often combined with Flink or Spark for the streaming interface. At its heart is the Bookkeeper engine which is a wrapper on RocksDB. The comparisons I’ve seen are with Kafka and its advantage is a cleaner and more flexible cluster architecture.

Pulsar and Rabbit MQ are fundamentally different products; viewing pulsar like a push based message queue is missing a lot of the features that make it so suitable for event driven architectures.

RabbitMQ simply does not like holding on to data. Performance craters with long queues. RabbitMQ is accurately a “message broker” and quite an excellent one at that. Tooling is fantastic, and the software (built on Erlang/OTP) is very reliable.

But it is not a “distributed log”. Pulsar (as Kafka) is built on a distributed ledger/log. People confuse ‘semantics’ of messaging with “message broker” so equate various products supporting ‘messaging semantics’.

To your point, and to doubters, simply try building an event sourcing system (complete with replays to recover) on RabbitMQ and see how that works out!

Not "currently" a distributed log, but we're working on adding that (in a Rabbity way). We're not trying to compete with the likes of Pulsar or Kafka though, we're just trying to round out RabbitMQ's functionality to ensure it remains the best swiss army knife of messaging - and log semantics (and performance) is now a dominant paradigm.

You're working on Rabbit team now? Interesting! Ironically enough, you probably have a lot to do with Pulsar gaining mindshare. Thanks for your blog; very informative.

I agree a statement that Pulsar and RabbitMQ are two different products. I understand Pulsar could be missing key based binding, exchanging, and routing in RabbitMQ. (to certain extend, Pulsar Function and key based routing might be able to provide routing features but it is not baked as first class citizen as RabbitMQ.) Do you care to elaborate what are a lot of missing features for event driven from Pulsar?

(Another clarification - Technically Pulsar client is a thick client. It allows both client and broker to co-manage the flow control. It is more than a push based queue. It does data streaming.)

Recent discussion of Pulsar vs Kafka: https://news.ycombinator.com/item?id=23787958

Anyone using this? Looks interesting.

Yes. I use it in setup where some streams are Geo-replicated to other continents.

Most of the time, it just works.

Here is a performance test program I wrote to see what I could squeeze out of it:


Yes, I'm using it in production and we're transitioning from our existing over entirely to it.

Running with official helm charts in AWS managed kubernetes and using EBS.

Currently passing about 20k/s in and 20k/s out.

Got any specific questions just let me know.

Not yet, but reading the docs. Has 'once only delivery' which is good. Still trying to find if it does FIFO (i think its hard to achieve this in a distributed system).

Yes it does FIFO.

Just like RabbitMQ, Apache Kafka and many other distributed systems, writes go through an elected leader, who is able to ensure ordering guarantees.

Specifically with Apache Pulsar, each topic has an owner broker (leader) who accepts writes and serves readers.

It should be noted that Apache Pulsar supports shared subscriptions which allow two or more consumers to compete for the same messages, like having two consumers on a RabbitMQ queue. Here FIFO order cannot be guaranteed for all kinds of reasons.


Basically, they have the same ordering guarantee as Kafka: FIFO is guaranteed for messages from the same producer, within the same partition. If you need producer-level FIFO, then you can only use 1 partition.

There's no ordering guarantee for messages coming from different producer.

To add:

In a distributed system, an ordering guarantee by producer (and optionally partition "key" which in the end is like an extension of the topic key) is pretty much all you need and all you can get.

If you have two producers then how would one decide which one sent the message first? Go by some timestamp? Clocks are unreliable for these purposes so it comes down to consesus. And letting the queue decide which one came first is equivalent. Once a message got acknowledged by the queue, ordering cannot change anymore.

What do you mean by FIFO? Messages delivered in insertion order? With Kafka, this is easy: messages with the same key land in the same partition. I guess Pulsar offers something similar.

I've spent a lot of time poking it and running it through scenarios to get an understanding of it - haven't used it in production though. Might look at trialling it there in a year or so, once some of the early bugs are shaken out.

It has a lot of promise, but adopting it right now is going to require you spending a lot of time reading source code when you, like I did, find the docs are out of date, or have giant holes.

Things I liked:

Tiered storage - the ability to offload old data to S3 / Azure / HDFS, and then reload it upon transparently by a consumer, is awesome. (albeit with some delay, but their reasoning is that historical reads don't need millisecond latency)

Inter-cluster replication is a lot easier than running up a Kafka Connect cluster to run Mirror Maker 2, but there's a slight caveat that it doesn't work for chained clusters (A -> B -> C). A message from A won't be replicated to C by B. But Pulsar assumes A is going to replicate to both of them if they need it.

Schema registry baked in is nice. Pulsar's replication automatically replicating the schema to other clusters is verrrrry nice for consumers on a different cluster. It's doable with Confluent's Schema Registry, but it's another thing to manage.

Load balancing brokers - haven't seen it any action, but they attempt to redistribute partition ownership based on load.

Only downsides were the docs were sparse or out-of-date in places, especially the BookKeeper ones. There were some configuration fun and odd errors with the BK command line tools (and they could never tell me which bookie was the auditor) and there's two scripts that do very similar things, but not quite. That said, having an auditor process that checks for under-replicated segments automatically is nice, once it's working.

There's several tools for Kafka that do similar, but nice to have it out of the box.

Can't really comment on its capacities to act as a message queue, but it has that too.

But yeah, I reckon I'll look again in a year.

On an interesting note, Pulsar is very similar design to the PubSub system - near-stateless easily scalable brokers with (another layer) then BK as the storage layer. They moved to Kafka, but then, they have Twitter sized problems so YMMV. https://blog.twitter.com/engineering/en_us/topics/insights/2...

What's up with github?


> Update - We have identified the source of elevated errors and are working on recovery.

> Jul 13, 05:53 UTC

Any other Publix fans here that thought this was referring to their sub sandwiches for a second?

Given the technical audience of hacker news, I can all but assure you that no one else here who reads “pub-sub” defaults to thinking “Publix Sub Sandwiches”

Publix makes great subs but i definitely wasn't thinking about them at first.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact