
SurgeMQ: MQTT Message Queue at 750,000 MPS - signa11
http://zhen.org/blog/surgemq-mqtt-message-queue-750k-mps/
======
jonquark
It's good to see new MQTT servers. As one of the developers of IBM's product
in this space I'd like to mention it:
[http://www-03.ibm.com/software/products/en/messagesight](http://www-03.ibm.com/software/products/en/messagesight)

I'm biased and MessageSight has existed a lot longer so it's not fair to
compare them but if people want to see numbers our performance report is here:
[http://www-01.ibm.com/support/docview.wss?uid=swg24035590](http://www-01.ibm.com/support/docview.wss?uid=swg24035590)
(that report is for 1.1, the equivalent for 1.2 released a couple of weeks ago
isn't out yet).

Good luck and a fair wind to everyone in the MQTT space!

------
itchyouch
These numbers are great, but they are also highly dependant on hardware
architecture and a strong coupling with network architecture. Another big
contributor to messaging is in the message size. One can claim 10M
messages/sec but for 1 byte messages, 10m messages/s is hardly impressive.

I can say that on today's hardware, achieving 3-7M msgs/sec in a single-
stream, guaranteed, in-order messaging is possible. This us also with 120 byte
messages, which amounts to about 3-7Gbps if sustained messaging in clusters of
2 to 200+ machines. What do you need to achieve this?

* Kernel bypass network architecture e.g. Solarflare 10/40gbe

* 10gbe switches

* decoupling of transport layer into a dedicated processor cores

* busy-polling processes

* business logic placed on dedicated cores.

* use of shared memory and/or memory mapping to transport messaging between processes.

It all sounds fairly convoluted to throw up processes and burn a core on
rx/tx, but if you need to process millions of messages a second at fairly low
latency, this is about the limit of where it goes. Using traditional
application and socket tied to a single process will get you to around 1
million messages/sec with some very large buffers and efficient business
logic, but thst eill be about it. The next step would be to start sharding
streams, but that comes with its own complexities.

~~~
jandrewrogers
I think you are overstating the difficulty of achieving good network
throughput on modern hardware. Architecture definitely matters but saturating
10GbE is pretty achievable without going to extraordinary lengths. While doing
kernel bypass for networking is more efficient (and preferable for other
reasons), it is not necessary for pushing a huge number of messages in my
experience.

The biggest bottleneck tends to be the computational cost of parsing the
protocol format. Nonetheless, I've seen GeoJSON documents moved over networks
at per node messaging rates similar to those in the article.

~~~
itchyouch
In my experience, just passing messages around w/ dummy logic should be
yielding numbers about 10x greater at around 10x the reduced latency.

With architectural tuning, the messages/sec in the various scenarios can be
increased 5x (~4M msgs/s) while decreasing the average latency of processing a
message to the theoretical minimums of around 5-6 microseconds to get from
wire to application memory and back out to wire.

Even with efficient protocol formats (mold64), one reaches ceilings too soon.
The fact that the author mentions that they are neither CPU bound or network
IO bound leads me to believe that they are probably bound by memory IO and
lost efficiencies to context switching.

------
jacques_chester
There's some typoes and broken links. I like the code, though, lots of work
has been done.

It's worth bearing in mind that messages per second is important, but it's
easy to get fixated on benchmark porn.

Different queueing systems have different guarantees, disciplines and
semantics. These affect user-facing behaviour, which tends to be more
important at first blush than throughput.

I like zillions of messages per second as much as the next fellow. But
frequently you need to worry about things like:

* Are messages delivered once and only once?

* Are messages are messages delivered at least once?

* Can messages be dropped entirely under pressure? (aka best effort)

* If they drop under pressure, how is pressure measured? Is there a back-pressure mechanism?

* Can consumers and producers look into the queue, or is it totally opaque?

* Is queueing unordered, FIFO or prioritised?

* Is there a broker or no broker?

* Does the broker own independent, named queues (topics, routes etc) or do producers and consumers need to coordinate their connections?

* Is queueing durable or ephemeral?

* Is durability achieved by writing every message to disk first, or by replicating messages across servers?

* Is queueing partially/totally consistent across a group of servers or divided up for maximal throughput?

* Is message posting transactional?

* Is message receiving transactional?

* Do consumers block on receive or can they check for new messages?

* Do producers block on send or can they check for queue fullness?

And there's probably a bunch more I've forgotten.

The thing is that answers to these questions will fundamentally change both
the functional and non-functional nature of your queueing system.

For example, a queue system giving best-effort, unordered, non-durable
behaviour is going to run a _lot_ faster. It also pushes a lot of work onto
the application programmer. On the other hand, once-and-only-once, durable,
consistent queues are lot slower and screech to a halt under most partition
conditions. But they also fit what most application developers expect to
happen upon the first encounter with queueing systems.

I work on a section of Cloud Foundry in my day job, and other teams have seen
that different tasks require different queueing approaches.

For example, stuff like metrics is is still useful under conditions of
dropping messages, out-of-order messages and so on, because what's interesting
is the statistics, not any one single measurement.

But a message like "start this app" requires much higher guarantees of
ordering, durability, delivery certainty. People get mad if your PaaS doesn't
actually run the application you asked it to run.

So, just remember: queues are not queues. You need to compare delivered apples
with lossy oranges.

As a note, the author observes that MQTT provides an option to select which
delivery semantics you prefer (at-least-once, at-most-once / best-effort,
once-and-only-once), but I can't see which one the benchmark is run for.

~~~
zhenjl
Author here. A bit of a pleasant surprise to wake up and see this on HN.

Thanks for the detailed response. Here's some answers hopefully can help
clarify things a bit.

* Are messages delivered once and only once?

A: MQTT allows QoS 0 (at most once), 1 (at least once), and 2 (exactly once.)
The performance numbers in the blog are for QoS 0.

However, SurgeMQ implements all three and there's unit tests for all three. I
just haven't done the performance tests for QoS 1 and 2.

* Are messages are messages delivered at least once?

SurgeMQ supports it though the numbers posted are for QoS 0 (at most once)

* Can messages be dropped entirely under pressure? (aka best effort)

No. Currently no messages are dropped.

* If they drop under pressure, how is pressure measured? Is there a back-pressure mechanism?

See above.

* Can consumers and producers look into the queue, or is it totally opaque?

Not sure what this means..sorry..

* Is queueing unordered, FIFO or prioritised?

Ordered, FIFO...MQTT spec requires that messages from publishers to delivered
in the same order to the subscribers.

* Is there a broker or no broker?

Brokered

* Does the broker own independent, named queues (topics, routes etc) or do producers and consumers need to coordinate their connections?

Brokers uses topics to route. Publisher publishes to a topic, subscribers
subscribe to multiple topics w/ optional wild cards.

* Is queueing durable or ephemeral?

Ephemeral currently. Though MQTT spec requires that any unack'ed QoS 1 and 2
messages be redelivered when the server restarts or client reconnects. So once
SurgeMQ meets that spec, it could be considered somewhat durable.

* Is durability achieved by writing every message to disk first, or by replicating messages across servers?

N/A

* Is queueing partially/totally consistent across a group of servers or divided up for maximal throughput?

Currently SurgeMQ is a single server w/ no clustering ability. However, MQTT
spec does mention the bridge capability that is a poor man's cluster. Not yet
implemented.

* Is message posting transactional?

QoS 1 and 2 starts to be more transactional. QoS 0 is strictly fire and
forget.

* Is message receiving transactional?

See above.

* Do consumers block on receive or can they check for new messages?

Block on receive.

* Do producers block on send or can they check for queue fullness?

Block on send.

~~~
jacques_chester
Great reply.

I wasn't expecting to make you fill out a survey. I was just trying to show
off scars I and others I know have accumulated over the years :)

~~~
zhenjl
Heh. No worries. Excellent list of questions. Like antirez, I am evernoting
(is that a verb now?) this for future references.

------
curiously
How does this compare to RabbitMQ?

