Hacker News new | past | comments | ask | show | jobs | submit login
ZeroMQ: Modern & Fast Networking Stack (igvita.com)
128 points by igrigorik on Sept 3, 2010 | hide | past | web | favorite | 44 comments

I've played with ZeroMQ on some small projects, and I've been quite impressed by it. Basically, ZeroMQ was designed to send millions of small, mostly-asynchronous messages at extremely high speeds. If you feed it messages very rapidly, it sends the first message immediately, and queues the rest in userspace until (IIRC) the network card is ready to transmit another packet. Then it sends all the queued messages as a batch. So you get both high throughput and low latency.

There's some built-in support for common topologies (client/server, publish/subscribe, pipeline), and if you're willing to use the undocumented XREQ/XREP socket types, you can build more exotic topologies.

Most of the value in ZeroMQ is the actual C++ implementation, so other languages generally just wrap the C++ code. The zeromq API is tiny.

I haven't used it on a big production project yet, but I hope to do that soon.

Well, the idea that simply 'queueing the rest in userspace' is OK illustrates a pretty disconcerting lack of understanding in writing networking code. This is not an acceptable solution in the real world.

I want to correct that assumption right now - it's one of those places that higher level networking libraries often fall down - because simply 'fire and forget' is a leaky abstraction.

Let's say you're streaming a large file over TCP to a client. Generally, disk reads are far faster than you can send data over the internet. If you naively keep sending data, you'll quickly fill up the OS buffers - and begin to consume memory as it's queued on the client side. In an extreme case, if you're sending a multi-GB file, you'll probably end up with most of the file sitting in RAM (assuming you don't run of ram and start thrashing). And for what? You really only need to load additional data in from the file as fast as you can send it in the socket.

In a real-life case, streaming multiple files to multiple clients, you'll quickly run out of RAM with even moderate size files.

Some high level libraries provide an event to let you know that the OS buffers have been emptied out. For example, Node.js has a drain event. But this has to be handled explicitly, and isn't really highlighted in the docs. But if you write your code naively, you will definitely run into problems.

While it's great that many of these high level networking libraries have tried to abstract away many of the complexities of network programming, they're also a trap for the unwary. Just remember that they aren't a silver bullet that let you handle sending data over the internet as a simple function call.

Have you checked out the ZeroMQ documentation on flow control and data rate limiting? I can't speak for anyone else's use case, but there's a nice selection of blocking primitives and data rate limits that have proven sufficient for the smaller programs I've written.

It's hard for me to tell if your criticisms are based on real-world experience with the library, a review of the available documentation, or just my description of a single feature intended to reduce the cost traversing the userspace/kernel barrier.

I'm definitely interested in real-world experience with ZeroMQ and any problems you've encountered. We may begin some larger-scale testing soon, and would love to know about any problems.

Instead of rate-limiting, there are two other options that solve the GP's concerns: the high water mark which controls the maximum queue length and the swap size, which saves messages to disk when the queue is full.


If you are streaming a large file over TCP, ZeroMQ, with its emphasis on small messages sent and received apparently atomically, would be a stunningly poor choice.

It's also wholly unsuitable as a floor wax or a dessert topping. :)

Does your criticism also apply to its intended use case? At a glance, it doesn't strike me as an unreasonable design choice, but I haven't thought about it deeply.

what would be the best way of sending large files server-to-server ? this is a question that has been puzzling me.

Is there like a library or a protocol that is tuned towards transferring multi-gb files ?

Are you intending to have many of these transfers happening in parallel, or are you sending a single file and hoping to saturate your link? Do you have full control of both ends, or only the server? Are you streaming or is the whole file always transferred? Do you need encryption?

For simultaneous large files with no special needs, my instinct would be to just choose any lightweight HTTP server that supports sendfile(). Basically, pass it off to the kernel, and let it handle it. But maybe there are more subtle keys to good large file performance: http://www.psc.edu/networking/projects/tcptune/

For cases where you are controlling both ends, you might find useful suggestions here: http://serverfault.com/questions/111813/what-is-the-best-way... In particular, Tsunami and uftp look interesting.

0MQ is built on good old principles the Internet stack (TCP, IP) is based upon. Thus is has tx/rx buffers. It doesn't even try to do fire-and-forget, because -- as you correctly say -- it's an leaky abstraction. Instead it aims for end-to-end reliability. And yes, moving it into kernel would make it somehow more reliable. Watch for my talk at Linux Kongress (eeptember 23rd) which deals with moving the functionality into kernel, standardisation and integration with Internet stack.

0MQ has what is called a high-water mark on (outgoing) queues. You can then decide whether to block if the queue gets full, or poll the socket to know when it's ready for output. This is also how TCP works. 0MQ abstracts the complexities of the network that you don't want to see, but it exposes those parts you still need to. It would make a very nice transport for file transfer, and I'll probably add such an example to the 0MQ Guide.

Try it, you'll be surprised, I guarantee it.

What OS are you using that has infinite sized TCP send buffers?

xrep/xreq really isn't well documented. I'm writing a learn by example series for ZeroMQ in Ruby, that already includes an XREP example. I'm planning on adding in XREQ this weekend.


Thanks, man. Very helpful indeed!

It looks like there's still no way for applications to detect when ZeroMQ encounters common networking problems. If my application can't differentiate between "no response received" and "network error", I'll end up re-implementing timeouts and error detection logic in the application protocol. In most situations that's a waste of time and adds extra weight to the system. No thanks.

In most cases 0MQ will recover silently (and usefully) from common networking problems. When a peer crashes, for example, and then comes back, its partners don't see the problem. Messages get queued, and then delivered. This works for the main socket types and transports (but not PAIR and inproc:)

This lets us do things like start clients and THEN start a server... the clients automatically connect when the server comes along. The server can go away, be replaced by another on the same endpoint, and clients will gracefully and invisibly start talking to the new server.

In some cases this is precisely what we want, in other cases it's not. If we need to detect dead peers, we add heartbeating as a message flow on top of 0MQ. Most larger 0MQ applications currently do this. Eventually 0MQ sockets may offer heartbeating, it seems a natural evolution. ("Seems" but is not necessarily.)

Additionally there are some patterns (like synchronous request-reply) that simply don't handle network issues properly. If your service dies while processing a request, your client if it does a blocking read will hang forever. There are work-arounds such as using non-blocking reads.

It would be a mistake to try to solve all challenges at once in any project. 0MQ is taking it step by step, starting with the core questions of how to design massively-scalable patterns correctly. Just doing that is worth gold, as you can see from people actually using 0MQ, who consistently tell us, "this makes our life orders of magnitude easier, thank you".

And as we solve the core problems properly, we'll continue to address other aspects, either in 0MQ core or in layers on top of that, and this process will continue ad infinitum.

That's the nature of networking. You can never differentiate between "network error" and "no response received". TCP in no better. You'll have accept that or keep with a single box.

ZeroMQ is quickly becoming an even bigger hammer in the premature optimization planet of Newbo-Thumbia.


1) It's a networking library; no admin tools or other soft handle-bars, like user-space utilities.

2) It uses a binary protocol. Good luck debugging that with syslog.

It's a very powerful tool in the hands of a capable systems architect, who actually needs it. For the rest, it's pretty much like an adult male tiger; excellent to watch in its natural habitat from a safe distance, terrible pet idea for you and your fiancee (and not because you live in a studio apartment.)

It's got absolutely nothing to do with optimization, it just happens to be rather fast. As far as I know, there is know other tool out there, fast or slow, that offers zeromq's level of simplicity.

1) It's a networking library

Yes, but one that's easy to use. Versus BSD sockets, which have an annoying array of quirks when programming for multiple platforms.

Moreover, it's consistent whatever your transport. One thing I've found using zeromq is that, because it's so simple to use, it makes you want to think in terms of message passing. I use a queue to push messages between threads at practically no cost. Then later if I feel like it I can shunt the consumer to a different process, or a different host, or several different hosts with a load balancer. I could do the same with raw sockets, but I'd rather eat my own leg with a spoon.

2) It uses a binary protocol. Good luck debugging that with syslog.

It's just a packet. You can put whatever you want in it: XML, JSON, protocol buffers, msgpack, JPEG.... The wire protocol is trivial.

It also has jolly nice bindings for Python, Ruby, Lua etc, allowing you to bridge between C or C++ and your scripted components with zero hassle.

Go on, give it a go.

"As far as I know, there is know other tool out there, fast or slow, that offers zeromq's level of simplicity."

STOMP! Stomp is simpler than 0mq, but you have a central server instead of point to point.

Not really. Compare 0MQ's wire protocol:


and STOMP:


Is it? I thought this was a level of abstraction higher? The premature optimizer would want access to the raw sockets this is built ontop of. I could be misunderstanding, though, I've never used it. The idea I have in my head is,

ZeroMQ: Send a message, other side either receives full message as a single payload (or nothing at all? or an error is raised? I'm not tototally sure).

Sockets: write(), read() and friends. Other side has to make sure he has the whole message, otherwise read() again and wait, deal with errors on its own, etc.

I have seen higher-level libraries than Zero, I have seen middle-ware with better persistence than Zero, I have seen networking libraries better integrated with their languages than Zero. But I haven't seen a faster beast.

It's right there in their <title/>: "Fastest. Messaging. Ever".

The great majority of the people would just benefit from learning socket programming better, and sticking to something native to their tongue. Use the messaging infrastructure that integrates best with your development process, or opt for something with more padding, like Rabbit.

Call me weak, but I like infrastructure software whose guts I can peek, lsof, cat, grep, log, strace and diff. Stuff that I can take a snapshot of and reproduce. Network services should be transparent to the application code base. I wanna poke a whole in my firewall, give some hint to inetd, a few lines of cron and be done with it. I simply don't trust my own software enough to be sure I will not screw up the infrastructure.

I completely agree with the desire to be able to use Unix As It Should Be.

That said I am a little envious when I read that I can send("bunch of shit") and receive() and call it a day. I wouldn't call that premature optimization, nothing is going to beat raw sockets and domain specific knowledge.

That said, since you brought up AMQP (we use Rabbit, too, I'm a fan) have you seen this?: http://lists.zeromq.org/pipermail/openamq-dev/2010-March/001...

First, function level semantics and brevity are not exactly differentiators when most anyone can wrap the socket API with the class/parametric/duck-typed interface of their choice (what is github if not a pile of DWIM wrappers.) See how Twisted did wonders for network programming at the semantic level.

Also, I have already stated that it was a powerful tool in the hands of a "capable systems architect". Besides, I am a closet iMatix fan ;-)


All I wanted to point out is that ZeroMQ should not be treated as another "plugin" that people are encouraged to use willy nilly. I would use it in high-performance applications in financial trading, VOIP, scalable game servers, etc.

But these are not what most people are working on. Blog posts like these just encourage the ADHD bleeding-edge types to take another stab at their own thighs in the unjustified pursuit of "web scale". And the moment SmashingMag gets a hold of Zero, we will see the most gruesome newbie bloodshed since the battle of WordPress.

Lol. Actually I believe 0MQ (or rather, the principles it embodies) is something all serious developers need to learn. How to do concurrency properly, how to get above TCP sockets, how to design simple APIs, how to do open source properly, etc. It's hard to see that exposure to 0MQ would make anyone a worse developer.

As a programmer, using 0MQ is Better than Sex. You can quote me on that. It would be unethical to deny such fun to anyone. What beginners need, with any new technology, is a set of clear recipes that they can implement without having to invent everything. This was why a books like Stevens' trilogy were so vital to getting TCP out of labs and into the grubby hands of ADHD bleeding-edge types.

We're working on that, with books like http://www.zeromq.org/docs:user-guide. That'll grow to cover all of 0MQ eventually.

Speaking of AMQP/RabbitMQ there's an integration project here: http://github.com/rabbitmq/rmq-0mq. The idea is that you can use RabbitMQ as a broker for 0MQ networks.

suppose you wanted to build a private-messaging system for a website, where users can send PMs to each other. Would zeromq be a good candidate for that (as the backend).

Would you need to use a broker for that ?

No, it wouldn't be a good choice because it's not persistent in the physical, written-to-disk sense. It's an in-memory queue.

And no, you don't need a "broker" for that. The messaging middleware application of ZeroMQ, and other middleware, is not the same "messaging", in the human communication sense. The former can be used to implement the later, but typically it's for inter-application messaging and communication.

In your case, I recommend you use a tradition RDBMs and do something like

      messageId   SERIAL,
      senderId    VARCHAR(20) NOT NULL,
      recipientId VARCHAR(20) NOT NULL,
      timestamp   TIMESTAMP DEFAULT NOW(),
      message     TEXT,
      status      message_status,           //
      PRIMARY KEY (messageId)

   CREATE TYPE message_status AS ENUM ('draft', 'sent', 'new', 'read', 'archived', 'deleted');
Then a message "send" is just a row insertion. To refresh the display, in semi realtime fashion, you can have some javascript on the client side poll your server periodically, assuming you don't have comet or some other persistent connection.

If you have a persistent connection, then the code that inserts to the database also puts out a ticket in some "refresh-display queue" in the following manner. All logged in users are eligible for message status updates. i.e. those that have an active session will get "you have new email" message. You will have an alerter process/thread/job that runs in the background and reads those message update tickets. It takes the recipient ID and updates that user's display, i.e. pushes.

Typically, you would use ZeroMQ in just that later part; as a queue of updated-message statuses, and the queue is consumed by a background job that just notifies a bunch of remote users that they have new updates. Javascript on the client side takes care of the display update/modification (typically a DOM operation, though you could also have a flash or java applet consume pushed data; those later two can actually keep persistent connections, but watch out for session length or you might quickly run into file descriptor lossage.)

The easy deployment, easy snooping and administrative tools are all big wins for STOMP on ActiveMQ, even if it doesn't hit millions of messages a second.

Yup. I'm sure every network programmer who's worth his salt has a version of stackable abstraction layers library that runs of epoll or a variation thereof, and supports various transports from plain IP through domain sockets and to exotics like TLS over ICMP. It's like a rite of passage into the circle of network programming enlightenment :)

Then they'd better find a new rite of passage, because ZeroMQ is turning into a category killer.

So you are saying that the existence of a good library should discourage other people from writing their own? You are so missing the point.

Did existence of TCP discouraged people to write their own implementation of OSI layer 4? Not really. But those who do so are a small -- i mean really small -- minority.

I don't understand why HLL don't have some sort of standard abstraction on top of sockets. Is it because it's a fundamentally "leaky abstraction" or should it be done? Seems to me that most HLL attempts to abstract networking try to be too high level.

Plan 9 has dial[1] and friends, which kinda does to network connections what printf did for output. And, not surprisingly, something similar is in Go[2]. Of course, we're talking libraries here, so if you really want it in the language, then you're out of luck. Although any sufficiently advanced language should make this transparent enough anyway.

[1]: http://swtch.com/plan9port/man/man3/dial.html

[2]: http://golang.org/pkg/net/#Conn.Dial

Java EE has something very similar to 0MQ in the form of JMS.

I'm just throwing out something random here, but I imagine the reason no one else really does this in standard libraries is because it's quite complicated and can easily lead to fragile protocols. Take AMQP, for example, it took something like 5 years to agree on their 1.0 standard.

JMS does cover some of the same ground, all messaging does. But it's basically an API wrapper around two ancient technologies, a queue system and a topic system. I think it might have been IBM MQSeries (which became WebSphere MQ) and some other product but can't find the details.

What JMS achieved was to somehow turn these two very different semantics into a single one, based around "destinations". That was very clever but also makes JMS weirdly complex to use because the queue (request-reply) pattern and the topic (publish-subscribe) pattern just don't work the same way. We designed AMQP originally by taking the JMS spec and reverse engineering that into a wire level protocol. Then around version 0.3 we threw out destinations and came up with a generic wiring model based on exchanges, bindings, queues. That was my summer holiday in 2005.

AMQP is BTW still some way away from a 1.0 standard and mainly it's been five years trying to get reliability working in the center of the network. That always seemed destined to failure, as I explained in a presentation to the working group in 2006: http://www.zeromq.org/whitepapers:switch-or-broker.

AMQP and JMS both focus on resources held in the center of the network. It's familiar to anyone using HTTP as their stack. Thin stupid clients talking to big smart servers. 0MQ turns the focus to smart clients working across a thin, stupid (but massive) network.

Both approaches have their value. We (iMatix and the 0MQ community) tend to believe we can do a lot more, faster and more cheaply, using the distributed approach.

and """ While iMatix was the original designer of AMQP and has invested hugely in that protocol, we believe it is fundamentally flawed, and unfixable. """

Java EE has something very similar to 0MQ in the form of JMS.

I've worked with JMS, and no. Maybe my implementation was flawed. (Weblogic) IIRC, you needed 5 different objects of 5 different classes just to do anything.

"Wouldn't it be nice if we could abstract some of the low-level details of different socket types, connection handling, framing, or even routing?"

Perhaps I'm missing something, but wouldn't this make ZeroMQ a Modern & Fast abstraction of a Networking Stack?

If you read the 0MQ Reference Manual at http://api.zeromq.org/zmq.html you'll see that it specifies the abstraction, while the code implements that abstraction.

This is yet another well-written post.. i think Ilya has that rare talent for accurate and engaging technical writing. I'm consistently pleased with his work. Thanks.

Good article. Please consider changing the font-type and increasing the font-size.

Cmd+? We have the technology.

Thanks. I really didn't know that.

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact