Hacker News new | comments | show | ask | jobs | submit login
ØMQ - The Guide (zeromq.org)
203 points by brudgers on May 31, 2015 | hide | past | web | favorite | 66 comments

This is probably one of the finest documentations of all time. Doesn't just guide you through the features of ZeroMQ, but actually teaches you about the horrendously complicated task of programming distributed systems in an elegant way. Everybody should have a read!

The 0MQ guide is one of the best pieces of documentation ever written. I read it years ago and continue to refer to it periodically for both projects that I use 0MQ for, and projects I don't.

It's simply a rad guide. Highly recommended.

Thank you for this, I appreciate it. The Guide took a long time to write, around a year. It was the first ZeroMQ project where I merged all pull requests without code review. The results were amazing: most of the examples are now in many languages, a Rosetta stone of distributed programming. This was what convinced me that a "merge first, fix later" approach would work well more broadly (and it has).

I'm rewriting the Guide slowly, as the software has changed and we've found nicer, simpler ways to solve some of the harder problems.

There is a lot of new ZeroMQ-related material on my blog, at hintjens.com.

And thanks for the culture&empire writeup too :)

I find I compare all technical writing nowdays to the TeXBook. Has a similar style in that it is highly technical and personal at the same time.

I started 0MQ last year and the documentation was quite a pleasure to read and the whole thing has been great so far.

My general experience with 0mq in the past (with a past team) has not been great. While a nice abstraction library over sockets, sending it the wrong parameter is quite likely to crash in painful ways in rare instances. Maybe it's gotten better over the past year.

Yes, it's fast. But if you want a message bus and are going to want for advanced features, I'd recommend a proper message bus.

It's going to abstract away some of the socket details, but you can still get bitten pretty easily - again, may have gotten better in the last several months, I don't know.

Be careful also with distros carrying older versions, you are probably wanting to make sure you get the latest than what your distro may contain for the same reasons.

The original codebase used asserts rather poorly, for error handling rather than internal sanity checks. Four years ago we had fixed most of these problems, and had a "futzing" project to try to crash the library. Over time we've removed the edge cases... the codebase did take a long time to become properly stable, one of the downsides of lots of new code written rapidly, and one reason we insist on small gradual patches these days. (There have been _very_ few new errors introduced since we switched to our decentralized contribution process in early 2011.)

Good to know. For what it's worth, the majority of my bad experiences were in 2013 to late 2014, and usually through the python bindings.

Usually calling the functions in differnt ways was able to dodge the problem. It's not always quite obvious what caused it to crash, but it was sort of "if it hurts when you do that" sort of thing. Often these types of errors happened only under heavy traffic and were hard to replicate except under stress or lots of repeated calls.

If not using the python bindings, it might be better off for most people.

Has the problem with req/rep over lossy connections been solved (http://lucumr.pocoo.org/2012/6/26/disconnects-are-good-for-y...)? My understanding was that you couldn't detect disconnects because ZMQ doesn't expose the internal state.

I've also worked with ZMQ in the past, and our experiences were a little better because we didn't use it as a message bus, but rather as a transport medium between modules in our proprietary video decoder software. It basically took away a lot of the boilerplate code you have to write around socket programming, which was really nice.

If you're looking for an AMQP-compliant message bus that can be federated and clustered, RabbitMQ is pretty damn amazing.

RabbitMQ is only amazing if your cluster is 100% available, you have no soft lockups, no high load, etc. Its cluster failure handling is absolute crap.

In our stack it's the buggiest, most unreliable piece of software we are running, and it's falling over (losing cluster state, hanging, crashing, etc) on an almost daily basis.

We used to run 8 nodes, and it was a nightmare recovering from partitions. Now we've reduced it to 2 per cluster, and it's still a nightmare.

Just recently I have found out that nanomsg is the successor of ZeroMQ. Does anyone have an opinion about this new lib? I am curious how mature and stable it's.

Nano is excellent in many ways, though it's not a successor, that's just bombast from its author. ZeroMQ is far more than a library. It is a large community and hundreds of projects, e.g. independent ZeroMQ versions in Java, .Net, C++, Erlang, C. Nano is one library and bindings.

Take a look at the git repositories and the email list and you really see the significant differences. These aren't technical. They're why the ZeroMQ community merges pull requests rapidly, and products tend to be stable on master, while Nano has pull requests hanging for months, and isn't stable.

Nano should have been compatible with ZeroMQ, then it would have been a successor and the protocols could have evolved over time to become what Sustrik wanted. Instead it tried to define itself by one-upping ZeroMQ, missing many opportunities. ZeroMQ has so much stuff built on top. Nano? So little.

This is not accidental or random. The ZeroMQ community is welcoming to people, and every effort is welcome, unless it's really disruptive. Nano is just a the wrong side of hostile, arrogant, and insecure and it shows in the arguments on the mailing list (which we just don't get on zeromq-dev).

Edit: by "not stable", I refer to its "0.5 beta" status, not any technical instability. The point being there is no guarantee of forward compatibility.

Small correction: There is a pure Go implementation of nanomsg (https://github.com/gdamore/mangos).

Ah, indeed, thank you. This is encouraging and I'd like to see more of this. We need spaces for experimentation and competition, and the more libraries attack this "NoBroker" space, the better. I do wish Nano would stand on its own merits rather than the cheap "why we're better than ZeroMQ" marketing. Also I'm somewhat disappointed that Sustrik isn't using Nano in his own new projects (like libmill). That's not a great vote of confidence.

> I do wish Nano would stand on its own merits rather than the cheap "why we're better than ZeroMQ" marketing.

IIRC, 0MQ launched with quite a lot of "why we're better than AMQP" marketing.

True, and perhaps Nano was inspired by this. However look carefully and you will see that we never, ever, criticized RabbitMQ. I know the team that built it, they are friends, and we both invested crazy amounts of time in trying to keep AMQP alive. RabbitMQ is an excellent product, built with cool technology. The protocol is not so good. I designed it, I know its weaknesses. It is way too complex, and strangled by a committee that could not innovate and preferred to bully and lie its way to a "1.0" version that broke everything and delivered so little it's barely a messaging protocol, let alone "advanced".

It is fun to compare products. The competition between RabbitMQ and ZeroMQ was deliberate, useful, and good for both projects. However for a breakaway project to claim "we're the successor" is downright silly IMO. Hostile and negative, and sets a terrible tone for a young project. And if you can't achieve that, what then?

I find this rather sad, and looking at Nano's commit history, another of those entirely predictable stories. Why would you want to supercede the ZeroMQ community? It's large and successful and friendly. Why not simply make a better ZMTP engine? Make it smaller, cleaner, compatible, and then over time, improve the protocols... simple and undramatic and guaranteed success.

But no, we need drama and argument and hostility and... no matter how good the code, the outcome is that contributors don't stick, the core developer gets burnt out, and the project dies.

It happened to Crossroads, and Nano appears to be really just the same, in C. The waste in time and effort is sad. We need projects like Nano. We need space for new experiments. We need choice and competition.

I've spoken to some Nano contributors who are replacing Sustrik (who seems to have abandoned the project), and things may improve. One hopes.

There's nothing as comprehensive as the ZMQ guide for documentation, but here[1] is a good place to get started. It seems fairly stable API wise at this point.

I haven't used it much, since there's no way to switch from ZMQ without losing wire-protocol compatibility, and I haven't started any new projects that need IPC recently.

1: http://tim.dysinger.net/posts/2013-09-16-getting-started-wit...

I built our entire monitoring infrastructure on top of nanomsg and nodejs; previously I used zmq. Over the last 3 months, I've had absolutely no issues with a bit over 200 servers communicating with each other.

Pretty cool stuff.

I used nanomsg to have 20+ servers listening to queue from a single master.

I found that over time the memory consumption would rise horridly and I had to stop using it. I've just had another sanity-check of my code and don't spot an obvious leak, but I'm willing to believe it is there:


Otherwise though using nanomsg was a pleasure, and I have it in use elsewhere without problems.

its actually not a successor, but a fork by one of zmq's core devs, Martin Sustrik.

there is a writeup in the docs how nanomsg differ from zmq: http://nanomsg.org/documentation-zeromq.html

The HTML for this page is 2.7MB. Somebody on the train to FOSDEM was telling me that they use it as a test case for web browsers and HTML parsers.

That's kind of cool. It's a full, large book complete with examples with full syntax coloring. And still fits into less than 1/100th of a cent of disk space.

And the images add only another 750 KB. Rendered in Chrome it takes 98.7 MB (compare vs. ~250~MB for a a typical G+ page on load, often growing to 2+ GB, just sitting there).

This is an exceptionally strong argument for well-coded, self-contained, static pages.

I've been thinking about using ZeroMQ for RPC, but the one problem it doesn't solve is discovery, ie. finding a peer to talk to among a dynamic pool over which you'd like to load-balance. zbeacon looks promising until I realized it uses UDP broadcasts.

You can do it with ZooKeeper or Consul, but then you have to roll that glue for every app, or write a library which only you are going to end up using, and you'll end up doing it for every language you implement in. We write a lot of microservices, in all sorts of languages, and we'd like to move away from a centralized load-balancer, which is both a bottleneck and a SPoF.

(gRPC looks like a great start, but I'm puzzled why they chose to ignore/defer discovery. Plain one-on-one RPC is trivial, the difficult part is making it completely distributed and fault-tolerant, neither of which gRPC seem to address at all.)

Discovery is one of those problems with dozens of answers that depend on your context. For many cases, UDP broadcast/multicast is perfect. For others, a broker can help a lot (ZeroMQ has a broker, called Malamute, that is easy to use for discovery). For other cases, you need a mix of stable and unstable pieces, and so CZMQ has a zgossip class that does this.

It's an emergent problem. One good pattern that's emerging are clusters of clusters; e.g. each box supports a cloud of small services that talk to each other and to a single broker on the box, and the brokers on a network then discover each other opportunistically with a mix of broadcasts, manual assistance, and gossip.

While you can solve problem of discovery by introducing your own broker, you will still have problems such as single POF etc. This is because ZMQ was designed for different purpose and you are trying to forcefully fit it to unrelated purpose.

There are many solutions to what you want to accomplish and each one has its fans. I personally like smartstack because it combines two very mature and battle tested technologies: Zookeeper and haproxy.

ZeroMQ sockets can connect or bind, and this is independent of the socket type. If you want to load balance requests against a dynamic number of responders, just have the responders connect their REP sockets to a set of bound REQ sockets.

But you still need to know the IPs and ports of the endpoints to connect to. This is what I mean by discovery.

ZeroMQ merely solves exposure of sockets as queues. In all distributed systems, there must be a set of known hosts, or discovery becomes tedious. DNS is an easy way to solve this.

ZeroMQ is an abstraction that hides a lot of complexity for you in the name of convenience, and there's no reason why it couldn't abstract discovery, too. Instead you have to build it from scratch for every client app.

I guess I just see it as too easy to solve w/o adding custom complexity to ØMQ.

Ya this is a good read. I cut my teeth on this stuff. This guy here is good if you're working with scala. https://github.com/mDialog/scala-zeromq mDialog were actually bought by google so I guess it worked out well for them too: http://techcrunch.com/2014/06/19/google-acquires-mdialog/

It annoys me that ø is representing "zero"

We've stopped using "ØMQ" in text and now just use "ZeroMQ" in most places (like in the Guide). The original choice was to annoy people into remembering the name... it kind of worked though got tiresome after a few years. I still like the logo.

The worst part is that now you're left with people who don't quite know how to call it. Just looking at the thread, you can see variations like ØMQ, 0MQ, ZMQ and finally ZeroMQ all in both upper and lowercase.

On the topic of chars ...

Your Python examples of server and client use the unicode ellipsis (\xe2) which breaks Python2.

It should be a slashed zero, but that does not have a unicode codepoint and the combining version renders poorly.

For what it's worth, it made the title a whole lot more visually appealing that "0" or "zero" when I posted it...and it's true to the page title.

Not least when the empty set symbol (∅) exists, and would be more appropriate. This is almost as bad as the metal umlaut.[0]

[0] https://en.wikipedia.org/wiki/Metal_umlaut

Honestly, it would be equally inapropriate. In both cases you're hijacking symbols that look like other symbols but mean completely different things.

Høh, I was gonna ask why it was used, but that (sort of) explains it. It seems like Americans like our Ø way more than our Æ or Å, I wonder why. Colbert used it correctly at least..

ØMQ pronounced correctly would have been a horrible name anyway though.

Hver gang.

When would you use 0mq vs Google's grpc? What are the main differences between them? I rarely hear them mentioned as alternatives to each other, but I haven't been able to grasp the differences.

grpc is Google's flavor of RPC, with transport and semantics specified for you.

0MQ is an abstraction over raw sockets that lets you worry less about how different parts of a distributed application find and talk to each other.

There's no real direct comparison; you can implement request/reply patterns, like RPC, using 0MQ for the networking parts, but that's not all it's limited to.

As a disinterested third party, I know there is some criticism of ZeroMQ and I am interested in what that is about. When or why would you want to avoid ZeroMQ?

The fact that it's a LGPLv3 licensed library gives me pause to use even in our backend infrastructure (never ever in code running on client devices). The v3 situation is so complicated that even fsf had to come up with a matrix to explain how their own licenses play with each other. Not a very comforting situation:


This is pure licensing FUD.

First, as complicated as the licensing interactions around the GPL may be, the quite simple fact is that the (L)GPL's requirements only apply if you distribute code, so there is no more reason to take pause when using an LGPLv3 library in backend infrastructure than there would be if you were using a BSD-licensed library.

As for client devices, it is possible to distribute proprietary applications linked with LGPL libraries, though there are requirements you must follow. The amount of time it takes to figure out and comply with those requirements is minimal compared to the person-hours that went into developing ZeroMQ, which you are getting for free.

Edit: apparently ZeroMQ includes a static linking exception to the LGPL, so you don't even have to comply with the LGPL's most onerous restriction when linking with ZeroMQ! http://zeromq.org/area:licensing

Well it's FUD in the sense that I really, actually feel Fear Uncertainty and Doubt about using any code under GPLv3 license. There's good reason to [https://www.google.com/search?q=gplv3+for+commercial+use&oq=... ] worry. Large companies I've worked in before this (with teams of technically savvy lawyers on staff) have been known to promulgate an outright ban of such open source software in their production stack.

Now, nothing makes me happier than being proven wrong in this case (especially with the static linking exception AND this little nugget "It is the intention of the ZeroMQ community to move gradually towards the Mozilla Public License v2, as a replacement for the current LGPLv3 + static link exception" ).

So in a sense, my comment here has had a positive result not just for me but also for anyone else reading this thread and worried about the license. So on balance, I'd credit this subthread more for clearing up of FUD than the reverse ;-)

When Blizzard made Starcraft 2, I am sure they were very concerned by all the many software licenses out there, but then decided to use every tool available to make the game better, released at a earlier date, and have resources directed elsewhere than reinventing LGPLv3 licensed libraries.

The gaming industry is however a very competitive one, where every choice is critical. A company there could not survive if they took unnecessary costs. Yours might be different, so maybe the competition won't eat your lunch while you are busy reinventing wheels. Do a risk analysis and decide how much money and time it is worth, and ask yourself what happens if a competitor happens to be in the exact same spot and decided to instead go with the free alternative.

Is this outdated wrt security for zmq 4.0+?

[ed: I see now that the latest version is just a release-candidate -- I accidentally thought 4.0 was "out"]

Yes, it does not describe the current security mechanisms. I've written articles on my blog to explain those; they need to be reworked into a new Guide text.

Version 4.0 is stable; there's a pending 4.1 update.

Not familiar with 0mq, but I'm very familiar with MPI. Does anybody have any experience for how these compare?

any real reason you should use zeromq over rabbitmq?

The two approaches are very different. RabbitMQ centralizes and uses a fairly heavy asymmetric protocol (that I originally designed, so I know this pretty well), which makes the server a major hub, and clients rather stupid and slow.

There are things you just cannot do with AMQP and thus RabbitMQ. The main one, and why we originally started ZeroMQ, was multicast from publishers to subscribers. How else can you get message rates of millions per second?

ZeroMQ (and Nano and the many similar efforts that have come to life since we started) gives every process the same capability. Your laptop can talk directly to ten thousand peers, and do it rapidly.

We do have a broker, Malamute, which works nicely for various patterns. I use this in lots of projects. However not in the conventional sense of starting a process on a box somewhere. I use Malamute as an actor thread, to coordinate events and workload between other actor threads, in a single process.

ZeroMQ makes this kind of scale magic quite easy. Take a look at the CZMQ library and you'll see how wise use of ZeroMQ transforms even an old language like C. Thread-to-thread messages work the same way as process-to-process messages. You can't even think of such things using AMQP.

It does change the way you build distributed systems.

I want to use ZeroMQ but stuck with RabbitMQ because I am using Celery.

I hate Celery and I hate RabbitMQ because it was so difficult to get stuff working the way I wanted which makes me wonder if it would've been better if I just wrote my own simple job queue.

They're fundamentally different things. For your needs, how about rq?


ZeroMQ (in certain configurations) is broker-less if I recall correctly. The clients connect to each other, rather than going through a broker.

Last I checked, ZeroMQ is more of a "low-level" library/framework that provides easy-paths to more higher-level functionality that would be comparable with what you'd expect from a standard message queuing system.

Anyone got a different take from it? I only did the tutorials quite a while back.

Yep, it's best to think of ZeroMQ as networking library than an "MQ". It makes communication between processes easier than writing BSD sockets code by hand.

For example, the Mongrel2 web server uses ZeroMQ to drive backend handlers. You probably wouldn't dream of transporting HTTP requests and responses over a brokered message queue, but ZeroMQ is different since it's "just sockets".

Yes, ZeroMQ is brokeless. However, you can implement a broker with ZeroMQ. In addition, it should be noted that ZeroMQ buffers messages when the connection is disconnected.

If micro (possibly nano) seconds matter, you can't use a broker system like rabbit. 0mq is a set of wrappers around tcp and udp and allow you to do network programming at extreme performance levels generally not available without a whole lot of mucking about.

While this doesn't answer the question, one thing I've found (see my other comment for why I don't care for 0mq) is that Rabbit really doesn't like to be full as a queue.

It really requires that you can eat messages off as quick as you can put them on the queue, or else it slows down -- which means you can't pull them off as quickly, and then things basically get to "cascade failure" type problems.

Maybe others have other experiences - Rabbit is fast, but I'd probably evaluate other options if you want your queue to actually, well, queue.

zmq is not a message broker like rabbitmq. its more like a concurrency library. it kinda feels like working with sockets, except its not just sockets.

recommended read: http://zeromq.org/topics:omq-is-just-sockets

hn-thread: https://news.ycombinator.com/item?id=6739231

This is not entirely accurate. The difference is that ZeroMQ doesn't have a centralized broker, not that it doesn't allow you to work with sockets. Lots of ZeroMQ projects utilize sockets for communication. In fact, because there isn't a broker, ZeroMQ gives you more freedom as to the underlying communication medium.

In the Postface section one of the contributors writes:

"We investigated different solutions to find something suitable for our needs. We tried different message brokers (RabbitMQ, ActiveMQ Apollo, Kafka), but failed to reach a low and predictable latency with any of them."

Applications are open for YC Summer 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact