Hacker News new | comments | show | ask | jobs | submit login
ZeroMQ is just sockets (zeromq.org)
201 points by bmaeser on Nov 15, 2013 | hide | past | web | favorite | 75 comments



I like 0MQ a lot, but this is disingenuous. Let's break it down:

> portability

Sockets are just as portable, more so on UNIX descendants where one can rely on relatively consistent socket APIs. Beyond that, almost every single language and runtime (Python, Ruby, Java, OCaml ...) provides a portable socket API.

> message framing

Length-prefixed message framing winds up being 10-100 lines of code in almost any language/environment.

> super fast asynchronous I/O

Sockets have this.

> queuing

Sockets have buffers. The OS can use those buffers to implement flow control. This isn't the same as queueing, but the truth is that you rarely want blind background queueing of an indefinite number of messages that may or may not be delivered.

> support for every bloody language anyone cares about

Just like sockets.

> huge community

I don't think you can get 'huger' than the community around sockets.

> price tag of zero

Seeing as socket libraries ship with everything, does that mean they have a time/resource cost of less than zero?

> mind-blowing performance

Also, sockets.

> protection from memory overflows

This has essentially nothing to do with a networking library. Plenty of environments have safe/efficient zero-copy chained byte buffer implementations/libraries.

> loads of internal consistency checks

Library correctness isn't a unique feature.

> patterns like pub/sub and request/reply, batching

Ah-ha! Here finally we get to the meat of it!

If you need QUEUES, including pub-sub, fanout, or any other QUEUE-based messaging structure, than 0MQ is better than sockets!

> and seamless support for inter-thread transport as well as TCP and multicast

Inter-thread transport of already-serialized messages at the transport protocol layer doesn't make a ton of sense from an efficiency perspective.

> ZEROMQ IS JUST SOCKETS

No, 0MQ is a lightweight network message queue protocol. It's not competing with sockets.


> Sockets are just as portable, more so on UNIX descendants where one can rely on relatively consistent socket APIs. Beyond that, almost every single language and runtime (Python, Ruby, Java, OCaml ...) provides a portable socket API.

You've got to be kidding me. The BSD socket API is only "portable" for basic things. Do any kind of advanced thing and you will notice the limitations of the "portability".

Want to write an evented server that handles a large number of sockets? Choose your favorite platform-specific API: epoll, kqueue, whatever Solaris is using, etc.

Error handling? Each platform behaves in a subtly different manner. See http://stackoverflow.com/questions/2974021/what-does-econnre... for an example.

Windows support? I hope you don't mind the #ifdefs and typedefs in code. The WinSock API is still OKish... it doesn't differ from the basic BSD socket API too much. But good luck trying to handle more than 1024 sockets in a non-blocking/async manner. I hope select() on Windows serves you well.

> Length-prefixed message framing winds up being 10-100 lines of code in almost any language/environment.

Only if you're writing blocking code. If your code is evented, good luck with writing 2-3 times more code. Oh, and don't you dare getting that code wrong and introduce bugs. And of course you have to write this code every single time. And you didn't forget to unit test all that, did you?


Most of your argument is that each OS uses a different IO multiplexer. Lightweight abstractions over these multiplexers have been around for decades: feel free to use libuv if you want a fairly 'modern' one.

That the contour of the API differs slightly means nothing. An example of true incompatibility would be, say, supporting UNIX-style mounts on Windows. If you wanted to support that cross-platform, either you or a library would have to directly implement the semantics of UNIX mounts, as opposed to just making a shim over what the OS already provides.


Whether lightweight abstractions exist is moot. The point of the grandparent was that sockets are portable. If you're going to use an extra library besides pure sockets anyway then why is that any better compared to using ZMQ?


> That the contour of the API differs slightly means nothing.

What? It means everything if you have to learn socket intricacies at different levels of abstraction on a per-platform basis. That is not what most people mean when they say an API or library is portable.


Hahaha so true. Last year I wrote some network code that had to use sockets on windows, Linux, xbox 360 and playstation. 3. Each had many slight deviations in the low level socket API, from different behaviours to different error codes.


>> Want to write an evented server that handles a large number of sockets?

What about what the parent said, use a multi-platform technology like Java, Python, Ruby, etc?


The socket APIs in Java, Python, Ruby etc don't do message framing for you. My point about having to write bug-free message frame parsing code still stands.


Original 0MQ author here. Pretty good analysis. The only real difference between 0MQ and sockets (i.e.traditional L4 transports such as TCP or UDP) is that it implements messaging patterns such as pub/sub or req/rep. You can think of it as a L5 layer designed to orchestrate communication between N endpoints (as opposed to 2 endpoints as is the case with TCP).


Perhaps that is how you saw it. However you are dramatically wrong. ZeroMQ v4 does full end-to-end encryption, supports protocols like TIPC, and (you knew this but choose to forget it for reasons I never understood) entirely changes how we write multithreaded applications.

The only plausible reason you could disregard the magic of using the same semantics for secure internet messaging and inter-thread messaging is that you don't write applications.


(FWIW, parent commentator more or less co-founded ZeroMQ with the grand-parent commentator; rumcajz left the project and is now working on http://nanomsg.org/.)


I agree that the full end-to-end encryption is an awesome feature, though it is very recent. And I didn't even know about the TIPC support.


If you don't mind me asking--why isn't there a bus primitive in 0MQ?

We're doing a simulation using it and we've ended up with a bunch of pub/subs instead of a shared bus.

It's not terrible, but is a little weird.


You might be interested in nanomsg (http://nanomsg.org/), the spiritual successor/spinoff of zeromq, by the original creator (the parent you replied to, Martin Sustrik). It features a BUS pattern too.


After working with 0MQ for a while I felt that this sort of summary was misleading.

I came to think of 0MQ as a multi-point data link abstraction (i.e. layer 2). The API abstracts over sockets, IPC message queues, and in-process message queues as the virtual layer-1 transports.

I say it is a layer-2 abstraction because 0MQ doesn't provide any mechanism for addressing or transparently routing over an internet of connected 0MQ networks. You can do source-based routing by explicitly naming the intermediate hops but this is more like intellegent layer-2 bridging than traditional layer-3 routing. There is no concept of a layer-3 address or naming scheme of any kind and there any important layer-4 features (re-transmissions, flow-control, out-of-order resequencing).

The message structure and fan-in/fan-out features are very useful but they are operating at a layer-2 level, not a 3, 4, 5, etc. layer, from my perspective.

It has been about a year since I spent time with 0MQ so perhaps it has evolved beyond what I experienced.


how does nanomsg compare?


I just found http://nanomsg.org/documentation-zeromq.html. If you haven't seen it, perhaps it can offer some comparisons for you.


TCP/IP does multiple endpoints as well (multicast, multipath, broadcast, server to many, iptables/pfw rules) etc.

Various queue control methods are available in TCP/IP also.

UDP is the only way to get decent performance if your application is designed around it. Why bother resending data if it is too old now to be useful, or if the data arrived via another route? TCP often has more variable latency than UDP, which is the main performance killer for certain types of apps. zeromq isn't multipath aware either.


For many people getting the data from A to B as reliable as possible is more important than getting max performance. If that's something you need to worry about, you either end up implementing a worse TCP on top of UDP, or wondering why data disappear without notice.

Dealing with multicast is a real pain in the neck to deal with in your network infrastructure unless you only want it on one subnet, which restricts you just as much as traditional broadcast.

multipath is either something you leave to the routers, or use a protocol such as SCTP or, hope multipath-TCP will come to your OS in the near future, or you manage it in the application.

It's unclear what you mean by server to many, in this context it sounds like what you'd use a zmq socket to fan out messages for.

The queue mechanism in TCP/IP are for the transport layer, not for implementing application policy.


> Length-prefixed message framing winds up being 10-100 lines of code in almost any language/environment.

Over the years, I've worked on several different systems that wrote those "10-100" lines of message framing from scratch, and had to fix subtle, hard-to-track-down bugs with those. It's a conceptually simple thing that's very easy to have subtle bugs in edge cases. Edge cases that are difficult or impossible to produce in development systems, that do happen in production systems once you're running high volumes. An example of this is sockets pausing and in the middle of sending the multi-byte length, and needing to fiddle with certain parameters on the sockets that control heartbeat and other minutia.

It's certainly simple to write something that works well, but also very simple to write something that works well but will fail in subtle ways under certain kinds of circumstances.


With (dirt simple) length-prefix framing, and blocking reads, pauses and such are non-issues. e.g. blocking read for 2 length bytes, blocking read for N bytes of payload. If your read fails for any reason, or you've timed out, then you give up and close the connection.

Your application protocol needs to handle timeouts (some sort of retry, preferably with some notion of idempotency).

The problem with:

> > Length-prefixed message framing winds up being 10-100 lines of code in almost any language/environment.

is that it's not really a response the ZMQ feature. With ZMQ you don't have to reimplement for every application and platform.


Can you elaborate on why pausing on sending is particularly tricky when sending a multi-byte length header compared to say pauses on any other part of the payload?


It's particularly tricky when dealing with async code, where you can't simply say "block here till you have 4 bytes." If you're just getting in events that say "you received n bytes and here they are."


All networking code should work even if it receives 1 byte at a time. Use a buffer, and have some sort of abstraction responsible for packetizing the input. The output of that module is a fully formed frame ready for interpretation.


0MQ is competing with sockets in the same way that SQLite is competing with fopen.


The only points of yours I agree with are the claims of 'price tag of zero' and 'mind-blowing performance'. The rest seem to be apples/oranges comparisons and being intentionally obtuse, e.g. as to what 'portable' and 'community' mean.

> Sockets are just as portable [...] almost every single language and runtime (Python, Ruby, Java, OCaml ...) provides a portable socket API.

It's not portability if you have to learn APIs at varying levels of abstraction for each language you need to port to.

> Length-prefixed message framing winds up being 10-100 lines of code in almost any language/environment.

That is work to do in every single language you wish to work in. You are acknowledging the value that ZeroMQ provides in not requiring you to carry out this work.

>> support for every bloody language anyone cares about

> Just like sockets.

Except with the aforementioned portability (APIs at a consistent level of abstraction), which is not something raw sockets provide.

>> loads of internal consistency checks

> Library correctness isn't a unique feature.

Not sure what that means. ZeroMQ's claimed value-add here is in providing correctness in various languages, that you would not get otherwise when using raw sockets.

>> huge community

> I don't think you can get 'huger' than the community around sockets.

People writing socket code in C don't go to SocketConf and meet people writing socket code in Python. They don't idle in #socket on IRC. They don't swap blog posts about the cool 'socket patterns' they wrote today. It's not a community.

>> and seamless support for inter-thread transport as well as TCP and multicast

> Inter-thread transport of already-serialized messages at the transport protocol layer doesn't make a ton of sense from an efficiency perspective.

It's a feature nonetheless. That you think it doesn't make sense 'from an efficiency perspective' does not invalidate that feature.

>> protection from memory overflows

> This has essentially nothing to do with a networking library. Plenty of environments have safe/efficient zero-copy chained byte buffer implementations/libraries.

But you don't get that with raw sockets in every environment, which is the point.


I actually thought this story was a link to:

http://hintjens.com/blog:42 ("A Web Server in 30 Lines of C")

before I clicked through. Salient quote:

"ØMQ Is Just Like BSD Sockets, But Better

The other essential ingredients of a creation myth are lies and deception. ØMQ is nothing at all like BSD sockets despite very insistent attempts from its early designers to make that. Yes, the API is vaguely socket-like. APIs are not the same as semantics. ØMQ patterns are weird and wonderful and delicate but they are not, and I'll repeat this, even marginally close to the BSD "stream of bytes from sender to recipient" pattern."


Indeed. The original "sockets on steroids" story was wrong, though sincere. Sockets are the API but the actual machine underneath is nothing like a BSD socket. ZeroMQ has in the last years moved away from the "it's a BSD socket" metaphor as it is a tediously limiting API in many ways, and created the unnecessary confusion that I wanted to pick on in this story.


I think you are missing the fact that 0MQ builds on top of sockets. Of course you can do all of the same things using just plain sockets but then you end up implementing parts of 0MQ. Like all libraries, 0MQ is a set of preimplemented features that you can just use in order to work at a higher level of abstraction and not worry about corner cases so much.


You do realize the page is humor, right?

A serious explanation of ZeroMQ takes 500 pages and several weeks to read.


> A serious explanation of ZeroMQ takes 500 pages and several weeks to read.

I hope this is not true. It doesn't speak well to ZeroMQ at all.


It's not true. I checked http://zguide.zeromq.org/page:all and it would only take 332 pages to print it all out. And to understand ZeroMQ you only need to read the first few pages and skim the rest. That is enough to see that ZeroMQ is a serious attempt at letting you use the same design patterns that you would use with Message Queuing over top of plain sockets, i.e. there is zero Message Queue system behind the curtain. 0MQ is useful because these design patterns make it easier to build correct and powerful applications that leverage network connectivity.

I have built applications that use both AMQP messaging with RabbitMQ and ZeroMQ for the parts where an MQ broker was not necessary or where it would impact performance.

The arguments swirling around ZeroMQ and sockets are the same ones that swirl around threading and higher abstractions. We now know that it is hard to write correct programs using threading unless you refrain from using locks and either have all state immutable or you use lock-free access techniques. There are many libraries (even for Android and iOS) that encapsulate threading with a task-oriented layer that communicates between tasks/threads using queues.

Sometimes you have to make a decision NOT to do something that you CAN do, because of the greater good of the work.


It's no different than any significant technology that has few antecedents. Once you know the background and cast away assumptions and fallacies, it's much easier of course.


Read the ZeroMQ guide for yourself. It is IMHO one of the finest (and funniest) ever written.


It is not true, at all.


I don't find it that disingenuous. I recently used 0MQ on a project, and quite frankly it would have taken a lot more code to do what 0MQ provided out of the box.

0MQ implementation is built on Sockets. I can build the same features on top of Sockets, but why would I want to if it works well?

Side note: I'm interested in looking deeper in to nanomsg http://nanomsg.org/index.html -- which is a re-write of 0MQ by the original author.


It is worth looking at Sustrik next project: http://nanomsg.org/ which is still in alpha version. ZMQ rewritten. He's got awesome blog http://250bpm.com/blog where he mentions lot of things that he decided to do differently in nanomsg and the reasons behind it. Check out: http://250bpm.com/blog:23 (I hate ZMQ contexts) and http://250bpm.com/blog:22 (how we are doomed to rewrite protocols on top of existing protocols and solving same problems again). His other blog posts are also of a great quality.


The thing I most dislike about 0MQ is the overly conceptualized socket typing (ZMQ_DEALER etc), and the way the documentation tries to cram all the combinations in to idiomatic 'patterns'.

All I want to be able to express when I send() is:

  1. Whether it's a Broadcast, Unicast or Anycast message
  2. Whether I'm sending globally, or targeting a 
     subset of peers (subscribers to some filter).
Forgive me if these mappings are flaky, it's been a while since I used 0MQ.

  Target        Method       ~0MQ socket type

  Global        broadcast   -> ZMQ_PUSH
  Global        unicast     -> ZMQ_PAIR
  Global        anycast     -> ZMQ_DEALER
  
  Subscribers   broadcast   -> ZMQ_PUB/SUB
  Subscribers   unicast     -> ZMQ_REQ/REP
  Subscribers   anycast     -> ZMQ_ROUTER


Yes, the over-conceptualized abstractions can be annoying. However if you read the RFCs that specify the patterns, http://rfc.zeromq.org/spec:28, http://rfc.zeromq.org/spec:29, http://rfc.zeromq.org/spec:30, and http://rfc.zeromq.org/spec:31, you will see that each pattern encapsulates more than just the routing model.

When you send() you also need to express how exceptions are handled -- what happens when there are no peers, and what happens when their buffers overflow.

Martin Sustrik did a fine job when he designed those patterns because they are (so far) watertight containers for rather tricky semantics.


Yeah, I'm sure there are loads of technical details... but I can't help feeling there are parallels with the over zealous use of 'patterns' in OOP.

When I started out with 0MQ I felt sucked in to a world of combinatorial explosion where I always have to think about what type of socket is at the other end of of a connect()... it just doesn't feel right to me.


These "patterns" exist because people like you want to gloss over the "loads of technical details". :)


Just the opposite.


The core of "0MQ is just sockets" argument is that it doesn't do the main reason while MQ systems are used instead of, say, sockets - persistence and guaranteed delivery if some endpoints die.

All the arguments in the article are about the fact that it does messaging really good with lots of features, but it doesn't have the queue part at all.

Sure, you can build that yourself on top of 0MQ, but I'd say that building the 'guaranteed delivery' part properly is the hard part of an MQ system.


I may be nitpicking irellevant details, but I just can't help it.

Why are they using the Norwegian/Nordic letter Ø (pronounced almost like "uh" in English) in their name, a letter most people in the world can't type, if they want to get traction?


The zeromq guide (http://zguide.zeromq.org/page:all#The-Zen-of-Zero) explains it thusly:

"The Ø in ØMQ is all about tradeoffs. On the one hand this strange name lowers ØMQ's visibility on Google and Twitter. On the other hand it annoys the heck out of some Danish folk who write us things like "ØMG røtfl", and "Ø is not a funny looking zero!" and "Rødgrød med Fløde!", which is apparently an insult that means "may your neighbours be the direct descendants of Grendel!" Seems like a fair trade.

Originally the zero in ØMQ was meant as "zero broker" and (as close to) "zero latency" (as possible). Since then, it has come to encompass different goals: zero administration, zero cost, zero waste. More generally, "zero" refers to the culture of minimalism that permeates the project. We add power by removing complexity rather than by exposing new functionality."


That's a fantastic insult.


Although that Danish phrase actually refers to a dessert (https://en.wikipedia.org/wiki/R%C3%B8dgr%C3%B8d)


I assume it means "zero with a line through it" which was used on older computers to distinguish zeroes from the letter O on low res screens.


Or empty set as noted by Bourbaki. I can't help but read it that way anyway.


That's always been my reading as well.


Actually, it long predates computers, handwritten and typeset materials as far back as the 12th century use it. As it turns out, it's frustrating when two common characters look so similar.


Probably unicode support reasons, they're obviously going for a slashed zero to look old school and/or avoid being called OhMQ but... http://en.wikipedia.org/wiki/Slashed_zero#Representation_in_...


Most people just type "zmq", and the name is "ZeroMQ", with the unwriteable shorthand acting as a kind of text logo.


Empty set / dot matrix zero, as others point out. Plus there is near reflectional symmetry across the midline of the M given the morphology of the Ø and the Q.


In that case they should use the unicode-character or HTML entity designated for empty sets[1], and not the letter Ø.

Believe it or not, even in the Nordic part of the world we do have empty sets, and we are able to distinguish them from Øs just fine :)

[1] http://en.wikipedia.org/wiki/%C3%98#Encoding

Anyway. If that's the underlying logic... How does communicating "empty set MQ" to your visitors work when you want to market the product with the name "zero MQ"? What do I search for when I remember "empty set MQ"?

To me this just doesn't add up. It looks like an attempt at doing "something cool" gone a bit off target.

Edit: The replies to my original comment here also seems to back up the point that this is not very clear or universal communication.


> What do I search for when I remember "empty set MQ"?

https://www.google.com/search?q=0mq


Calm down, it's just a style thing.

http://en.wikipedia.org/wiki/Slashed_zero


This may be a reference to 'What have the Romans ever done for us' in 'Life of Brian'. http://www.youtube.com/watch?v=9foi342LXQE


The important detail they miss out is that there's no broker and no (built in) persistence, and that sucks when parts of your critical messaging infrastructure go down and you lose messages you thought had been sent.


If you read the name as "zero message queue" is it surprising that it doesn't have persistence? There's no message queue.


> There's no message queue.

I believe all the socket types except REQ and REP do have incoming/outgoing queues. There's just no requirement for a broker to serve as an independent queue.


Yes, your applications do have queues for incoming/outgoing messages but there is no "message queue" A "message queue is an external thing that provides some kind of service but with ZeroMQ that external thing does not exist. If you sometimes need an external thing to provide delivery guarantess and persistence, then it is easy to connect RabbitMQ to accept/send ZeroMQ messages. Or just make your own shim process that translates to/from AMQP.


zmq is not simple a messaging framework. its more like a concurrency library. and there are a lot of reliability and high availability patterns for zmq: http://zguide.zeromq.org/page:all#Chapter-Reliable-Request-R...


Yes, because 0mq is just sockets.


been looking into this recently. it seems like a lot of the bindings haven't been maintained in 2+ years, but the pure java implementation JeroMQ is very active and convenient to use (https://github.com/zeromq/jeromq)

the name reminds me of jero, the american-born enka singer (https://www.youtube.com/watch?v=ba9rKhVAz80)

i'm not entirely sure if i personally could come up with a good usage for zmq though, unless i were going through tons of data from sources like social media or scientific experimentation


The library API hasn't changed much in the past 2 years, so why would you expect the bindings to change? Just for the hell of it?

The latest release (version 4) has added a few new functions, so all of the bindings are undergoing some minor revision now to support the new calls.


For me, if there hasn't been activity for several years and I can't find it being widely used then I assume it abandoned.


That's a fair assumption and accurate for a lot of the bindings. It is true that there are a lot of bindings for ZeroMQ and many were weekend projects for a single project. However it's normal and expected in a large community. To treat that as significant is like saying, "I find a lot of abandoned HTTP server projects, so this 'Web' thingy sure looks doubtful."


For me, if there hasn't been any activity for several years yet it's still commonly used, I assume it's stable.


i find that there are at least a couple dozen issues/defects that haven't been addressed, and it doesn't seem like they're going to be


Pretty good explanation. I really appreciate such presentation "by analogy", carefully comparing a technology with something very well known. It provides another angle from which understanding is improved.


I think of 0mq as being a really powerful prototyping tool. If you require guarantees or complex topologies, it will save you a good amount of work to get the thing running. I have never gotten low latency from it, but I know that others have with some effort.


And asserts. Sockets and asserts.


Yes, god forbid you'd stop your production chain just because you found an internal consistency error that could potentially wreak havoc with your data. Oh, wait, that's exactly what you should be doing. If you write C/C++ without asserts, you are driving fast and drunk without a seat belt.

Luckily it's a free world so you are more than welcome to take libzmq, strip out all the asserts, and launch your new improved fork! Why even debate this? Please remember us when you get rich and famous.


ZeroMQ markets itself as sockets on steroids, and this hypothetical exchange is to clear up any (hypothetical?) confusion stemming from their own messaging?




Applications are open for YC Winter 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: