
ZeroMQ: Modern & Fast Networking Stack - igrigorik
http://www.igvita.com/2010/09/03/zeromq-modern-fast-networking-stack/
======
ekidd
I've played with ZeroMQ on some small projects, and I've been quite impressed
by it. Basically, ZeroMQ was designed to send millions of small, mostly-
asynchronous messages at extremely high speeds. If you feed it messages very
rapidly, it sends the first message immediately, and queues the rest in
userspace until (IIRC) the network card is ready to transmit another packet.
Then it sends all the queued messages as a batch. So you get both high
throughput and low latency.

There's some built-in support for common topologies (client/server,
publish/subscribe, pipeline), and if you're willing to use the undocumented
XREQ/XREP socket types, you can build more exotic topologies.

Most of the value in ZeroMQ is the actual C++ implementation, so other
languages generally just wrap the C++ code. The zeromq API is tiny.

I haven't used it on a big production project yet, but I hope to do that soon.

~~~
sshumaker
Well, the idea that simply 'queueing the rest in userspace' is OK illustrates
a pretty disconcerting lack of understanding in writing networking code. This
is not an acceptable solution in the real world.

I want to correct that assumption right now - it's one of those places that
higher level networking libraries often fall down - because simply 'fire and
forget' is a leaky abstraction.

Let's say you're streaming a large file over TCP to a client. Generally, disk
reads are far faster than you can send data over the internet. If you naively
keep sending data, you'll quickly fill up the OS buffers - and begin to
consume memory as it's queued on the client side. In an extreme case, if
you're sending a multi-GB file, you'll probably end up with most of the file
sitting in RAM (assuming you don't run of ram and start thrashing). And for
what? You really only need to load additional data in from the file as fast as
you can send it in the socket.

In a real-life case, streaming multiple files to multiple clients, you'll
quickly run out of RAM with even moderate size files.

Some high level libraries provide an event to let you know that the OS buffers
have been emptied out. For example, Node.js has a drain event. But this has to
be handled explicitly, and isn't really highlighted in the docs. But if you
write your code naively, you will definitely run into problems.

While it's great that many of these high level networking libraries have tried
to abstract away many of the complexities of network programming, they're also
a trap for the unwary. Just remember that they aren't a silver bullet that let
you handle sending data over the internet as a simple function call.

~~~
ekidd
Have you checked out the ZeroMQ documentation on flow control and data rate
limiting? I can't speak for anyone else's use case, but there's a nice
selection of blocking primitives and data rate limits that have proven
sufficient for the smaller programs I've written.

It's hard for me to tell if your criticisms are based on real-world experience
with the library, a review of the available documentation, or just my
description of a single feature intended to reduce the cost traversing the
userspace/kernel barrier.

I'm definitely interested in real-world experience with ZeroMQ and any
problems you've encountered. We may begin some larger-scale testing soon, and
would love to know about any problems.

~~~
ehsanul
Instead of rate-limiting, there are two other options that solve the GP's
concerns: the high water mark which controls the maximum queue length and the
swap size, which saves messages to disk when the queue is full.

<http://api.zeromq.org/zmq_setsockopt.html>

------
mleonhard
It looks like there's still no way for applications to detect when ZeroMQ
encounters common networking problems. If my application can't differentiate
between "no response received" and "network error", I'll end up re-
implementing timeouts and error detection logic in the application protocol.
In most situations that's a waste of time and adds extra weight to the system.
No thanks.

~~~
phintjens
In most cases 0MQ will recover silently (and usefully) from common networking
problems. When a peer crashes, for example, and then comes back, its partners
don't see the problem. Messages get queued, and then delivered. This works for
the main socket types and transports (but not PAIR and inproc:)

This lets us do things like start clients and THEN start a server... the
clients automatically connect when the server comes along. The server can go
away, be replaced by another on the same endpoint, and clients will gracefully
and invisibly start talking to the new server.

In some cases this is precisely what we want, in other cases it's not. If we
need to detect dead peers, we add heartbeating as a message flow on top of
0MQ. Most larger 0MQ applications currently do this. Eventually 0MQ sockets
may offer heartbeating, it seems a natural evolution. ("Seems" but is not
necessarily.)

Additionally there are some patterns (like synchronous request-reply) that
simply don't handle network issues properly. If your service dies while
processing a request, your client if it does a blocking read will hang
forever. There are work-arounds such as using non-blocking reads.

It would be a mistake to try to solve all challenges at once in any project.
0MQ is taking it step by step, starting with the core questions of how to
design massively-scalable patterns correctly. Just doing that is worth gold,
as you can see from people actually using 0MQ, who consistently tell us, "this
makes our life orders of magnitude easier, thank you".

And as we solve the core problems properly, we'll continue to address other
aspects, either in 0MQ core or in layers on top of that, and this process will
continue ad infinitum.

------
mahmud
ZeroMQ is quickly becoming an even bigger hammer in the premature optimization
planet of Newbo-Thumbia.

Edit:

1) It's a networking _library_ ; no admin tools or other soft handle-bars,
like user-space utilities.

2) It uses a binary protocol. Good luck debugging that with syslog.

It's a very powerful tool in the hands of a capable systems architect, who
actually needs it. For the rest, it's pretty much like an adult male tiger;
excellent to watch in its natural habitat from a safe distance, terrible pet
idea for you and your fiancee (and not because you live in a studio
apartment.)

~~~
dtf
It's got absolutely nothing to do with optimization, it just happens to be
rather fast. As far as I know, there is know other tool out there, fast or
slow, that offers zeromq's level of simplicity.

 _1) It's a networking library_

Yes, but one that's easy to use. Versus BSD sockets, which have an annoying
array of quirks when programming for multiple platforms.

Moreover, it's consistent whatever your transport. One thing I've found using
zeromq is that, because it's so simple to use, it makes you _want_ to think in
terms of message passing. I use a queue to push messages between threads at
practically no cost. Then later if I feel like it I can shunt the consumer to
a different process, or a different host, or several different hosts with a
load balancer. I could do the same with raw sockets, but I'd rather eat my own
leg with a spoon.

 _2) It uses a binary protocol. Good luck debugging that with syslog._

It's just a packet. You can put whatever you want in it: XML, JSON, protocol
buffers, msgpack, JPEG.... The wire protocol is trivial.

It also has jolly nice bindings for Python, Ruby, Lua etc, allowing you to
bridge between C or C++ and your scripted components with zero hassle.

Go on, give it a go.

~~~
aaronblohowiak
"As far as I know, there is know other tool out there, fast or slow, that
offers zeromq's level of simplicity."

STOMP! Stomp is simpler than 0mq, but you have a central server instead of
point to point.

~~~
sustrik
Not really. Compare 0MQ's wire protocol:

<http://rfc.zeromq.org/spec:2>

and STOMP:

<http://stomp.codehaus.org/Protocol>

------
eps
Yup. I'm sure every network programmer who's worth his salt has a version of
stackable abstraction layers library that runs of epoll or a variation
thereof, and supports various transports from plain IP through domain sockets
and to exotics like TLS over ICMP. It's like a rite of passage into the circle
of network programming enlightenment :)

~~~
stcredzero
I don't understand why HLL don't have some sort of standard abstraction on top
of sockets. Is it because it's a fundamentally "leaky abstraction" or should
it be done? Seems to me that most HLL attempts to abstract networking try to
be too high level.

~~~
azim
Java EE has something very similar to 0MQ in the form of JMS.

I'm just throwing out something random here, but I imagine the reason no one
else really does this in standard libraries is because it's quite complicated
and can easily lead to fragile protocols. Take AMQP, for example, it took
something like 5 years to agree on their 1.0 standard.

~~~
phintjens
JMS does cover some of the same ground, all messaging does. But it's basically
an API wrapper around two ancient technologies, a queue system and a topic
system. I think it might have been IBM MQSeries (which became WebSphere MQ)
and some other product but can't find the details.

What JMS achieved was to somehow turn these two very different semantics into
a single one, based around "destinations". That was very clever but also makes
JMS weirdly complex to use because the queue (request-reply) pattern and the
topic (publish-subscribe) pattern just don't work the same way. We designed
AMQP originally by taking the JMS spec and reverse engineering that into a
wire level protocol. Then around version 0.3 we threw out destinations and
came up with a generic wiring model based on exchanges, bindings, queues. That
was my summer holiday in 2005.

AMQP is BTW still some way away from a 1.0 standard and mainly it's been five
years trying to get reliability working in the center of the network. That
always seemed destined to failure, as I explained in a presentation to the
working group in 2006: <http://www.zeromq.org/whitepapers:switch-or-broker>.

AMQP and JMS both focus on resources held in the center of the network. It's
familiar to anyone using HTTP as their stack. Thin stupid clients talking to
big smart servers. 0MQ turns the focus to smart clients working across a thin,
stupid (but massive) network.

Both approaches have their value. We (iMatix and the 0MQ community) tend to
believe we can do a lot more, faster and more cheaply, using the distributed
approach.

------
j_baker
"Wouldn't it be nice if we could abstract some of the low-level details of
different socket types, connection handling, framing, or even routing?"

Perhaps I'm missing something, but wouldn't this make ZeroMQ a Modern & Fast
_abstraction_ of a Networking Stack?

~~~
phintjens
If you read the 0MQ Reference Manual at <http://api.zeromq.org/zmq.html>
you'll see that it specifies the abstraction, while the code implements that
abstraction.

------
aaronblohowiak
This is yet another well-written post.. i think Ilya has that rare talent for
accurate and engaging technical writing. I'm consistently pleased with his
work. Thanks.

------
simplegeek
Good article. Please consider changing the font-type and increasing the font-
size.

~~~
trevorcreech
Cmd+? We have the technology.

~~~
simplegeek
Thanks. I really didn't know that.

