
ZeroMQ instead of HTTP, for internal services - augustl
http://augustl.com/blog/2013/zeromq_instead_of_http/
======
tieTYT
To me it looks like a ZeroMQ client can be written in many languages so you
don't need to know Clojure to use it. Therefore the article has an extra
barrier to entry for understanding the content because more people understand
OOP languages like Java than Clojure.

I guess my point is, don't skim the article and think it's not for you because
you don't know Clojure. Here's a list of languages it supports (
[http://zguide.zeromq.org/page:all#Ask-and-Ye-Shall-
Receive](http://zguide.zeromq.org/page:all#Ask-and-Ye-Shall-Receive) ):

C++ | C# | Clojure | CL | Delphi | Erlang | F# | Felix | Go | Haskell | Haxe |
Java | Lua | Node.js | Objective-C | Perl | PHP | Python | Q | Racket | Ruby |
Scala | Tcl | Ada | Basic | ooc

~~~
augustl
OP here, thanks for making that clear. I intentionally tried to keep the
Clojure code as generally easy to read as possible, and aside from the big
Clojure win (more specifically, functional programming win) towards the end,
the code is pretty generic ZeroMQ stuff. I suppose it's not a coincidence that
the ZeroMQ docs has the code snippets available in many programming languages
:)

~~~
ajkjk
I have to confess that I, as a non-Clojure-speaker, found it much less
interesting after seeing that your example was in Clojure. Using less common
languages as examples makes it feel like it's in that big world of amusing
side projects instead of something trying to be mainstream. Not on purpose,
it's subconscious.

~~~
TeMPOraL
OTOH, Clojure gets more common as more people use it in examples.
Subconsciously, if you keep seeing this strange language every other example
code then you start to think that maybe it's not so side-projecty after all.

------
plq
In my experience, ZeroMQ is very fragile in RPC-like applications, especially
because it tries to abstract away the "connection". This mindset is very
appropriate when you're doing multicast (and ZeroMQ _rocks_ when doing
multicast), but for unicast, I actually want to detect a disconnection or a
connection failure and handle it appropriately before my outgoing buffers are
choked to death. So, I'd evaluate other alternatives before settling on ZeroMQ
as a transport for internal RPC-type messaging.

If you are fine with having the whole message in memory before parsing (or
sending) it (Http is not that bad when it comes to transferring huge documents
over the network), writing raw MessagePack document to a regular TCP stream
(or tucking it inside a UDP datagram) will do the trick just fine. MessagePack
library does support parsing streams -- see e.g. its Python example in its
homepage ([http://msgpack.org](http://msgpack.org)).

Disclosure: I'm just a happy MessagePack (and sometimes ZeroMQ) user. I work
on Spyne ([http://spyne.io](http://spyne.io)) so I just have experience with
some of the most popular protocols out there.

------
dclusin
I always enjoy reading discussions of internal application layer communication
protocols. If there was one thing I would like to add though or wish for, it
would be that people would go one level deeper in their thoughts and
discussions. Specifically I'm referring to the transport layer (TCP/IP). I
come from the financial services sector and we are heavy users of reliable
multicast as an internal transport.

The kinds of solutions, changes, and re-thinking it has allowed us to do with
regards to our architecture have just been tremendous. For example, if your
system is latency sensitive you can move a lot of your more intensive logging
to dedicated nodes. All they have to do is listen on the specific multicast
port. This allows one to reduce latency in the critical path because you don't
have to spend time logging inputs and outputs which ultimately ends up as some
form of disk write or other I/O. Obviously you can't get rid of all of the
logging/disk writes that your system must perform to maintain durability and
other availability guarantees that are required, but for most things I've
noticed that the requirements aren't always that stringent.

What's even cooler about multicast is that there is nothing preventing you
from re-implementing your existing point-to-point, sequential, guaranteed
message delivery protocols on top of it. A reasonable question to ask would be
why would any sane person person want to do this?

Because most of the time you know which sorts of requests are going to result
in calls to a manageable and finite number of which services. When using point
to point protocols you've essentially painted yourself into the proverbial
corner because now somewhere in your architecture you're forced to write fan-
ins and fan-outs at the application layer that accumulate the results of
queries to various services you utilize. These things are difficult to
implement, frequently a source of headache, and difficult to reason about in
the event of a failure. Why not use a transport protocol that gives you all of
that for free?

The real difficulty here is that in order to properly utilize one-to-many
communication internally you really have to have it on your mind since it will
require you to structure your software in a very specific way. It's certainly
not a silver bullet, and won't cure all that ails you. But now that we've been
using it for so long I'm constantly coming up with solutions or ideas that are
only possible because we utilize multicast as a transport that would be
otherwise too costly or tedious to implement using TCP/IP.

~~~
diroussel
What kind of multicast are you using?

ZeroMQ can do multicast using PGM -- [http://api.zeromq.org/2-1:zmq-
pgm](http://api.zeromq.org/2-1:zmq-pgm)

~~~
dclusin
PGM :)

------
espeed
Ilya Grigorik has been advocating using ZeroMQ with SPDY for modern backend
RPC.

See "Building a Modern Web Stack"...

AirBnB Tech Talk:
[https://www.youtube.com/watch?v=ZxfEcqJ4MOM](https://www.youtube.com/watch?v=ZxfEcqJ4MOM)

Slides: [http://www.igvita.com/slides/2012/http2-for-fun-and-
profit.p...](http://www.igvita.com/slides/2012/http2-for-fun-and-profit.pdf)

Blog: [http://www.igvita.com/2012/01/18/building-a-modern-web-
stack...](http://www.igvita.com/2012/01/18/building-a-modern-web-stack-for-
the-realtime-web/)

~~~
espeed
BTW: I posted a ticket to add backend SPDY support to App Engine -- see Ilya's
comment:
[https://code.google.com/p/googleappengine/issues/detail?id=9...](https://code.google.com/p/googleappengine/issues/detail?id=9398)

------
DanWaterworth
The thing that really annoys me about ZeroMQ is that sockets aren't thread-
safe. This isn't such a big problem C, but it caused some really weird and
difficult to track down problems in a Haskell program I wrote where I used a
PAIR socket, but I wrote and read in different threads. I ended up doing some
fairly unsightly socket gymnastics to get around this limitation.

Haskell programs are generally thread-safe by construction. It's very easy to
overlook thread-safety in foreign libraries, but this isn't just a problem in
Haskell, multi-threaded code is the default is a number of realms.

I now try to avoid ZMQ where I can. Modern libraries should be thread-safe.

~~~
jrn
Have you found any brokerless mq alternatives? About to build a haskell
project and 0mq looks good. I'll keep in mind setting up two sockets, for push
and pull. The documentation shows, zeromq contexts being threadsafe, wheras
passing off a 0mq socket to another thread requires, putting memory barriers
around it.

[http://api.zeromq.org/2-1:zmq](http://api.zeromq.org/2-1:zmq)

~~~
dowskitest
You should have a look at Nitro.

[http://gonitro.io/](http://gonitro.io/)

It's still young, but it grew out of some frustrations with using ZeroMQ in
large-scale deployments.

[http://docs.gonitro.io/#faq](http://docs.gonitro.io/#faq)

~~~
jrn
Yes very good. thanks.

------
greenail
It seems to me that ZeroMQ is very good if you are creating a messaging
network with load balancing, fanout, and fanin, however I don't think what
most people are using HTTP services for matches to ZeroMQ full capabilities.
That being said, it is very easy to get started with ZeroMQ. The whole idea of
making everything messaging oriented seems to make sense especially if you
think message passing is a good way to simplify complex systems that need high
levels of concurrency. The issues with threading, forking, and contexts make
some things tough with ZeroMQ, see:
[http://250bpm.com/blog:23](http://250bpm.com/blog:23). I stopped hacking on
ZeroMQ when i was unable to trouble shoot some hard to reproduce crashes.

------
blakesmith
Thrift has been my backend RPC tool of choice both at work and on personal
projects. It's been terrific for building composable backend services with
tightly integrated language support.

Pros: \- Explicit contracts through the IDL. I despise implicit contracts in
service oriented code. \- Code generation for many languages. \- Server and
client interfaces that don't require as much ceremony as HTTP. \- Fast binary
protocol

Cons: \- Sometimes the generated code isn't exactly what you want (can feel a
bit Java centric) \- Binary protocol isn't human readable, can be harder to
debug. \- Stream oriented calls aren't very feasible without some sort of
home-grown chunking solution. This is why tools like Cassandra tell you that
the whole request must be able to be held in memory.

Thrift is definitely worth a look if you're shopping for an RPC tool:
[http://diwakergupta.github.io/thrift-missing-
guide/](http://diwakergupta.github.io/thrift-missing-guide/)

~~~
lmm
Can't you use framed transport to handle streaming better?

(my previous company gave up on thrift when we discovered it couldn't do
recursive data structures (i.e. trees), and made our own interface -> service
layer on top of protocol buffers instead).

~~~
blakesmith
As far as I know, you still have to read the entire response into memory for
parsing. You could write your own streaming parser, but as far as I know the
main compiled bindings don't support streaming response handling. I would love
to be proven wrong though!

------
bluesmoon
I really love ZeroMQ. We used it at LogNormal for all of the back end RPC. We
did run into one issue though. If your 0MQ servers (the ones that call bind)
run on a virtual host, as do your clients (the ones that call connect), and if
that virtual host reboots, then the clients do not automatically reconnect to
the server. I didn't quite have the time to investigate this, but from what I
could tell, it only happens with virtual hosts across a reboot. It may have to
do with iptables rules that we have in place (we have an IP whitelist), but I
can't be certain of this.

~~~
augustl
I've never come across this. I'm able to kill my server (the one with the REP
sockets), submit a form on the web page, have the HTTP request in the browser
hang (that under the hood proxies to ZeroMQ), start up the server, and have
the request in the browser get replied to almost immediately. But I've only
been running ZeroMQ in topologies with two nodes at most, no advanced routing
etc.

~~~
bluesmoon
when you say kill the server, do you mean kill the server process or kill the
box that the server runs on? I have no problem with the former, but I have a
problem with the latter, and I can't tell why. It's pretty consistent.

~~~
rumcajz
It's a TCP issue. When the other end of the connection goes away silently
(power goes of, cable is cut off etc.) TCP doesn't inform ZeroMQ that the
connection is broken, so ZeroMQ uses it furthe without trying to handle the
error.

~~~
jzwinck
It's only an issue with TCP if you forget to enable keep-alive
([http://www.gnugk.org/keepalive.html](http://www.gnugk.org/keepalive.html)).
Keep-alive solves this problem, and it solves other problems too, like when
your network equipment decides a connection is no longer needed and disables
it. Or, you can implement application-level heartbeats--whenever you send you
have the chance to recognize that your peer has gone away.

------
JulianMorrison
REQ and REP have an evil, evil failure mode if they get out of sync, as I
recall. The lockstep of request and reply is rather easy to break.

~~~
jacob019
At JacobsParts we experienced this. Every few weeks our central API would just
stop responding. We were using a threaded python zeromq server and the REP
"sockets" would get stuck, even after the client was gone. Enough stuck
sockets and all threads were hung. Now we use python gevent for an
asynchronous server and we use ROUTER, DEALER instead of REP, REQ. Actually
the clients can use either DEALER or REQ depending on if they are asynchronous
or threaded. Anyway the results have been fantastic and stable for a year now.
When there are many small requests the performance benefit of ZeroMQ is just
amazing. Can hardly tell the difference between a remote call and a local
function call!

~~~
augustl
I'm curious, did you have some sort of timeout mechanic for your REP sockets?
I don't, but I've never had problems either because I don't have that much
traffic.

~~~
jacob019
AFAIK ZeroMQ doesn't support any kind of timeout on REP sockets. It could be
hacked in with signals or a watchdog thread in a multiprocessing setup, but
that's ugly. If you're going to all that trouble it seems much cleaner to just
move to DEALER/ROUTER.

~~~
augustl
I do like the way REQ/REP keeps track of the requests and has RPC semantics,
though. Instead of blocking, you could always poll and time out instead.

------
UK-AL
Anyone used a ESB(biztalk, websphere) rather than a pure messaging solution.
What's the difference?

~~~
lmm
Expensive consultants, low-quality error reporting, and a bunch of useless
tools that make management feel like they can program (and then leave you to
fix their crap)?

I presume there must be some upside, but I never saw it.

------
benmccann
I've been using Avro because of its built-in support for Authentication via
SASL. Does anyone know what the author is referring to when he says work is in
progress for providing encrypted transports in ZeroMQ? Is there an issue that
I can follow somewhere for this?

~~~
gislik
CurveZMQ - authentication and encryption library

[https://github.com/zeromq/curvezmq](https://github.com/zeromq/curvezmq)

CurveZMQ was recently announced by Pieter Hintjens (one of the 0mq
contributors) on his blog:

[http://hintjens.com/blog:45](http://hintjens.com/blog:45)

------
fulafel
HTTP does have pipelining since the 90s.

~~~
augustl
I did look up HTTP pipelining before rolling with this, it is severly limited.
It can only really do sequential requests concurrencly, not an arbitrary
mishmash of requests and replies as ZeroMQ can. Also, pipelining should only
do idempotent requests, apparently.

My only source for this is the wikipedia article [1] though, do you have some
additional info about this?

[1]
[http://en.wikipedia.org/wiki/HTTP_pipelining](http://en.wikipedia.org/wiki/HTTP_pipelining)

~~~
pfraze
[http://chimera.labs.oreilly.com/books/1230000000545/ch11.htm...](http://chimera.labs.oreilly.com/books/1230000000545/ch11.html#HTTP_PIPELINING)

------
netghost
Does anyone have any experience using ZeroMQ with larger messages? For
instance if you need to send messages that are several megabytes for rpc style
calls.

I just remember seeing anecdotal comments about it not working well, but never
really dug into whether that was sound.

~~~
Figs
I've used it to stream an uncompressed video feed (essentially a special
purpose webcam) over LAN as a quick and dirty solution for a research project.
It worked out ok after I set a high water mark and forced it to drop data when
it couldn't keep up. Before I figured that out, messages tended to pile up and
crash my process by exhausting memory.

Can't say I'd recommend it, but it can be done, depending on the needs of your
project.

------
_pmf_
I don't follow the reasoning; wouldn't it be more sensible to compare ZeroMQ
vs. TCP instead of ZeroMQ vs. HTTP? If you can so easily replace HTTP, why the
fuck did you choose it in the first place instead of a plain TCP protocol?

~~~
augustl
ZeroMQ is not really at TCP level, ZeroMQ is about sending messages to and
from various semantic endpoints, such as REP and REQ sockets, specifically
made for doing RPC. If I was using TCP, I'd have to do a _lot_ of work myself
(keeping the connection up, throttling locally when it's not up, keeping track
of which messages goes where, etc).

------
blt
ZeroMQ is an impressive project and I like reading Martin's (passionate!) blog
posts. Haven't really needed it though. I hope I'll get a chance to use it to
build a distributed high-performance system one day.

------
krakensden
On a related note, it looks like Crossroads I/O is dead.

~~~
jallmann
Crossroads I/O has been superseded by nanomsg:
[https://github.com/250bpm/nanomsg](https://github.com/250bpm/nanomsg)

It's still pre-alpha but has a ZeroMQ compat layer already.

~~~
JulianMorrison
Why dead, why superseded?

~~~
jallmann
IIRC: Crossroads I/O was forked from ZeroMQ by its original author, who
subsequently decided to start from a clean slate with nanomsg. His blog has
some really good insights on various design decisions and "lessons learned"
from ZeroMQ: [http://250bpm.com/blog](http://250bpm.com/blog)

------
sauravc
HTTP works fine for me. Seems like a nice article though.

~~~
augustl
How do you manage connections? Do you use keep-alive or not? If so, you have
some sort of connection pool mechanism in order to be able to perform requests
in parallel? These two problems is the main reason for why I switched to a MQ
instead of using HTTP.

(Hopefully I managed to not make this seem like flaming, I'm genuinely
curious.)

~~~
tszming
Have you considered/tried using a decent HTTP client + HAProxy?

~~~
augustl
I have, but the fundamental limitation of sequential req/rep on a connection
with HTTP pushed me to find alternatives.

Do you have any recommendations for decent HTTP clients?

------
swah
What about interoperability with external services? With HTTP, you could just
open your APIs to the world, when it makes sense.

~~~
augustl
OP here. In the blog post, I use a traditional HTTP routing library on the
server, and I return HTTP-like responses. Since I didn't reinvent that part,
it's actually pretty trivial to either use the same code and package it into a
war file, or just fire up a Jetty in-process if you only have one server
instance.

------
jrn
Any experience with json-rpc? I'm trying to decide between the two.

~~~
rouille
I'm acutally doing json-rpc over ZeroMQ using a scheme similar to the one
described above to do file conversion tasks.

A ZeroMQ proxy using a ROUTER/DEALER pair with a bunch of REP sockets in the
background.

The clients use a simple REQ socket.

All in plain old C using ZeroMQ and jansson for JSON while conforming to the
json-rpc 2.0 spec.

~~~
swah
If I had to do that, I'd probably go with a queue on Redis or Postgres. "Cogs
bad".

What are the advantadges of ZeroMQ when you're still on one computer?

------
thrownaway2424
I see a lot of bare IP addresses and whatnot in ZeroMQ examples. Is it easy to
integrate with a naming system like Zookeeper?

~~~
kinofcain
The ZeroMQ guide has a number of sections on discovery:
[http://zguide.zeromq.org/page:all#Discovery](http://zguide.zeromq.org/page:all#Discovery)
However, you could also use something like zookeeper.

The guide is a fascinating read. Even if you never write anything using
ZeroMQ, it's a very useful intro to designing concurrent message passing
systems.

~~~
thrownaway2424
What I got from reading that is the author has no idea how to solve the
problem of many frontends and many backends of many services on a large scale
(same feeling I got from reading the section about heartbeating.)

~~~
diroussel
Interesting. Can you enlighten us with more detail?

~~~
thrownaway2424
The author lays out a dozen different, terrible ways to discover services,
only one of which is even remotely suitable. I associate that kind of stream-
of-consciousness documentation with people who are making it up as they go
along.

Anyway, the proposed solution of a full-mesh FE-to-BE heartbeat network over
UDP, switching to TCP in case of idleness, is not going to scale. That just
guarantees that you are going to run out of file descriptors in case of a
packet loss event, as everyone upgrades their heartbeat protocol to TCP.

~~~
diroussel
What, in your opinion, is the correct way to implement heartbeats in such a
message oriented system?

------
dschiptsov
http is a bad protocol for internal services due to arbitrary length text
headers with big parsing overhead and lack of multiplexing of packets per
connection by default. Even FastCGI is _much_ better choice.

Erlang's messaging protocol, Protocol Buffers, Msgpack and other carefully
serializing solutions trying to reduce parsing and communication overheads.

ZeroMQ seems like natural choice for the core of so-called messaging
middleware, because, well, it's what it was designed for.)

And, of course, avoiding JVM leads to great reduction of resource wasting.

