
Ask HN: Has anyone had negative experiences with zeromq? - jrussbowman
I've been looking at zeromg. Originally I was thinking about just using Python Tornado and it to form an http caching reverse proxy, but after reading the guide I really think I'm getting interested in using it for the basis of a scaling architecture.<p>I've been doing a lot of reading on it, and am now curious if anyone has used it and encountered issues they could share?
======
phintjens
Before I opened the 0MQ Pandora box I'd given up coding and was happily
writing my autobiography. 0MQ seemed like a pleasant way to spend a weekend.
But before I knew it, I'd lost control. Days, weeks, months have passed now,
all I think about are more subtle and perfect messaging patterns. They flash
before my eyes. Weird names and topologies. It doesn't get better, but worse.
Soon I'll be coding all nighters, my wife will leave me, my kids will forget
me, and all I'll be doing is programming, motherfucker.

Seriously, 0MQ has made network programming fun (again) in a bad, addictive,
way. Any design I can think of turns into real working code in a few hours,
sometimes days. And I'm using C, a language that isn't normally fun to work
in.

Right now, it's multithreaded clients and servers for resilient shared
distributed hash maps. Tomorrow, network-wide logging. After that, another
message broker. And so on.

Yes, it's a negative experience. I'd like my old lazy life back.

For the love of god, don't try it.

~~~
jrussbowman
I've already been there working with all the apis I use on unscatter.com
fortunately with 2 kids I have to stop. Though I have had some very tired
weeks when I can't sleep cause I am up all night coding

------
m0th87
I have, and it amazes me that everyone speaks so positively of it. I'm not
going to argue it's a bad library, or it's not worth investing it, but it's
not without its problems. And I certainly don't think it's applicable to all
distributed computation use cases.

This post on SO outlines some of the troubles I've had:
[http://stackoverflow.com/questions/4870814/is-zeromq-
product...](http://stackoverflow.com/questions/4870814/is-zeromq-production-
ready/5084150#5084150)

A couple of other issues. First, it's really easy to game a ZeroMQ socket.
Send it the wrong data, and your application will just fall flat on its face.
So you can't run it anywhere that has untrusted computers (e.g. over the
Internet).

Another issue I've had is a race condition that occurs when you call recv()
before anything is in the queue. The method will continue to block even after
it receives something. This is a _big_ deal because it requires some
workarounds with bad performance. But I wasn't able to get the jzmq dev team
to reproduce it, so it must be something restricted to either OS X or just my
system.

FWIW, I think most of the issues are restricted to jzmq, because there's a
good deal more complexity running around in that project to overcome the Java
<-> C bridge.

The reason I continue to use it anyway is because:

1) It's absurdly useful when it works.

2) The dev team is very responsive.

3) Bugs do get fixed if they can reproduce it. I already have had one issue
resolved: <https://github.com/zeromq/jzmq/issues/closed#issue/31>

As for Tornado, I am in _love_ with that technology. Rather than a framework,
it's more like a set of libraries for HTTP communication. That has huge
implications, and it feels much more pleasant for me to work with than, say,
Django.

~~~
jrussbowman
Thank you, this is the kind of information I was looking for

------
kordless
Loggly uses 0MQ extensively and we'd be happy to sit down with you and chat
about it. We were the ones that paid the 0MQ guys to bake disk persistance
back into the new version. Also, as someone else mentioned, Zed knows it
pretty well and he's awesome about taking time to teach what he knows to
others.

~~~
jrussbowman
I may take you up on that a some point. I tend to learn by jumping in the deep
end, so I'll likely write code first but if I run into any questions I'll keep
your offer to chat about it in mind.

Everything I'm doing is open source, I'm planning on using the Apache 2.0
license. The Github repo (without any code yet) is here -
<https://github.com/joerussbowman/Scale0>

The README gives an overview of what I'm attempting to accomplish.

------
obiterdictum
I can't say I have a lot of negative experience, but after evaluating it, I've
come to the conclusion that it's not the right tool for the job for us. We
develop trading systems and I wanted a decent messaging framework for internal
non-speed-critical communication between apps. Disclaimer: I had limited time
to evaluate it, so I may have some misconceptions about ZMQ, you have been
warned.

1\. Extensive use of asserts in release builds terrifies me. It's meant to
check for conditions that shouldn't happen, but I see users complaining about
their apps aborting with assertion failures on ZMQ mailing lists and it comes
up fairly frequently in Google. There are a fair bit of asserts for error
codes returned from system calls. I don't want a critical process crash
because I've used library in a wrong way in a completely different part of the
application.

2\. Only in 2.1 they've fixed the problem where some messages would not be
flushed and be lost if you terminate the process too early. This seems like a
fairly common bug for younger projects. Recommended workaround is... calling
"sleep" before you exit, which is one of the deadly sins of multithreaded
programming! This and above point convinces me that ZMQ isn't as mature enough
for me to be comfortable with.

3\. Transparent reconnection is good, but some of our applications need to
quickly detect that other nodes in the system are missing, which forces me to
implement off-band heartbeat mechanism.

4\. Threading model seems a bit awkward to me (last I checked). First of all,
let me state that I personally believe that a library starting threads behind
your back is a Bad Thing (unless it's a framework). ZMQ uses a sender thread
that you queue your messages into, yet it forces you to dispatch your receive
loop by either blocking read or zmq_poll. If it already starts threads by
itself, why not provide a callback?

5\. Not really a problem, but a missing feature: no way to demultiplex
messages from a stream of messages, so you have to implement it yourself. You
can subscribe to a subset of messages on a socket, but can't subscribe to
multiple subsets from a single socket.

~~~
phintjens
These are good points. Let me answer them.

1\. The 0MQ devs originally got asserts backwards, using them to validate
external input (e.g. on sockets) instead of internal consistency. We've been
fixing this for a year or two now, and it's pretty good. You'll get assertion
failures if you e.g. use sockets from multiple threads. Not so much if you
pass bad stuff onto sockets.

2\. 2.1 was a great step forwards, and the use of "sleep" was in toy examples.
Real networking apps tend to run forever, so this message loss at exit wasn't
a big deal. You're right that the product is still young.

3\. Totally agreed, this lack of peer presence detection is annoying, and the
source of some debate on the lists.

4\. Threading model works fine for me, I've used it extensively. A usable
reactor is a hundred lines of code, no more. See the libzapi zloop reactor, in
C, for example.

5\. Demultiplexing sounds like useful functionality but should probable sit
above sockets.

------
msutherl
ZeroMQ is used in the lubyk (formerly rubyk) project: <http://lubyk.org/en>

~~~
zedshaw
Ooooh, that looks sexay. I will play with this now.

~~~
msutherl
Also check out LuaAV, which is more mature and has some differences of
philosophy: <http://lua-av.mat.ucsb.edu/blog/>

------
timf
> "using Python Tornado and it to form an http caching reverse proxy"

It sounds like you should investigate <http://mongrel2.org/home>

~~~
jrussbowman
I have. And honestly if I was in build a product and get something released
because I'm building a business mode, I'd be using it I think. It's a pretty
good fit for what I was looking for when I started.

Right now thougt I'm thinking learning experience with zeromq, and also I'm
seeing how I can build something that can be used to scale an http application
beyond a single datacenter/cloud and more. I actually find that pretty
exciting and since I have young children and a good paying job, I'm still
treating the product I'm working on as a hobby rather than a business.

~~~
zedshaw
Yes, definitely do this. Don't let people try to convince you that you
shouldn't reinvent the wheel. Typically they just have some wheel they've
reinvented that they want you to use. Instead, you should implement as many
things as you can to learn how they work, and then use this knowledge to
select tools and avoid bullshit and marketing choices.

And who knows, maybe you'll do something better. That's progress.

Also check out gevent and eventlet to see the differences with those systems
as well. I just submitted a patch to eventlet to give it better zeromq
support.

~~~
jrussbowman
And that's coming from the guy who wrote the mongrel2 wheel :)

------
chuhnk
I think Zed Shaw would also vouch for the brilliance of 0mq. He uses it in
many places including mongrel2. The one thing he did mention in his pycon
presentation was not to expose it to the internet as there are some assertions
in the code which cause it to blow up on protocol errors.

------
pshc
I'm kind of in the same boat. I'm evaluating it right now to see if it'd be
suitable for iOS<\-->server comms, but it seems more like something you'd use
behind the server gateway.

Thing is 0MQ gives you transparent auto-reconnection--but I want to indicate
with a spinner when that's happening--and it makes request-reply synchronous--
but I already do everything asynchronously in the client anyway. Hmm.

------
docmarionum1
I've used it and it works great. The only problem I remember encountering was
getting it to work with a virtualenv, but that was probably just inexperience
on my part.

------
kemiller
It sure seems amazing for what it is. I do with people would stop comparing it
to message brokers -- solving for entirely different problems as far as I can
tell.

