
NSQ – A realtime distributed messaging platform - kowalchuk
http://nsq.io/
======
joshrotenberg
Having played with NSQ off and on for the past few months, and having gone
deeper with it over the past week or so in preparation for rolling out a
production service, here are a few things that have really impressed me, in no
particular order:

It's super easy to run. A few command line params, if that, and I've got a
local nsqd running that I can develop and test against. Great for those
offline coding sessions on BART.

The tools it ships with are very handy, specifically to_nsq and nsq_tail.
Again, both make developing stuff really easy because I can get stuff in and
out with fast, native command line tools or http. In fact, when I was first
playing around I built an entire prototype based on a few simple shell scripts
and the provided tools, just to see what would happen.

The nsqadmin and nsq_stat tools give you a lot of visibility into what your
producers and consumers are doing, how well things are performing, who is
connected to what, etc. And again, part of the distribution and very easy to
run locally.

The Go producer and consumer API is clean and easy to build around, pretty
well documented with a lot of examples in the NSQ apps themselves.

So far very stable. I haven't seen any crashes on the server or
producer/consumer side.

Helpful and friendly people on a low noise freenode channel.

~~~
matticakes
Thanks!

This is something that I've always tried to stress when talking about NSQ...

The "message queue" is the most boring and uninteresting aspect of the system.

It's the combination of out-of-the-box tooling and conceptually simple
primitives that really differentiate it from other systems.

~~~
ryanjshaw
The Features page says it is "horizontally scalable (no brokers, seamlessly
add more nodes to the cluster)", but the Design page says you need to run a
'nsdq' daemon - which is a message broker - i.e. you cannot embed the daemon
into your application?

~~~
joshrotenberg
You can embed nsqd (in Go):
[https://gist.github.com/joshrotenberg/ad49d39dbee8b48789d9](https://gist.github.com/joshrotenberg/ad49d39dbee8b48789d9)

That said, for a larger deployment you'd still probably want to run the
lookupd as well, at which point, at least for a typical architecture, running
nsqd standalone is probably fine. Embedding is darn handy for testing, though.

------
makmanalp
Previous discussion on bitly blog / reddit with some comparisons to zmq:
[http://www.reddit.com/r/programming/comments/117dsj/nsq_real...](http://www.reddit.com/r/programming/comments/117dsj/nsq_realtime_distributed_message_processing_at/)

And also a great article that compares / contrasts NSQ along with a lot of
other alternatives: [http://www.bravenewgeek.com/dissecting-message-
queues/](http://www.bravenewgeek.com/dissecting-message-queues/)

~~~
matticakes
(project author)

I appreciate all the work that Tyler has put into his series of blog posts on
messaging systems (and NSQ bug reports!) but I think this particular article
is one that misses the mark [1].

While raw performance is important, I think comparing the guarantees and
related operational and development semantics are far more interesting and
useful.

These systems vary a great deal in the interfaces and functionality they
expose to users building on top of them and operators dealing with them when
they're deployed in production (and inevitably break!). _This_ is ultimately
what matters most.

P.S. single node (localhost) performance is essentially useless. Most of these
tools are designed to be deployed as the backbone of large distributed systems
(NSQ in particular), so scalability and performance in that context is a
better indication of real-world performance.

[1] He's since posted this follow-up that outlines many of the same things
[http://www.bravenewgeek.com/benchmark-
responsibly/](http://www.bravenewgeek.com/benchmark-responsibly/)

------
chuhnk
We are using NSQ at Hailo for sending and receiving billions of messages every
day. It scales incredibly well and is extremely reliable. We'd be happy to
speak more about this with anyone who's interested.

~~~
nivertech
Can one use NSQ as a replacement for Kafka?

~~~
walls
Kafka offers persistence, which seems extremely rare in distributable queue.
(Is it even available outside of Kafka?)

On that note, what is everyone using these queues for that they can ignore
durability?

~~~
sytringy05
The standard JMS spec supports durable queues and topics. Generally a
nightmare to manage though - trying to track down a unconsumed message on a
clustered durable topic with lots of subscribers is like trying to find a
needle in a stack of needles.

------
ibejoeb
I'm glad these new projects are coming on, and this one seems to be very
forthright about its limitations, but I'm just putting this in the bucket with
all of the other messaging systems that provide a minimum feature set.

I haven't seen much on the market recently that offers things like (first-
class) persistence, guaranteed ordering, guaranteed delivery, or any of the
other more complex distribution patterns. That's why I'm still using ActiveMQ.
I'm pretty happy with it, so I suppose I'm not really looking for a
replacement, but I still wonder what benefit people are getting from these
newer systems.

~~~
matticakes
(project author)

NSQ is as much about what it doesn't do as it is what it does. To a certain
extent this mirrors, and was inspired by, the language's philosophy (Go) [1].

Also, NSQ was designed to replace an existing home-grown system deployed at
scale. This dictated a lot of the initial requirements (and in certain cases
excluded off-the-shelf tools).

When we left the experimental phase we realized we had built something that
was useful to others, and it turns out that despite not having the features
you've identified it can be incredibly effective in lots of use cases that
don't need stronger guarantees.

[1] If I'm being honest, NSQ was a vehicle for adoption of Go at bitly as well
as the project we used to learn the language. This was a huge risk at the time
(almost 3 years ago) but one that has certainly paid off.

~~~
ibejoeb
Yeah, don't get me wrong, it looks good for what it is, and certainly some of
the best products come from skunkworks and hobby.

Is there anywhere I can read about the specific problem at bitly you built it
to facilitate? I mean, I've seen a lot of these systems where there are
integration and timing concerns, so some kind of message bus is useful. But if
they do get to a point where things needs to be distributed, sending messages
without order and delivery guarantees just seems to be passing the buck to
downstream systems to not mess up.

~~~
matticakes
Yea, in the original announcement blog post (which became the design doc):

[http://nsq.io/overview/design.html](http://nsq.io/overview/design.html)

------
discardorama
I don't think I had heard about NSQ till now; how is it better than, say, a
pub-sub queue on a beefy Redis server? (Or a cluster of servers, if you wish).
Nothing beats Redis' simplicity, AFAICT.

~~~
JulianMorrison
Redis pub/sub is ephemeral. If you aren't connected and listening, you missed
it.

~~~
discardorama
My mistake: I should have said Redis queues.

~~~
JulianMorrison
NSQ's improvements over Redis queues are:

\- NSQ is about equally simple as Redis. Like Redis, it's one binary (and an
optional second, if you want a dynamically resizable cluster)

\- simplicity of clustering, just boot more servers

\- messages aren't lost if the worker process takes one and then dies, they
will be retried.

\- optional overflow to disk, queues do not have to fit in RAM

\- optionally always write to disk, for reliability

\- a message on one "channel" can be queued automatically in several "topics",
and the sender doesn't need to know. So messages can be diverted to
monitoring, for example.

------
simi_
NATS is a simpler, faster alternative: [http://nats.io/](http://nats.io/)

~~~
clintonb11
Gotta love those benchmarks for 4k payloads... those are some big messages
right there.

~~~
simi_
Our payloads are generally exactly 20B. We don't use the MQ to transport data.

------
alexatkeplar
It looks like there isn't any kind of user-definable shard/partition key
available within the topics - a message within a topic could go to any client
subscribing to a channel for that topic? Is that correct?

That's obviously fine for AWS Lambda-style single event processing (maps,
filters, sinks), but isn't that going to make multiple event processing
(reduces, sorts) really difficult? It rules out using in-process memory or
local KV storage for maintaining the necessary state across the aggregation
window - you're only left with pounding some remote database, not a great
strategy as per [http://radar.oreilly.com/2014/07/why-local-state-is-a-
fundam...](http://radar.oreilly.com/2014/07/why-local-state-is-a-fundamental-
primitive-in-stream-processing.html)

~~~
ploxiln
For the applications I've dealt with, it's an upside that a workers
distributed across a number of servers can process messages from producers
distributed across a number of servers, with no user-defined partitioning.
Kafka, I think, requires you to define partitions to have multiple workers
consume a stream. In the NSQ case, if you need more throughput, just spin it
up.

Somewhat on a tangent, one of the goals of the design was to avoid any servers
in the middle of the "workers" (consumers) and producers, and making it such
that any consuming or producing server dieing or experiencing a network
partition doesn't slow down any producers or consumers who can still reach
each other.

You're right, this isn't really focused on replacing map-reduce thing like
some other message streaming systems are. For one, I can't imagine a "sorting"
step using nsq. It's still used for somewhat hefty data-processing tasks, in
multiple stages. Each stage has as many servers spun up as needed, consuming
messages directly from the previous stage, and publishing to nsqd on
localhost. They might consume and acknowledge multiple messages to produce one
new message.

For reduced inter-server hops before hitting the database, you can run workers
on the same host as producers. So you might have a X hosts producing messages
(and running their own nsqd locally, of course), and have Y consumers on each
host, configured to consume messages from localhost instead of using
nsqlookupd to find all sources, and those consumers would make direct requests
to the appropriate database host. With this setup you can still use nsqadmin
to monitor the queue levels on all producer hosts, and you can still be
archiving messages to disk on some other remote hosts.

~~~
alexatkeplar
Thanks, that's a very helpful explanation. (A partition key is optional in
Kafka - the DefaultPartitioner just computes a random partition regardless of
key.)

I am having trouble visualizing this in NSQ:

> [A server] might consume and acknowledge multiple messages to produce one
> new message

I can imagine a server consuming multiple messages and performing a single
write to e.g. ElasticSearch or Cassandra - in this case it doesn't matter that
the batched write is a random subset of messages. But I can't imagine
consuming multiple messages and emitting another message - at least not
without being able to reason about what partition of data was present on that
server. For example, I can't detect an abandoned shopping cart because a
single shopper's events will be scattered across all servers.

Maybe I'm thinking of this wrong - can you give a live use case for consuming
multiple events and emitting a single message in NSQ?

~~~
ploxiln
You're right, it can't do some calculation across all data for a single user
or session or anything like that.

I suppose I was thinking that it might reduce "just a little", just adding
stuff together as appropriate, but that wouldn't really be useful.

In practice, the way multiple events turn into a single event is that some are
just thrown out, according to some static filter rule, or after checking a
datastore.

The places where NSQ are typically used is where new data never stops flowing.
So sometimes for this sort of thing, it updates a value in some datastore, and
creates a new message for the next stage with the latest value, perhaps only
if the latest value reaches some threshold, in a scheme where the value
decays.

------
j_baker
I've noticed that a lot of message queuing systems have an "delivery at least
once" property. What exactly does the possibility that a message could be
delivered more than once buy you?

~~~
tel
You can have "at most once", "exactly once", or "at least once" [0]. Each has
a different tradeoff in terms of implementation complexity, speed, and other
availability/consistency concerns.

For different applications differently delivery QoSes are tolerable. For
instance, if your messages are idempotent (either by design or nature) then
"at least once" is equivalent to "exactly once" and may be much faster/easier.

Generally, "exactly once" demands the least of the application and the most of
the broker.

[0] I suppose you could have "any number of times" as well, but that's no
guarantee at all.

------
deathflute
Can somebody comment on the advantages of using this over zeromq/nanomsg?

Thanks!

~~~
matticakes
NSQ is a full-featured messaging platform out-of-the-box, whereas zeromq and
nanomsq are lower level libraries that you could use to build (the same)
functionality.

------
kitd
Looks very interesting, though their protocol seems very MQTT-like. I'd be
interested whether they considered (and rejected) other protocols before
writing their own).

~~~
jallmann
Here is the HN thread in which NSQ was announced, and similar questions were
asked.

[https://news.ycombinator.com/item?id=4631994](https://news.ycombinator.com/item?id=4631994)

Although I don't know how the system has evolved since then.

------
foobarqux
Could you use NSQ to create a P2P messaging system?

~~~
joelthelion
I think you would need to add good firewall traversal technology (UPnP, etc.).

------
jsilence
How does this relate to mqtt?

~~~
cpitman
MQTT is a wire protocol for connection a message sender and receiver, but
leaves the actual messaging architecture (brokers, P2P, etc) unspecified. NSQ
appears to include its own proprietary wire format, as well as a full im-
memory messaging architecture.

~~~
Zaheer
Just want to add the NSQ wire format looks pretty efficient as well. Seems to
be defined at the binary level which was one of the biggest wins for MQTT in
terms of memory footprint. Although would be nice if a standard protocol was
used instead of creating their own.

~~~
sytringy05
Binary formats have their drawbacks - the guy that designed the AMQP protocol
called the wire protocol he designed for it an "expert mistake". Admittedly he
was designing a spec rather than a product but worth understanding.

Source is a long but very interesting read about messaging, amqp, zeromq and
others: [http://www.imatix.com/articles:whats-wrong-with-
amqp](http://www.imatix.com/articles:whats-wrong-with-amqp)

------
curiously
This looks interesting but I'm hesitant to switch from RabbitMQ & Celery to
this.

~~~
caffeineninja
We use it at Life360, formerly on RabbitMQ. No real issues.

~~~
clintonb11
I would also like to know why you made the switch in the first place. Was it
the more-easily-distributed nature of nsq?

------
seanieb
What is the core problem this solves?

