
Alchemy Micro-Service Framework: Using RabbitMQ Instead of HTTP - grahar64
https://github.com/LoyaltyNZ/alchemy-framework
======
atombender
As someone who has used RabbitMQ in production for many years, you should
rather consider using NATS [1] for RPC.

RabbitMQ's high availability support is, frankly, terrible [2]. It's a single
point of failure no matter how you turn it, because it cannot merge
conflicting queues that result from a split-brain situation. Partitions can
happen not just on network outage, but also in high-load situations.

NATS is also a _lot_ faster [3], and its client network protocol is so simple
that you can implement a client in a couple hundred lines in any language.
Compare to AMQP, which is complex, often implemented wrong, and requires a lot
of setup (at the very least: declare exchanges, declare queues, then bind
them) on the client side. NATS does topic-based pub/sub out of the box, no
schema required.

(Re performance, relying on ACK/NACK with RPC is a bad idea. The better
solution is to move retrying into the client side and rely on timeouts, and of
course error replies.)

RabbitMQ is one of the better message queue implementations for scenarios
where you need the bigger features it provides: durability (on-disk
persistence), transactions, cross-data center replication (shovel/federation
plugins), hierarchical topologies and so on.

[1] [http://nats.io](http://nats.io)

[2] [https://aphyr.com/posts/315-jepsen-
rabbitmq](https://aphyr.com/posts/315-jepsen-rabbitmq)

[3] [http://bravenewgeek.com/dissecting-message-
queues/](http://bravenewgeek.com/dissecting-message-queues/)

~~~
abrookewood
I spoke to Kyle Kingsbury (Aphyr) about 9 months ago and at that time, the
only queue or pub/sub system he thought was relatively safe from partition
errors was Kafka. Not sure if his position has changed recently.

~~~
lobster_johnson
Indeed, though this is irrelevant to this particular use case.

NATS doesn't have replication, sharding or total ordering. Consistency is a
challenge for clustered messaging brokers that need this.

With NATS, queues are effectively sharded by node. If a node dies, its
messages are lost. Incoming messages to the live nodes will still go to
connected subscribers, and subscribers are expected to reconnect to the pool
of available nodes. Once a previously dead node rejoins, it will start
receiving messages.

NATS in this case replaces something like HAProxy; a simple in-memory router
of requests to backends.

~~~
abrookewood
Ahh .. OK, that makes it clearer. Thanks.

------
olivemonkey
IMO this is an emerging anti pattern to use Rabbit to connect "microservices".
It often introduces a single point of failure to your "distributed" system and
has problems with network partitions. If critical functionality stops working
when Rabbit is down, you're probably doing it wrong.

~~~
calpaterson
Most real world microservice projects (I've worked on several) already have
many single points of failure. Often there is one service that needs to be up
for the system to be up (such as the one that processes your customers
orders), you don't realise some VMs are sharing a physical disk or everything
is dependent on a single router somewhere you've never heard of that will one
day run out of memory and drop TCP connections. This is not to mention the
risks posed to availability by third-party tracking software that push changes
that break web forms (#1 cause of long outages in my experience).

Message brokers like RabbitMQ give you a lot of benefit and introduce only a
small number of failure modes. You can obviate tricky service discovery boot
orders, do RPC with without caring about whether about your server could be
restarted and of course you get a good implementation of pub-sub too. If you
stay away from poorly considered high availability schemes I am absolutely
fine with recommending it for intra-service communication.

~~~
pm90
Great comment. One thing I don't like very much about microservices is simply
that my service often will have some such hard dependency. e.g. if the logging
service is down, I lose all logs-based statistics. If the authentication
microservice is down, I'm screwed. The parent comment made the excellent point
about SPOF, but it seems like for microservices to work correctly, there will
always be some SPOFs.

Maybe I'm being too pessimistic. I use microservices, but without significant
engineering rigor, I think its a recipe for disaster.

~~~
zeisss
> If the authentication microservice is down, I'm screwed.

Well, make sure your sevice is not dependent on those services then. Use
signed tokens to only rely on the authenticator for logins. everything
afterwards can work it our by themself.

logging: keep your stuff in a queue or logfile, until the log service is back
up.

------
eddd
Looks pretty cool, but doing RPC over RMQ is quite expensive. I would rather
see good abstractions over ZMQ - that would be really cool.

Also as many people here pointed out, RMQ is a single point of failure which
might be acceptable for some cases. To me, the problem is that by design this
architecture has a bottleneck which is really hard to get rid off.

------
mugsie
OpenStack has had this as a common design pattern across a lot of the services
for a while.

It works quite well for us now - there was a period 2-3 years ago where
rabbitmq was the bane of my existence, and the cause of many a page but newer
versions have been fine.

We do assume that everything in the queue can go away, and only in flight
calls will fail - and (at least in the service I work on) we do retries if
something falls through the cracks.

There is a shared library we use for it -
[http://docs.openstack.org/developer/oslo.messaging/](http://docs.openstack.org/developer/oslo.messaging/)

It also has drivers for ZMQ, and QPID afaik.

~~~
pm90
Yes, I have worked with the OpenStack messaging queue, and its really amazing
how it just works. Also, easy to grab analytics about service usage by
querying the queue directly, is amazing.

------
eeZi
How does this compare to Autobahn / crossbar.io?

[http://autobahn.ws/](http://autobahn.ws/)

[http://crossbar.io/](http://crossbar.io/)

I really like the service discovery in Crossbar ("routed RPC").

~~~
grahar64
I had not heard of these before. But they look really interesting. I think
they are pretty similar as it is a common pattern, and straight forward
solution to many problems.

------
nitrogen
This looks cool. I wrote a less developed than this and unreleased RabbitMQ-
based microframework that combines RPC and one-way event streams to integrate
an ecommerce site in one datacenter with a back office in another. At another
company I used ZeroMQ to make a job processing cluster.

Comments by others suggesting the use of http are kind of missing the point of
message queues. The latency added by http(s) made interactive RPC sluggish,
and made batch back office updates take hours instead of minutes with AMQP.

If I had Alchemy two years ago I might have used this instead of rolling my
own.

------
kitwalker12
this is helpful. we already use rabbitmq for our golang services. This would
ease the task of setting up publishers and subscribers in the node APIs we're
building.

------
ajmurmann
I've been working on something similar but decided to go with Kafka as the
message bus because I want the messages to persist. This allows for more error
recovery solutions and auditing. Still having all that data around also allows
me to come up with new ideas to use that days after the fact.

------
jhgg
I feel like this is a bit silly. Feel like one could just speak HTTP or
Thrift, and use etcd or zookeeper for discovery. RabbitMQ just complicates the
stack. Also, it's HA options are generally... disappointing.

------
sunnya
Check out [http://restbus.org](http://restbus.org) , a similar project for
.NET.

------
swang
What are you using for general load balancing/availability? A consistent
hashring?

~~~
grahar64
I do not understand the question. In production we use an ELB hitting multiple
routers, on multiple nodes. Not sure how hashring applies to load balancing.

------
philip142au
Use AKKA

------
jorgecurio
as someone who's used RabbitMQ in a SaaS app to run millions of tasks, heed my
warning, RabbitMQ is great if you are low on memory but the overhead was too
much to deal with. Celery also contributed to this but if I could travel back
in time, I would stop myself from using RabbitMQ. Instead, Redis is a much
better alternative for micro-service.

~~~
grahar64
Redis pub/sub (which I assume you are talking about as an alternative) is
fine, but you have to be listening when the message is sent otherwise you miss
it.

Another problem is working out if where you sent it there are any listeners,
which RabbitMQ deals with return queues. This is important for 404's when a
client is trying to talk to a service that does not exist. In Redis you would
just post to a pub/sub queue wait till nothing happens and timeout, rather
than immediately knowing through a return queue.

~~~
lyha
Actually Redis returns the number of consumers that were subscribed during
your publish and received the message, so you can detect that no service is
available and return your 404 status. The real limitation for RPC over Redis
pub/sub to me is that you can only broadcast your messages to all consumers,
and not have them load-balanced between the subscribed applications. So you
end-up following a discover/unicast call pattern where Redis doesn't bring
much to the table for the actual RPC call. Maybe that's an opportunity for a
cool addition to redis...

~~~
grahar64
Cheers, I didn't know that about Redis. I actually really like Redis and it
simplicity, it would be cool to get more options for passing messages over
Redis.

