
Show HN: Amqphosting, Managed RabbitMQ service - RabbitmqGuy
https://www.amqphosting.com
======
zedpm
My first question when I saw this was "How to you handle network partitions?",
since RabbitMQ's partition handling is, uh, suboptimal. I read bullet points
until I found this:

> With our RabbitMQ servers, you wont have to deal with message loss in the
> event of a network partition.

Reading on, I found that your answer to partition tolerance is to avoid the
possibility of partitions by not supporting clustering at all. So that kind of
rules out high availability, practically speaking. Shovel and federation are
poor options.

As someone who is actively looking for highly available AMQP without message
loss, I have to say that I'm not going to pay someone else for a poor solution
to the problem. A managed service has to solve the hard problems to be
compelling. I can run my own single instance and hope it doesn't go down at a
bad time, which is all you're offering.

I know this is all very negative, and I regret that, but I'm part of your
target market and you need to know what your offering looks like from my
perspective. A managed service can't sidestep the difficult problems of
operating their core technology.

~~~
lobster_johnson
Indeed. In my opinion RabbitMQ is essentially useless in clustered mode.

When Rabbit recovers from a network partition and has to decide between
multiple potential master versions of a queue, it picks the largest one to
become the new master, and discards the others. It's rather mind-boggling that
it can't merge them instead; after all, if your application is capable of
handling duplicate deliveries, then merging (which would potentially result in
previously ACKed messages becoming visible again) would be a perfectly
acceptable solution.

The only way to make it non-lossy is to turn off HA recovery and manually
handle network partitions, but it turns out that's not practically feasible,
because there are no tools to work with Rabbit queues at a low level; the only
way to recover is to discard one or more nodes.

We've also found Rabbit's clustering to be very flaky in general, beside the
lack of partition tolerance. We recently had a Rabbit crash where one Rabbit
node (not the machine itself) went down, and things got really stuck; the only
way to recover was to stop all the nodes, then start them again. After we did
that, all the queues were empty. We've also had instances where suddenly
bindings go missing, or the bindings are there but attempting to declare them
from a client fails with an "bindings already exist" error. And many other
weird errors.

The last year or so, after having to endure all of these issues, we've decided
to ditch clustering altogether and run a single node. That's risky, but
ironically it's a _lot_ more stable than our previous three-node cluster.

In my opinion, Pivotal really needs to redesign RabbitMQ's clustering.

Has anyone successfully moved off Rabbit? ActiveMQ, NSQ? Disque [1] looked
promising, but seems dead (last commit was 18 months ago) at this point.

[1] [https://github.com/antirez/disque](https://github.com/antirez/disque)

~~~
xerxes901
We run our RabbitMQ cluster with pause_minority as the partition handling
strategy. This should eliminate most message loss on partition, no?

~~~
lobster_johnson
Yes. A bonus is that if you use it in combination with a health-check-capable
proxy (HAProxy, Kubernetes), clients can be routed to any non-paused node
automatically. In fact, you'll need that, since a paused node will close its
ports.

That said, the devil is in the details; I'd be interested to know if RabbitMQ
is capable of reliably _detecting_ that it's in a minority. We've had issues
where nodes are having issues talking to each other, but the problem is not
consistent on both ends (e.g. A can talk to B, but B can't talk to A).

------
esdott
Why not support TLS/amqps at all pricing levels? That's a huge turn off for me
especially since you only have it at your highest pricing level. I'd also make
that clear on your comparison page as it seems like you support amqps at the
$55 level but do not on your pricing page. Good luck! (seriously, no sarcasm)

~~~
RabbitmqGuy
> as it seems like you support amqps at the $55 level but do not on your
> pricing page.

Hi, sorry about that. We'll fix the comparison page.

> Why not support TLS/amqps at all pricing levels?

When we started out, we were offering AMQPS/TLS for all plans through
certificates from letsencrypt[1]. However, it became hard to manage
certificate renewals, since we had to renew them on one machine and scp them
to the respective RabbitMQ servers. This was too labour intensive and not
worth. We however still have plans to roll out TLS for all plans at no extra
cost.

1\. [https://letsencrypt.org/](https://letsencrypt.org/)

------
jv22222
Great idea. Nice to see a managed service for this.

When we used to use RabbitMQ we were never able to restart a fully loaded
instance it always just seemed to hang.

------
kevinsimper
Looks nice, is it a single side project that you are launching? What are your
experience with rabbitmq vs. SQS or something like it? :)

~~~
RabbitmqGuy
Hi.

It started out as a side project but it has grown to a point where we are now
doing it full time.

