I used ActiveMQ in a large production environment for two years. It's by a large margin the worst piece of software I've used professionally. You wouldn't believe the number of very serious problems we had with it (including some mentioned here, like negative queue lengths, but mostly broker crashes, missing or wrong documentation, outright broken features, serious threading issues, and poor performance). If anyone's not convinced, just peek at the source code. It's frightening.
We ended up writing a new queuing system from scratch in a month, and in its first week it was already more stable, performant, and bug-free than ActiveMQ.
This is refreshing to read. I worked with ActiveMQ for about 2 years, and had nothing but problems with it. We probably lost thousands of dollars in sales because of lockups, corruptions, etc. I always assumed that we were "just doing something wrong" (which we probably were in many cases), but we were basically using out-of-the-box configurations and Spring for the JMS support. I'm glad to hear that I'm not the only one that has had problems with it. I assumed it should be pretty stable since it's an Apache project, but I have to agree that the source code is almost incomprehensible.
The thing we found is that when it crashed under load, it could take over 24 hours to consistency check and rebuild its queues on-disk. Insane for a highly-available environment.
This is why you should NEVER let an architect choose your software - ALWAYS the sysadmin who will actually be on-call to support it should have the final say.
Why'd you write one from scratch instead of just using JBossMQ or HornetQ or something? ActiveMQ is far from being the only open-source message queuing product, even if you only consider JMS compliant ones.
This was two years ago. JBossMQ was being rewritten and the new version was too immature. I don't remember HornetQ. We investigated RabbitMQ, which seemed good but still a bit young to bet our company on. We also needed some interesting features that none of these provided. In the end writing our own was the best decision we made my whole time at that company -- it paid dividends the rest of my time there.
I just found a thread I started about ActiveMQ and scaling:
Note the last message there. 36,000 threads when using NIO! And these answers were from the people commercially supporting ActiveMQ (FuseSource). I never did find out (on that thread or elsewhere) whether ActiveMQ has ever been used in a situation that large. I suspect it hasn't.
This was two years ago. JBossMQ was being rewritten and the new version was too immature.
Ah, ok... gotcha. Didn't realize the time-frame.
I don't remember HornetQ.
It's fairly new... basically the successor to JBossMQ. It is reputed to be blazing fast though. I've been experimenting with it, but haven't used it in anger.
We investigated RabbitMQ, which seemed good but still a bit young to bet our company on.
Hi - this is pretty damning. We're currently building an application that uses ActiveMQ but I could switch to a different broker if I thought it was justified. Do you remember which version you were using?
We've had nothing but problems with ActiveMQ and we've been using it for 4-5 year now. We've been slow on making the decision to move to a different open source implementation because it is difficult to test this kind of software. I mean, ActiveMQ appears to work fine in tests, but in production, lock-ups and storage leaks happen on monthy basis. We need simple functionality, peristent queues in their simplest form. HornetQ seems like a good candidate. I'd appreciate if anyone could share their experience with it.
Its surprisingly easy to write a messaging system that meets your requirements. Having a messaging system that meets every use case is extremely difficult. ActiveMQ is deliberately highly configurable - to try and address this - ActiveMQ is now stable and performant - with a high degree of fault tolerance built in. The things you should care about - like what happens if you loose the network, a disk etc are automatically taken care of with ActiveMQ.
I had this exact experience about four years ago. Spent way too long dealing with their buggy code, wrote our own in a few weeks and everything was much easier from then on.
"We ended up writing a new queuing system from scratch" <-- If it's good, why not open-source it? I'm sure there are many others who have use for a good queuing system.
I see lots of questions here about which queuing implementations one ought to consider. Some things to think about:
1. Do you need transactionality? What if your consumer experiences an error after it takes a message off the queue but before it finishes up processing it? Is it ok for that message to disappear forever? The common use case to consider is a consumer which takes a message from a queue, does some processing, and updates a database. To avoid distributed transactions, people often write these consumers so that they check first to see if a message has already been processed. This way, one simply ensures that his DB transaction commits before committing to the consumption of the message.
2. What level of message durability is required? Is it ok for all enqueued messages to disappear when a power cord is inadvertently pulled? I used ActiveMQ to populate a user behavior data warehouse. As users did stuff on a website, the application servers would enqueue observations (went to the product detail page, etc.) and these observations would eventually wind-up in a big Oracle DB. A delay of a day or so didn't matter, but we didn't want to lose more than a fraction of a percent of our observations. So, the queue was in effect a temporary system of record, and we had to allow for reboots, power outages, etc.
3. High Availability. Many modern queueing implementations can be deployed in redundant, scalable "meshes", ActiveMQ included. I haven't kept up on the feature sets of RabbitMQ, JBoss's offering, etc. or I would comment more here.
There are obviously a bunch of other considerations, but these are often the ones which are ignored when comparing ZeroMQ to ActiveMQ, etc. There's no right or wrong answer. "Lighter" implementations are often appropriate for messages where, if the shit hits the fan with your environment, the messages are useless anyway. IIRC, eBay builds pages by firing off a bunch of async requests for page parts, waits a maximum of N milliseconds, and then renders the page based on which pieces made it back on time. There's no value in persistence or transactionality, but the mesh sure better scale up.
The thing I've found about message queues is that it's pretty unimportant what platform they're based on. I used to be tempted to go with ActiveMQ "because it runs on the JVM", but that turns out to be pretty irrelevant.
No matter what your queue provider, you're most likely just using TCP sockets anyway, so don't think you need to be using Erlang just to take advantage of RabbitMQ.
Yes, this means the MQ can basically be thought of as a black box. But that's also kind of the point.
What I've seen with RabbitMQ is that the memory usage can grow and grow. If it runs out of either committed RAM or address space, it crashes. I suspect this may be a feature of the open source Erlang VM that's improved in the pay-for-support version. Even though RabbitMQ pays the performance cost of bouncing everything off the disk, it's not quite recoverable from that type of crash.
I'm sure my information is old, perhaps these things are improved in newer versions. Still, the behavior of the Erlang VM itself was relevant.
As a C++ programmer, I wouldn't feel comfortable abstracting away memory management for this type of server which is both so memory intensive and requires high reliability.
But of course many people swear by highly reliable JVM based stuff, so perhaps I'm wrong about it. :-)
Actually Erlang/OTP provides exactly the type of abstraction layer that's extremely useful for RabbitMQ's use case. It's specifically designed for consistent latency, predictable memory usage, and resilience under failure conditions. You know, basically the things that make an MQ worth using.
It was true that previously RabbitMQ could run out of memory and be unable to take more messages, but that changed in 2.0 and it will now swap messages to disk. Of course, then you pay the consequences of slow disks and can always run out of disk space, but nothing is invincible.
What I've seen with RabbitMQ is that the memory usage can grow and grow
Prior to 2.0, the contents of Rabbit's queues were always in RAM. Even durable queues with persistent messages always had their contents mirrored in RAM. As of 2.0, Rabbit can store queue contents on disk or in RAM, so as it faces memory pressure, it will send data to disk. If the disk is too slow for it to relieve its pressure, it will block connections that have sent data in the past until it can free up enough memory to continue.
Last I checked, Rabbit does by default keep the associative map between queues and message in RAM, so its memory usage isn't technically bounded (it's a around a dozen bytes per message, IIRC). If that's a problem, the toke plugin uses tokyo cabinet to store the associations, which I believe allows Rabbit to have entirely bounded memory use. I also haven't checked on that for a while, so it may be that toke is installed by default now.
I've used RabbitMQ 2.4 in production with queue lengths of several hundred thousand. Each queue item can be as large as 100 KB. RabbitMQ's memory usage never went beyond ~150 MB.
"a feature of the open source Erlang VM that's improved in the pay-for-support version"
Erlang was created by a telco in the 1990s and since then has been battered in production by serious users. It's real software. The version that Ericsson provide as open source is not crippleware as you seem to imply.
"RabbitMQ pays the performance cost of bouncing everything off the disk"
No it doesn't.
RabbitMQ only uses the disk if tell it to do so, eg if you require messages to be persisted when they cannot be delivered immediately.
"memory usage can grow and grow"
If you stuff data into a messaging server without draining it on the consumer side, then memory usage will grow. This will also happen if you write your messaging system in C++.
There are two solutions to this problem:
- flow control, where you tell producers to back off
- paging to disk, where you flush data from memory when it is on the disk
Neither of these is trivial to implement which is why there is a big gap between toy messaging systems and serious products.
As others point out on this page, RabbitMQ has support for both these features. In particular a lot of memory management capability has been added since 2.0.
It's absolutely incredible that it can take so much effort just to make it run. ActiveMQ seems shiny and interesting on the outside, but the prolific number of war stories took it off my list.
In brilliant contrast, RabbitMQ is it-just-works software. It's a piece of infrastructure that works so well it practically just fades into the background. I couldn't imagine having to worry about a service in which the chief goal is to help alleviate reliability concerns.
The Reddit guys think differently: "[...] (rabbitmq) died, which added about an hour to the downtime. It dies like this pretty often at 2am or at other especially bad times. Usually it doesn't cause any data-loss, just sleep-loss (its queues are persisted and the apps just build up their own queues until it comes back up), but in this case it decided to crash in a way that corrupted its database of persisted queues beyond repair. rabbitmq accounts for the only unrecoverable data-loss incurred, which was about 400 votes. [...] Coincidentally, rabbitmq crashed twice more that day and a few more times into the weekend. [...] Things have improved thus far, but replacing rabbitmq is at the top end of our extremely long list of things to do."
"Crashed"... I'm glad they're using such specific terms. I give them a lot of slack because they run that shop with a skeleton crew, but they sure do run into a lot of issues with perfectly good software, have Twitter levels of performance & availability, and make some very odd technical decisions.
Yikes. If you're smart, you'll do your diligence and not rely on one person's experience with ActiveMQ. Remember, it is an Apache project - so if something aint there, feel free to fill it in (re: documentation).
I can't believe the ActiveMQ/FuseSource guys weren't willing to work it out with David either. Perhaps there's a bit of open source/commercial software wrangling behind this story. Won't be the first/last time.
So, what might you recommend instead? I'm using rabbitmq on a couple of projects, and have been a bit underwhelmed by the lack of management tools that come with it. I figured something as basic as getting a list of what's in a queue and being able to then remove that item from a queue would not require me to write custom code, but I seem to have been wrong.
The RabbitMQ management plugin does provide an excellent command line interface. You download it along with its docs right from the management UI (/cli).
Btw, I'm using RabbitMQ, and I love it. My needs do not include high load or high availability though so I can't speak for that. So what's nice about it? AMQP (you automatically get lots of tools, docs, "expectations", etc), very friendly and active mailing lists, small & clean code base, small footprint, simple, active (more features are always being added).
What I don't like about RabbitMQ? While it's FLOSS inside out, its development isn't exactly a "community" work. For example, I can't report an issue, attach a patch, and receive a reviewer/committer feedback about it and possibly get it in. In fact, I can't even report an issue into their issue tracking system -- I just have the mailing list. That said, I believe they said they are going to fix that part "soon".
But it's true that the community has been much more involved in clients like Pika, for example.
And yes we are planning to open up the bug tracker.
A piece of advice for anyone doing an open source project - start with an open tracker, because opening up a previously closed tracker is a royal pain in the butt.
Does anyone have recommendations for resources on messaging/queueing basics? I'm look for something on the basic patterns (pubsub, broadcast... ?) and basic considerations to be aware of.
Thanks. I've tried reading the zeromq guide before, but it was confusing because it used concepts that I wasn't familiar with and it didn't define the concepts. Also, its examples are in C, and I don't know C. The RabbitMQ one looks helpful, though
I would suggest getting the "Camel in Action" book from Manning. I always find that it's better to learn something that has real implementation behind it - you see the theory, now here's the code to do it.
You mean there's a correct configuration? Also, Tomcat, Jetty, PostgreSQL, and Ngnix don't "install correctly" out of the box as described here. Tomcat clusters set themselves up? Postgres? Hell, I think default shared buffers allocation is 32 MB...
> Anything that streams bytes
I'm not sure what the author is trying to do. Doesn't sound like queuing to me...
The documentation is terrible, so I'll give that up. It is unfortunate that there is so much experimentation involved in setting it up. ActiveMQ is immensely versatile and configurable, and it's sort of necessarily complex to get it perfect for your task. It's not a turn-key software; it's a systems architecture component.
The part about Tomcat certainly isn't true, unless the author is using a different distribution than what you get from the project's website. The Tomcat startup scripts do not set any heap size paramters; until recently the default Sun JVM behavior was to set a max heap size of only 64MB. This logic changed a few versions ago to use a fractional amount of the total memory size on the machine, but for dedicated hosts this is still rarely the setup you want.
Perhaps the author is using a packaged version of Tomcat from his OS distributor.
Yeah, I don't know what else it could be. It's up to the distribution maintainers then. It's a straw man. ActiveMQ does just fine. I have a very high volume implementation that I haven't touched in almost 2 years.
That it's not production ready is pretty big claim. I don't want my clients reading this and getting all jittery on me.
Well, if every piece of enterprise software worked right out of the box, consultants would be out of business.
I have to admit that I'm not a big user of ActiveMQ, but I've used a closed-source equivalent from a slightly bigger vendor (IBM) for many years. It also is not configured correctly out of the box for most enterprises. When new features or platforms come out they are often buggy. The documentation though is usually decent until you really want to dig into the internals and understand it.
Ah, dear old WebSphere MQ. Generally awful, bad high-availability story unless you invest in heavy-handed clustering at the OS level. Great for enterprise integration if there's a mainframe around, I suppose. Documentation: quantity over quality. They do have a lot of documentation to show for the cost of the product. Written in C++, so more memory-efficient than messaging middleware written in Java.
Excuse my ranting -- I'm a middleware system administrator and WMQ is the necessary evil at my workplace.
Our Tomcats definetly didn't check for file handles, or run out of the box. The same goes for our NGINX. Can't talk about PostgreSQL, but MySQL certainly isn't delivered in a high volume configuration either.
Yes, streaming doesn't sound like queueing, but both JMS (and many vendors implement JMS) and AMQP support some form of streaming. Of course, in combination with high availability, YMMV.
We use ActiveMQ in production. I couldn't agree more that it is not ready for prime time. We've had a pretty bad experience overall. Man I wish we used Amazon SQS...
We have had incidents where it's Master/Slave replication continues on failure when configured to shutdown the master and slave brokers. It doesn't achieve the consistency it lists as a feature, and provides zero visibility of progress. I feel many of the features the project lists are incomplete, and are listed as a grasp at straws to have the bigger list.
Due to the poor quality of the replication, we had to invest a tonne of time to implement replication at the block device level with DRBD+GFS2, complicating the system and (benchmarks pending) likely decreasing performance, all to achieve a 'feature' ActiveMQ boasts.
I would also challenge a project to have worse, more incomplete documentation.
At the same time, I thank the ActiveMQ team for their contributions and ask the community to not turn it's back on this potentially decent solution. The ideas are solid. It isn't production ready, I feel, but that just means it needs some love.
I've now been involved in two projects where I would have needed a queue with following characteristics:
- simple pub-sub, queue-like: m producers, n consumers
- decent performance (order of hundreds of requests per second)
- high availability for message persistence (I don't want to loose messages)
- no strict FIFO needed, just some kind of lesser fairness (newer items should not block older ones from passing through)
ActiveMQ with active-backup setup over shared disk mount is the current choice, and the start-up is really slow if the queue has a lot of data.
RabbitMQ does not persist messages in HA fashion, so I've ignored it so far. Maybe HornetQ needs some attention.
I see a lot of flexibility and feature-richness in the queue landscape and it perplexes me that getting this simple combination of basics right is so difficult.
RabbitMQ itself doesn't have HA built-in, but warm spare HA is rather simple and well-documented, built on-top of Pacemaker and DRBD. It works quite well.
As a counter point, I have used ActiveMQ for a pub/sub architecture and it worked great (millions of messages a day, which was benchmarked as only about 10% of potential on a single broker). We did treat it as an API rather than a deployable component though. We wrapped the broker in our own service architecture (you can instantiate a broker like any POJO) and disabled persistency. Queues were used only very lightly. So maybe we dodged a bullet there... :-).
There is a lot of buzz about open source queue systems. I have tried for example the RabbitMQ with very bad results. I'll rather explain... one interesting concept about queues is contention and this is where RabbitMQ and others are behind. The strange thing is that contention is an old concept from the old mainframes.
For example the last time I checked you can't block a produced in RabbitMQ based on the number of messages on the queue.
If you have any questions or problems involving RabbitMQ please email us (info@rabbitmq.com) or post to the mailing list.
"one interesting concept about queues is contention and this is where RabbitMQ and others are behind"
Can you explain what this means? What kind of contention are you talking about. You say "RabbitMQ and others" - which others? Who implements this feature and what does it look like?
"you can't block a produced in RabbitMQ based on the number of messages on the queue"
Yes you can.
Well - it depends on your use case. RabbitMQ enables you to determine queue length, and supports flow control.
- RabbitMQ has a memory based flow control based on the total memory you have in the machine. I know I can develop something over RabbitMQ to accomplish what I want (limits based on number of messages) but I prefer to have the support within RabbitMQ.
- WebLogic JMS can block producers based on number of messages as a flow control method.
I don't know what WebLogic can do for flow control exactly. Please email rabbitmq-discuss to get a definitive answer and discussion on what you can do with Rabbit, which supports various mechanisms.
I had a lot of trouble with the network-of-brokers feature in ActiveMQ a few years ago, which in the end forced us to switch to another solution.
In a later, simpler, project I had issues with ActiveMQ locking up after a certain number of messages, which was solved with an upgrade.
On a whole, I think the quality of ActiveMQ is lower than other open source projects with similar brand recognition, e.g. other popular Apache projects.
sorta reminds me of the time I built something in 2001 which was a reactive system that was sort of a web crawler but also a bit more... seems demented looking back, but I used qmail as a message queue system for it.
Yes - I've been seriously considering XMPP for this purpose. I haven't seen any serious reason not to consider it other than some people's dislike of IM protocols.
BTW - David Pollak didn't talk to any of the ActiveMQ team at FuseSource - else we would have helped him out - he didn't speak to any of the execs either. Its a shame he doesn't allow comments on his blog.
I'm not too familiar with the message queue use-case, but perhaps this is something you could do with ScalienDB, which supports replication and failover. Disclaimer: I wrote it.
A message queue is like a database except people can open a socket and say "notify me when there is a new value for this key" -- and you can chose to let multiple people get in line for such updates and then wether you want to let everyone know or just the first person or so-on. you can implement all of this on any db with polling and locks, but that is not an efficient way to do things.
a lot of what is listed here are all standard caveats of things one needs to consider in just about any heavy enterprise-java environment.
i think you could sum this up that he simply wasn't working in a situation that warranted the division-of-labor and configuration flexibility that the product offers.
activemq should offer higher level guarantees about reliability. zeromq is simpler. you can argue that activemq has things you will probably end up implementing yourself on top of zeromq, or that zeromq has less to get wrong...
another way of saying the same thing, which illustrates the cultural differences:
- activemq is intended to be used in "the enterprise". it tries to implements a logical ideal, which is a reliable infrastructure that services can use without being coupled to each other - without worrying about whether messages were received, or exactly who they go to. to reduce complexity it uses a central broker (so you send messages to a central "hub").
- zeromq is intended (imho) for programmers that want to wire things together. it's less concerned with abstractions and more with providing something simple clear, simple and flexible that can be understood and used well. to reduce latency it uses direct connections between peers.
from that viewpoint, you can see that the two are both orthogonal and yet similar... (disclaimer: i haven't used either, but i used work on an ESB so have a vague grasp of what's going on. please someone correct me if this is wrong - i might as well learn as i lose karma ;)
ps rabbitmq is somewhere in the middle and was (i think) originally more performance-motivated (i believe it's used in finance for example - when speed might be critical).
RabbitMQ's main motivation has been to make it easier to join systems together, scale your applications and manage complex environments. That is what messaging is for. Back in 2006, we felt there was a need for a good, stable and scalable open source licensed product that could compete with the incumbents.
Notice that I did not mention performance. RabbitMQ has good performance and it is used quite a lot in finance, but the majority of users are what you might categorise as "anyone using MySQL or Postgres".
Re activemq vs zeromq, I recommend reading "broker vs brokerless" on our blog.
"it uses direct connections between peers" -- this is kind of true.. the guide suggests that ultimately you will want a broker (what it calls "devices") to intermediate producer-consumer interactions.
Activemq is a program that you can install and run, of a type often called message oriented middle ware. Zeromq is a library that provides abstractions on sockets. Basically they are unrelated.
zeromq does not persist messages so you can lose them and need to adjust your infrastructure around this. If you are used to network programming that should be fine, just build persistence in to your protocol where you need it, ie reply after you have committed to persistent storage.
It persists messages in one direction. Imagine you have a publish subscribe situation where you have a few apps subscribed to a publisher. If one of those apps goes down, the publisher will make a note and store the messages until it comes back. When the app comes back, it will get all the messages it missed. It's really a wonderful thing if you were going to implement peer-to-peer pub/sub yourself.
It's also nice in applications like Mongrel2. Make an http request, in your browser, start up your web server, and the request completes! Great for development, where sometimes you switch to the browser faster than your app can restart after a change. With Mongrel2 (and thanks to 0MQ), you don't even notice the race condition.
ZeroMQ can persist messages to disk in high water mark situations, so once it hits memory pressure it can put it to disk, but it won't read from disk on crash...
We ended up writing a new queuing system from scratch in a month, and in its first week it was already more stable, performant, and bug-free than ActiveMQ.