Hacker News new | past | comments | ask | show | jobs | submit login

He suggests the following message queues:

http://www.rabbitmq.com/ - first version 8th February 2007, but the first version not to have "alpha" or "beta" status was 1.5.1 released 21st January 2009

http://memcachedb.org/memcacheq/ - version 0.1.1 released 26th November 2008

http://www.ejabberd.im/ - not really a message queue

http://xph.us/software/beanstalkd/ - first public release 11th December 2007, hit 1.0 28th May 2008

http://activemq.apache.org/ - not sure when it was first released but the mailing list goes back to December 2005

The first public release of Starling (Twitter's first custom message queue) was 10th January 2008. Presumably they had it running internally for a while before they released it.

From this, we can see that when they built their own pretty much the only realistic open source option was ActiveMQ, which can hardly be described as a light-weight solution (not to mention it still doesn't have a stellar reputation under high loads). When the alternatives aren't rock solid yet, rolling your own (where at least you understand all of the code and how it works) seems like a perfectly practical alternative.

While it is possible that Starling had been running internally before its release, this does not excuse overlooking rabbitmq. A software package that had proven itself in real-world scaling and been designed by people with real experience in the problem domain (c.f. the financial services world) is going to be much better at "alpha" or "beta" quality than Starling is going to be even after the twitter devs hammer at it for a couple of years. The twitter devs were starting from scratch, writing something that other people out there actually had some experience with, and decided to not take an existing solution and fix/adapt it to their needs.

I take issue with the idea that the financial services world has real experience in Twitter's problem domain. My experience with the financial services world is significant technically, but casual in a career sense. That said:

I think hi-fi devs make lots of stupid decisions in the name of performance. In the few cases where their actual outcomes match up to their posturing, it's because their code is obsessively cobbled around one specific use case they've been working on since 1989.

Have you ever read an order management system, or looked at Tibco Rendezvous on the wire?

Most of the hi-fi companies adopting MQ are built around straight AMQ, and bare-metal performance was out the window long before they bolted their crappy WebSphere app onto it. What these companies are looking for is predictability, not performance, and their problem sets are much simpler and most stable than Twitter's.

RabbitMQ was less than a year old, and significantly more fully featured than what Twitter needed. Fixing bugs in that would be a whole lot harder than fixing bugs in 1500 lines of code they wrote in a language they knew.

Yes but what's the rationale for doing it now in scala?

At a guess, a few reasons. Firstly, they had everything else written against a message queue with particular behaviour - so better to upgrade that queue than switch to a completely new one and have to rewrite everything that interacts with it.

Secondly, after running a custom message queue for well over a year they know EXACTLY what they need from one, so writing their own still makes sense.

Thirdly, if they're going to start moving other core bits of Twitter infrastructure to Scala it makes sense to try it out with an important piece of the puzzle that they thoroughly understand first.

And finally, Twitter's core competency is delivering messages. As such, it's really not so extreme to use their own software to do that - they work at a high enough scale that they need to be experts in whatever solution they are using.

As Douglas Crockford once said, "The good thing about reinventing the wheel is that you can get a round one.". The fact that Twitter's reliability over the past 6-12 months has been a huge improvement, despite the enormous growth the service has seen (it's mentioned in the mainstream media all the time) would suggest that their decision to roll their own message queue paid off.

It's also worth noting that the one queue Arcieri proposed might have been interface-compatibile --- MemcacheQ --- is a single-developer side project written in C. Arcieri is "completely confused" by the fact that that Twitter didn't adopt this as the core of their service.

This is a straw-man argument. MemcacheQ is a straightforward mash-up of two very stable software stacks.. memcached and BerkeleyDB. The lines of code to accomplish it are trivial. So what.

It's 4000 lines of C code, not counting headers. Try again.

No.. the bdb.c file is 800 lines of trivial near-BDB example level code. The rest of it is from memcached core. Either way it's solid. Read the code.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact