Would be nice to know why RabbitMQ/ZeroMQ didn't fit the bill [scale/speed?] -- I can somewhat understand the not relying on SQS but then there is anyways a reliance on EC2 right ?
I thought the same thing about Rabbit. PinLater's ACK system feels like a questionable reinvention of AMQP. I'm probably missing something.
Rabbit's advertised throughput doesn't reach 100,000/sec, but you should be able to shard your way around that. Or use Apache Kafka, which has the sharding built in.
Precisely, this is the same query I had when I glanced down the post. It will be great if the author or someone from the team respond to the various trade-offs and why they did not go ahead with Kafka ?
Looks like they needed to have built-in support for time based scheduled execution of jobs at a specific time in the future. Rabbit MQ does not have that (that I'm aware of) without doing some horrible hacks around one queue per message.
I would be interested if anybody could do a comparison between that PinLater and Celery. Feels like there is a bit of NIH involved, unless it's the 100K/sec rate is not something that celery could realize and PinLater can.
I have yet to dig deeper, but the fact that it uses Thrift and already has worker implementations in C++ and Java is a huge plus. Celery is a Python lock-in and makes it non-trivial to split your enqueue code and worker code. Celery on the other hand, since it is Python all the way down, let’s you orchestrate the execution of dependant tasks. In my experience these features contribute to a more complex codebase that is harder to maintain, test and understand.
Celery has flexibility for message backends, but prefers RabbitMQ which comes with its own operations overhead, and does not have real horizontal scaling built in (it has failover techniques). RabbitMQ is great for mission critical messaging that survive system restarts.
My biggest worry is relying on Redis or MySQL. What happens if workers stop consuming tasks, wouldn’t that balloon the redis memory consumption pretty quickly? I suppose memory consumption is a factor of load and task bytesize. MySQL on the other hand feels like the wrong tool for this job.
Celery can very well handle 100k tasks/sec. Celery is not locked in to Python, but you are right it's not a drop in solution for other languages as it doesn't support them natively (yet, but that will change soon).
You can write simple workers for Celery in other languages too, and there is one in active development for Java. But then Redis is not really a good choice, AMQP is more convenient for interoperability.
Have no idea if pinterest evaluated Celery, because they have not been in contact. If they did there are several pitfalls they could have gotten wrong at this scale, but I'm pretty confident it would have been more than suitable. It could have been a benefit for all of us and chances are they have underestimated the effort needed to maintain a custom solution.
Thanks all for the comments about the PinLater post - great to see this discussion! Wanted to clarify a few things:
* Yes, we did evaluate a few open source solutions before deciding to build in-house. We couldn't at the time find any solution that met all the requirements that we had: scale, scheduling jobs ahead of time, transactional ACKs, priority queues, configurable retry policies, great manageability (e.g. ability to rate limit a queue online, inspect running jobs, review failed job stack traces). [update: also support for clients/jobs in multiple languages: Python, Java, Go, C++]
* Note that we were building a v2 system. Not improving substantially on the existing system on multiple dimensions would not quite make it worth the cost of doing large scale migration of tasks. We had to meet all the requirements, could not compromise on missing a few.
* Sorry I didn't mention explicitly in the post, but we do plan to open source PinLater, hopefully later this year.
Gearman is a very simple queue that is even easier than RabbitMQ (in my opinion). You create worker threads, and submit jobs to the server. The server sends the job to a worker thread, handles re-queuing it if it fails, etc..
Basically these guys 'invented' the queue.
Also, the problem of acknowledging success without blocking can be solved without queues - by programming in async environments like NodeJS
kind of. It only re-queues if you drop the socket, I believe. Just sending an error status will not have the message re-queued sadly. I've looked into this, but everyone seems to be doing that manually.
Sounds like a perfect fit for beanstalkd[1]. It has scheduled jobs, explicit ACKs, priority and some other nice features (time-to-run, write ahead logging, job burying/kicking, queue pausing).
We have been using for years to process 1+ million/jobs per day without a single issue.
I was waiting for the big reveal of the solution, but it was this:
"While traditional messaging systems tend to use one of the models described above ("broker" model in most cases) ØMQ is more of a framework that allows you to use any of the models or even combine different models to get the optimal performance/functionality/reliability ratio."
Brokers are needed if you want queues to persistent across consumers. ZeroMQ is a multicast system; if no listener are running when an event is produced, the event is thrown away. Great for real-time RPC, not great for a task-scheduling system.
A broker is not suitable for all systems, but for this type of system going broker-less will quickly get complicated, and you're likely to end up with something resembling a broker anyway.
It doesn't look that way - it's not mentioned in the blog post neither on Pinterest's Github page.
Therefore, I don't really understand the purpose of this blog entry. Yay, they built something. What's the use, if it's not published and usable for others, too?