
A Celery-like Python Task Queue in 55 Lines of Code - jknupp
http://www.jeffknupp.com/blog/2014/02/11/a-celerylike-python-task-queue-in-55-lines-of-code/
======
m0th87
For an alternative, check out RQ: [http://python-rq.org/](http://python-
rq.org/)

We use it in production and it's been rock-solid. The documentation is sparse
but the source is easy to follow.

~~~
marban
It's great but as far as i can remember doesn't support P3. Also, it'd be nice
to use it with Mongo instead of Redis.

~~~
jmagnusson
Actually Py3-support landed 6 months ago
[https://github.com/nvie/rq/pull/239](https://github.com/nvie/rq/pull/239)

------
agentultra
I'd recommend looking at alternative serialization formats. Pickle is a
security risk that programmers writing distributed systems in Python should be
educated about.

~~~
jonesetc
I understand the risk is basically because you're evaling when unpickling.
What formats are safe then?

~~~
michaelmior
Pickle doesn't really use eval, but there is still the potential for users to
execute arbitrary code[1]. JSON, YAML, MessagePack, etc are safe in this
respect (assuming a well-implemented parsing library) because all the parser
does is convert the data into simple data structures.

[1] [http://lincolnloop.com/blog/playing-pickle-
security/](http://lincolnloop.com/blog/playing-pickle-security/)

~~~
jonesetc
I was just using eval loosely. The eval I particularly meant is that an
__init__ is run for classes with a .__getinitargs__ method defined [1]. And I
guess json et al. is the reasonable answer I should have expected. I was
hoping for something that mimicked the functionality of pickle but maybe
signed the information so that it would be safe to use across a network.

[http://docs.python.org/2/library/pickle.html#object.__getini...](http://docs.python.org/2/library/pickle.html#object.__getinitargs__)

------
cschmidt

        Having a way to pickle code objects and their dependencies is a huge win, 
        and I'm angry I hadn't heard of PiCloud earlier.
    

That's a nice use of the cloud library, without using the PiCloud service.
Unfortunately, the PiCloud service itself is shutting down on February 25th
(or thereabouts).

~~~
jknupp
Ah, I'm sorry to hear that. Hopefully, they make their library code more
readily available before closing their doors.

~~~
caidan
Looks like the PiCloud team is joining dropbox but according to their blog
post "The PiCloud Platform will continue as an open source project operated by
an independent service, Multyvac
([http://www.multyvac.com/](http://www.multyvac.com/))."

Source: [http://blog.picloud.com/2013/11/17/picloud-has-joined-
dropbo...](http://blog.picloud.com/2013/11/17/picloud-has-joined-dropbox/)

~~~
cschmidt
Yes, the Multyvac launch has been delayed until February 19.

They are supporting much of the PiCloud functionality, but not function
publishing, which I used quite a lot. (That was a way you could "publish" a
function to PiCloud, and then call it from a RESTful interface. It was a nice
way to decouple my computational code from my website, which has very
different dependencies.)

I fear it is more oriented toward the use case of doing long running
scientific jobs, rather than short Celery-like jobs. I hope for the best.

------
tonymillion
Although Celery can use it, why is Amazon SQS treated as a second class
citizen in python background worker systems?

I've yet to find/see a background worker pool that played nicely (properly)
with SQS.

------
dangayle
Thanks Jeff. As someone else mentioned, I love these little projects that
demonstrate the basics of what the big projects actually do. Makes it much
easier to understand the big picture.

~~~
dmunoz
Absolutely. I'm always pleased when documentation includes some pseudocode for
what the system generally does, without the overhead of configuration,
exceptional control flow, etc. It's not always possible with large systems,
but makes it a lot easier to see the forest, not the trees, in even mid-sized
code bases.

------
est
[http://docs.python.org/2/library/multiprocessing.html#sharin...](http://docs.python.org/2/library/multiprocessing.html#sharing-
state-between-processes)

Why don't anyone build Celery _and_ Redis alternative using this?

~~~
SaberTail
Celery uses `billiard`, which is a fork of multiprocessing.

That doesn't help with communicating with a distributed worker pool, though.

------
thruflo
I scratched an itch in this space to create, in Python, a web hook task queue.
I wrote it up here [http://ntorque.com](http://ntorque.com) \-- would love to
know if the rationale makes sense...

------
jbaiter
Are there any non-distributed task queues for Python? I need something like
this for a tiny web application that just needs a queue for background tasks
that is persisted, so tasks can resume in case the application
crashes/restarts. Installing Redis or even ZeroMQ seems kind of excessive to
me, given that the application runs on a Raspberry Pi and serves maximum 5
users at a time.

~~~
batbomb
My suggestion is to just use RabbitMQ. It's written in Erlang, it uses a
BerkeleyDB-like backend for message storage. It's non-distributed, "durable",
and optionally with persistent and non-persistent messages. It has a web
interface to examine messages. Second suggestion is to use JSON for your
messaging format, although it's possible for basic tasks to put all the info
you need in the headers.

~~~
ris
I have personally found RabbitMQ one of the most bonkers, overengineered and
painful to manage services I've ever dealt with.

------
rch
I like these one-off projects that Jeff is doing, but it would be particularly
instructive to see one, or a combination, make it to 'real' status.

~~~
jknupp
Checkout sandman: www.sandman.io or www.github.com/jeffknupp/sandman

