
Three quick tips from two years with Celery - taylorhughes
https://library.launchkit.io/three-quick-tips-from-two-years-with-celery-c05ff9d7f9eb
======
bjt
In my experience Celery brings more cost in complexity than value. It's got
several abstraction layers to make a variety of backends (Redis, Mongo,
relational DBs) all look more or less like AMQP.

My life got better when I stopped using Celery and instead started using Redis
or RabbitMQ directly. It's not that hard.

~~~
1_player
I have been working on a Django application with two periodic background tasks
to synchronize some data with some hardware, I was using Celery+Redis to
handle the schedulation. The solutions for running cron-like tasks on Python
are not very mature, everybody seems to be using Celery.

Once deployed in production I spent countless evenings debugging why the tasks
weren't running, why upon timeouts tasks weren't terminated, and why the task
queue was getting bigger and bigger. 50% were configuration problems, due to
the _bad_ and confusing documentation, 50% were due to introducing a sheer
amount of complexity with Celery that I often spent my time reading through
the source code to understand what was going on. And Python, with all its
magic methods and abstractions, made that very hard.

One day I just rewrote everything with three Django management commands: two
for the background tasks, one for the schedulation (with multiprocessing). I
haven't heard from the client since.

------
andy_ppp
I've found a few problems with celery myself. It has an absolutely amazing
feature set and seems like a clever and very clean abstraction.

However, when it starts hiding exceptions within your tasks if you use JSON
for messages or if you are having problems with it doing things you were not
expecting... Like randomly not passing information between dependant tasks...
Gevent and celery I have found don't play well together even if you disable
Gevent for the worker.

Maybe you eventually realise that writing your own task for rabbitmq and your
own workers is simpler than relying on someone elses complex code.

Celery will 100% do what you want but it's definitely been a steeper learning
curve than it should be and has lots of gotchas and many hours of hunting
around to deal with obscure issues.

I do wonder if there is a reasonable heuristic, apart from experience, to
learn when a piece of software may be too complex for at least me to use well.

------
rizwan
Forgive me for asking, but how is "no default timeout" a sensible default for
any technology?

------
OrangeTux
And do not use Redis as a broker under a high load. It will random crash or
refuse to set workers to work, while queue is filling up with tasks.

~~~
mooted1
At Uber we use redis as a broker for 10k+ requests per second (pushing 50k one
some days) with up to several million items backed up. This is a single redis
instance with replication turned on and RDB snapshots for durability.

Curious how our use patterns are different that causes what you're seeing.

~~~
andy_ppp
Is that with Celery or are you using your own queue management?

------
simonpantzare
-Ofair disables task prefetching which will affect throughput negatively if most of your tasks run for a short time ([http://celery.readthedocs.org/en/latest/userguide/optimizing...](http://celery.readthedocs.org/en/latest/userguide/optimizing.html#prefetch-limits)).

~~~
mlissner
yeah, I don't get this suggestion at all. They say that:

> This option comes with a coordination penalty, but results in a much more
> predictable behavior

In general, I'd much rather the greater throughput over predictable behavior.
In general, I'm not watching my queue and having it be predictable isn't a
very high priority.

~~~
taylorhughes
Ofair will increase throughput in a queue where you have mixed-length tasks,
because in a default configuration long tasks will block the queue from
starting new tasks. I believe this is because the main process distributes
some number of tasks, waits for them to finish, then distributes another set
of tasks, and so on. (Would love more clarity on this — I haven't dug into it
too much.)

Assuming most Celery task queue work involves network requests that can be
unpredictable, I think Ofair is a more sensible/understandable default.

------
humbertomn
For the retry part, I prefer to use a database table + cron task to do it...
storing failed attempts and making x new attempts in predefined date and
times, not having it permanently on a celery queue.

