
Celery 4.0 - mlissner
http://docs.celeryproject.org/en/latest/whatsnew-4.0.html
======
parhamn
Celery is one of those things in Python that you can't (sometimes
unfortunately) live without. Earlier versions of Celery have some difficult
bugs and inconsistencies that made it feel like a very tough tool to work with
and required a lot of developer diligence and operational experience to make
sure your tool didn't break. Things like message memory explosion (multi-pass
deserialization), poor defaults, difficulties debugging and tracking
exceptions, poor monitoring tools like 'flower' etc. all lead to this.
Problems were exacerbated by the fact that simple async operations in python
(which are easily fixed in more concurrent languages with a simple go
func{}()) end up requiring a heavy distributed solution like Celery (or the
lighter RQWorker) which creates a whole host of issues.

I imagine as native async tooling improves in py3x (async/await, aiohttp, and
other tools) use of celery to do trivially concurrent things will decrease and
Celery's usage will focus on more complex workflows (chords, fanouts,
map/reduce).

Looks like many concerns we're tackled here (thanks Celery team) and I'm
looking forward to playing around with this release.

~~~
andybak
I've managed to live without it. For low and medium traffic sites it's hugely
over-engineered. For all the sites I manage, I run a single cron task that
triggers a range of background jobs. It's an approach worked very well for
nearly 10 years.

I understand there's a genuine use-case for Celery but like many technologies
people are told "if you need a task queue use this" when there are much
simpler solutions that are more than good enough for most.

~~~
asksol
Celery was never meant as a replacement for cron, it was simply a nice bonus
that fits the messaging pattern well. Writing a task queue is actually very
simple using for example Redis, but that doesn't necessarily mean Celery is
over-engineered IMHO. It's very easy to forget the support required once your
system is in production.

Disclaimer: I'm a contributor

~~~
andybak
I'm not saying Celery is over-engineered in general. It's just over-engineered
in the context I've often seen it recommended in. i.e. for people learning or
people with fairly modest requirements.

~~~
collyw
Yep, I first asked on Stack overflow the best way to achieve a background task
in a Django app, and the answers all said Celery. Considering it got run
around once a week, it was a very over engineered solution I ended up with.

Its useful to know Celery, and gets used in a proper context in my current
work, so I guess learning it wasn't a waste.

------
ceronman
"Starting from Celery 5.0 only Python 3.5+ will be supported."

This is a small detail, but I'm glad that one more big Python project is
dropping 2.x support. The 2/3 split is not good for the community.

~~~
alexhayes
Agreed!

------
ben_jones
IMO the secret ingredient to scaling almost any Python web application, or
even implementing it in the first place. It's a shame they had to curtail some
features due to lack of funding.

Edit: Apparently they curtailed some features for simplicity as well.

~~~
asksol
Thank you for the kind words, it's so very appreciated :)

I have merged many features, like broker transports, result backends, etc, and
while the initial contribution was great, it ends up being unmaintained with
issues that nobody fixes.

If there's any feature that you really want back, chances are the problems
with that feature are not super difficult to fix, so please reach out!

~~~
ergo14
Asksol we use celery at AppEnlight extensively and love it. Thank you for your
great work.

------
stuaxo
On my last project I had to use Celery. In the end, I had to look inside the
source... having done the same with Python + Django I was apprehensive.

In the end I shouldn’t have been, it was a pleasant surprise. It's great 4.0
has come out.

Almost every Django developer will touch celery at some time, if your
organisation can support development please do, it's not just an important
piece of software, but well put together too.

------
cwyers
> Nowadays it’s easy to use the requests module to write webhook tasks
> manually. We would love to use requests but we are simply unable to as
> there’s a very vocal ‘anti-dependency’ mob in the Python community

I'm not a heavy Python user, and I've never heard this before. It sounds...
less than good.

~~~
ryankask
The author may be referring to libraries themselves having dependencies. These
attitudes are changing. For example, last week the Django community published
a draft proposal that declares that "Django can have dependencies". The
"Background" section is a good read and helps explain the origins of these
attitudes:
[https://github.com/django/deps/blob/master/draft/0007-depend...](https://github.com/django/deps/blob/master/draft/0007-dependency-
policy.rst#background-and-motivation).

------
fuhrysteve
Congratulations to @asksol and the rest of the team for sealing the deal on
4.0! I've been waiting for a number of the features now in 4.0 for a long time
now. I know you guys have been busting your asses for a long time and juggling
with some complicated dependencies along the way.

Some time ago I built a little celery addon library as a sort of experimental
way to solve the problem of having dynamic celery beat scheduled tasks. I
never ended up implementing anywhere for a few different reasons:
[https://github.com/fuhrysteve/CeleryStore](https://github.com/fuhrysteve/CeleryStore)

I really like the concept behind the old djcelery project. But I don't use
django much these days, and I'd like for it to be more compatible with tools
I've become more familiar with (sqlalchemy / etc).

Do you have any advice for how to approach this? I know 4.0 introduced some
new abilities to add beat entries via API.

------
welder
In case you also run into the depreciation of sending error emails[1] I've had
to switch back to using a decorator[2] to catch unhandled exceptions and
generate error emails.

[1] CELERY_SEND_TASK_ERROR_EMAILS config removed
[http://docs.celeryproject.org/en/latest/whatsnew-4.0.html#fe...](http://docs.celeryproject.org/en/latest/whatsnew-4.0.html#features-
removed-for-simplicity)

[2]
[https://gist.github.com/alanhamlett/dc8cdd4721ea63053f14#fil...](https://gist.github.com/alanhamlett/dc8cdd4721ea63053f14#file-
tasks-py-L14)

------
kkirsche
Just started using this for the first time. It's pretty nice. I'm still trying
to figure out how to really use it with Flask/Connexion cleanly and not in the
top level module, but it's made my life much simpler for handling long running
tasks (e.g. When needing a status 202 response). Great project. Sad to see
SQLAlchemy was removed from the brokers though I understand.

~~~
brianwawok
You don't want to use a DB a broker.

For AWS they have SQS which is awesome. Now if someone would write a Google
cloud pub/sub, all would be well in the world.

~~~
pdpi
Hum, Google Pub/Sub seems roughly equivalent to SQS to me?

[https://cloud.google.com/pubsub/docs/overview](https://cloud.google.com/pubsub/docs/overview)

~~~
brianwawok
Yes but no one wrote the kombu broker for it yet to my knowledge. To use a new
broker in Celery you have to write code on how to send and receive messages.

------
sandGorgon
Has anybody compared Celery 4 to rq (which started off as a much
simpler+performant alternative to celery).

Would love to see how they stack up after this release.

~~~
asadjb
We recently moved one of our services from Rq to Celery. We are using an older
version of Celery, but this comment should still apply. While Rq is a _great_
way to go in the start, if you have a large number of messages you are
handling, you'll get into memory bottlenecks. This isn't a problem with Rq,
but with using Redis as a broker. So if anyone is considering Rq vs Celery,
they should keep this in mind.

The reason we switched to Celery was the volume of messages we were handling.
Since Rq relies on Redis, _all_ your messages need to fit in memory. While Rq
was great and simple to setup at start, as we grew we were consistently
dealing with Rq breaking because Redis was full and stopped accepting any
write operations.

We moved to Celery because it could use RabbitMQ as a broker. RabbitMQ
offloads most messages to the disk which has nicely taken care of the memory
limitation issues.

With Rq we would get stuck after 10K messages (our messages included images so
individual message size was large). With RabbitMQ I've seen the queue grow to
about 120K without so much as a single hiccup.

~~~
sandGorgon
That's very insightful. Quick question, since you are doing this in production
- how are you serializing images into a message ? Base64 or something else.

~~~
zo1
Not the person you're asking, but from the experiences I've had with Celery +
other messaging queues: Don't pass around large binary blobs if you can avoid
it. Whether that is an image or something else is irrelevant.

You can do such things as passing a database unique key, GUID, or file-path to
the raw data on disk. Obviously, you will also need to engineer around that if
you've got a distributed system. The tangential benefit is that you're not
using a "messaging" queue or system for persisting or semi-persisting your
image data. That's a big no-no as such systems are transient in nature and
that often doesn't align with binary or image processing.

Base64 is for when you want to pass around binary in a text-based format. E.g.
XML or JSON. But do keep in mind that because of the encoding format,
converting binary data to base64 does increase the payload size by about 15%
or so.

------
fuxi
I have been using celery for years now and I love it. I was excited to see 4.0
with SQS support. However after giving it a spin I run back to 3.x.

Unfortunatelly version 4.0.0 has too many bugs for now. I will need to check
back in few months. Better tests and better mocks would prevent the issues.

------
tschellenbach
Many thanks Asksol, beautiful software. Have been a happy user for many years
now. Hope to try out 4.0 soon!

------
ledil
can you suggest a monitoring tool for celery? I want to view what tasks are
currently working... is flower the only solution? thx

~~~
alexhayes
Have you tried running 'celery events' on the command line? I find this
sufficient in most cases.

~~~
ledil
should I enable something to activate celery events?

------
gcb0
a rabbitMQ client? what's so awesome that everyone here says it is essential
for python web projects? Honest question, never used python for that.

~~~
pandler
I haven't used Django in a while, but, if I'm not mistaken, it's basically the
easiest way to add any kind of asynchronous behavior to a python web server.
There are few other options, and I don't think any of them are as mature as
Celery.

~~~
zo1
The python Tornado web server does have the ability to do async tasks.
However, it'd better be non-blocking otherwise your nice async web-server is
going to stop serving requests while your task is processing.

------
logronoide
Worst experience with a python toolkit ever. I hope new versions fix all bugs
and issues.

~~~
alexhayes
Of course, all other software is bug free...

Honestly this is a complex problem area and I think the Celery developers have
done an excellent job of making it pretty trivial to get up and running while
providing lots of flexibility for more advanced users! Not an easy feat.

Are there bugs? Of course - but I've never come up against one that I can't
work around. Is that annoying? Sometimes but that's software development.

