
Why do we need Flask, Celery, and Redis? (2019) - feross
https://ljvmiranda921.github.io/notebook/2019/11/08/flask-redis-celery-mcdo/
======
sandGorgon
I see a lot of comments that are talking about how python does not have a go-
like concurrency story.

Fyi - python ASGI frameworks like fastapi/Starlette are the same developer
experience as go. They also compete on techempower benchmarks. Also used in
production by Uber, Microsoft,etc.

A queue based system is used for a very different tradeoff of persistence vs
concurrency. It's similar to saying that the usecase for Kafka doesn't exist
because go can do concurrency.

Running "python -m asyncio" launches a natively async REPL.
[https://www.integralist.co.uk/posts/python-
asyncio/#running-...](https://www.integralist.co.uk/posts/python-
asyncio/#running-async-code-in-the-repl)

Go play with it ;)

~~~
pinkbeanz
I think a hard part with lots of these “what do I use x for” examples is it
starts with the tool and the discusses the problem that it solves. I find it
more helpful to start with a problem, and discuss the various tools that
address it, in different ways.

Forget email, say you have a app that scans links in comments for
maliciousness. You rely on an internal api for checking against a known
blacklist, which follows shortened links first, and an external api from a
third party. You want the comment to appear to submit instantly to the poster
but are comfortable waiting for it to appear for everyone else. What are your
options?

You could certainly use message queues and workers. If you’re cloud native
maybe you leverage lambdas. Maybe you spin up an independent service that does
the processing and inserting into the database in the background, and all you
need to do is send a simple HTTP request on an internal network.

Your solution depends on your throughout requirements, the size of your team
and their engineering capabilities, what existing solutions you have in place.
Everything has its pros and cons. Pretending that celery/redis is useless and
would be solved if everyone just used Java ignores the fact that celery and
redis are widely popular and drive many successful applications and use cases.

~~~
StavrosK
While I agree with the rest of your comment, the sentence "if you’re cloud
native maybe you leverage lambdas" made me irrationally angry.

~~~
CalRobert
Can you explain why? I use lambdas often and they seem to solve the problems
they're meant for well.

~~~
StavrosK
It wasn't the lambdas, it was the combination of "cloud-native", which is a
very salesmany term, and "leverage", which is my pet hate word. It's exactly
as useful as "use", only much more pretentious. I'm just easily triggered with
language :P

More off-topic (or, rather, on-topic), I find lambdas great for things like a
static website that needs a few functions. I especially like how Netlify uses
them, they seem to fit that purpose exactly.

~~~
normalnorm
> I'm just easily triggered with language :P

Me too! It makes me irrationally angry when people regurgitate linguistic
clichés. I was already mad with:

"python does not have a go-like concurrency story"

when it would be enough (and 1000x less cringe) to say:

"python does not have go-like concurrency"

I think these mindless clichés make language really ugly and dysfunctional,
and even worse they are thought-stoppers, because they make the
reader/listener feel like something smart is being said, because they
recognize the "in-group" lingo. In my experience, people get really offended
when you point this out. It's kind of an HN taboo to discuss this. Which is
also interesting in itself.

Going forward we should pay more attention to our communication use cases.
Btw: I wonder if we can stack several of these clichés. For example:
"leverage" \+ "use case" = "leverage case".

~~~
StavrosK
I agree, I think Orwell's "Politics and the English Language" is spot on here.
I try to use simpler language whenever possible, I agree that people think
that using longer words makes them sound smart but is just worse for
communication.

I've found it's a taboo to discuss anything even slightly personal. People are
averse to feeling bad, so criticism needs to be extremely subtle in order to
not offend.

> Btw: I wonder if we can stack several of these clichés. For example:
> "leverage" \+ "use case" = "leverage case".

I hate you for even thinking of this.

~~~
rumanator
> I've found it's a taboo to discuss anything even slightly personal. People
> are averse to feeling bad, so criticism needs to be extremely subtle in
> order to not offend.

The personal association you made between "discussing anything even slightly
personal" and "criticism needs to be extremely subtled" makes it sound that
your problem isn't language or Orwellian discourse but the way you
subconsciously link discussing personal matters with harshly criticising those
you speak with for no good reason.

If your personal conversations boil down to appease your own personal need to
criticise others then I'm sorry to break it to you but your problem isn't
language.

~~~
StavrosK
You just misconstrued my saying "personal" and clearly meaning "personal
criticism" as meaning personal things in general and then criticized me on
that straw man. I don't hold that opinion at all.

You also went to "Orwellian discourse", which has a specific meaning, from a
text by Orwell I mentioned. It seems to me like you got personally offended,
interpreted my comment in the most uncharitable way, and chose to lash out at
me instead, and I'm not sure why. I wasn't even talking about anyone
specifically.

~~~
rumanator
You were the one associating discussing remotely personal stuff with
criticising others, and if that was not bad enough your personal take was that
you felt the need to keep criticising others but resorting to subtlety just to
keep shoveling criticism without sparking the reactions you're getting for
doing the thing you want to do to others.

I repeat, your problem is not language. Your problem is that you manifest a
need to criticize others. That problem is all on you.

~~~
heyoni
That is such quite a stretch of the imagination and I sure didn’t read it that
way. I may be wrong but here’s one fun example from this comment section that
I wanted to “respond” to and demand some clarification on.

[https://news.ycombinator.com/item?id=22911497](https://news.ycombinator.com/item?id=22911497)

PS: I work with like 65-70% of that stack daily

------
waterside81
What's interesting is that McDonald's wait times have actually gone up since
they moved to kiosk ordering, mobile app ordering, Uber eats etc. They've
increased their ability to take orders, but haven't been able to keep up on
the supply side. The old way was almost better in that it introduced a natural
bottleneck so while it took longer to place your order, once you did, the
queue in front of you was shorter.

[https://www.businessinsider.com/mcdonalds-spending-
millions-...](https://www.businessinsider.com/mcdonalds-spending-millions-on-
drive-thru-2019-10)

~~~
skrebbel
This is intentional. In fact, I've seen many McDonald'ses that were
redecorated such that you can't see the screen with the queued/ready orders
from where the kiosks are. This way, you're not discouraged from ordering if
you feel like the wait will be too long.

This is also why McDonald's introduced table service, which is only in
restaurants that have a layout where it's impossible to hide how many people
are waiting. It costs manpower to deliver food to tables, but the additional
orders are worth it.

McD's don't mind if you have to wait, they mind if you leave before you order.
"Busy-looking queue" is a much more frequent problem than "totally-packed-
restaurant".

Source: I like hamburgers + I geek out over stuff like this. I.e., Just
Guessing.

~~~
arrty88
Yet we are willing to wait for chipotle, five guys, shake shack for 20 min m

~~~
darkerside
?

Chipotle is about as fast as it gets

~~~
dillonmckay
Not when the person in front of you is getting 6 different orders.

------
1337shadow
Actually with uWSGI, be in on Flask, Django or else: I don't need neither
Celery nor Redis. uWSGI has a built-in celery-ish spooler, cron-ish task
scheduler, memcache-ish key-value store, along with _plenty_ of other toys
that I love from the deepest of my heart ... And I've been in this (great)
situation for years, not planning to move out to more complicated stacks. I
would highly recommend uWSGI over of Celery and Redis, which I used in the
past, prior to doing it all in uWSGI, unless you have a really good reason
which I'm eager to read about. And now that uWSGI supports a lot of languages,
even if I have some PHP or whatever to deploy I'll go for uWSGI, one of the
most beautiful piece of software I have the chance to use.

~~~
d33
Could you elaborate, e.g. link to documentation, subprojects or example code?
I'd love to get rid of Celery because of how difficult it is for me to tweak
it for good performance.

~~~
darkerside
Would love to see more detail as well because I'm a bit skeptical. From uWSGI
documentation, it looks like to coordinate cron across anything more than a
single server setup, you'd need to configure a Legion, which means you're then
integrating uWSGI's orchestration framework with whatever you're already using
(k8, ECS, etc).

I like minimalism, but sometimes batteries are included for a reason.

~~~
GlennS
I suspect they are using a single server. I found uwsgi's mules and cache2
very useful in that situation.

If someone finds that Redis and Celery are more complication than they need
for a given task, then I think they're probably not using an orchestration
framework.

------
ljvmiranda
Hi, author here! Pleasantly surprised I saw this on HN, thanks for posting
feross! Sorry for the Mcdonalds analogy, it's just that it's really near our
office and I got that insight while ordering McNuggets! Didn't expect it will
cause some divide

Agree, Mcdonalds has definitely upped their ordering game recently. Thank you
and I appreciate all the helpful comments!

~~~
throwaway888abc
Hey, Nice article format for actual humans with easy to digest flow. Will
share with devs. Great work! Thanks

------
andybak
I wish there was a paragraph up the top that made two points:

1\. Quite often you don't (I've built dozens of websites without needing
Celery)

2\. Even if you think you do there's often a much simpler solution that is
enough for most needs (Use cron, spawn a process etc)

Celery is a big, heavy lump of code to add to most websites and it increased
the deployment complexity.

~~~
danpalmer
It’s worth scoping out what your site will need to do up front to some extent.
Spending a couple of days setting up a basic background processing system
(whether that’s celery, rq, or a home grown system) makes it easy to make
great engineering decisions later down the line.

Questions like: should I send this email in-line in the web request? Get a
very easy answer: no, just stick do it later. Sure, sending an email is
probably fine to do in-line for now, but months in you may realise that things
are slow, that you’re sending emails and rolling back transactions later, or
committing the transaction but losing the email that needed to be sent, or all
manner of other annoying edge cases. Queues don’t solve everything, but they
can be an ok answer to a lot of stuff for a long time.

For basic sites, yeah maybe not necessary, but a reliable background
processing system has always been a significant accelerator in my projects.

~~~
1337shadow
Same in my experience: emails should always be sent in a background process.
Luckily for me, using uWSGI to deploy anything in any languages: it builds in
a nice little spooler that's going to let me spool emails without adding a
single new software to my stack: not even having to start another process.

------
procinct
One aspect of this set up I’ve never been able to understand is how the
application then gets the result from the worker? If it’s polling for status
from the backend doesn’t that defeat the purpose of having a worker to begin
with? Or does this set up only work for tasks that don’t need to have the
backend notified about the result so the front end can just poll for the
result via the application?

~~~
adamcharnock
You’re spot on really. Having the front end wait for a background task to
complete broadly defeats the purpose. There are some caveats though: if you’re
using an async/threaded web server then it may not matter that you have a
pending request hanging around as your web server is free to continue serving
other requests. It also may be that you need to run the task on specialist
hardware for some reason.

Really though, I think a lot of people use celery for offloading things like
email sending and API calls which, IMHO, isn’t really worth the complexity
(especially as SMTP is basically a queue anyway). Of course, YMMV depending on
your use case.

However, I find it is often more worthwhile for:

1\. Tasks which take a long time to run

2\. Tasks which need to happen on a schedule, rather than in response to a
user request.

There is an option 3 too, which is for inter-software communication. Eg events
or RPCs, but I found Celery to be very much a square-peg-round-hole for this,
which is why I developed Lightbus
([http://lightbus.org](http://lightbus.org)). Lightbus also supports
background tasks and scheduled tasks. /plug

~~~
doliveira
It's not just about the time the operation takes, it's about reliability. Even
if sending an email synchronously doesn't usually take more than a few
milliseconds, you still need to handle cases like servers failing in the
middle of the request, temporary upstream unavailability, some expired API,
account limits reached, etc...

Honestly, I think that mostly anything that doesn't depend directly in the
current state of your infrastructure should be done asynchronously. I've had a
lot of issues with systems that start up doing everything synchronously:
you'll probably need to refactor it to be asynchronous in emergency mode
during a crisis.

~~~
marcosdumay
For email, that's why you set a relay within your control, that will accept
messages without a fuss and send them around following SMTP conventions.

But anyway, how is your application supposed to respond after any of those
failures? Is it just supposed to ignore the failure and thread on like if
nothing happened, leaving your users on the dark? Is it supposed to reliably
log every task so that it can retry anything that fails and in the worst case
feed failures into some monitoring system/process? Or is it supposed to inform
the user of any success or failure before the user can move on?

Queue software is only a good match for the first. For the second you will
need to roll your own interface with the monitoring system anyway, so it's
much easier to roll your own queues and get control of everything. The third
one is best done synchronous, it doesn't matter the nature of the process or
how long it takes. But funny thing is, I have never seen the first situation
on the wild.

------
nickjj
If anyone is looking for another Celery post that goes over common web
development use cases for using Celery there's:
[https://nickjanetakis.com/blog/4-use-cases-for-when-to-
use-c...](https://nickjanetakis.com/blog/4-use-cases-for-when-to-use-celery-
in-a-flask-application)

The above post walks through sending emails out with and without using Celery,
making third party API calls, executing long running tasks and firing off
periodic tasks on a schedule to replace cron jobs.

There's links to code examples too in an example Flask app which happens to
use Docker as well.

------
doteka
They mostly need Celery and Redis because in the Python world concurrency was
an afterthought. In most other languages you can get away with just running
tasks in the background for a really long time before you need spin up a
distributed task queue. In Python I’ve seen Celery setups on a single machine.

~~~
mattbillenstein
You still should use some sort of work queue - your application process may
need to restart (deploys), or for a period of time, work could overflow the
amount of available resources (bursts), so having some place to put the task
before it goes onto processing is useful regardless of the underlying
concurrency primitives of the language.

~~~
acjohnson55
It may not make sense to retain jobs across deployments. What if the contract
of the job is changed by the code being deployed? Might be easier to keep it
all in-process, letting queues drain in a graceful shutdown.

~~~
mattbillenstein
I haven't found that to be the case typically -- you could always serialize
some information into the task to check for things like this.

Also consider if the machine running that process just disappears and that
process dies. Putting work into a task queue allows you to do it durably until
it can be processed so that it's not lost in some typical "that
machine/instance died" scenario.

------
trboyden
Excellent craftmanship of a helpful blog. Very reminiscent of the style used
by the Head Rush Ajax
([http://shop.oreilly.com/product/9780596102258.do](http://shop.oreilly.com/product/9780596102258.do))
book O'Reilly published back in 2009 and the rest of the Head First series.

------
memco
I think it’s really important to understand task queues and workers but my
experience working with these particular tools isn’t exactly fun. I inherited
a system built on celery, rabbitmq and nameko I’d be interested to hear how
people setup their systems to debug and test new tasks. I’m currently using
manually added psb.set_trace to telnet in to a debugging session, but I’d
prefer to use an IDE so I can modify code while debugging. Anyone have any
tips? One thing I would caution against is putting the business logic in the
task logic. This is obviously up to whoever set up the tasks but it seems like
most of the tutorials don’t mention how painful this can make testing
especially once you start making chains, chords and sub tasks.

~~~
rukittenme
For testing you can set "CELERY_ALWAYS_EAGER" to "True" in your config.

------
dralley
Does anyone have enough experience with alternatives to Celery to give a good
comparison of Celery vs. Dramatiq vs. RQ?

------
timwis
Cool article! But why do you need a database _and_ a message queue? I would
think a message queue is the main thing, and a database is only necessary if
you want long term persistence. Or you could just use a database as a message
queue.

------
bryanrasmussen
I think the McDonald's order by touch screen also has the benefit of seeming
to take less time when waiting because you are not standing in line behind
someone, even if it takes the same amount of time psychologically it seems
less.

------
kantye
Very noice! I am not normally not a huge fan of stick figures (a la
waitbutwhy), but this was very pleasant to read :D And noice work picking an
example I can relate to

------
nurettin
Celery is good for distributed and persistent message queues which can be
monitored. If you just need multiprocessing, use a multiprocessing pool, it
comes with python.

------
quezzle
I only use celery for sending out emails. It’s overkill.

I wonder how many other people have celery just for email.

------
tchaffee
McDonald's is very often a love or hate divide as you can already see from
some of the comments here. You could prevent that distraction by using a
generic food takeout store and ask people to imagine their favorite. The
article will resonate with a larger audience, and the comments will be higher
quality.

~~~
ljvmiranda
Hi author here, definitely didn't intend and expected that! I'm mostly drawing
from my own experience and that insight while ordering food inside Mcdo.

~~~
tchaffee
As someone who also does technical writing, I agree you should draw from your
own experience.

It can be hard to find the right analogy. If the subject of the analogy is
enough of a hot topic, it will get attention itself. As you can see, you've
got comments in here even talking about how McDonald's isn't faster with their
new system, or how they could have better optimized for customers, links to
articles about McDonald's business model etc. Some of the earlier negative
comments about McDonald's were deleted - probably due to downvotes. Since my
advice to other writers was sincere and I believe useful, I'm keeping my
comment.

For sure I'm not expecting you to change your article. Just hoping that my tip
might help you with future technical writing. If not, no worries.

------
alanfranz
And, of course, you need multiple separate components because Python/Flask has
no central "application" concept,there are multiple, stateless processes.

Had you got e.g. a Java app running with a multithread application server
model, you could serve and process everything within a single process. No
Celery, no Redis, no MQ.

This doesn't mean that the above stack has no use. But, whenever picking a
tech, you should understand the use case. The "simpler" Python/Flask solution
has an increased complexity when the task at hand is not simple anymore.

~~~
icebraining
> Python/Flask has no central "application" concept,

They do, it's a WSGI application. Flask has been multithreaded for many years.
Python also has multi-process queues that don't need an extra process.

~~~
alanfranz
WSGI is purely an interface between a webserver and python. What has WSGI to
do with state?

The application server model makes it so there's a running application, with a
state, and which exposes an HTTP endpoint.

> Flask has been multithreaded for many years

So? You run a blocking thread to performing a long-running task in Flask? With
Python? Try, then report what you find.

Flask/Django are mostly designed to work with a stateless approach. Nothing
wrong with that, but it's got drawbacks.

> multi-process queues that don't need an extra process.

More processes usually imply more complexity. And still, since you don't have
a central application with a state, you NEED an extra piece to manage the
result from the queue.

~~~
icebraining
> So? You run a blocking thread to performing a long-running task in Flask?
> With Python? Try, then report what you find.

I did it for years. It works just fine. The GIL is essentially like running an
app on a single core, which works just fine for many use cases. CPU cores are
quite powerful.

> More processes usually imply more complexity.

Right, but I'm only saying you _can_ have more processes without requiring
Redis.

> And still, since you don't have a central application with a state, you NEED
> an extra piece to manage the result from the queue.

A regular thread can do that.

------
pupdogg
With all due respect, my recent experiences ordering at McDonald's have been
nothing short of horrible! As nice as the kiosk is, it has taken the
accountability factor for your order out of the picture. I vividly recall
getting blank stares from employees when asked "how much longer until my order
is complete?". My past few visits (11 to be specific from 11/19 thru 2/20)
have yielded 8 minutes of wait time on average. This is ordering inside the
facility and at 5 different locations. Last I recall, it used to be a lot
faster...I think between 1-3mins tops! I can't say the same for drive-thru
though...seems like any orders from the drive through are always tagged with a
higher priority. I do remember this since I was in my teens in the late 90s.
Though it looked like a mainframe system, McDonald's did have a countdown
timer that would initiate on orders. I'm all for tech and automating
queues...but humans are complex beings and until the gap is bridged, I think
we have a lot more room for improvement!

~~~
jessaustin
Drive-through is higher priority at most restaurants, because the customer can
drive away after ordering but before paying.

