Does anyone have experience running Flask at scale? I used it once, for a small ...

foob · on Sept 10, 2016

You would typically use something like nginx as a proxy, in conjunction with uwsgi for managing a number of workers, and then offload slow operations to a task queue via redis or something similar. Caching obviously helps if it's applicable and it's also easy to expand to multiple servers with a load balancer. Websockets are a little bit more complicated but definitely possible.

As a side note, the synchronous request processing is more a consequence of Python rather than flask itself. I've personally found that I build more scalable things in Python, compared to something like node, because it lends itself to scalable architecture decisions. You can do a lot of things in node that are super convenient when you have a single server but that require major changes when you expand.

spdionis · on Sept 10, 2016

> I've personally found that I build more scalable things in Python, compared to something like node, because it lends itself to scalable architecture decisions. You can do a lot of things in node that are super convenient when you have a single server but that require major changes when you expand.

That's one of the major plus points of PHP too.

ntenenz · on Sept 10, 2016

Do you mean something like this?

https://www.youtube.com/watch?v=tdIIJuPh3SI

https://github.com/miguelgrinberg/flack

https://speakerdeck.com/miguelgrinberg/flask-at-scale

happyslobro · on Sept 10, 2016

So, move expensive operations outside of the web processes. Looks like the Flask community has worked out good ways of doing that. Seems legit, answers my question :)

ddorian43 · on Sept 10, 2016

If you want async, use gevent or other async implementation and you'll get ~same as node. If you want parallellism, you'll need multiple processes. Every framework in whatever language should move expensive operations outside of request-response.

doh · on Sept 10, 2016

I ran a small site that was written in Flask and received couple of millions requests a day without any problems.

I used Nginx as a HTTP proxy and gunicorn for WSGI. It worked decently well, although it was pretty resources intensive (CPU, RAM).

Today you can use something like Caddy[0] and Gunicorn[1]

[0] https://caddyserver.com

[1] http://gunicorn.org

dxdstudio · on Sept 11, 2016

We use flask backed by apache mod_wsgi running in containers or VMs at work for our micro Services and it's quite performant. I've run stress tests with this setup using freeBSDs siege command (awesome btw) and 1000+ concurrency and it ran without a hiccup.

I can't say at what level it would break, but I'm confident it would perform as well as django (and it's designed to be thread safe so assuming your following the 12factor app principles And use the process model/attached storage for persistence running elsewhere, you can run multiple threads, apache vhost multi threading notwithstanding.

bobwaycott · on Sept 10, 2016

Wait, Caddy supports WSGI now, or you're replacing nginx with Caddy and using gunicorn for WSGI?

doh · on Sept 16, 2016

Using gunicorn (even with Nginx).

noselasd · on Sept 10, 2016

flasks builtin web server is for debug/development use. Run your flask apps under gunicorn, twisted web or any of the other supported servers in production.

happyslobro · on Sept 10, 2016

Yeah, I was running it under gunicorn. But that still leaves you with a small number of concurrent connections. If you want to have long connections, for something like long polling or websockets, then being limited to one request per CPU core seems a little sketchy.

Mind you, it could just be that I take high concurrency for granted. I build most web stuff on Node or Clojure, but now that I think about it, apps that require long quiet connections are actually not the norm.

est · on Sept 10, 2016

> for something like long polling or web sockets

Are you imagine this scenario or you are actually using Flask for long polling? What's your Flask websocket setup looks like?

FYI Flask internal does not stop you from using thousands of threads of greenlets to process concurrency. And web request-response model is embarrassingly parallel on multi-core. Just spawn one worker per core.

For a simple API service if you can not handle 3K rps per Flask instance you are doing it wrong.

wumpus · on Sept 10, 2016

Python does high concurrency just fine, this example just isn't using it.

  from gevent import monkey
  monkey.patch_all()

Steeeve · on Sept 10, 2016

I have used Flask at significant scale for REST api requests.

I haven't done web sockets with it - what work I've done with sockets has been in Node.

For building REST apis, it doesn't get easier, IMO. It's very straightforward, it scales well, and it's simplicity makes troubleshooting a reasonable task.

It's appropriateness for slow requests may be questionable, but before spending too much time on a more robust solution it's worth looking into why requests are slow in the first place. Cacheing, message queues, etc. are easy solutions to implement. Data store optimization is generally a quick and easy win that should be done regardless. When it gets to the point where python is the limiting factor it's easy to replace because the client facing front end is generally a proxy.

asimuvPR · on Sept 10, 2016

I run some flask microservices and the key has been to use gunicorn and nginx. It scales up very well. Its not as quick as some other Go microservices I have but Python has an advantage in terms of libraries for my use case. Flask is simple and that helps to keel things under control.

Tech1 · on Sept 10, 2016

IIRC, flask deploys straight to AWS's elastic beanstalk (gunicorn) with minimal configuration, I even think it's given as an example in EB's docs. I've deployed an EB hosted flask instance (2 ec2 instances behind a load balancer with zero issues) in production.

rcarmo · on Sept 10, 2016

Nobody ever uses the built-in Flask server for production. The common deployment pattern is to load the Flask app into a WSGI or asyncio serve, which will then handle requests with scalable threading/process models.

Look into uwsgi or gunicorn, and you'll never look back :)

pmalynin · on Sept 10, 2016

We've made the mistake of using Flask in production...now we're trying to witch to aiohttp as fast as possible.

WSGI is partly to blame.

A web technology derived from the '93 has no place in 2016.

ageofwant · on Sept 10, 2016

The problem is not Flask. Try to understand the technologies you are working with. If you cannot make Flask work for you chances are you won't be able to make it work with aiohttp/whatever.

Flask is used in production in sites serving thousands of requests/second.

philtar · on Sept 10, 2016

Can you say more? I'm interested.

wumpus · on Sept 10, 2016

Async flask can do quite a bit of work with a single thread.

fiatjaf · on Sept 10, 2016

Use gunicorn for a simple solution.

oneloop · on Sept 10, 2016

It is supposed to process a single request at a time. You want more, instantiate several workers.