

Deploying Python Without Downtime - philipcristiano
http://philipcristiano.com/2013/06/27/python-gunicorn-deployment.html

======
jonny5532
A neat trick with uWSGI is to dynamically set the number of worker processes
to 0, which causes incoming requests to hang whilst waiting to be processed by
no-longer-extant workers.

As long as you can apply DB migrations before the waiting requests timeout
(and then set the number of workers back to something sensible) you can
perform quite major upgrades without even dropping connections.

~~~
otterley
Does the master process actually accept() connections while the workers are
being restarted? If not, clients will eventually fail to connect after the
accept queue length has been reached. This can happen very quickly as accept
queues tend to be pretty short (in the hundreds, if not less).

~~~
philipcristiano
It doesn't accept, not sure what the queue length is, either 100 or 1k as a
guess.

------
dkuebric
One thing worth noting about these rolling restarts that I didn't see in your
post: if the new code isn't completely backwards-compatible, you can end up
with bad states from having a mix of workers running. This negates a lot of
the value of the rolling restart because it creates other failure modes.

For example, if you introduce a new ajax endpoint in the release, and a client
hits a new worker generating a HTML page that calls it, but 90% of your
gunicorn workers are still serving the old version of the app, 90% chance that
you're going to 404 that request.

~~~
philipcristiano
That's definitely an issue and you have to weight the time it takes to work
around that with how long you mind the service being unreachable.

A new AJAX endpoint is pretty simple to work around with 2 releases, one to
add the endpoint and another to use it. Most changes probably aren't that
fortunate.

------
buttscicles
Rather than killing Gunicorn's child processes, I prefer to send SIGHUP to the
master process.[1]

It's as simple as

`pkill -f --signal HUP "gunicorn: master \\[procname\\]"`

[1] [https://gunicorn-
docs.readthedocs.org/en/latest/faq.html?hig...](https://gunicorn-
docs.readthedocs.org/en/latest/faq.html?highlight=HUP#how-do-i-reload-my-
application-in-gunicorn)

~~~
philipcristiano
The issue with that is that it tends to stop all the workers then start them
again, not one-by-one. Since some applications can take a minute to start that
leaves the socket open but won't start accepting connections until the workers
start again.

