Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Django Deploy Recommendations
109 points by juanefren on Oct 23, 2010 | hide | past | favorite | 38 comments
I have an app that uses Django Python and Postgresql.

Right now is running under Linux/Apache/mod_wsgi, how ever have read about Nginx and Gunicorn (also uwsgi and others) as a better option...

I would like to read your recommendations (advantages and disadvantages) about deploying django.

PS: I am mainly developer so I lack a lot of knowledge about sysadmin.

I recommend nginx in front of gunicorn. This way nginx can sit up front and do what it does best: serve static media files and buffer requests and responses.

Set up gunicorn under some kind of process manager. I use daemontools, but runit, upstart, monit, and others can work very well too.

For updating code on the server, I'm a big fan of keeping it simple, and to me that means writing a small shell script that ssh's into your machine and runs the proper commands to update the code and send a HUP signal to gunicorn.

You can even set up this script as a git post-commit hook, so that every time you push, your code is updated. If you have a robust test suite, you can set up a Hudson instance to run this command when all tests pass.

If you plan to have long-lived connections (comet, many requests out to third party services), then make sure you set up gunicorn to use Eventlet workers. What that will do is transform your code into asynchronous evented code using coroutines. But you most likely won't have to worry about that.

supervisord is also quite nice for managing processes. You can restart individual processes through the supervisorctl command, e.g. "supervisorctl restart my-gunicorn-process". This can easily be rolled into a fabric command.

Beautiful summary of a nice setup indeed. Although, I would run in a virtual environment, don't you think that's a good idea? By the way, Eventlet workers sounds nice - how does that compare to using Twisted for aIO?

Yeah virtualenv is incredible, I use it for everything. Eventlet is comparable to Twisted, but with a slightly different ideology. You really have to try them both and decide which one fits better to your brain.

For updating code, you use Fabric! Nothing else compares.

Self plug: kraftwerk at www.kraftwerk-wsgi.com runs exactly this setup, but automated (even cloud server setup).

Another great tool that hasn't been mentioned yet is Fabric. Combined with a VCS it makes pushing changes to the server an absolute breeze. With the simple fabfile shown below I only need to do "fab update" to push all of the latest code to the server:

  #!/usr/bin/env python
  from fabric.api import *

  env.hosts = ['example.com']

  def update():
	  local("./manage.py validate")
	  local("./manage.py test core")
	  with cd("/var/www/example"):
  		  sudo("git pull origin master", user="www-data")
		  sudo("./manage.py migrate --all", user="www-data")
  		  sudo("touch apache/django.wsgi", user="www-data")

Here's what I've found that helps tremendously:

-Keep your software in a DVCS of some sort. I find Git and Mercurial to be great.

-Use virtualenv to abstract away from your current environment's Python distribution. Start clean and download the packages you need.

-Create a requirements file listing the packages that you need for you program. Put it in the same format that the 'pip freeze' command outputs, so that installing a new environment is quick and easy.

-Set up a local configuration file (not managed by version control) and a base configuration file with all the settings that are immutable. Import the local config file within settings.py so as to avoid local setting conflicts.

-In production, set up an nginx frontend to serve static files, and route all the Django urls to an Apache backend (or Tornado, or whichever App server you may want to use).

-I haven't tried anything else, but WSGI is super intuitive and easy to use.

-If you need it, try using Django-South for versioning database schemas. Do take into account that it has a bit of a learning curve.

-You don't need to put your python files in /var/www, any directory will do.

These are some excellent points, the only things we do differently are:

- We keep a local settings file for each environment, and those /are/ versioned. We have a /settings_local directory which contains each of the variants (localdev/dev/staging/df/live). The appropriate one is sym linked to /settings_local.py, which in turn is imported into settings.py.

- We bypass Apache entirely and just plug Nginx FCGI into Django directly.

- We have a separate pip requirements file for each environment (also kept in source control)

- We use a Puppet to configure our systems. Perosonally though, I have found Puppet to have an exceptionally steep learning curve, so you may want to shop around.

I hope that helps!

I use fabric for deployment, testing, log maintenance, database backups. For almost everything on remote machines.

Learning curve is rather not steep.

+1 for fabric.

If you haven't used fabric, give it a try. It's an extremely simple API for performing remote management, deployment, etc.

I've tried numerous tools but nothing compares to fabric. It is just so easy and it always works. I can't imagine not seeing a fabfile.py in my project's deploy/ dir anymore, it just wouldn't be right.

> - We keep a local settings file for each environment, and those /are/ versioned. We have a /settings_local directory which contains each of the variants (localdev/dev/staging/df/live). The appropriate one is sym linked to /settings_local.py, which in turn is imported into settings.py.

I didn't read your comment (thoroughly) before posting my comment above, so I was parroting your point about organising project settings, however there are a few differences between our approaches. And I'd be interested to hear what you think about them. I don't work that actively with django at the moment, and never really had the opportunity to use my above system in a commercial project so maybe (probably) there are some gotchas I haven't thought about!

> -Set up a local configuration file (not managed by version control) and a base configuration file with all the settings that are immutable. Import the local config file within settings.py so as to avoid local setting conflicts.

Have you tried keeping your settings in a package rather than in a module? Most introductions to django use settings.py to keep things simple (works out of the box using manage.py), however there is another (better) way!

You can store your settings in a package instead, so you have a settings directory, and inside this you have modules which represent a configuration. So you could have settings/development.py rather than settings.py, this just means you just change your django settings environmental variable to point to the correct configuration for each machine.

There are a few perks to managing settings this way, first you can extend existing settings (for example if you wanted to have some default settings that all configurations use, you could have a settings/shared.py and then do a `import * from .shared` in each of your configurations). Which means you have access to all the existing settings so you don't have to repeat yourself if you say want to add some middleware, or an application (think debug toolbar.)

And another benefit is that you are able to manage your settings through your version control system. Which I understand is not always ideal, however my guess is for most private projects this will be the best way of doing things! It also just seems to me like a much more pythonic way of organising your settings. (You can also do this for urls, however there's not as much benefit there, given that urls probably won't change from machine to machine that often.)

I have recently moved to a settings package like this and it has opened up a lot of doors in terms of workflow.

I run nginx as a static file server and reverse proxy to uwsgi (via a local socket). I find the setup to be simpler than Apache/mod_wsgi, though both approaches meet my performance needs without any trouble. I run PostgreSQL as my database. (This is on a reasonably low traffic, but fairly data heavy site. It all runs comfortable on a Linode 512)

I'll just reiterate what Daishiman said regarding DVCS (like git or mercurial), virtualenv, and South. Use them! All three will make your life much easier 6 months from now.

http://www.djangy.com - in private beta, by wednesday we're hoping to invite several hundred more users.

if you're in a time crunch, email me (dave@djangy.com)

EDIT: email

Having had a quick browse it looks like Heroku for Django, which would be absolutely awesome!

Are you using EC2 behind the scenes, or something else?

You mention database support, but I couldn't see anything about specific vendors. Will there be support for spatial extensions? I'm specifically thinking about Postgres and PostGIS.

Looks like a great project though - I've signed myself up for a beta invite!

Thanks, we've been getting GREAT feedback and iterating rapidly.

Yes, we're using EC2 on the backend, and it's distributed and scalable. We currently give each app a MySQL database, but we've been getting a ton of requests for other database support, so it's something we're looking into. Unfortunately there are quite a few other requirements to plow through before we get to that particular feature.

Look for updates this week.

Is this the project that was on HN a while ago asking who was interested?

If I remember correctly people were giving the OP a hard time about using sqlite and not turning off debugging in a live environment. This project looks awesome and I'd really love an invite to the beta.

Yes it is the same project, I just looked it up. I was curious about this myself because I remembered being excited before the debug issue came up. Looking back at the thread, he owned up to it and assured that the actual project doesn't make such mistakes. The site does look a lot nicer than I recall, so I'm going to check it out.


The site looked pretty pathetic when they first had it up. It's beautiful now and this sounds like it's going to be pretty legit. I'm very excited.

nginx in front of gunicorn is fast, simple, efficient with resources, and reasonably scalable. Today, there's nothing else I would recommend.

Of course, use [d]vcs, setup.py, virtualenv, pip, and south (and a reasonable test suite!). Using these along with gunicorn as mentioned above has streamlined my deployment to a "one-click" process that leverages all of these packages to create an isolated, repeatable, rock-solid [re-]deployment with nothing more than a generic Makefile.

I've been doing Django development and deployment for over four years now and it is the closest thing to a "silver bullet" I've found.

I was in the same boat a few months ago and found this PyCon workshop by Jacob Kaplan-Moss extremely helpful



I usually deploy it is nginx => Django FastCGI via flup. Make sure to use prefork instead of threaded method. On top of that I run each Django project inside individual virtualenv's [1], where virtualenvwrapper [2] simplifies the process. Since all my servers are Debian-based, I use my own creation called flint [3] to automatically start/stop the FastCGI processes on boot/shutdown. For deploying code I use fabric [4] and then simply $ flint restart_all or $ flint restart myproject to reload the code.

[1] virtualenv - http://virtualenv.openplans.org/

[2] virtualenvwrapper - http://www.doughellmann.com/projects/virtualenvwrapper/

[3] flint - http://igorpartola.com/projects/flint

[4] fabric - http://docs.fabfile.org/0.9.2/


Buildout. Like virtualenv but handles downloading correct required package versions too. gives you an isolated shell and django manager in your project directory. You can then distribute your project to different people and servers and ensure the correct version of python, django, etc. is used.

FWIW, I don't believe Jacob recommends zc.buildout any longer. virtualenv + pip is easier and more Pythonic instead of Zope-like.

He uses Buildout in conjunction with Fabric as recently as http://github.com/jacobian/django-deployment-workshop also linked on this page. I think a fair assessment is that buildout is much more powerful but also more complicated than using pip alone. It greatly reduces boilerplate for spinning up new sites for me now, but I could also see myself switching to creating some snapshots of pip only skeleton projects in the future.

Maybe try http://cloudsilverlining.org (http://cloudsilverlining.org/django-quickstart.html)

Anyway, Apache/mod_wsgi is perfectly fine and I wouldn't bother tweaking out that part unless you have an issue with it.

I'd strongly recommend using a cloud provider so you can test your deployment process. You can do it without a cloud provider, but you probably won't because it won't be trivially easy to create new servers.

Use varnish in front of apache. Why not nginx, lighttpd? Because apache has market share - there's modules for nearly everything. Going farther off the beaten path will just turn into a headache as you'll eventually want/need functionality offered by a application with a larger market share.

We're running nginx proxying to Apache/mod_wsgi on RHEL. I've heard from the Parsely guys that they got their memory usage waaaay down by skipping Apache and just using the nginx equivalent of mod_python - I seem to remember that they weren't sure how well it would scale, though.

Most of the recommendations here are good. Par for the course in Django deployments and administration is virtualenv, Fabric, pip, and per-server settings files.

On top of those, we use Apache, mod_wsgi, Fabric (in a slightly-weird way which I'll get into below), mod_xsendfile, Mercurial, and a home-grown migration library.

Serving static files via nginx or Apache is fine, but generally requires that you copy them out of your various pluggable apps and into some static docroot on every deploy. We use mod_xsendfile instead (with another Django helper app, which I'm hoping to get onto Bitbucket in the next week or two) to directly serve static assets out of the 'media' directory of each installed Django app.

Our use of Fabric is slightly non-standard, too, as I mentioned; instead of writing a single 'fabfile', we have a collection of development, testing, and deployment commands which use Fabric as a very high-level Python API for distributed command execution.

Regardless of the stack you choose, as a non-sysadmin, there are a few habits and practices I'd strongly recommend you keep in mind to avoid getting yourself into a painful place later:

First, set up a staging environment that looks as much like your production setup as possible. It can be on the same server, or in a local virtual machine, or (even better) on a spare server that can be pressed into duty if the primary ever goes down.

You should always have a recent dump of your production database loaded into this environment, and the ability to pull a more up-to-date snapshot in quickly. (This will help with recovering from major "whoopsies" in production, too, and force you to continually test your backups.)

Second, keep copious, detailed notes on everything you do while deploying, updating, or troubleshooting production issues. I'm literally talking about stuff like this:

  Created uploads directory:
  mkdir /var/www/uploads
  chown www /var/www/uploads
  chmod -x /var/www/uploads

  Configured upload directory in Django settings:
  echo "UPLOAD_FILE_DIRECTORY = /var/www/uploads" >> /usr/local/deploy/myapp/src/myapp/settings.py
Some sysadmins I've worked with literally copy their entire .bash_history file for each session into a running log, though I find that tends to end up with a lot of noise ('cd ..; cd ..; pwd; ls; etc.') that doesn't help when you're trying to triage an issue.

I like to use simple text files (backed up in Dropbox, of course) for these notes, but a wiki is fine if that's your preference. It may seem like pointless duplication of effort at first, but grepping a directory full of notes to see what you changed is a much more reliable triage technique than counting on yourself to remember a bunch of details. This goes double when you're panicking in the middle of a production outage.

Beyond that, everything else is gravy. If you're using Apache/mod_wsgi now, I'd recommend you keep using it until you hit a real scaling limit, or have spare cycles to try out a secondary hosting setup post-launch.

(In case anyone's interested in that migration tool, it's on Bitbucket: https://bitbucket.org/rcoder/finch/overview )

You really should use nginx to serve static files and route everything else to apache. The setup is dead simple.

Actually I had dinner with Graham Dumpleton (mod_wsgi author) recently when he was in the San Francisco and he was speculating about the benefits of having nginx handle all the requests and proxy the django requests to Apache. In this case instead of a memory expensive thread/process under Apache being open while input and output streams to the server, a relatively less expensive nginx thread handles the request or response, basically buffering it and passing it on to the server or client only when data transfer is complete.

Know of any tutorials appropriate for a Linux server noob (I'm comfortable with the Linux command line but haven't done much web deployment other than basic one-click xampp stuff)?

It's quite easy. Install nginx, update the config file, generally found in '/etc/nginx/sites-enabled/'

There are plenty of good nginx config examples out there, but to use it as a reverse proxy you want to do something like: location / { proxy_pass; }

You also need to change your ports.conf file for apache to make it run on a different port if you want nginx running on port 80.

linode's library and slicehost's tutorials are great for starters.

use apache mod wsgi for django and nginx as a reverse proxy for serving static content.

this way you get the apache you know and love as well as a minimal memory footprint.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact