Hacker News new | past | comments | ask | show | jobs | submit login
How to Write Dockerfiles for Python Web Apps (hasura.io)
44 points by praveenweb on Feb 27, 2018 | hide | past | favorite | 22 comments

    Using an appropriate base image (debian for dev, alpine for production).
No you don't. It's nonsensical to use a "convenient" development environment, and then make the production application work in something more bare-bones/stripped down/different. I've never used Alpine Linux, but even I know there's a LOT of subtle differences between it and Debian. Nontrivial stuff that matters, like C library implementation (GNU libc vs. musl) and init system (systemd vs. sysv init). You don't want that, because it opens up a multiverse full of potential, exclusively bad surprises that you'll be unprepared to debug.

Sort of defeats the whole point of single consistent environment if you are going to change the base container itself.

I agree on base image point. Although if we have python apps in the image, it should not matter (both 2-alpine3.7 and 3-alpine3.8 work ok). Systemd vs sysv is not an issue in docker containers since it's (mostly) not used. If you are using an init system, you are probably using tini (included in recent Docker) or supervisord (usually used for older/legacy c/c++ apps).

I can attest to that. We once had the issue that a python app running through gunicorn segfaulted with alpine while it worked fine with a centos image.

> Using an appropriate base image (debian for dev, alpine for production).

Seems to me like this is a good way to introduce subtle differences between dev and prod.

I'd just use the same base images for everything. Consistency is king.

I don't think alpine solves the process reaping problem. The only base image suitable for production I found so far is: https://github.com/phusion/baseimage-docker

Process reaping and signal forwarding are solved by using tini. You can call if from inside the container, by installing it and then changing the entrypoint/cmd. You can also specify it in your docker run/docker create command by adding "--init" I haven't seen the phusion base images in production in almost a year. Usually when an organization gets more experience in building images, they use better solutions then phusion images.


Don't use Phusion base images. If you need ssh and logrotate and etc, use lxd or a vm.

I am not currently running docker on prod (but I am about to) you Sir pointed me to a good direction. Thank you!

Correct me if I'm wrong, but wouldn't just running a container with --init also reap zombie processes generically?

If you are not concerned about the size of your production image and if your modules don't compile with alpine builds, you can stick to the same debian base image.

Or you can use alpine for dev if you are not concerned about debugging too much.

I would say it boils down to use cases and priorities.

Especially these two, with alpines different c compiler for everything. Perhaps dev on a debian base and production on of one of those single binary only containers.

One thing that’s missing from the RUN command is ‘-u’, for unbuffered output (essential when using the default loggers).

But there are simpler ways to build dev images using build stages... I maintain two for my own use, one for web dev and another for ML/back-end stuff:

- https://github.com/rcarmo/alpine-python

- https://github.com/rcarmo/ubuntu-python (because musl doesn’t play well with some binary dependencies)

I also tend to rely on uwsgi or sanic rather than gunicorn (which feels a bit dated to me) but those boil down to personal preference.

+1 for `-u` this one caught me out when I first used python services in docker.

Just curious, why do you need gunicorn for auto-restarting the server inside a docker container, if you can just restart the container? It's pretty lightweight operation for config changes, imo.

Yes, you can restart the container too but it will not be real-time and sometimes when you restart docker daemon might take longer and you might have to wait to check your update.

Restarting the container is manual work. I prefer restarts being automatically taken care by watch tools or the server.

Another solution could be using docker in swarm mode to let docker automatically restart containers

    COPY src/requirements.txt ./
    RUN pip install -r requirements.txt
This has always confounded me as we always put `-e .` in our requirements.txt. Has this fallen out of favour? It doesn't appear the author's application needs to be installed like most pyramid or click applications do.

I haven't really looked at Pyramid in years (I do mostly Django). Why does it need `-e .`? In the projects I've worked on, that would be a really weird thing to do.

`-e` in a requirements.txt is always a bit of a hack. If you really have to do it, it's highly recommended to pin it to a particular sha1 or tag. But I guess my bias is from spending a lot of time and effort trying to get python projects to be as close as possible to reproducible.

out of curisoity, why gunicorn instead of uwsgi?

Apart from gunicorn being easy to configure, i just happened to use gunicorn first. I'm sure the same Dockerfile can be applied with uwsgi. Are there any benefits to using uwsgi over gunicorn? Open to exploring.

I can't speak for the author but in my experience gunicorn was simply easier than uwsgi.

When I ran into problems configuring uwsgi it was painful. This was a few years ago so maybe things have gotten better

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact