Hacker News new | past | comments | ask | show | jobs | submit login
Lessons from Building a Node App in Docker (jdlm.info)
134 points by JohnHammersley on April 21, 2016 | hide | past | favorite | 28 comments

The article should mention why you would like to run something in docker. What many forget is that when putting stuff in a container, you create future work for yourself to manage not only your own stuff, but also all the dependencies in the container. If you're just after isolation, that could be accomplished with Linux name-spaces and apparmor.

(Author here.) That's a good question. In this case, the main benefit I was after was easy setup of consistent development environments. We have in the past (and still do) use vagrant + ansible for the same purpose, but Dockerfiles are a lot simpler, and people have been less afraid of changing them vs. the slightly crazy ansible playbooks.

It think it's worked pretty well so far, though we have had a few difficulties:

- Particularly in the early stages of development, we've been changing dependencies a lot, and that requires a lot of image rebuilds. They are very consistent, but also a bit tedious.

- For those of us on macs, using docker machine for development hasn't been all that great, because inotify doesn't work for automatic code reloading (watchify, nodemon, etc.). However, they're hard at work on that with the new Docker for Mac.

I'm hopeful that technologies like kubernetes will make it easier to deploy these containers, too, but I haven't really got there yet. Maybe another article some day!

It's not an either/or scenario. Unless you have a trivial environment, you still need something like Ansible to configure the application in question. Things like setting up monitoring, log transports, certificates, databases etc. needs to be orchestrated outside the container.

Check out docker-osx-dev which uses rsync to more efficiently synchronize container volumes with your local files and enable watchers.

That seems to work only w/ boot2docker. Is it still applicable in the docker-machine era?

I could think of a few reasons why you would like to run something in a Docker container.

- OS packaging is tedious to say the least, and "git clone and pull dependencies on production systems" processes are generally considered messy (if not evil, especially when pulling from repositories hosted on the internet); Docker solves both of these issues by offering a generic interface for shipping and deploying isolated instances of your application. You don't get that with just namespaces and apparmor, you need an API for that (which is really what makes Docker so useful)

- Docker (potentially with the help of its ecosystem) can provide a uniform interface to a couple of the most important operational aspects of an application: logging and monitoring. Especially for heterogeneous or simply large environments, this is a big win.

- When container deployment orchestration matures more, it is much easier to manage and auto-scale your application in a large scale setting since you don't need to reinvent that wheel for every specific stack out there. It will come.

- It makes setting up and understanding development environments easier. Similar to Vagrant, Docker Compose lets you describe your architecture in a config file and easily set up a full stack for you. Especially in companies supporting or developing for a complicated stack, that's very useful. It also probably makes on-boarding of new developers a lot easier.

Still you're making a valid point: you need to maintain those containers. Just like you need to maintain your application's dependencies. Let's be honest though, in many ecosystems that problem already exists: take your typical JEE application that once built, hardly ever upgrades its dependency list anymore (no one even monitors what security holes are being found in all those jars, in many cases). But yes, you should. That problem is really not solved with Docker containers. Most images/containers will be as thin as possible (as will the host OS, preferably), but the concern remains valid.

Because its still shiny enough...?

you can scale it up on a generic compute farm like kubernetes?

I think it's brilliant that this gives security more prominence with setting up the unprivileged user. Pretty much every Docker post / article I've seen tends to skip over details like that.

What exactly do you achieve with this? It's running in a container. What's a hacker to do? Screw up the app in the container, which they could do with the app user anyway?

(Author here.) So far as I can tell, it's not that there are known, specific things that one can do to break out of a docker container as root; it's just that the space of possible things you can do is larger, so there is more surface area for you to attack. So, following the principle of least privilege [1], you should avoid running as root, if you reasonably can, and in most cases it's not that hard to do.

[1] https://en.wikipedia.org/wiki/Principle_of_least_privilege

Running as root within a container means your still running as root on the host as well for the underlying process. If there's a security issue with containerization, you'll end up with root on the host.

Running as a non-root user in the container is an extra level of protection and follows the principle of least privilege.

Docker is very clear about this in their documentation: Don't run applications as root.

This is incredibly clear and well-written. Great article!

If your NPM installs are mysteriously slow in Docker, try adding this line before 'npm install':

RUN npm config set registry https://registry.npmjs.org/

I don't know why it works, but it does.

Can anyone tell me what this does? This looks like a configuration files in JSON, but why do these setting make it work in docker? If my docker image size is different, which parameter in this would I change?

Well this just tells the npm client to download the modules from official registry which is "https://registry.npmjs.org". Extra info: If you want even faster downloads within an organization(or wherever recurring npm install might occur), I would recommend setting up a local lazy mirror. PS: I maintain one such lazy npm mirror app-https://github.com/bhanuc/alphonso . It only stores the modules that has been requested and makes subsequent installs much faster.

https://imagelayers.io/?images=node:4.3.2 - I wonder how small we could get an image that's still capable of running node and having an extra user (Buildroot is root-only by default)

If the image size is your primary concern, there are many alpine linux images which excel at this, for example: https://github.com/mhart/alpine-node

mhart's images are great, but I have found one issue with them. They do not contain the dependencies needed to build node-gyp based packages, more specifically and critical bcrypt package. I have fixed this issue and have pushed and maintain up-to-date node images that are built on alpine and have make, g++, etc that is needed. https://github.com/stackci/node

Should g++/make really be included in a container that's used in production though?

If you use the bcrypt node package you need g++/make to compile native extensions so there's really no way to avoid it. You can remove it after you run `npm install` in your own dockerfile, but it won't help container size at all.

I'm solving this problem by having a `dev` container with make, gcc, g++, python, and nodemon included and which is then used to install node_modules with `ENV NODE_ENV dev` set. When I'm ready to deploy, the node_modules gets deleted, installed in production mode and a production container gets built. For dev container, node_modules is a volume, and it gets copied into production container so it's self-contained.

Any reason why you wouldn't use some kind of supervision for your process when running it in production?

It depends whether you want restarts handled from within your container or from the outside (e.g. orchestration framework).

With newer docker (1.2 onwards - https://docs.docker.com/engine/admin/host_integration/) you can have a container restart policy which handles a lot of the simple cases where previously one might have used supervisor.

Baking restart policy into the container can be convenient (and was considered standard practice before restart policies), but has the downside that it's a bit less flexible in terms of how your container works in different environments.

Docker itself can restart containers when the parent process inside the container exits for any reason. That behavior is not enabled by default but it can trivially be enabled for a container (started via "docker run", or in the compose yml file similarly).


The last time I tried to use that it failed when coupling containers. Ie that nginx requires, my web app, which requires the database. Every process but the database failed because they all depended on each other. I ended up using shell scripts which used an arbitrary wait of 2 seconds after each process. that fixed the problem so that the containers came up after a reboot. Did docker fix that yet?

I think the implementation of links between containers did change significantly in the past year and a half.

But that "links break when containers restart or are replaced" aspect was a deal-breaker for me when I started using docker "near production". I just use --net=host ... so I don't use docker network-related functionality at all. For my purposes server-level firewall settings are fine.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact