Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Why use Docker and what are the business cases for using it?
122 points by dmarg on June 30, 2015 | hide | past | favorite | 131 comments
I recently did a very basic tutorial on Docker. Afterward, when talking to the tech lead of the project I am working on, he asked me "What are the business cases for using Docker, how could it save us money, how could it make us more efficient?" I told him I would do some research to find the best answers and figured I would turn to you all here at HackerNews for some advice.

Thanks in advance!

Just so you know, I currently work on a project at my company that uses AWS, Codeship and BitBucket for development and production. AWS and Elasticbeanstalk host our application, Codeship runs tests on changes to branches in BitBucket and then pushes the code that passes on certain branches to different AWS environments for Development or Production.

Does Docker provide anything useful to someone who develops on OS X and deploys to Linux VMs (Digital Ocean, Linode, AWS)?

Current setup is something like:

- Develop locally (OS X) - Test deploy to local Vagrant Linux VM (provisioned by Ansible) - Deploy to staging/live Linux VM w/ Ansible (or Fabric if I'm being lazy)

I've been following the Docker hype for some time now, but ever time I look into it, I couldn't find any info on how Docker could make my life simpler or easier. If anything, it would just add another complex abstraction layer to have to deal with.

What am I missing?

You can build a docker image in a vagrant VM on OSX and deploy that exact docker image to production.

If your vagrant image resembles production, it's probably fine, but there's a level of confidence to be reached from deploying the exact entire self-contained binary, shipping it through QA and staging, and eventually promoting it to production.

This should be made into an answer to the OP's question, because this is the use case that I think most people will wind up moving to docker for, if they do.

This is also a use case for not developing on OS X, which doesn't have anything to do with production anyway.

Can you clarify? Do you mean that one should only develop upon the platform that will be used in production?

Yes, the previous commenter is a purist with no battery life on their laptop.

Further, if you're in the middle of upgrading your production OS, does this mean that you need two developer machines?


An alternative to Docker is Packer (https://packer.io/). It will create and register an AMI for EC2 so you don't have to run any configuration scripts once it is deployed. It's made by the same company that makes Vagrant and uses a similar setup. If you run Ansible to provision the Vagrant VM, I believe you can use the same script to provision the Packer image.

I would not call this an alternative. I think an alternative development workflow involves using packer as well as a number of other top level tools (vagrant for example) and using a number of other languages and DSLs. What you end up with is a VM image, not a container image, so you miss out on some of the container-vm abstractions that make docker and other container technologies so appealing.

Agreed, but a docker image is lighter in weight and takes less time to deploy and consumes less resources on the target server

My current workflow is similarly VM oriented and I've just started playing with docker, so an experienced user can feel free to chime in.

Docker is more about process isolation using LXC than having a whole VM for a particular app. You still need some operating system underneath your docker deploys, so it feels redundant in a world where I already have a VM for each project. However I could see it shining at hosting a half dozen to few dozen services from the same machine.

One thing I don't like aesthetically is the tendency to expose local folders to the container. I prefer complete encapsulation.

Docker doesn't expose local folders to the container unless you specifically and verbosely instruct it to do so. Unless you're thinking of the ADD and COPY instructions, which are entirely optional (or you can, IIRC, use ADD with a URL).

If you are referring to docker-machine (née boot2docker) sharing your home directory into the VirtualBox, that's a necessity brought on by the fact that you are not developing on Linux and Docker is (currently) Linux only, so something has to give in that relationship. It is, IIRC, also more-or-less optional, with the tremendous downside that you must scp/rsync your workspace into VirtualBox by hand, before ssh-ing in to execute `docker build`.

There are many ways of using Docker and obviously different companies could come up with their own business cases for adopting the technology. So let me focus on one scenario and we can talk about whether it makes sense for your environment.

Software engineering is often difficult because programmers have to deal with inconsistent environments for development and for production execution of their products. Due to mismatches between these environments, developers often found bugs that surface in one environment but not in another.

Hardware based virtualization (VMWare, HyperV, etc) helped with the inconsistency issue because it enabled developers to create dev/test environments that could later be replicated into production. However this category of virtualization requires more computational resources (esp. storage) than operating system virtualization like LXC used by Docker.

In addition to requiring less resources than hardware virtualization, Docker defined a convenient container specification format (Dockerfile) and a way to share these specifications (DockerHub). When used together, these Docker technologies accelerate the process of defining a consistent environment for both development and production in your application. Dockerfiles are easy to maintain and help reduce the need for a dedicated operations team for your application. In buzzword speak, your team can become more "DevOps".

Docker, by the virtue of relying on a thinner virtualization layer than hardware hypervisors like VMWare, also has higher performance for I/O intensive operations. When used on bare metal hardware, you going to be able to get better performance for many databases in Docker containers compared to databases running in a virtual machine guest.

So to recap, Docker can help you

- maintain consistent dev/test/prod environments

- use less resources than hardware virtualization

- free up the time your team spends on dev to ops handoffs

- improve your app I/O performance compared to running in a hardware virtualization based virtual machine guest

However if you are using AWS, note that Docker Container Service available from Amazon actually doesn't give you Docker on bare metal. That's because Docker Containers run in AWS virtual machine guests virtualized by Xen hypervisor. So with AWS you are paying a penalty in resources and performance when using Docker.

Great, but what are the benefits of running Docker in AWS? You are still running VMs and you are being charged for running them. With Docker you are simply putting yet another layer of complexity, because now you have to run more beefier VMs, you now have problem with network communication between containers running on different hosts. So you will most likely need to use overlay network. You also decrease resiliency, because now when AWS terminated a single VM, all apps running on that node suddenly disappear.

I also don't get the argument about running the same container in dev/test/prod. For example my company is working on going Docker and one of the problem with these environments is that app running there has different configuration. So the idea to solve it is to create three different versions of the same container. Genius! But now are you really running the same thing in dev/test/prod? How is it different to what we did in the past? Especially that before Docker through our continuous delivery we actually were using exact same artifact on machines set up with chef that were configured the same way as in prod, while with Docker now we plan to use three different containers.

>what are the benefits of running Docker in AWS?

I don't see benefits to running Docker in AWS. In my opinion, AWS implemented its Docker-based Container Service very poorly. I advise my customers against using AWS when want to use Docker. There are many bare metal as a service providers out in the marketplace.

>the argument about running the same container in dev/test/prod

Is this issue really caused by Docker because you said that you had consistent environments when built by chef?

I installed the Amazon Linux Installation Image (AMI) with docker and Python 3.4 for a recent project. I know ElasticBeanstalk fairly well, but configuring this was a big headache. WASTED LOT OF TIME.

Instead of making life easy, it just added unnecessary burden of learning Docker for the future project members. Documentation is poor, had to hunt for hours for solutions to simple queries.

On AWS, I would suggest you stick with the basic Linux flavour that you know. Use their Docker build only IF you know docker very well.

> In my opinion, AWS implemented its Docker-based Container Service very poorly.

We looked into using it earlier this year.

The web UI was flat-out busted in several ways - they only listed the first 100 security groups, and we have a magnitude more than that. The command line interface was poorly documented, and was missing some of the functionality.

It was a total waste of a week; I wonder if they've fixed any of that.

Would you mind elaborating on some of the issues you see with Elastic Container Service? I'd hoped it would be something like an AWS-specific Mesos, but I haven't looked into it closely.

Can you name a few bare metal as a service providers?

Rackspace, SoftLayer/Bluemix

Your environment-specific config shouldn't be in the container, but described in the environment itself, whether through ENV vars exposed to the container, or a mounted volume of config files. This is a solved problem, in my (admittedly limited, compared to some other commenters) experience.

Robust composition. Instead of needing a separate database, message queue, app server VM, you can run all 3 containers together on one omnibus machine, or separated as scalability needs demand.

Yes, you can achieve something similar using multiple layers in OpsWorks or different deployment schemes with normal Chef, but IMO containers make provisioning and deploying combinations of components easier than most other provisioning and deployment solutions. There's less opportunity for unexpected version collisions and because the network infrastructure is virtual, you can move containers between underlying VMs, allowing capacity planning without substantial reconfiguration.

Check out https://medium.com/@hyperhq/docker-hyper-and-the-end-of-gues...

I think this is the way Container-as-a-Service to go, instead of the current form of AWS ECS.

@takeda, how do you create 3 different containers in Docker for dev/test/prod? What parameters are you changing in the three stages?

- maintain consistent dev/test/prod environments

We're pretty heavily invested in Docker. To this point, it's really nice knowing that all the developers are operating in consistent environments.

An additional bonus is that using Docker makes it really easy to propagate infrastructure changes to the rest of your team for use in development. And, more importantly, know those infrastructure changes are consistent across local dev setups.

As an example, I recently incorporated Sphinx search into a project we were working on. I didn't want to require devs to install Sphinx on their own machines and get it up and running for search to function properly. I also didn't want devs to have the overhead of running Sphinx for search on their local boxes unless they were actively working on something related to search. Basically, I wanted search to be optionally configured to run on startup.

I used a DockerFile to setup Sphinx in its own container, pushed the DockerFile to a Tools source repository we use, and then incorporated the build and running of that container into our startup scripts (just some simple orchestration stuff for the local machine written in bash). Now, if a dev wants a containerized search mechanism they run a simple bash script to build that container, then run another command to spin up our dockerized web app with a connection to a running sphinx search container. We do this for all of our services: mysql, redis, sphinx, the web app itself, and anything else we might need.

As an added bonus, all the Docker CLI work and orchestration of our application is easily hidden behind a shell script. If a dev wants to run the webapp, they simply run: app server dev. If they want the app with search they run: app server dev search from the command line. Developers never need to know what's going on under the hood to get their job done. From their perspective, it just works.

I'd be interested to know if and how people are integrating Docker into their edit-compile-test development cycle?

For me personally, the time it takes from performing an edit to seeing your change "live" on a developer's machine is extremely important. To this end we've spent some effort in making our own services and dependencies run natively on both OS X and Linux to minimize the turnaround time (currently at around 4 seconds).

Docker fixes the "works on my machine" problem, which is worth tackling but not something you run into too often, but (in my admittedly limited experience) introduces pain into the developer's workflow. Right now, I'm leaning towards enforcing identical environments in CI testing and production via Docker images, but not necessarily extending that to the developer's machines. Developers can still download images of dependencies, just not for the thing they're actually developing. I'd love to hear alternative takes.

We do a really simple trick. We have one Dockerfile, which we use for dev and for production, every configuration difference is handled through environment variables (which is easy to manage with docker-compose, Ansible, etc.), so you can run the same image everywhere and change the environment configuration to switch between dev/test/prod.

So to solve the live reload problem we do only one thing in development which we don't do anywhere else: the Dockerfile copies all the current source files into the image, but we run the container with a mounted volume of the source code - which in dev mappes exactly the same way the Dockerfile defines the src copy.

This way when you create the image and start the dev container it won't change anything, but as the src is mounted on top of the included src in the image the regular (Rails) auto reload works perfectly.

For test and prod we just simply don't do any volume mounting at all, and use the baked in src directly.

Easy and fast everywhere, all you have to control is how and when the images are created (ex: we do a full clean checkout in a new directory before building a test image, to ensure it only contains committed code. EDIT: by test I mean for local testing, while you're working on revisions, a CI server does the automatic testing).

I spent a lot of time thinking about this and working on it. My github account hasn't been active very much over the past year, but the docker fundamentals are there.

I would love to find the time to modernize in the context of machine/swarm/compose.

One of the thoughts[1] was using docker volumes as sophisticated pipes to other docker processes, allowing for a different kind of blog/publish/editor workflow while being agnostic in tools (blog engine, editor [web or some other mechanism]) for how the content is actually created.

Needless to say, there are many aspects of Docker I find really powerful, and I particularly delight in taking an existing workflow and modernizing it in that context.

Long winded context, but, suffice to say, a 4sec turnaround time would be way too long for me personally. I know first hand as well as through all of the talks on Docker out there it is possible to get near realtime updates of your app in local dev.

1: https://github.com/keeb/blog

To tell all the cool kids you are using it.

(Ok, I know there are real use cases for Docker, but I see a lot of hype as well. People telling my mathematician friend that she needs to use docker at the start of her project - it is likely to be a one off graph she needs to produce for a research paper).

There is a big push for reproducibility in science. If you friend can package the process for building that graph in a Dockerfile, it is more likely that readers of her paper will be able to reproduce her results.

or, you know, publish the formula, so readers can reproduce in whatever language / system they want.

Reproducibility is a big push.... but not like you are suggesting. Shipping a dockerfile is the equivalent of saying "This works, if you use this flask, this pipette, this GCMS and this piece of litmus paper"

Docker is not the only solution to problems. It solves some, but you can't tack it on to everything.

Why not both? I am not in academia but I was under the impression that some academics might be publishing 'questionable' results that cannot be reproduced at all in order get their paper count up for tenure review. Not to mention puff-pieces from industry that basically serve as PR in peer reviewed journals without furthering their discipline.

So shipping working code (even if it comes with a required pipette) might be a nice requirement for a peer reviewed publication to take on in order to keep their journal relevant. Shipping in Docker or similar guarantees reproducibility.

If the code is crap, and only works on one particular data set, then putting it in a docker container ain't going to help.

This is an area where a lot of companies are focusing in terms of data science.

As you noted, reproducibility is a huge issue in the scientific community (according to docker users/vendors I've spoke to) to the extent that there are a number of funded startups trying to solve this problem (some using Docker.)

What has been also very surprising is the big companies who have read only copies of analytic data they want to run computations on - sandboxing the data scientists scripts in a container has helped them tremendously in terms of supporting the execution.

One business use case is to stop threads between the Ops and the Dev team about missing JAR files on a production system.

If Docker was fixing this only, I would still use it. There is nothing better than a single binary deployment that is byte to byte the same as it was running on a dev laptop and a QA env.

Do you mean like a war file?

(Yes, I'm being hyperbolic, unless you want to have your entire application running from a servlet container with nothing in front of it. And I agree, a single deployment object is just about the only way to keep your sanity. But still, this isn't the first thing that's proclaimed that advantage.)

No, I do not mean like a war file. I am talking about an average Java developer who does not understand how classpath works, how environment works and what are the assumption that is built in to the code and blames everything on other people . For a long time they could get away with this, post-Docker world they cannot.

Not sure why you still have this problem. In Java world, Maven build system solved this long, long time ago. There is even a plug-in that builds a uber jar that can be invoked with JVM as the only dependency.

Still have problems with Java versions, Java security policies (encryption strength, etc, that you manually have to add files to your JDK/JRE to work), external dependencies, etc.

We use Java, Maven, JBoss + Ear/War deployments. And "Works on my Box" is one of the most frustrating problems we have. This is one of the reasons we're pushing toward Docker.

I do not have this problem, because I understand this very well. I said, an average Java dev has this problem, that is based on my experience working with Java devs in San Francisco for few years.

Is that really where we're at? That an "average Java developer" doesn't understand the classpath or environment?

I've certainly worked with a lot of folks who didn't, but I guess I've just always hoped that experience was unique to me.

Oh boy, if I could publish all of the things I went through. I was hoping too, until I have been in several companies, I have experienced with several developers and that was the time when I started to think, this is not an isolated problem but rather the average.

I absolutely love it (Docker + docker-compose) for creating homogenous local development environments. And if your app needs a MongoDB, Elasticsearch or anything else, it's as easy as adding one line to your overall docker-compose config file to link those services to your app. No need to pollute your development machine, you can just have anything running in Docker containers and share them across your team.

I've created several repos on Github for that matter. Here's for example some boilerplate for running Node.js, Express, Nginx, Redis and Grunt/Gulp inside Docker containers:


I love docker-compose, it is to services what npm is to packages.

That said, I'm only using it in development. For production there's so many options I don't even know where to begin: http://stackoverflow.com/a/18287169/3557327

In my experience, if you start hearing: "I don't know what's wrong, it works fine on my localhost!" a lot, then it may be time to think about Docker.

In more general terms: more the complex the environment, the more moving pieces, the more developers on the team, the more servers in production, the more likely there's going be a discrepancy between what Developer-A has running on his machine, and what Developer-B has running on theirs. Docker helps keep everybody on the same page.

For me personally: I'm a dev on a number of projects, and Docker helps me keep my dependencies straight. I no longer have to change things around locally just to work on Project-A, I just get their latest Docker image, and I'm good to go.

This seems pretty fine but it's not a Docker-specific argument. The same arguments fits for a Vagrant-Workflow.

How is this different from sharing a plain old VM image and doing the same thing? Is there any particular advantage that a Docker image brings?

Docker images use the union file system. That means when a new version of that image is available, you often only pull a few megabytes from the hub because it already has the OS.

A docker container is also way more lightweight. Instead of running a full OS, you I only run one process. So your docker image starts within seconds

Is this more like a Solaris zone or a Linux chroot-ed env?

A lot like Linux chroot, with some additional features, restrictions, and a mechanism to share them easily.

For one, speed. Provisioning a VM image takes a lot longer than a container.

Secondly, composition. Can you provision and link the 10 instances that make up your application? Web servers, app servers, proxies, caches, databases, hadoop etc.

I think this second ability is the truly compelling one for me. The features that allow this (swarm and compose for docker, lots of other competing orchestration stuff) are still pretty young, but it would still push me in the direction of containerization over virtualization.

Makes system administration somewhat easier since the host OS can stay the same and docker containers change, while giving developers more control. I have total control of which version of packages is installed, which OS I'm using. I don't have to create a ticket, argue over the ask, and get approval just to change an web server timeout. It sort of usurps the sys admin role, though, which might be a negative. I can move my container anywhere that's running Docker and all packages are there inside of the container. If you spend a lot of time setting up new boxes, that's a plus. Before, I had to dump all packages, figure out which ones were missing, then install all of them, and the host OS had to be exactly the same. Now I know it's exactly the same, all the time, anywhere.

My only warning is that using anything but Ubuntu for your build host is going to take way too long and you're going to be waiting hours for it to complete if you don't have any layers cached.

It's very useful in situations where you need a reproduceable deployment and also need high performance and direct access to hardware. We run simulations that require a lot of setup, and we tried with VMs at first, but they were too slow and the GL driver inside of the VM didn't implement all the extensions we needed. Docker worked perfectly in this case, though we have to run it in privileged mode.

I never figured it out, we could do everything we need in production with LXC, puppet, carton and perlbrew. Combined with a vagrant box for dev, we have no issues.

Although I do use docker to run deluge in a container with openvpn on my home pc, but only because someone else had gone to the trouble of writing the dockerfile and getting it to work.

It seems to break a lot though, when I have time I'm going to get rid of it and replace it with a systemd-nspawn container, because there's less handwaving involved and I can get it to work correctly.

> I never figured it out, we could do everything we need in production with LXC, puppet, carton and perlbrew. Combined with a vagrant box for dev, we have no issues.

Would you feel differently if you hadn't already ramped up on all those technologies?

What I mean is that since docker sort of overlaps with much of what people are doing with configuration management, containers, build/packaging systems, and virtualization, it has the potential to reduce the complexity of the stack. If someone's starting fresh, they can potentially avoid having to learn to use and deploy some or all of those different components and just go with docker.

> Although I do use docker to run deluge in a container with openvpn on my home pc, but only because someone else had gone to the trouble of writing the dockerfile and getting it to work.

That's actually pretty cool.

Could you post a link to the Dockerfile for the deluge/openvpn container?

I have a VM setup with transmission/openvpn since I was having issues getting the VPN to work with Docker's networking and it was just easier to use a standard VM instead.

Sure, this is the one I used, with private internet access.com , https://github.com/jbogatay/docker-piavpn

I'm sure you could easily change it to work with other VPN providers

Something noting is that more and more PaaS (Platform as a service) are using docker.. so sometimes you're not making the decision as a developer to use a docker, you're just forced to use it.

I'm saying this because I know docker solves a lot of pain on the devops side, but on the "software" side it's been painful all the time I've touched it. I.e. practically speaking, it makes releasing much slower, sometimes I'm forced to do a hard reset on the container rather than just reload nginx, etc.

My suggestion is to go with what's simpler for your stack. If you're struggling with having to manage and deploy new configured/secure ec2 instances every day, then it might be worth looking into docker.

> Something noting is that more and more PaaS (Platform as a service) are using docker

To expand on this:

* Heroku have introduced Docker-based tools to run their buildpacks outside of their staging servers,

* Cloud Foundry has, in public beta, the Diego scheduler, which can accept and manage Docker images,

* OpenShift 3 uses Docker and Kubernetes as its core components.

Disclaimer: I work for Pivotal, who founded Cloud Foundry.

I can give you two first hand examples which also revolve about the apsects osipov mentioned:

#1 is in my dayjob:

We use docker in combination with vagrant for spinning up a test environment consisting of several containers with different components (both ur own products and 3rd party) to run integration tests against them.

The main reasons for this approach are: - We can supply containers with the installed producs which saves time on automated runs since you don't have to run a setup every time

- We can provide certain customizations which are only needed for testing and which we don't want to ship to customers without doing all the steps needed for that over and over again

- We have exact control and version how the environment looks like.

- Resources are better distributed than in hardware virtualization environments

#2 Is a pet project of mine, a backend for a mobile App. There are still big parts missing, but in the moment it consists of a backend application which exposes a REST API running on equinox plus a database (in an own container).

The reasons I see for using docker here:

- I have control and versioning of the environment

- I can test on my laptop in the same components as prod, but scaled down (by just spinning up the database and one backend container)

- Since more and more cloud providers are supporting Docker (I am currently having an eye on AWS and DigitalOcean, haven't decided yet), switching the provider in the future will be easier compared to having, say a proprietary image or whatever.

- If I ever scale up the poject and onboard new teammembers, the entry barriers for (at least) helping in Ops will be lower than if they have to learn the single technologies until they get at least basic knowledge of the project.

My apologies if I'm hijacking the original poster.

Does Docker handle multi-environment configuration management? For example: qa, stage and live have the same config files, but different values.

Currently we're using Ansible and we set variables for a specific environment, then we feed those variables into config files based on where we're deploying to (config files are not duplicated, only variables that feed into config files.)

All of our configuration is in git and we can quickly see and change it.

How does Docker handle this?

This is not necessarily a part of the Docker specification, but here's a best practice followed by many apps running in Docker containers: http://12factor.net/config

This. Our most recent project was engineered to leverage Docker and Ansible in this manner.

We have a single playbook to deploy everything, i.e. deploy multiple (micro)services, heavily using docker images pulled from a private registry.

With a single playbook, we have multiple Ansible inventory / hosts file for each environment: QA, prod. Sensitive information / secrets are stored in Ansible-vault groupvar files. QA people have ssh access to their own machines, while Prod ops have their own separate ssh access and machines.

The playbook was refactored to heavily use roles, wherein config template files are dynamically setup using inputs from inventory vars and groupvars.

The roles are also topology independent, meaning a QA project cluster can actually be a single big VM with mocked DBs, while the Prod cluster can be spread across multiple machines.

Docker helped simplify the code deployment. Prior to deployment, docker images are built and tested by Jenkins first prior to pushing the images to the registry.

One way I've seen many people tackle this problem is to have the Dockerfile/image built in a more generic way, then the end of the Dockerfile kicks off an Ansible playbook (or some other lite CM tool) that will configure everything for the proper environment (e.g. change configuration and kick off a service, something along those lines).

Some will even go as far as using a CM tool to do the entire internal Dockerfile build, and the Dockerfile is just a wrapper around the CM tool. This does require more bloat inside the Docker image, as you need to have your CM tool or whatever other supporting files/scripts installed in the image, but it does make more complex scenarios much simpler.

> you need to have your CM tool or whatever other supporting files/scripts installed in the image

This pattern is maybe even more helpful than harmful, for making your dev environment more closely match production, when your final deploy target is not a docker container.

(You are obviously going to want to see those build scripts running in test, if not earlier; certainly once, before they should kick off in a production environment.) You could do more individual steps in the docker file, just like you could store your token credentials and database handles in the git repository. Neither way is "completely wrong" but there is a trade-off.

It sounds like you're reinventing a PaaS, which is a road many people go down when they build their devops environment from the ground up.

In the long run it's a bad idea: you wind up with a snowflake PaaS that only you maintain and only you can understand or extend. The amount of engineering effort behind Heroku, Cloud Foundry or OpenShift is enormous and you can get support on a high level.

I'm biased, because I work for Pivotal (who founded Cloud Foundry), but in my view rolling your own PaaS is a strategic error at this point.

Absolutely. This situation is handled by using Environment variables. Depending on the environment, I use environment variables to point services at different places (ie dedicated production database server vs my micro dev mysql container).

Remember that environment variables are visible to processes outside the container (i.e. users), if they have the same or higher privileged user. They are not a great place to store passwords or any other confidential information.

The environment of a process is only available to root or the same uid.

    vagrant@monitor:/proc$ sudo -u nginx cat 1779/environ

It's not uncommon to allow users to sudo up to particular system users for commands, nor it is uncommon for compromised programs to give the attacker a shell as the user of the compromised program.

Anything owned by that user is vulnerable. A common problem which is typically resolved by reading a config file while root and downgrading to a lower privilege user. For example, you wouldn't want anyone who could become the nginx user to get the SSL key, or the password to your S3 bucket, or...

You do not give the nginx user sudo ability, and any user who has sudo is root, and should be treated as such.

Side question. I'm well aware of the benefits of docker but, has anybody measured performance degradation due to lack of machine specialization? Back in the web 1.0 days it was common knowledge that you start in 1 server, then you split into 1 app server and 1 database server and you can get 4x the capacity. Did we lose all that with the docker way? Is it not so relevant anymore with modern multicore CPUs?

Docker doesn't force you to put your app and your db on the same image. That is up to you. Most have "App" images and "DB" images separate.

If we want to get really specific, Its also common to see the "DB" image split up between the image of the disk where the data is actually persisted, and the image of the actual DB process. This makes it easy to play around with your data under different versions of your DB.

If you're familiar with Java, think of a docker container like a WAR or EAR, except it can contain ANY dependency, not just Java code. Database, binaries, cache server, you name it. The implementation is vastly different, but the effect is a deployment artifact that can be configured at build time, and easily deployed to multiple servers.

Codeship have a great series on Docker for Continuous delivery on their blog: http://blog.codeship.com/

That said I've paged the founders to this thread, they can make the case much more effectively than I can. (disclosure: I don't work for Codeship).

Besides just actually running software, I also find it really neat when projects use docker to build their entire application. It provides an effective means of documenting all of your dependencies and making reproducible builds.

Take the docker-compose for example. You can just check the code out, run a single script that builds the project for your environment and everything is pretty much self contained in the dockerfile (https://github.com/docker/compose/blob/master/Dockerfile). You don't have to clog up your host computer with deps and you get an executable plopped into an output bin folder.

Additionally, the steps in the dockerfile get cached so subsequent builds are really fast.

A docker file is a pretty poor way of providing reproducible builds though.

First off there's the FROM line, which can contain whatever opaque image you feel like that already has dependencies inside it, and who knows how they got there or what will happen when it needs to be updated.

Then there's the fact that it's like a script but worse: every line creates a new image, and docker will try to cache the results after each line, but that cache can work against you if you're not really careful (imagine if build systems like make worked that way? No dependency tree, just refusing to execute the first half of your makefile because well, it worked last time so why do it again?

And in practice, you get to find out how many people just put an "apt-get update" in their docker file too. Now our backwards compatiblility is really just equal to Debian's. Hope there's no back ports repos in there or anything that would give a non-backwards-compatible package!

It's certainly possible to use Dockerfiles to create reproducible builds, but it's literally no better than a shell script at doing that. You have all the rope you need to hang yourself and then some.

>making reproducible builds.

Docker builds actually aren't reproducible. There are many sources of non-determinism that Docker cannot address. Do you use the base images from DockerHub as-is or do you run 'apt-get upgrade' or whatever for security patches? If you do, the result you get from building that image (as opposed to using what's in a cache) is different depending on the time it was built. The same goes for any Dockerfiles that compile from source. Hell, just extracting a source tarball results in a different hash of the source tree because of the timestamps on the files. You and I have little hope of building the same image and getting the same exact result.

Build reproducibility is a very interesting topic with some unsolved issues, but Docker isn't helping with it. See https://reproducible.debian.net for a good resource about build reproducibility.

Don't know why you were downvoted. Docker doesn't give you reproducible builds because you're still running in a raw host OS environment with all its state, but simply the subsystems partitioned into their own namespaces. Docker is more akin to a snapshot than reproducible.

Docker, or containers in general? I'd really like to hear about Docker specifically, but most of the answers so far seem to relate to containers in general, rather than Docker specifically.

What are the business cases for using Docker over some other container-based solution?

We have a shitload of servers running CentOS for historical reasons. We can't change the distribution because all the services running on this servers are tight to the quirks and special cases of this distribution. So we need to live with CentOS.

Some of our newer services need a up to date version of glibc and a lot of other dependencies CentOS can't provide. So we use docker to boot up Ubuntu 14.04 containers and run the services with special needs in them.

Another great thing is isolating scripts we don't trust. We allow our customers to run scripts of all kind on our servers --> inside Docker containers. So the customers can't mess with the hostsystem.

>running CentOS for historical reasons

Is CentOS not the state-of-the-art Linux distro to run for servers (besides RHEL for support)?

Maybe safe-of-the-art. It is stable (and old) which is the way many people like their servers. But, it certainly isn't the latest and greatest.

The entire point of RHEL/CentOS is that it isn't bleeding edge; it will certainly be modern though. I think it's rather unfair to call it "old" though; the latest release was in March.

Fair enough. But the mentality that picks stable over up-to-date tends to never upgrade. I'm stuck supporting rhel5.5, our "new" systems are 6.5

If you are hosting gameservers, it just does not fit.

This is what we do. Existing stable CentOS/RHEL environment that we can deploy other OS containers to (specifically Ubuntu 14.04) without having to stand up entire new VMs or metal.

our use case for docker is the following:

we're a webshop, and recently we've standardized our stack on symfony2/nginx/postgresql, so all our websites use that. but beside of that we have some that we maintain that need to run on old version of php/centos.

As we have only 1 server internally for pre-staging environments, docker does help us to save a lot of memory/cpu compare to what we had before (virtualbox, yes...), without needing a lot of machine to setup (like openstack).

Also we don't really have a guy dedicated to sysadmin, so the less time we need to spent on server administration, the better we feel. So we have a set of 3 containers (for symfony+php_fpm / postgresql / nginx ) that already tuned to meet our needs, with a ansible playbook https://github.com/allan-simon/ansible-docker-symfony2-vagra... , that we reuse for every new project we have. So that the developers can have a working stack, without needing to reinvent the wheel, they even don't need any knowledge of system adminstration "run this ansible command, done" ! without any risk to break other services.

Also the reproductability and the stateless properties of docker containers has helped to nearly eliminate the class of bugs "the guy that does not work in our company anymore made a tweak one day on the server to solve this business critical bug, but nobody know what it is but we need to redeploy and the bug has reappeared"

We have been using docker for about a year at Demandforce, Intuit. We have had a mostly positive experience with it.


- Dont see any environment related issues because the docker image is the same in every environment

- Easy onboarding to teams that use docker because you dont need to setup anything new. This is especially useful if your company encourages developers to work across teams

- Ops can build around infrastructure around this and be sure that every team builds and runs code in the same way

- If your application is complex, using docker-compose, its extremely simple to setup your dev environment

- The community is moving towards docker, and it doesnt hurt your resume if you have production docker experience


- For an extremely simple application (that you think will remain simple over its lifetime), it might be more overhead to use docker than not use it

- Even though we’ve been using boot2docker and vagrant to setup docker on MacOSX, it hasnt worked seamlessly. When you get on and off a vpn for example, boot2docker has constantly messed things up. If you can get your dev setup right, docker works well. If not, it can be a pain sometimes on OSX

- Although its easy to build docker images for most of the open source software out there (if docker images dont already exist), it can be a pain to do that for enterprise software. Try using docker with oracle db. You might get it to work. You wont have fun with it !!

I would keep an eye on this project: http://www.opencontainers.org/

Heard about this and seems like everyone and their mother are signing on. This is one of the reasons why I asked the main question is because I want to fully understand what the business case is for using Docker.

I'm a sysadmin for a private K-8 school. I use docker because of the ease of deploying and upgrading a large number of tools on a limited number of servers.

I've used puppet for several years to manage our infrastructure, and puppet is still managing all our staff and student laptops, but for servers, I've switched everything to CoreOS + Docker.

Can people elaborate on when it would be better to use a virtual machine and when it would be beneficial to use a container?

If you have own hardware (e.g. Data Center) that is running your own code that you trust. By going with containers you can pack more applications into the same hardware (less overhead), therefore your costs ate lower.

If you running in AWS, you use VMs anyway so the overhead is there no matter what (and also is not your concern, because you pay for the VMs). By adding Docker there you basically adding one extra layer on top of it, so from the infrastructure point of view you making things even more complex.

Virtual machines have a much higher level of isolation than the LXC used currently for containers. In a container all it takes to get access to the whole system is a privilege escalation exploit. Such exploits are fairly common.

I talk about this a bit here:


this was a year ago, so a little out of date. I now work for another company that is into Docker.

I also have various bits on my blog:


check out the jenkins ones, for example:



In your Jenkins example, why use docker? Why not ask the devs to directly install Jenkins on their box?

Because the builds would not be contained. There were many dependencies that needed to be installed on each box.

Using Docker meant that the slaves could be built from scratch as well.

See here: http://zwischenzugs.tk/index.php/2014/11/08/taming-slaves-wi...

We at PeachDish use Docker and Bitbucket to scale our BeanStalk environment. Docker has helped us deploy test site much easier as one can be assured that everything needed to run the app is in the dock. It helps us build consistent environments for testing.

simplest case is that it can serve as a multiple staging environment, when you have loads of APP in a single code base (often startups going about prod-market fit). With docker tech shipping speed= production speed. Without docker you are slowed down by 2x or more.Without docker, either you set up a lot of staging environments, which is not great and costly. Or you use one single testing environment and let each of you tech person wait for hours, wasting time, while QA tries to test one branch and you have another idiot deploying another brand on staging. While there are more complex/useful cases, this is one simple biz value i get out of it for my team.

Mostly answered already, but another use case is getting engineers up to speed on environments that they're not familiar with. For example, I work a lot with Rails apps, and sometimes we'll need a dev to come in and work some css/html magic or something like that. Well getting them setup with rvm, ruby, rails, bundler, mysql, elasticsearch and working through compilation errors on OSX can be a real nightmare, especially if you set up your env 2 years ago and have just been updating it incrementally since.

When we dockerize the app, they can be setup with the docker image and ready to go in under an hour, without having to screw with their system ... much.

Here's a really simple example of a project that I had set up using docker.

For my website I have it set up with continuous integration to run my tests when I merge into master and build a docker image which it then pushes to the docker registry. I then have it ssh into my host server, pull that image, then run the new container and remove the old one.

Boom, i've just deployed my website by simply merging a PR into my master!

This is just a simple use case, and I probably wouldn't suggest deploying a production ready site like this, but it's really cool! It's really simple to just pass around images and have things up and running on local dev's too

This is not something that is unique to docker though. I could just as easily set up a hook to deploy our site without docker, but it's not something I need.

I prefer to manually deploy, it's only one command, and I can make sure it's all worked correctly.

That said, I will be moving that command to a chat command, as I like that. But even then, it'll be a manually triggered command.

I have auto deploy setup for CI testing, as in that case I do want to know that those branches are ready for deployment to prod, when I want to.

Ahh that's true, I guess what I was getting at was that it makes the deploys much quicker and easier (at least in my experience).

I was also playing around with a chat command to spin up PR's/branches that haven't been merged into master yet, onto a temp container for viewing! Dunno, maybe i'm just not as familiar with other tools that can do similar things as easily

In our organization we have seen tremendous value using docker for our CI server (bamboo) and for infrastructure deployment. On the CI server side of things.... Our CI server a.k.a the "host machine" in docker speak has a MySQL database that we did not want used by all of our builds and wanted to isolate the creation of our application from the ground up within using its own MySQL instance. Additionally there is a lot of other peripheral software that we did not want to install on our host machine but it was critical for our unit tests to run. This is where docker is valuable to us from a build perspective. It allows us to isolate the application for unit testing and not get caught up with the possibility of other running software on the same machine affecting the builds, therefore freeing us up from debugging, and therefore saving the company money because developers are not wasting time debugging CI server issues. It's easy to isolate CI server related issues from the docker container running the unit tests because a developer can just run the same tests using the container on their local machine, so it creates a consistent environment.

On the infrastructure deployment side of things.... Previous to our "dockerized" infrastructure we were managing about 7 different AMI's for all our servers and it was becoming a pain in the butt to manage the installation of new software if our application called for it, create a new AMI, then re-deploy said AMI. If you have experience with AWS and you have done this enough times, I'm sure you have faced at one point or another long wait times for your AMI to be created before you can re-deploy with that newly created AMI. This is time wasted on the application deployment side of things, but also on the personnel side of things while you wait for that damn thing to be created so you can re-deploy. Time is money and money waiting for resources to be available or for AMI's to be created is money taken away from the business. Additionally though in its infancy stage, we are using docker-compose (https://docs.docker.com/compose/) which offers some really nice ways of defining your container infrastructure within a single machine, I highly recommend looking into this for further efficiency.

To get some additional viewpoints on containerization, you could also take a look at what has been said about similar, preceeding technologies:

* Solaris Zones, see also SmartOS Zones based on that

* FreeBSD jails

I consider it as a replacement for deploying applications as a virtual machine. That is, if I want to host compute and let anyone run any random program in a reproducible manner on my server, I could let them run it in docker and be done with it. So as am IT admin, I find this a useful alternative to letting people run arbitrary programs and add restrictions around it. I think this is more of an IT-OPS tool than something a developer would want to spend time with

Say you're running on CentOS 6.6 (or the equivalent RHEL) and you want to run some software that won't work because you need a newer library than is installed (this recently happened to me recently trying to install Transmission).

You have two choices:

1. Upgrade to CentOS 7.x.

2. Use Docker and install the software into a container using a newer OS (CentOS 7.x or a newer Debian).

#1 is very expensive and sometimes impossible (if you need to be on CentOS 6.x for compatibility reasons).

#2 is very cheap.

There's one of your business cases right there.

This is our current business case. Multiple CentOS/RHEL 6 systems in a global environment and we want to run an application that requires Ubuntu and newer libraries.

Instead of spinning up new VMs in each environment for one new application, we can instead run a Ubuntu container with the application within the existing environment. This brings with it all the other benefits such as continuous delivery and orchestration that we didn't have before.

Once the platform is established there is no limit what we can run within a repeatable and consistent environment.

Isn't that a business case for containers, rather than Docker specifically? If you want to install an entire OS into a container, LXD is more suitable, surely?

For my project, I wanted to have an automated and easy code to deployment setup, so here's what I have it working Bitbucket -> Private Docker Repository -> Tutum (https://www.tutum.co/) - Live app

It's working pretty well. Once you check in your code, docker repository hooked to bitbucket will build the docker image and then you can start deployment from Tutum with one click.

The only truly compelling case I've seen for Docker is Amazon's ECS, which takes a cluster of EC2 machines and will automatically distribute containers among them where ever there is capacity, according to declared resource needs of a given container. The ability to waste less of your EC2 resources is a very clear business win.

Everything else is still nice, but it's basically "dev environments suck less".

FYI this is very similar to Mesos + Marathon. However, both feel way too verbose and painful to use. I'm very interested in seeing how Docker Swarm plays out.

This ecosystem is still so raw.

> However, both feel way too verbose and painful to use

Try Lattice: http://lattice.cf/

Disclaimer, I work for Pivotal, which developed Lattice based on Cloud Foundry components.

I've been using Docker to build a catalogue of similar types of software. Using Docker allows all the software to have the same interface which makes it easier to compare like-for-like. Here's the site - http://nucleotid.es/

Have you ever used provisioning software like Chef to prepare a server to run your software? Have you ever used that in conjunction with Vagrant in order to test out your provisioning and software deployment locally? Docker replaces (or can replace) all of that.

Docker does NOT replace configuration management tools like Chef, Puppet and Ansible. Those are still necessary for preparing the host machine which Docker containers will run on. Where Docker does alleviate/reallocate some things is in the configuration of the containers that run on those hosts. Instead of configuring the host for Ruby/Python/etc. you would move that configuration to your Dockerfile. But I think CM tools also have support for generating Dockerfiles, so there's that too.

> Chef, Puppet and Ansible. Those are still necessary for preparing the host machine

In many cases now, they are not. Docker containers can run on CoreOS, which machines are designed to be configured entirely from a cloud-config file, organized in clusters.

With Deis for example, you can build and orchestrate your scalable web service in Docker containers without even writing a Dockerfile, or necessarily knowing anything about how the Docker container is built. The builder creates slugs with the necessary binaries to host your service, and you tell the container how to run itself with a (often one-line) Procfile.

I would still want chef scripts for my database server, but for things that can live in a container on a Fleet cluster, I most certainly do not use Chef, but I absolutely do get reproducible hands-off builds for my runtime environment, and without spending time individually preparing the host machines.

> Docker does NOT replace configuration management tools

You're right, for now, but hopefully not forever.

Docker, the company, has got an insane valuation, so is monetize anything and everything in sight to validate that valuation. But trying to give away tools to developers is an uphill battle on a good day, never mind selling to developers. So Docker is pursuing the end of the market that's likely to pay off - the data center, and that's why there's such a huge push behind putting Docker into production.

Now, my experience with current CM tools is that it's still easier to boot a VM and tinker with it to get it working 'normally' (ie, bash + editing config files directly), and then throw that VM away and play with my chosen CM tool to get it working there instead.

Dockerfiles do a good job of bridging that divide, out of the box. On a fresh machine, I install the base OS, and then install docker, and I'm up and running Dockerfiles, albiet with a cold (docker build) cache.

Hopefully the next generation of CM tools can blend the both so setting up a local target is easy as 'docker build' and ongoing maintenance of deployed machines is as easy as 'puppet agent --onetime'.

I have not, as I am fairly new to this field but that makes sense in a way if it can minimize the use of other tools to just use the one tool.

Hi, You can have a look at Cloud 66 (http://www.cloud66.com/how-it-works) a full stack container management as a service in production. Cloud 66 uses Docker technology and integrates 11 open sources tools to cover end to end stack deployment on any cloud provider or on your own server, within one click.

You can compare different Docker solutions (http://www.cloud66.com/compare) and read how Cloud 66 used Docker.(http://blog.cloud66.com/docker-in-production-from-a-to-z/). (disclaimer: I do work at Cloud 66 )

This application uses it to spin up replicable instances of genome processing pipelines https://arvados.org

Do people use Docker in conjunction with Vagrant now? Or is Docker used as a replacement for Vagrant for a homogenous development environment?

We use vagrant on our dev machines to spin up a CoreOS cluster. Could use something like boot2docker but we prefer the dev environment to mirror production as close as possible.

(disclosure: I do work for Codeship :P )

There are a lot of really great reasons to use Docker, or any container technology for CI.

First off containers give you a standard interface to a reproducable build. This means you can run something in a container and expect it to behave the same way as something a co-worker runs on their workstation, or something run in the staging or production environments. For CI this is an absolute necessity. Rather than running tests locally, and expecting a CI server closely tracking the production/staging environments to catch issues with different version of the OS or libraries you can expect any build that passes locally to also pass on CI. This cuts down on a lot of potential back and forth. The only shared dependency between CI/local/prod/staging is docker itself.

Another benefit is (almost) complete isolation. This means rather than having different vm images tracking different projects, you can have a single vm image with docker, and have each container running on the vm for any version of any build across your system. From a CI perspective you can abstract most of the complex configuration for your applications into "docker build -t myapp_test ./Dockerfile.test && docker run myapp_test".

Containers use a differential filesystem, so N running containers for an application will take up 1 X the size of the container image + N x the average space of changes made in the running containers on top of that base image. This makes larger images highly space efficient without having to worry about different instances treading on the same folders.

The line between dev and ops blurs a little (devops), but clear responsibilities. Ops becomes responsible for maintaining the docker infrastructure, and dev is responsible for everything inside the container boundary, the container image, installed packages, code compilation, and how the containers interact. A container mantra is "no more 'well it worked on MY machine'". If it works for the dev, it really will work in prod.

Besides this, there a number of benefits around speed, accessibility, debugging, standardization, the list goes on. There are also a ton of great and varied Docker CI solutions out there, from specific Docker based CI like us (codeship), Shippable, Drone, Circleci, as well as standard solutions like jenkins via plugins. Many hosting solutions are supporting docker redeploy hooks for CI purposes. The standardized nature of containers make it trivial for vendors to provide integrations. Even if you don't use docker yourselves, this is certainly a great space to watch.

Technically you can use docker for CI/CD without using it for deploying your app. When you do this you lose some of the benefits listed, but not all. You lose the cohesion between CI/local and prod, but you still gain a whole lot in terms of speed and complexity within your CI infrastructure.

Thomas Shaw did a great talk at Dockercon on introducing Docker to Demonware for CI across a variety of projects. I don't think the video is up yet, but it's well worth a watch if you're thinking of bringing it into your company. In the meantime we wrote a blog post on his talk: http://blog.codeship.com/dockercon-2015-using-docker-to-driv....

We are just starting a beta for our new CI flow which follows the container paradigm very closely. It allows you to build docker compose stacks for your various application images, and run your CI/CD pipeline locally, exactly as it would get run on our hosted platform.

If anyone is interested in joining our beta, just drop me an email: brendan at codeship.com.

I honestly don't know how much those things cost (I have heard some people say AWS is not cheap, but compared to buying your own hardware maybe all of this stuff is very cheap). The point of asking is, my company has not found a clear place to use Docker directly, but we do use it indirectly through the Deis project, and CoreOS.

My experience with Deis has been wonderful. If you ever looked at Heroku but got to the pricing page and didn't look any further, Deis has the same workflow (and much of the same stack, Cedar) as Heroku. The whole thing is built on docker containers, and designed with failover in mind.

I see that Codeship costs a fair amount of money on the higher end of usage; for the cost of a few months on their enterprise subscription, you could probably build your own CI cluster on Deis. CoreOS also targets AWS, and I don't have any idea what your AWS environment looks like, but you could likely build a Deis cluster on AWS just as easily as you could on your own hardware, if not easier.

I try not to think of Docker as an end so much as a means. For me, it doesn't even matter that it's using Docker under the hood, but if you have containerized your application, Deis can work on already built images just as easily as Heroku works on git repositories.

I can't use Heroku for serious things because it costs too much, and we're small potatoes. But I've got plenty of hardware lying around, and some slightly bigger iron that if I'm being honest is probably underutilized, this is based on knowing that it hosts multiple kernels using virtualization, and the only reason those different tasks run on different machines is to keep them nominally isolated for increased maintainability.

Containerization is "virtualization lite." If I can take those services and jobs that all run on their own virtual machines and make them all run on Deis instead (or even just the ones that don't maintain any internal state of their own), I will gain a resource boost by not having to virtualize all of that separate virtual hardware and individual Linux kernels anymore. The marginal cost of another container is lower than a full virtual machine. If it fits into CI, the maintenance cost is lower too, because that's one less individual system that needs to get apt-get upgrade. If we were better at adopting things like chef, this might not be an argument, but for us it still is.

I inherited a lot of legacy stuff. Your situation might not be anything like mine. If you are already drinking the CI kool-aid, you might not honestly have much to gain from Docker that would compel you to invest time and effort into using it to host your apps.

If Deis looks a little complicated, you might check out dokku. Your laptop probably doesn't have enough power to spin up a whole Deis cluster, but you can still get "almost like the Heroku experience" using Dokku, with Docker under the hood providing support. I'm not going to promise you that it will cut your AWS bills in half, but if you did drink the kool-aid, it might be worth checking out just how much of your currently required development infrastructure and outsourced hosting needs can just go away when you add containerization to your developer toolbelt.

well not exactly a business case but it's obviously a win for developers.

I shed a single tear when I realized I could just fire up flask + nginx + uwsgi within seconds after installing docker.

For a business perspective, it's a little tricky. I guess it can help if you need to offer an onsite version of your SaaS app and the enterprise client had strict rules about being on site.

What would really make docker kickass is if they had a way to encrypt all the source code somehow and protect it.

Maybe you have looked already and it wasn't useful to you but on the Docker website it has some pretty good marketing to explain its usefulness: https://www.docker.com/whatisdocker

Why Use Docker: "How does this help you build better software? When your app is in Docker containers, you don’t have to worry about setting up and maintaining different environments or different tooling for each language. Focus on creating new features, fixing issues and shipping software."

Business Case: "...With Docker, you can easily take copies of your live environment and run on any new endpoint running Docker..."

Yeah, I looked at the Docker website. I feel that Docker is super good at marketing and wanted to get some other opinions.

Here is CoreOS opinion on docker:


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact