Really enjoyed the write-up, Lyst team! Thanks for the info.
As a member of another large-ish Python dev shop with a number of dependencies, one of things that have kept us from moving fully in the direction of Docker is total deployment time. Since Docker is a 1-process/1-container model, our dependency sizes can be large since we run a number of processes on a single host. This means that our network into a newly launched host would be theoretically very large. How did Lyst deal with that? Or was the flexibility and slight slowness just preferable to something like shipping pre-built AMIs?
I'm not entirely sure I understand what you're asking though. We've so far only Dockerized the main Python application, and not many of the supporting applications. We don't think we'd go down the route of running Postgres, ElasticSearch, Redis etc under Docker.
To give you an example of times taken, Pull Requests take about 10 minutes to test, a full master build (including uploading build artifacts) takes around 15 minutes, and a deployment to 8 Dockerized web hosts is about 5 minutes, including a smoke test on each host.
EDIT: Just realised what you're asking. If you're running many processes on each host then yes it can take a while to deploy. We deploy our Celery workers under separate Docker containers and this can take a while to stop and start. For the website though we run uWSGIs in forked mode, so it's only one container per host. This will probably change in the future, we're evaluating Mesos + Marathon for a more PaaS-like deployment where we can say "14 web processes please!"
Hope that answers your question.
Interesting that many who use docker "for real" have the same problems we did. This is why we ended up building this:
There's a long thread on this here:
(Disclaimer: wrote Ansible)
I also attempted to try Ansible for deploying containers and had issues with both of these bugs:
Do people actually use Ansible to control Docker in practice? It would be great to have something that was a step up from building random Python scripts from scratch :)
(Disclaimer: I'm not very good at Ansible.)
curl https://raw.githubusercontent.com/ianmiell/shutit/master/library/osquery/dockerfile/Dockerfile | docker build --no-cache -
This meant a) I could reuse existing scripts to provision normal server b) docker could be used to test the ansible scripts were actually capturing everything required.
The cache point was a bit of a pain though, I confess.
As a longtime Ansible user who's just starting out dockerizing our apps, I find that there's significant refactoring needed when trying to reuse existing playbooks & roles to provision Docker containers.
In the current model we have a single playbook driven by an EC2 dynamic inventory, and each server definition under "roles/" (with some reuse between them, obviously). All in single repo.
With Docker it seems like this layout needs to be turned completely on its head, because a Dockerfile (unlike an Ansible playbook) can only define a single container type, and to make things even harder there can only be one per directory (something similar to "make -f" would have been useful)
So, the logical directory layout in this situation is probably one directory + Dockerfile per container, with a corresponding Ansible playbook + role in each. That's awful for code reuse and quite a bit of refactoring though.
Has anyone managed to do it more elegantly ?
Make was the perfect tool for the task at the beginning, but it's getting a little unwieldy and will probably need rewriting in something else soon, probably Python.
"How Docker Got The Apple Community To Love It, Despite Horrible Performance" <- will someone please write that post, since that's what I really want to know.
tldr: boot2docker should expose ways to leverage faster vm configurations.
The advantage of doing it in Docker proper is that it works in all distros without outside dependencies, instead of just boot2docker.
Note: we are looking for help on this, if you feel willing and able. Even better if you can invest some recurring time to maintain and help others contribute.
Even with sub-par performance in development, that doesn't mean it's the same in production... your local desktop isn't going to run a full database instance with equal throughput to the dedicated cluster of servers either, so we should all abandon cassandra (or whatever) too?