Hacker News new | past | comments | ask | show | jobs | submit login
Docker at Lyst (lyst.com)
91 points by Peroni on Dec 8, 2014 | hide | past | favorite | 23 comments

I'll repost my question from the blog here (And thanks again Lyst team!):

Really enjoyed the write-up, Lyst team! Thanks for the info.

As a member of another large-ish Python dev shop with a number of dependencies, one of things that have kept us from moving fully in the direction of Docker is total deployment time. Since Docker is a 1-process/1-container model, our dependency sizes can be large since we run a number of processes on a single host. This means that our network into a newly launched host would be theoretically very large. How did Lyst deal with that? Or was the flexibility and slight slowness just preferable to something like shipping pre-built AMIs?

We (hi, I work at Lyst too!) see Docker images as fairly similar to pre-built AMIs conceptually, just a lot more lightweight. We bake our Docker host AMIs relatively regularly and use them in the ASG that runs the Dockerized website. Building AMIs (and booting new instances from them on each deploy) takes a lot longer than building a new Docker image and launching containers from it.

I'm not entirely sure I understand what you're asking though. We've so far only Dockerized the main Python application, and not many of the supporting applications. We don't think we'd go down the route of running Postgres, ElasticSearch, Redis etc under Docker.

To give you an example of times taken, Pull Requests take about 10 minutes to test, a full master build (including uploading build artifacts) takes around 15 minutes, and a deployment to 8 Dockerized web hosts is about 5 minutes, including a smoke test on each host.

EDIT: Just realised what you're asking. If you're running many processes on each host then yes it can take a while to deploy. We deploy our Celery workers under separate Docker containers and this can take a while to stop and start. For the website though we run uWSGIs in forked mode, so it's only one container per host. This will probably change in the future, we're evaluating Mesos + Marathon for a more PaaS-like deployment where we can say "14 web processes please!"

Hope that answers your question.

Thanks so much for the answer!

"Dockerfiles are great for simple cases but need a lot of external help to use in more complex situations. Even simple variable expansion would be helpful."

Interesting that many who use docker "for real" have the same problems we did. This is why we ended up building this:


There's a long thread on this here:


Here's a nice way to do Docker builds with Ansible, which allow using the template engine and all of that good stuff:


(Disclaimer: wrote Ansible)

I attempted to do something like this when figuring out our build problems but it didn't really work for me. It doesn't solve the problem of actually getting configuration into the build very well because it still relies on Dockerfiles and has the added side effect of making the cache mostly useless. Maybe I was just doing it wrong?

I also attempted to try Ansible for deploying containers and had issues with both of these bugs:

https://github.com/ansible/ansible-modules-core/issues/27 https://github.com/ansible/ansible-modules-core/issues/188

Do people actually use Ansible to control Docker in practice? It would be great to have something that was a step up from building random Python scripts from scratch :)

(Disclaimer: I'm not very good at Ansible.)

Our experience was that Dockerfiles were almost completely useless except for the most trivial of cases. We effectively ditched them, spitting out Dockerfiles that implant the ShutIt functionality and no more, eg:

  curl https://raw.githubusercontent.com/ianmiell/shutit/master/library/osquery/dockerfile/Dockerfile | docker build --no-cache -

For my purposes, I wrote a makefile which starts up a container with ssh, then provisions it with ansible (as you would with any normal remote host).

This meant a) I could reuse existing scripts to provision normal server b) docker could be used to test the ansible scripts were actually capturing everything required.

The cache point was a bit of a pain though, I confess.

Very nice Michael !

As a longtime Ansible user who's just starting out dockerizing our apps, I find that there's significant refactoring needed when trying to reuse existing playbooks & roles to provision Docker containers.

In the current model we have a single playbook driven by an EC2 dynamic inventory, and each server definition under "roles/" (with some reuse between them, obviously). All in single repo.

With Docker it seems like this layout needs to be turned completely on its head, because a Dockerfile (unlike an Ansible playbook) can only define a single container type, and to make things even harder there can only be one per directory (something similar to "make -f" would have been useful)

So, the logical directory layout in this situation is probably one directory + Dockerfile per container, with a corresponding Ansible playbook + role in each. That's awful for code reuse and quite a bit of refactoring though.

Has anyone managed to do it more elegantly ?

Thanks! I may use those as templates to improve this:


Hello Lysters from the side of the square to your left! Can you give an example of what you replace in your Dockerfiles using Makefiles? We haven't needed to modify ours dynamically.

waves. We don't change anything in the Dockerfiles, it's more that we use the dependency logic in Make to ensure that x is done before y and before z. We also use it to actually run a lot of our containers with the ability to switch in configs for production-like databases if the developer needs it.

Make was the perfect tool for the task at the beginning, but it's getting a little unwieldy and will probably need rewriting in something else soon, probably Python.

It is amazing how many in this community are able to just dismiss this "Oh yeh, boot2docker performance is still awful." If Docker was a product by Google or Microsoft, we'd have pitch forks out by this point. Yet somehow Docker is still able to skate by on having incredibly subpar Mac OS X performance.

"How Docker Got The Apple Community To Love It, Despite Horrible Performance" <- will someone please write that post, since that's what I really want to know.

If I understand correctly, Docker's performance on OS X is a function of running in a Linux virtual machine, because Docker requires Linux. Is there something about Docker that you feel like inhibits performance beyond the VM itself?

One area of possible improvement is that boot2docker's default mechanism for sharing volumes with the mac is virtualbox guest additions, which are notoriously slow. As for performance, to my knowledge it's simply a matter of the underlying vm config.

tldr: boot2docker should expose ways to leverage faster vm configurations.

I'd love for it to work over NFS. It should be easily doable, but I guess it means adding another dependency to the boot2docker ISO.

We're mostly looking at supporting remote volume sync in docker itself, with a pluggable transport - could be smb, nfs, periodic rsync, sftp... bradfitz started an awesome contribution with a full remote-fuse-protocol sync :) pretty awesome but we haven't been able to merge it yet because the volume system needs to be made sufficiently clean and modular first.

The advantage of doing it in Docker proper is that it works in all distros without outside dependencies, instead of just boot2docker.

Note: we are looking for help on this, if you feel willing and able. Even better if you can invest some recurring time to maintain and help others contribute.

It really has been one of the worst parts of deploying Docker in development actually. I would have gone into more depth on that but every error feels like a personal suffering. We have quite a few developers and many are off site and have odd hardware. You'd think a VM would hide this but integrating with VirtualBox has an ongoing an non trivial internal support cost for us.

I gave up on boot2docker, and just use an Ubuntu image with docker as my virtual dev environment.

Even with sub-par performance in development, that doesn't mean it's the same in production... your local desktop isn't going to run a full database instance with equal throughput to the dedicated cluster of servers either, so we should all abandon cassandra (or whatever) too?

Do you run OSX in production?

No, 99% of our devs run Macbooks

Are Lyst and Lyft friends?

We haven't yet had the pleasure.

Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact