Hacker News new | past | comments | ask | show | jobs | submit login
Fig: Fast, isolated development environments using Docker (orchardup.github.io)
175 points by andrewgodwin on Jan 27, 2014 | hide | past | favorite | 44 comments

I'm not involved with this project but there is some confusions in this thread, maybe I can share my point of view:

Docker is a tool to run processes with some isolation and, that's the big selling point, nicely packaged with "all" their dependencies as images.

To understand "all" their dependencies, think C dependencies for e.g. a Python or Ruby app. That's not the kind of dependencies e.g. virtualenv can solve properly. Think also assets, or configuration files.

So instead of running `./app.py` freshly downloaded from some Git <repo>, you would run `docker run <repo> ./app.py`. In the former case, you would need to care of, say, the C dependencies. In the second case, they are packaged in the image that Docker will download from <repo> prior to run the ./app.py process in it. (Note that the two <repo> are not the same things. One is a Git repo, the other is a Docker repo.)

So really at this point, that's what Docker is about: running processes. Now Docker offers a quite rich API to run the processes: shared volumes (directories) between containers (i.e. running images), forward port from the host to the container, display logs, and so on.

But that's it: Docker as of now, remains at the process level. While it provides options to orchestrate multiple containers to create a single "app", it doesn't address the managemement of such group of containers as a single entity.

And that's where tools such as Fig come in: talking about a group of containers as a single entity. Think "run an app" (i.e. "run an orchestrated cluster of containers") instead of "run a container".

Now I think that Fig comes short of that goal (I haven't played with it, that's just from a glance at its docuementation). Abstracting over the command-line arguments of Docker by wrapping them in a JSON file is the easy part (i.e. launching a few containers). The hard part is about managing the cluster as Docker manages the containers: display aggregated logs, replace a particular container by a new version, move a container to a different host, and thus abstract the networking between different hosts, and so on.

This is not a negative critique of Fig. Many people are working on that problem. For instance I solve that very problem with ad-hoc bash scripts. Doing so we are just exploring the design space.

I believe that Docker itself will provide that next level in the future; it is just that people need the features quickly.


Docker -> processes

Fig (and certainly Docker in the future) -> clusters (or formations) of processes

So, simulate a cluster of machines instead of a single machine? Seems like a good thing. "Tiny datacenter in a box."

The question whether it's close enough to production to be useful; any testing environment simulates some things well and others poorly. Load testing would be right out, I'd presume, but it might be useful for testing some machine failures.

"Tiny datacenter in a box."

That's been built into [Open]Solaris for years. You can define the network topology too.

I've only recently been playing around with Solaris derivatives, and am pretty impressed at how far it is with some of this stuff. My recent favorite is discovering 'ppriv', which lets you drop processes' privileges on a process-by-process basis without even starting up a new container/zone to encapsulate them. E.g. you can run a process with no network access or with no ability to fork, or with no ability to read/write files (or all of the above). Super-handy for running untrusted code as a stdin->stdout filter without worrying about it causing other mischief, and not having to encapsulate it in a zone/jail/container just to run one process.

FreeBSD's 'capsicum' [1] also looks promising at the OS API level as a similar initiative to write code with minimal privileges, but afaict you can't use it on the command line to run unmodified code with restricted privileges, at least not yet.

[1] http://www.cl.cam.ac.uk/research/security/capsicum/

Writing a command line wrapper should be relatively simple for capsicum. Designing the interface might need some work. I think the idea mainly has been to get code to sandbox itself but I can see a use case.

Yeah, for the base system that approach makes sense to me (build privilege-dropping into the code), but sometimes I just want to sandbox an existing binary. One recent example where it's come up is a student AI competition, where their submissions aren't supposed to do anything but read/write stdin/stdout, and it'd be nice to be able to enforce that externally by just lowering the process's privileges.

You can package all of your apps dependencies in one image and have a script in the container start them.

Yes you can but I don't think that's the philosophy behind Docker.

To expand on the "dependencies" idea of my previous post, although you technically can put a process supervisor, a web server, an application server, and a database in the same container, this is not the best practice. It makes your app simpler to distribute (a single image, no orchestration) but harder to evolve (e.g. move the database to its own physical server, or replicate it and put them behind a connection pooler).

For instance if you have a tool to manage a cluster of containers, you will be able to manage the different processes/containers logs in a repeatable way.

But sure, if you know you don't need the added flexibility, you can put everything you want in the same image.

There are several use-cases for docker, and they could use different docker images and containers. Right now there's no easy way to distinguish all-in-one image from one-process images.

Seems like the "docker way" is the one-process-image. But one use-case that I find entertaining is to use docker as an super simple way of trying out software. For example I ran Wordpress for 10 minutes just to check it out. In that case it makes sense to have everything in one container as it makes it much easier to run. But in production it might not be a good idea, especially if the app is not totally self contained.

Great explanation, thanks for posting. It cleared up a few things for me.

Thank you that was very helpful.

I'd love to use this .. but who has time to learn yet another configuration and provisioning management tool? I mean, I can make the time - and will - but since this is just another docker management tool, lets use this moment to pick on it, a little bit..

What this needs is the ability to be pointed at a working VM - lets say, Ubuntu 13.10 server - and then just figure out whats different about it, compared to the distro release.

Something like the blueprint tool, in fact.

When I first looked at the Dockerfile format, my thought was, hey, another provisioning file format to learn. I guess you could just call chef/puppet/ansible/etc in your Dockerfile and call it a day though? I have not heavily used any of these tools so my perception of their overlap might be off.

I did exactly that with Puppet as an experiment: http://kartar.net/2013/12/building-puppet-apps-inside-docker.... Works reasonably well. I will say though that the Dockerfile syntax is, IMHO, much easier to use than Puppet/Chef/etc.

Disclaimer: I work at Docker and previously worked at Puppet Labs.

I've tried this and it is not that easy, docker containers are meant to run one and only one process. So, for instance my puppet started an upstart job, and this crashes the docker build process.

I'm a docker newbie, though.

The Dockerfile shouldn't run any processes that need to persist between 2 stages as each stage will be created in a new instance. You can either do fancy one-liner bash scripts (my favorite) to configure simple service / app startup scripts, or include a configuration file from elsewhere using the `ADD` directive. The only non ephemeral command should be the final `CMD` or `ENTRYPOINT` as far as I can work out.

I too am just learning this stuff, but that should hopefully help you out!

I don't have a ton of experience with any of them, but fig.yaml looks dramatically simpler and easier to learn than Chef or Puppet. (It also solves a much narrower problem, but I think the point remains.)

What this needs is the ability to be pointed at a working VM - lets say, Ubuntu 13.10 server - and then just figure out whats different about it, compared to the distro release.

It sounds like you want Blueprint (https://github.com/devstructure/blueprint). But careful what you ask for... I found this to not actually be a very useful approach in practice.

> I found this to not actually be a very useful approach in practice.

Could you expand on this, please? I'm curious to know what the problems were (just so I know what I'm letting myself in for)

(Disclaimer: I haven't found any solution I really like... so the problem may be me.)

It turns out that installing and configuring services on a server touches many files and only some of them are important. Even the basic assumption that Ubuntu is the same everywhere wasn't quite right. Linode has some of its own packages installed and I think they tweaked the kernel. Running it in VMWare, you probably have the guest additions installed, etc. These things aren't important, but Blueprint doesn't know that. So I ended up with this massive number of changed files and the tooling for filtering through them to get just the important bits wasn't so hot (or at least it wasn't a year ago).

I've been using fig on some side projects. It's incredibly exciting how easy it makes configuring what could be a quite involved development environment. Installing redis is 3 line addition to a fig.yml file (https://github.com/orchardup/fig-rails-example/blob/master/f...). It also has amazing potential for an agnostic development environment across teams.

Is this meant to be a next-generation Vagrant? What advantages does it have over vagrant-lxc?

A few things:

- System configuration is managed for you using Docker (you don't need to figure out how to hook up Puppet/Chef/shell scripts)

- You can aggregate log output from all of your containers

- You can model your application as a collection of services - starting, stopping, scaling them etc

- You can ship exactly the same Docker image you use in development to production

More importantly, all this stuff works out of the box by default. Some of these things are possible with Vagrant, but you need to learn and piece together other tools (Puppet, Foreman, etc) to get it all working.

This is really cool! I am going to use it! :D

I don't think so. You might run fig as part of your Vagrantfile to bootstrap the rest of your stack.

Well if you are on OSX these will run inside of a vagrant machine since OSX isn't supported yet. (docker-osx runs vagrant for you)

Docker isn't supposed to replace vagrant, its supposed to supplement it, at least with development machines.

There's also Boot2Docker which I've been using on OSX - https://github.com/steeve/boot2docker.

I just spent few days dockerize my development enviroment and now Im able to recreate complete enviroment in one command - It took less then 100 lines of bash and dockerfiles.

So I put those files in VCS so the next guy could just clone the repo, run make devel and get the app running, ready to code on.

So unless you want to use Docker at deployment, dont split app in multiple containers - you got more running parts to integrate and no gain, instead use supervisord and run all processes in single container.

Theres few hacky parts (how to inject ssh keys into container) but so far its really cool.

I too wrote few wrapper scripts around lxc-attach so i can run ./build/container/command.sh tail -n 20 -f /path/to/log/somewhere/on/container

I cant share any code but im happy to answer questions at [HNusername]@gmail.com

I'm an employee of Docker, Inc. although my thoughts are my own, etc...

I must say that this is great. I've been advocating this sort of usage of Docker for a while as most still think of Docker or containers as individual units. I'm happy to see others adopting the viewpoint of using container groups.

However, it is something I do hope to eventually see supported within Docker itself.

Also, recently, I've been telling others how since October you could do this exact same thing using OpenStack Heat. Using Heat and Docker is similar to Fig, the configuration syntax is quite similar even, but it requires the heavy Heat service and an OpenStack cloud. That means that for most people, it isn't even an option. It's great that Fig now provides a solid lightweight alternative.

As 'thu' has said already, people want and need these features quickly and I expect in the next year we'll see serious interest growing around using these solutions and solving these problems.

Can anyone explain, in a nutshell, what features this tool provides beyond just using Docker itself?

In essence it's letting you store the options you'd pass to "docker run" in a configuration file as a collection of services. You can then start your whole application using one command instead of starting the individual containers and linking them up.

Services can also be controlled as if they were whole units – you can say "start my database" instead of "start this image with these ports, these volumes, etc".

I've been trying to learn Docker and one of the things that has tripped me up is figuring out how to connect the different pieces and the mechanics of a deployment. Like, I need some process to reliably set up the DB container, create a user account on the DB, import some SQL... and then run the Web container, link it to the DB container, feed it the DB credentials, and so on.

I'm not sure fig totally solves this, but I'm pretty sure it's in the ballpark.

After a quick glance, I'd say fig.yml is the magic here. It orchestrates building and linking all the docker containers.

Looks interesting. I know that Docker is all about the single process model, but there's are some images I've been meaning to play with that align themselves more with the single app (including dependencies) model (which fig also seems to attempt to solve).



Hi, I work at Phusion, the idea of these images is not to run both your app and your database in the same container. We agree that the Docker philosophy would be to run those in separate containers, and link them.

The idea is to make sure your app runs in an environment that's actually a fully functioning valid operating system. This means besides your apps process there's also the supporting processes of the operating system (ubuntu in this case).

This enables for example in-container build scripts and monitoring.

Good to know the use case, thanks. I would still have run the db in a different docker container. Just nice to have a bit more of the os available to your process.

I haven't actually deployed anything with docker yet, just used it on my local machine (so I could run neo4j in isolation). It's a slightly different way of thinking from how we've traditionally managed these things so it's going to take a little time to find the best patterns to work with them. I'm sure some people have already figure it out - I just haven't had the time to dedicate to it yet.

This looks great. It looks a lot like what I've scraped together using a fabfile and Python dicts for configuration, but much more formal. I'm excited to try it out.

See also libvirt-sandbox (in Fedora for more than a year) which lets you use either KVM or LXC to sandbox apps:


So, why is this development only? What's missing to make it production ready?

Docker itself isn't production ready.

This is exactly what i have been working on for the past few days, just in a way nicer package. Think i will ditch my current work and use this instead. thx.

Can't help but think about the Framework Interoperability Group whenever I read FIG. This looks awesome though.

How does Docker handle ABI incompatibility?

For example, EC2 disabled some of there extended instruction sets to ensue uniformity but I am not sure how long this will last. Then we will have to deal with Docker deployment problems.

I propose we dig deep into our Gentoo roots and build the dependencies on demand.

can someone explain to me the application of Fig and Docker? Also, how do they differ?

One application I thought of is for deploying to client. You just get them to use the instance and there's zero configuration needed. but then, what if you need to make updates to the code base, how do you update the code changes to all the deployed fig/docker instances running already?

Ideally you run a new build and test it before deploying. If you wanted to you could include a gift pull in your startup script or run the pull over ssh in the container although you would need a trick to load the environment cars passed by docker.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact