Hacker News new | past | comments | ask | show | jobs | submit login
Docker interactive tutorial (docker.io)
192 points by dhrp on Aug 16, 2013 | hide | past | web | favorite | 47 comments

I never understood what docker is or who is supposed to use it. Can somebody enlighten me?

Regarding who is supposed to use it... If you like deploying your apps to a Platform-as-a-Service (PaaS) like Heroku, Google AppEngine, or DotCloud, but wanted to run it on your own infrastructure, then Docker could be for you!

It's designed to be a key building block for anyone who wants to create their own PaaS-like service or environment. Why would you need your own PaaS? You might have unique security requirements forcing you to run an app in-house, or you might not want to pay the high prices of a commercial PaaS if your app gets popular.

So I'm used to configuring stuff by hand, and now I see a lot of buzz around docker, vagrant, chef, puppet, salt and they seem like glorified shell scripts. Am I missing something?

Vagrant is a little bit different - it takes the process of booting up a new VM instance (think VirtualBox/VMWare) and automates much of the process of:

1. Choosing OS / flavor / architecture (Debian 7 x64),

2. Setting up networking (private network with its own IP address),

3. Setting up SSH access ($ vagrant ssh),

4. Kicking off provisioning (Chef/Puppet/Salt/Ansible/Bash)

With Vagrant, if you're playing around in your VM and accidently screw something up and can't recover [0], you simply "$ vagrant destroy" and "$ vagrant up" and you'll have a working VM again.

Couple Vagrant with Puppet, and if you destroy/up you'll have your VM go through the process of installing all software and settings and everything for you, meaning your full, working environment is back up and running within minutes, just as it was before you screwed it up.

[0] I experienced this several times when my apt-get stuff would screw up and I couldn't install/remove/purge anything. Google and #debian never let me recover.

It's basically what you get when you start your deployment toolchain as simple shell scripts, then make them evolve, add features, and at some point, rewrite from scratch with a "real" programming language. I don't know if that matches your definition of "glorified" though :-)

I'm reading this python example (http://docs.docker.io/en/latest/examples/python_web_app/) and it's a wrapper for running bash scripts with arcane syntax layered on top.

These tools let you focus on the bigger picture instead of the nitty-gritty.

Docker and Vagrant are convenient for building (and distributing) consistent environments on top of (potentially wildly) different software or hardware platforms. Docker has a ton of other tricks up its sleeve (one example is "layers:" http://docs.docker.io/en/latest/terms/layer/).

Chef/Puppet/Salt/et al. are configuration management tools which enable programmatic definition of infrastructure. The key point is: you define how the system should look, not the specific steps on how to get there. Abstraction layers (management of files/users/packages/services/etc.) provided by the CM tool shield you from a nightmare of permutations, corner cases, and platform-specific options; it'll just enforce a given configuration regardless of local changes.

Having a central source of truth for systems configuration is a gamechanger in itself; your configuration directives or applications can query (or update) your configuration management database, which enables some very cool automation with very little effort. Then there's the community: for any given stack, there's probably a well-documented, well-tested Chef cookbook or Puppet manifest to build it, instantly plugging you into a rich experience base.

Yes, one could script all of this from scratch, but I'm not sure why one would.

The big difference with just scripts is that you can "freeze" the state of your deployment with strong guarantees that 1) it will not change and 2) it will behave the same way across machines.

Another advantage of containers is that deployment can be made atomic: either container A is running, or container B. You can't end up with a half-finished upgrade which leaves your server in an undefined, broken state. That property becomes very important when you deploy to a large number of servers.

Docker is a way to bundle an application inside a filesystem and then run the application inside of an LXC container.

Containers are an isolation mechanism. They aren't virtual machines. More of a distant cousin to chroot or a jail.

The target audiences are developers and operations. It's a packaging/deployment tool.

It depends. Do you mind me asking, what is your background/profession? And have you deployed a virtual machine before to run an application?

Im a coder and owner of a small SaaS company.

What I do to run applications is this: I fire up a vm or dedicated server on some provider and run my stuff on it. I use Amazon and a couple of other providers.

Ok. That's great. Then let me try to give you some examples #:

Imagine you have setup your SaaS to run from some containers (1 container with your web app, 1 container with your worker, 1 container with your queue and a database somewhere.

Now let's walk through a scenario for a significant new release of your web app:

1) Package your new web app, Launch it for testing (on the same host, cheap) to point to a testing database. 2) Fails? Rebuild, test immediately. 3) Happy? Now relaunch your container to connect to the production database 4) Everything works completely? Now re-route your traffic to the already warmed up container. Chances of failure? < 0.1%

Some other ideas: - Package your worker. Run it once (on the same host), more load? run it multiple times, run it on multiple servers. -- it is so much quicker and cheaper than spinning up virtual machines. - So your developer made some changes.. He packages it and you run it. It fails. You now just save the entire container including the last state, logs and everything exactly as you crashed it. And hand that back to him.

Hope it helps.

these examples are based on intended use, because right now the whole development is still moving so fast production deployments are no yet recommended.

Sorry, I dont even understand your first sentence. "Imagine you have setup your SaaS to run from some containers". What is a container? What is a "Worker"? What do you mean with "queue"?

Then you say "Package your new web app, Launch it for testing". I never package my web app. It just runs and runs and runs. And my customers use it. I develop it on another machine, and from time to time I push updates from the development machine to the production machine. Everything seems fine to me. Am I having a problem I dont know about?

A worker is usually something that does background processing. For example a user uploads an image and you need to convert it to multiple sizes. You can either do this on your application server within the scope of the request, or set up a task queue.

Here's an oversimplification of the latter: You have an application server, a queue (something like 0MQ, or redis) and one or multiple workers. When the image is posted, you add a job to the queue, asking for the image to be processed. The worker polls the queue asking if there are any jobs, an if there are, it executes those.

> I never package my web app. It just runs and runs and runs. And my customers use it.

So how do you deploy a new version of your app?

> So how do you deploy a new version of your app?

I push the new version to the server.

Honestly, playing with this tutorial for 5 minutes answered that question for me.

There is a lot of good, introductory material (slideshow, white paper, summary) here: https://www.docker.io/about/

The creator here. Please let me know if you have any questions or comments!

Hi dhrp, I assume you work for Dotcloud?

Nice job so far. Where do you see this heading to? What's your view of what will happen in the virtualization world?


Hi. Thanks! My personal take on where Docker is heading to? There are so many things people can do with this, it's hard to summarize.

What I personally care most about? As a designer ex-entrepreneur and front-end developer, the thing that gets me going most is the idea that I'm able to "just run" an application. No more difficult than from the Mac store. For example Trac (a wiki system), Wordpress, Django apps, Mailservers, torrent-servers. Basic stuff which just makes it easier for me to deploy my creations, and those of others.

Absolutely love docker, hope to see it mature even more :)

I am currently playing around with it and building a messaging platform playground. One "pain" so far is that docker's IPAddress assignment is not very flexible. Will it be possible to assign IP addresses to containers (e.g. from "docker run")? Or have a better control what IPAddresses are used (like giving a network range on docker -d)?

If i am not mistaken docker saves changes in containers through aufs and keeps those changes as separate images on disk, right? I'm currently working with containers which keep their state on the host OS (by mount bindings) and thus, i don't want to keep old images of not-running containers. Will there be some switch to disable that or clean up old ones? Maybe i am misinterpreting something, but i'm new to docker ;)

Anyway, keep up the great work, i am very impressed with docker, kudos!!

Currently using the unionize.sh script works very well for me.


You have to run ./unionize <bridge> <container sha1> <ip address> after starting the container, but that brings up a new interface inside the container with that IP and connects it to the bridge.

This is useful for having private IPs between containers of an application, for accessing databases or similar.

I was starting with unionize as well, but the fact is that docker looks up for available IP adresses on its own. So you can give docker -d the -b parameter and pass an existing bridge and it will go through that bridges IP space and assign IP adresses already. Also that way the IP address shows in "docker inspect" which it doesn't with unionize.sh (i think).

The problem here is that the built-in IP Address allocator is rather stupid and doesn't even try to ping an address before assigning it. I got it to interfere with my network heavily when it assigned my gateways IP Address to a container ;)

It'd be nice if the whole IP Address allocation was more pluggable or configurable. Right now it's some code deeply tied into the whole system (i think) and i fear i don't have the Go skills to change that myself :(

(for example, i think i would have been able to write a little bit of Go to assign IP Addresses the way i want to, if the system would be more pluggable)

Oh, and yes, I do work for dotCloud.

What kind of security issues/misconfigurations do people typically have/need to watch out for? How does this compare to alternatives?

Haha, I had a good laugh upon hearing that Docker cannot run inside of itself. Is this an LXC limitation? Clearly the solution was to run Docker inside of KVM inside of Docker.

But seriously, nice job. I haven't used docker yet because I want to play around with the standard lxc utilities first. But this is pretty awesome.

It's not an LXC limitation, but seeing as this case hasn't gotten the love it involves jimmying things all over the place.


Thank you for posting this answer. Indeed, I think the short answer is: It is not impossible, but there are limitations.

Actually, I've ran Docker within QEMU within Docker (using v9fs so that QEMU could use the container's FS as root FS). Works, but painfully slow and not very resource-efficient :-)

I will add that Docker is getting an architecture upgrade, and in the future will support nesting :)

Yeah, I was kidding about Docker in KVM in Docker. I know it would be a lot of overhead.

Would there be a useful reason to actually do this (run Docker in Docker), or is it more just a novelty?

If anything, it's needed for the development of docker itself. We already build docker with docker (https://github.com/dotcloud/docker/blob/master/Dockerfile), but we can't yet test docker with docker because of the nesting problem.

I'd like to give each tenant a container and let them run Docker app containers within that.

8 steps, 15-20 minutes, and you'll get an initial understanding of a much better way to do devops/deployment, including scalability. Thanks Docker!

I've been playing a bit with Docker and found I couldn't get the kernel option to work (the linux-aufs_friendly kernel just wouldn't work on my Arch setup). However, that forced me to the option of using vagrant to set up a vm with docker configured and I found that great - I recommend this option over playing with your kernel.

I'm a long time virtualbox user but had never played with headless vms and never realized how easy that is with vagrant. Additionally, using vagrant to get a coreos vm running, with docker all set up, was pretty cool. So far I'm finding the payload of the vms rather heavy (with the os overhead), but I haven't really got down to setting up individual docker app containers. I'm looking forward to that, and even more to what could lie ahead for this space: vagrant, coreos, docker, chef/puppet all look to making a very promising convergence.

Probably it won't help you, but Docker is working on my Arch with linux-aufs_friendly kernel.

I suspect it must have something to do with my set up. My first suspicion was my nvidia drivers (304xx) but I recall not even being able to boot in console mode. I'll give it another shot later but for the moment vagrant's doing a fine job putting it all together.

Enjoyed it, more would be welcome, only if you have more time to give of course.

ps: especially on committing and layering.

Yeah. The committing and layering stuff is not the easiest to grasp. There is actually a section on the docs that does a decent job explaining these concepts but it is a bit hidden: http://docs.docker.io/en/latest/terms/

Well thanks I plan on using docker to deploy a jvm app soon, this should help.

This should help a bit more with your specific case:


Why is the docker hype machine cranked up to 11?

Hi seiji, I'm the author of Docker.

There is hype around docker, for sure. The optimist in me likes to think that it's because people find the project useful and are excited about the possibilities of containers in general - which I believe are huge.

Me and the dotCloud team have been working on container technology since 2008. For a long time it felt like preaching in the desert - mostly because it required exotic patches to the kernel which made widespread adoption difficult. So it's very rewarding to see more people adopt containers, and of course it's great to be on the right side of the hype for a change. But if we hadn't been at the right place at the right time, someone else would have done it instead. Containers are just too important and useful to not blow up.

Could you share some thoughts (or links) on the differences between LXC and linux-vserver? I played a little bit with that, and it seemed to promise "true jails" for Linux -- and LXC seems to be a natural successor -- any comment of what we've gained/lost from the "transition"?

I'd also love to hear what people think about the relationship between freebsd+jails, solaris+zones and Linux+LXC/docker and/or if it would make sense to modularize the back-end so that "docker" (as in the daemon/management tools) would work for maintaining jails and/or zones as well?

It'd be fun to be able to run docker+LXC under GNU Debian/Linux, and docker+jails under GNU Debian/kFreebsd (and ditto for the Debian-like/Ubunut-based solaris distros)... Maybe not useful, but interesting...

> Could you share some thoughts (or links) on the differences between LXC and linux-vserver?

vserver, lxc and openvz were 3 competing projects to add process-level isolation to the linux kernel. We used all 3 of them extensively (the ancestor of docker was based on vserver, then ported to openvz, then finally to lxc). They all had pros and cons, but in the end the only meaningful difference is that lxc got merged upstream, and the others didn't.

> would make sense to modularize the back-end so that "docker" (as in the daemon/management tools) would work for maintaining jails and/or zones as well?

Absolutely. That is the goal, and starting with 0.8 Docker's architecture will be modular enough to support it.

See this blog post for details: http://blog.docker.io/2013/08/getting-to-docker-1-0/

Ditto, I'd like to hear a bit more about the security related strengths and weaknesses of docker and its alternatives.

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact