

Getting Started with Docker - fideloper
http://serversforhackers.com/articles/2014/03/20/getting-started-with-docker/
I saw that people were looking for better getting started docs for Docker, so I put together the post I wish I found on Docker when I was digging into it.
======
Xdes
This skips over the hard part: managing docker containers. Poking a hole
directly to the container is a leaky abstraction. A reverse proxy like HAProxy
or Varnish should be sitting in front of the container.

Once you have the reverse proxy setup the next problem that arises is routing
to containers based on the domain. Now your HAProxy or Varnish config is going
to get bloated and every time you deploy a container the config needs to be
modified and reloaded. By this time you might be looking at chef or puppet for
automating the config generation.

Chef and puppet are not simple to learn. They have their own set of quirks
(like unreliable tooling support on Windows). I'm in the process of conquering
this, but I hope one day there will be a simpler way.

~~~
tomgruner
This is a great point. The initial Docker examples make everything seem easy,
but we blew way past our estimated time in integrating docker into our
workflow because of the points you mention. I am still happy with the choice
to use docker though and our team will be better at server administration in
the future.

One thing about this getting started guide is that it recommends the Phusion
base image which boots init. That seems to go against the best practices
outlined in a recent article by Michael Crosby -
[http://crosbymichael.com/dockerfile-best-practices-
take-2.ht...](http://crosbymichael.com/dockerfile-best-practices-take-2.html)

~~~
fideloper
Nice, hadn't read those. Thanks! I was wondering about that, but still need a
solution for logging, cron jobs, and similar (perhaps running those on the
host machine is the answer)

~~~
tomgruner
I am still finding good solutions for those too, and trying to add some
concepts to my toolbox like orchestration, service discovery, proxies, data
containers, ambassador containers, and so on. It's hard for me to wrap my head
around the different recommended ways to use docker compared to my initial
expectations.

------
izietto
For non-Paas use cases (for example, a development server with a bunch of
projects) I find schroot (1) simpler and more productive. For example, you can
use the normal `service stop / service start` instead of writing manually init
scripts, and you don't get stuck with sharing directories, which I found
extremely tricky with Docker (for example, I couldn't start correctly mysql
with supervisor sharing the mysql db directory). But Docker is in early
development, so I think it will become easier in the future.

1: [https://wiki.debian.org/Schroot](https://wiki.debian.org/Schroot)

~~~
fideloper
I've had the same issue with MySQL. It's an issue of timing - You install
MySQL, and the MySQL data directory has its data/default databases. Then you
share the directory with Docker, and the data lib directory is wiped out (the
files don't exist on the host machine, after all). Getting it right in an
automated way is a Hard Problem™.

As of now, I'm keeping data persisted within the Container, which I don't
necessarily like. I would _love_ to hear a good solution on that.

~~~
tonyhb
Here's a dockerfile setup I wrote for Postgres which uses a 'data container'
for the entire Postgres database:
[https://github.com/codelittinc/dockerfiles/tree/master/postg...](https://github.com/codelittinc/dockerfiles/tree/master/postgres)

The gist of it is that you explicitly tell your DB container that there will
be a shared directory on the container at runtime. This allows you to chown
the directory before the data container is added.

Then, when you're running, use --volumes-from `$data-container-name` and it'll
work. Want an article on it?

~~~
fideloper
Awesome, thanks. I can figure it out from that example most likely, but an
article would never hurt :D

------
robszumski
CoreOS experience designer here. I'm looking for testers to check out the
general platform and test some of our new features. All skill levels are fine
– new to docker & CoreOS, new to CoreOS only, etc. I'm happy to work with your
schedule and make it as quick or involved as you're comfortable with. Anything
from emailing a few thoughts to Skype to hanging out in our office in SF for
the day.

Email: rob.szumski@coreos.com

------
markbnj
I've been using docker for a couple of months, but we have only just begun
experimenting with actual deployment in a test environment on ec2. Right now
we use it primarily as configuration/dependency management. We're a small team
and it seems to make setup easier, at least so far. Two examples: the first is
a log sink container, in which we run redis + logstash. The container exposes
the redis and es/kibana ports, and the run command maps these to the host
instance. Setting up a new log server means launching an instance, and then
pulling and starting the container. The second example is elasticsearch. We
have a container set up to have cluster and host data injected into it by the
run command, so we pull the container, start it, and it joins the designated
cluster. The thing I like about this is the declarative specification of the
dependencies, and the ease of spinning up a new instance. As I say, just
experimenting so far, and I don't know how optimal all of this is yet, so
would love any feedback.

One last quick thought on internal discovery. A method we're playing with on
ec2 is to use tags. On startup a container can use a python script and boto to
pull the list of running instances within a region that have certain tags and
tag values. So we can tag an instance as an es cluster member, for example,
and our indexer script can find all the running es nodes and choose one to
connect to. We can use other tags to specify exposed ports and other
information. Again, just messing around and still not sure of the optimal
approach for our small group, but these are some interesting possibilities.

------
tonyhb
This is a copy and improvement of the article I wrote last month, even down to
the breakdown of "What's that command doing?" with `docker run -t -i ubuntu
/bin/bash`.

Glad it was useful enough to spur an improved article, at least.

[http://tonyhb.com/unsuck-your-vagrant-developing-in-one-
vm-w...](http://tonyhb.com/unsuck-your-vagrant-developing-in-one-vm-with-
vagrant-and-docker)

~~~
fideloper
I've been sitting on my original since December.
[https://www.dropbox.com/s/lf0qi70vlglasgv/Screenshot%202014-...](https://www.dropbox.com/s/lf0qi70vlglasgv/Screenshot%202014-03-21%2010.40.07.png)

~~~
tonyhb
Hey, no problem at all! Just surprised to see how similar we were in our
styles. Happy that there are more resources popping up now!

------
yblu
Can someone tell me what's the point of this? (I seriously love to know, not
criticizing it.) Why would I need to have docker containers to install stuff
on them instead of just installing stuff directly on host?

Let's say I develop a new web app, I would install NodeJS, PostgreSQL and such
on my machine. Before I deploy the app for the first time, I'll install them
in the necessary servers. Now, it looks like I would need to do the same,
except adding the step of building Docker containers.

I think I must miss something important here because the number of GitHub
stars for Docker is impressive and this is usually a good indication of the
usefulness of the project.

~~~
robszumski
Docker containers let you isolate the entire environment for your app. Let's
say your running an app on CoreOS in a container that needs python 1.2.3.

On your laptop you can build and test the new version of the app that needs
python 1.2.4. Once you decide that's ready to go, you can push the new
container onto the same CoreOS machine, so it's running both containers.
Without the containers, running two versions of python on the same box isn't
possible. If you had a chef script that updated to 1.2.4, you'd possibly break
every other app on the box.

Containers also let you do some cool things like sign and verify a container
before it's launched on the box. It should be bit for bit the same on your
laptop as it is on the remote machine. Containers also boot within seconds,
much faster than a VM. There have been a few tech demos running around that
actually spin up a new container with a web server to service every web
request, just to show how fast you can boot them. 300ms is pretty long for a
web request, but it's the idea that counts.

~~~
yblu
Thanks, this is good info. For NodeJS and Rails, I use nvm and rvm, so didn't
have any problem with multiple environments. But yeah, I see your point that
Docker can help in such scenario or when there's no equivalent of nvm/rvm.

~~~
e12e
Have a dependency on conflicting libc is a favourite problem that can be
difficult to solve without some form of container (vm, chroot or something "in
between" like lxc/jails). Another is dependency on different kernel (either
major version on same os, or dependency on a different kernel, like freebsd),
which docker (by design) doesn't solve.

I don't use ruby much, but it doesn't strike me as very easy to work with/very
reliable for production deployment. But that might be me. How well does it
handle dependencies on conflicting modules with parts written in c?

Perhaps the most important point is that (when it makes sense) a docker setup
might allow easier horizontal scale-out, and or redundancy.

All that said, keeping things simple is generally a good thing. But sometimes
adding complexity in one area makes the overall system less complex.

------
njharman
> with Macintosh's kernel

I misread that as "Microsoft's..." and got excited since I run a build farm
that's 70% windows and wish I could use docker but it's not worth having two
systems (Container and VMs).

Also isn't that complete wrong? Macintosh is not an OS or company. It was one
of Apple's product lines, long ago.

------
zobzu
VM CAN share binaries/libs/etc (otherwise called files)

also, VMs CAN "share" memory. ie VMs can dedup memory between themselves. On
Linux at least.

Not saying docker/lxc and all things namespaces are bad at all - but setting
things straight. VMs can do this:)

Checkout KSM for memory "sharing" and any overlay-style file system that is
mounted by VMs (this one works exactly the same as when you use
namespaces/docker/lxc in fact)

------
arianvanp
Shouldn't "setting up a correct init process" be part of every "getting
started with docker?" [http://phusion.github.io/baseimage-
docker/](http://phusion.github.io/baseimage-docker/)

~~~
bjt
No. That guide assumes you're running a bunch of processes in the container
(or even a full system). That's not the case at all when you're doing an
"application container" that doesn't need its own cron daemon, ssh daemon,
etc.

Containers can be much leaner than the kind discussed there.

~~~
FooBarWidget
I am one of the authors behind that guide. That guide does _not_ assume that
you're running "a bunch of processes" or "even a full system" in the
container. Even if you're only running 1 process, it's still highly relevant.

The main point is the Unix process model and how zombie processes work. Things
are just not setup properly if your system doesn't handle that properly. And
except by using specialized apps (e.g. the my_init system used in baseimage-
docker), it just _isn 't_ set up properly.

One of the Docker authors, shykes, stated: "In short, regular applications
don't expect to be pid 1, and generally speaking they shouldn't."

The other point is to be able to login to the container to perform one-off
sysadmin and debugging work. There _was_ lxc-attach for that, but now that
Docker supports multiple backends, SSH is the only portable solution that
works no matter which Docker backend you use.

------
calgaryeng
I wish that people would stop writing tutorials on "getting started" with
Docker, and actually start writing up examples of how to work with multiple
containers, hosts, and linking.

That's the part that I (and I'm sure other beginners) get totally stuck on.
Anyone can do docker commit/pull.

~~~
fideloper
I actually haven't found many getting started articles, which is why I wrote
this. However I fully plan on writing up more interesting stuff.

------
netcraft
this is the first time I have heard of coreOS - seems to be custom built for
containers like docker. are there downsides to doing system updates this way
and not having a package manager, just relying on containers for everything?
Seems great in concept.

~~~
barrkel
I don't know how well this works as soon as you have a single file that needs
multiple edits to support multiple images.

------
ilovecookies
Well good morning hackers.. This has been around for ages...

[http://www.xenproject.org/](http://www.xenproject.org/)

------
pg_fukd_mydog
Would it be better to use FreeBSD and their Jails mechanism for all of this?

~~~
e12e
Joyent would probably claim that kvm+zfs would be best. But if you don't have
kernel support for jails, then no, using jails isn't better. It's not an
option. Oracle would probably claim solaris zones are better (and arguably,
they'd be right).

Jails are (as far as I can tell) great -- but not so great that freebsd didn't
include a new hvm assisted hypervisor in freebsd 10 (BSD Hypervisor (bhyve)).

LXC in many ways _are_ *bsd jails for Linux.

