
Docker for Beginners - vegasbrianc
http://prakhar.me/docker-curriculum/
======
krat0sprakhar
Hi everyone,

Author here. There are a bunch of awesome[0] tutorials on Docker so why
another one? Well, my motivation was to have a guide (for myself and for
others) on how to deploy dockerized apps on the cloud. So in this tutorial,
apart from giving an intro to docker, I demonstrate how to use Elastic
Beanstalk for single-container and ECS for multi-container deployments.

Here's are the two apps we deploy on AWS for example

1\. Catnip - A simple flask app (single-container):
[http://catnip.elasticbeanstalk.com/](http://catnip.elasticbeanstalk.com/)

2\. Foodtrucks - A simple app to discover foodtrucks in SF (Flask +
Elasticsearch in multi-containers): [http://sf-foodtrucks.xyz/](http://sf-
foodtrucks.xyz/)

I'm new to Docker myself so I'm sure I've made mistakes. Let me know if you
have ideas on how to improve this!

[0] - [http://docker.atbaker.me/](http://docker.atbaker.me/)

~~~
alainchabat
Awesome! I will try ECS :D 2 questions about ECS : I have few script I'd like
to run on ECS. I created many images with the same dependencies, only the
script code change. I was thinking attaching the script with volume . Is that
a good way to do it ? Is ECS handle volume well ?

------
fideloper
ECS is really the only way I'd go with Docker - So far in my experience,
Docker in production is _really_ a lot harder than the casual developer /
devops person realizes based on the "getting started with docker" tutorials.

My conclusion with Docker is that, in general™, you really need to have a
justifiable reason to go whole-hog into Docker, especially if you're not on
AWS / considering ECS.

I'm glad the article covers ECS, as it makes a lot of the scheduling / config
issues simpler!

~~~
gtaylor
> ECS is really the only way I'd go with Docker

ECS is AWS-specific, which is perfectly fine for some. But Kubernetes has been
amazing for us. It abstracts many of the differences between AWS/Google Cloud,
it's open source, and is far more powerful and flexible.

The only issue right now is that setting the cluster up involves running some
shell scripts (yuck). We use Google Container Engine (hosted Kubernetes on
Google Cloud), so we don't have to deal with that, but the option is there
should we ever need to go multi-cloud.

Figured I'd toss that out there for anyone struggling with ECS (it can be a
bit rigid) or keeping an eye on things beyond AWS. Kubernetes is still young
and rough in areas, but it is a nice, opinionated way to orchestrate
containers.

~~~
merb
wow so instead of running my configuration management stuff I know need to
either: \- be on aws \- be on gcloud \- any other cloud that has docker
support out of the box

OR: I need docker + a configuration management to setup my environment, or I
need to somehow manage my coreos configs. Oh and finally don't forget to run a
network over a network cause it's dockerish so for most clouds this means we
run a network on a network which runs a network.

docker adds so much complexity. people just don't see this right upfront and
use most of their time into these stuff, but there are easier ways to deploy.

YES if you are really big and if your servers needs to scale way beyond the
most than you need it probably since configuration management won't help you
and setting up servers even in the cloud take some time _looking at some
netflix articles_. however i just don't get it why people use their time for
docker when there are other things to do in their programs.

~~~
pbecotte
But it doesn't really, it is just another tool to learn like your
configuration management software. Honestly, learning Docker to a high level
was WAY easier than getting my Chef knowledge to a mediocre level. Further, it
has other benefits like being MUCH easier to set up well isolated dev
environments with Docker than with Chef. Since we switched we can usually have
someone running the dev environment and tests within an hour or so of unboxing
their laptop. With chef running the whole thing from scratch took that long by
itself (after you installed all the software) and usually failed. And the idea
that it is too complicated for something simple- I built a sample blogging app
that you can deploy to a digital ocean or ec2 machine with like 3 or 4
commands
([https://github.com/pbecotte/devblog](https://github.com/pbecotte/devblog)).
The system we have built at work has 8 separate services between a bunch of
data stores, background workers, and a couple different pieces of our app- but
Docker allows you to run the entire environment on your local machine and then
deploy that same setup to our cluster, without any differences to account for
making all those apps run on one VM for your local environment (which can get
bad when one service requires a different version of a ruby gem than another
for example).

~~~
merb
How do you do the same thing without a cloud Environment?

Do you install the software manually? Do you configure your network manually?
Do you configure your os manually? Do you install the docker daemon manually?

What about installations behind firewalls? Or about code that shouldn't belong
to the docker registry since it shouldn't be pushed over the internet ?

etc...

Why do you have 8 separted services anyway? How many people does your company
have? for 8 services you should at least have 8 * 3 people.

------
fpoling
The weak part of Docker is volume/data management and it is interesting that
the article almost never mentioned it. There are solutions that attach
networking storage to containers, but that just adds a latency to the
application. If one wants access to the fast local storage, then Docker
requires significant extra efforts to set things up.

~~~
stephenitis
I think you can do exactly what you are asking for with the host machine's
local storage with docker today. Where do these docs fall short for you?

[https://docs.docker.com/engine/userguide/dockervolumes/](https://docs.docker.com/engine/userguide/dockervolumes/)

I would first start with the use case, checkout what came out in docker 1.9
and the read up on what the various volume plugins provide.

(I work for ClusterHQ and we read and hear a lot of feedback around the topic,
also )

~~~
fpoling
My point is that most Docker tutorials and documentation underestimates the
complexity of persistent data management. Docker is not going to solve that.
However, once one takes care about the data, Docker is really nice for service
isolation and deployment.

~~~
musha68k
The guys over at the great
[http://www.thecloudcast.net](http://www.thecloudcast.net) have been regularly
following this topic recently. It's a treasure trove of devops information,
anyone interested should subscribe IMHO.

------
dijit
I still don't know how to give docker an IP address or "physical" network card
on my network.

I was able to do this very easily with solaris zones, and even BSD jails, but
every installation of docker that is any way integrated into packages seems to
be unable to do this.

perhaps I'm simply not using the right google search terms.

~~~
general_failure
Is your intention to expose the ports of a docker container? If you want to do
this search for "expose" ports. The docker run option is -p and -P for this.

Or to create a subnet with custom IP for the docker daemon's bridge network?
Use -b option to docker daemon.

[https://docs.docker.com/engine/userguide/networking/](https://docs.docker.com/engine/userguide/networking/)
is a good place to start. If you have a specific question, feel free to ask
and I can try to answer.

~~~
dijit
no, that's not my intention, my intention is to not have to deal with iptables
mangling packets and adding netfilter tags to everything.

not to mention port collisions with things that must run on predefined ports
(think SMTP or pesky applications that keep redirecting you back to port 80)

I'm looking to expose 'an IP' similar to a bridged/open network in KVM.

~~~
jdoliner
Would doing --network=host do this? It makes the container use the hosts
networking stack so there's no funny business.

~~~
takee
Yes it would help here but would also expose new security loopholes.

------
nemothekid
While the technical tutorial is good, I can't say anyone should choose Docker
for the reasons outlines in the "Why should I use it?" section. "Because its
popular" will only give you a headache when you are trying to deploy your
simple rails app.

~~~
krat0sprakhar
Great point! While the _popular_ aspect is true to some extent, I've found
that for my side projects (I'm still a student) Docker has vastly decreased
the pain in deploying my projects. I used to have "but it works on my machine"
moments but now using the same container in production is super awesome!

I'm sure Docker is not a panacea to all your infrastructure problems but it
surely is a worthwhile tool to learn :)

~~~
nemothekid
> _I 've found that for my side projects (I'm still a student) Docker has
> vastly decreased the pain in deploying my projects. I used to have "but it
> works on my machine" moments but now using the same container in production
> is super awesome!_

Thats a far stronger reason to use Docker - as I'm sure people have wrestled
with this issue when trying to use Capistrano/Grunt/Ansible to deploy as well.

~~~
k__
But "It's popular" seems to be one of the main reasons.

Otherwise more people would use Nix.

~~~
tayo42
How does nix offer any of the advantages that docker does? Isn't it just
another way to do config management? Plus the added complexity of using a very
niche os. It seems like it's not popular for a reason...

~~~
icebraining
Nix is a package manager with isolation, not a config manager. And it runs on
any Linux distro or even Mac OS X, you don't need to use NixOS.

[https://nixos.org/nix/](https://nixos.org/nix/)

------
urs2102
Prakhar, this is phenomenal. I'm a Columbia student as well and actually just
came across your blog with this article. Hope to meet up at some point!

This is definitely a great start to Docker and I like how you provided an
application to allow the reader to just work through the process of deploying
something. Will definitely recommend this, as Docker is something easier shown
than explained. Great stuff man!

~~~
krat0sprakhar
Thanks Uday! I'm planning to hold a workshop in the coming months on campus
where I'll be going over this stuff in greater detail. Feel free to join in!

PS: If you do come to the workshop, drop by and say Hi!

------
MichaelBurge
I'm always surprised to see a comparison to Virtual Machines. The containers
seem more like an enhanced chroot jail than a CPU emulator, and I've always
used those for the similar purpose of isolating a tricky build environment.

They even have some of the same restrictions(Docker needs root, as does
chroot; they both work by making system calls lie to the process).

Whenever I hear a comparison with VMs, I wonder for a second, "Wait, is there
some clever way to invoke the virtualization instructions without evicting
references to the OS from the CPU's context to provide isolation without a
separate guest OS?"

------
siquick
As a Docker noob, it would be so useful if there was a list of Docker use-
cases that someone could link to?

~~~
krat0sprakhar
At the risk of oversimplifying, you can use Docker wherever you might want to
use a full-fledged VM. You can use it to run a job, host a webserver or any
other scenario that you can think of. What makes Docker (rather containers)
different from a VM is the way it sandboxes all your application's
dependencies into an isolated sandbox.

So let's say you have a python app that for relies on a dependency that needs
C bindings (e.g. ImageMagick). Instead of running `./app.py` freshly
downloaded from some Git <repo>, you would run `docker run <repo> ./app.py`.
In the former case, you would need to care of, say, the C dependencies. In the
second case, they are packaged in the image that Docker will download from
<repo> prior to run the ./app.py process in it. (Note that the two <repo> are
not the same things. One is a Git repo, the other is a Docker repo - called an
Image.)

Think of process of the building a container as taking a snapshot of the
entire OS (such as VM images) but w/o the high overhead of running these
images.

Feel free to reach out to me if you need more clarification!

------
musha68k
Docker does great marketing but to my knowledge I can't use the toolbox to
effortlessly (without Vagrant) set up a non-trivial microservice _development_
flow on my Mac yet.

On the deployment side there are also still lots of inconsistencies between
the different tools but I can see it becoming the go-to pragmatic way to
bootstrap your private "cloud" very soon.

Again deployment is only part of the solution and I wish Docker would hire
more good people to work on different development "best-practices" for Linux
_as well as OSX_.

If Docker eventually wants to become a major service provider they should
complement that kind of tooling with the same vigorous level of documentation
and blogging as Heroku, Digital Ocean, Codeship and the likes come up with
consistently.

~~~
fpoling
On my Mac I run docker under CoreOS VM, the same OS I use on production, and
use lsyncd to transparently synchronize all the files that I touch in the
editor into the VM. Works like charm.

~~~
musha68k
I bet you would help not only me but lots of people if you'd share that
workflow.

Which hypervisor are you using?

Is Vagrant still part of that setup (a tool that should become obsolete with a
pure docker development approach IMHO)?

I didn't find anything like that on either the Docker nor CoreOS official docs
unfortunately. Both of those companies are fighting for territory in this
super lucrative space, one of them _should_ provide it without relying on the
greater community (like one of you or me doing what's arguably their job for
free).

Edit: I'm sorry if this comes off as the typical abrasive comment but I've
been working with both those technologies for more than a year now and mind
you not alone. A couple of good developers and sysadmins I work with are also
still trying to figure out how to go about all of these issues and what I see
is a big disconnect between what gets advertised vs. where we are at this
point. As somebody who did FreeBSD jails and OpenVZ container system
administration as well as general "distributed systems" development myself for
many years I also admit to painfully miss the amazing simplicity of creating a
monolith and effortlessly deploying it to Heroku. It's something I've become
used to over the last couple of years and I miss it, even though I myself find
the promise of the current "microservice" trend very interesting as well.

~~~
fpoling
The setup is straightforward.

I use VirtualBox to run CoreOS, but VM can run anything as long as it comes
with a recent Docker and does not require a lot of maintenance. Then I run
lsyncd to synchronize files from the host transparently into that VM and edit
whatever files I need in the Emacs on the host. When I need to run a docker
command, I do it either from Emacs by prefixing the command with ssh local-vm-
name or from a ssh session in a terminal.

To test the things I use, for example, another VM where /etc/hosts points for
my production domains to the VM with Docker. Another useful thing is to expose
during development, say, PHP/JS code that I edit directly into container for
quick testing feedback. For that I can run the container with an extra host
volume mount that override the software tree in the image with one that comes
from Docker VM and where lsyncd copies all my changes that I made in Editor.

~~~
musha68k
Thanks for answering, do I understand correctly that you develop directly on
CoreOS with Emacs? If so I think your workflow might be more of an outlier as
I reckon most people tend to work on OSX (possibly even running some IDE of
some sorts).

I think a lot of confusion comes from unclear definitions so I try my best
here, it would be great if you could chip in once again...

# Premise

1\. our base operating system is Mac OSX which we will simply call _osx_

2\. on top of _osx_ we run a _hypervisor_ (e.g. xhyve, vmware, virtualbox)

3\. on top of our _hypervisor_ we are running a virtual machine with CoreOS as
our _docker_engine_ (with the help of or without vagrant)

4\. on top of our _docker_engine_ we run an arbitrary set of
_docker_container_ instances most of which are comprised of a "bespoke"
_docker_image_ of our own; a certain "microservice" in development

# Questions

How do we:

a) map a ("microservice") project's source code directory located on our _osx_
file system onto the currently instantiated development _docker_image_ for our
current project's development _docker_container_

 _osx_ => _hypervisor_ => _docker_engine_ => _docker_container_

b) and pass any file changes from _osx_ down to the _docker_container_ level
as well (i.e. inotify, lsyncd, etc)

BTW, if we'd wanted to make this an even more helpful effort why not make this
a proper gist?

[https://gist.github.com/musha68k/399c66374ca54c665fd5](https://gist.github.com/musha68k/399c66374ca54c665fd5)

~~~
fpoling
I do not run Emacs in VM, I edit files on the host! lsyncd just transparently
copies all my changes into the VM, see the details in the gist. In another
setup this also worked with Eclipse when lsyncd synchronized compiled classes
into the VM.

------
yeukhon
neat article. I didn't know now you can run Docker on EB!!

I recommend start with Docker Machine
([https://docs.docker.com/machine/](https://docs.docker.com/machine/))

Another well-written beginner tutorial (but with a few hardcoded urls):
[http://stackengine.com/docker-101-01-docker-development-
envi...](http://stackengine.com/docker-101-01-docker-development-
environments/)

------
is_this_tinder
awesome - very well written m8

------
systemz
I still prefer using automation like ansible or saltstack over docker. You
can't run it on very popular OpenVZ VPS and it's another useless layer of
abstraction with security holes in it.

