Author here. There are a bunch of awesome tutorials on Docker so why another one? Well, my motivation was to have a guide (for myself and for others) on how to deploy dockerized apps on the cloud. So in this tutorial, apart from giving an intro to docker, I demonstrate how to use Elastic Beanstalk for single-container and ECS for multi-container deployments.
Here's are the two apps we deploy on AWS for example
1. Catnip - A simple flask app (single-container): http://catnip.elasticbeanstalk.com/
2. Foodtrucks - A simple app to discover foodtrucks in SF (Flask + Elasticsearch in multi-containers): http://sf-foodtrucks.xyz/
I'm new to Docker myself so I'm sure I've made mistakes. Let me know if you have ideas on how to improve this!
 - http://docker.atbaker.me/
I learned a lot about how to use Elastic Beanstalk and the ECS cli from this guide.
My conclusion with Docker is that, in general™, you really need to have a justifiable reason to go whole-hog into Docker, especially if you're not on AWS / considering ECS.
I'm glad the article covers ECS, as it makes a lot of the scheduling / config issues simpler!
ECS is AWS-specific, which is perfectly fine for some. But Kubernetes has been amazing for us. It abstracts many of the differences between AWS/Google Cloud, it's open source, and is far more powerful and flexible.
The only issue right now is that setting the cluster up involves running some shell scripts (yuck). We use Google Container Engine (hosted Kubernetes on Google Cloud), so we don't have to deal with that, but the option is there should we ever need to go multi-cloud.
Figured I'd toss that out there for anyone struggling with ECS (it can be a bit rigid) or keeping an eye on things beyond AWS. Kubernetes is still young and rough in areas, but it is a nice, opinionated way to orchestrate containers.
I need docker + a configuration management to setup my environment, or I need to somehow manage my coreos configs.
Oh and finally don't forget to run a network over a network cause it's dockerish so for most clouds this means we run a network on a network which runs a network.
docker adds so much complexity. people just don't see this right upfront and use most of their time into these stuff, but there are easier ways to deploy.
YES if you are really big and if your servers needs to scale way beyond the most than you need it probably since configuration management won't help you and setting up servers even in the cloud take some time looking at some netflix articles.
however i just don't get it why people use their time for docker when there are other things to do in their programs.
Do you install the software manually?
Do you configure your network manually?
Do you configure your os manually?
Do you install the docker daemon manually?
What about installations behind firewalls?
Or about code that shouldn't belong to the docker registry since it shouldn't be pushed over the internet ?
Why do you have 8 separted services anyway?
How many people does your company have? for 8 services you should at least have 8 * 3 people.
No. You need a networked computer with a quasi-recent kernel. There are no requirements to use a cloud provider or to use AWS/GCP.
> OR: I need docker + a configuration management to setup my environment, or I need to somehow manage my coreos configs. Oh and finally don't forget to run a network over a network cause it's dockerish so for most clouds this means we run a network on a network which runs a network.
This sounds like aimless rambling. You don't have to use any of this. You can, of course, but I can do some nasty stuff without Docker as well. Like I mentioned with Container Engine, setting up a Kubernetes cluster is like two mouse clicks. You can get as simple as ECS, or you can roll your own from the ground up. You have the option to pick a point on a spectrum. With ECS, it's Amazon's way or the highway.
> however i just don't get it why people use their time for docker when there are other things to do in their programs.
Because it saves us loads of time developing, testing, and deploying our systems. There is more initial setup work, but after that we are deploying images seamlessly, have a great rolling upgrade and rollback story out of the box, and get a lot of other bonuses like service discovery, auto-scaling (vertically and horizontally), and much better (higher) resource usage levels. And we're far from a mega-corp.
To me, it sounds like you may have skimmed some, perhaps even played a bit. But you ran into a snag, threw up your hands, and have summarily dismissed an entire ecosystem after your experience(s). I see enough un-informed or 100% incorrect things above to think you're missing some details.
Question what benefits do you get with docker than without?
Without I have Cgroups, my software is versioned i.e.:
programm-X.X.X.jar / programm-X.X.X.rpm / whatever.
Why do you get better / higher resource usage I mean you use an additional layer you could do the same with just CGroups.
How do you do service discovery? Etcd could be installed without docker?! How do you auto-scaling without a cloud? I mean even without docker auto-scaling is trivial.
Not to be dismissive, but why would we want to use cgroups? Docker works great for us. There are indeed tons of alternatives to everything. Why cgroups? Why not BSD and jails? Why not build on unikernels and VM images?
> Why do you get better / higher resource usage I mean you use an additional layer you could do the same with just CGroups.
Kubernetes (and most other Docker orchestration systems) come with a scheduler. Containers are allocated based on desired characteristics to the machines with the most spare capacity. You can cut down on idle capacity by intelligently spreading the load without having to think very much about it.
You could do the same with cgroups, but you're going to need to write an orchestration system, a scheduler, and you probably won't have the massive communities that some of the alternatives have already built.
> How do you do service discovery? Etcd could be installed without docker?!
Yes, it can be installed without Docker if you'd like. We run some things in Docker, and others outside ourselves. It's no silver bullet.
> How do you auto-scaling without a cloud? I mean even without docker auto-scaling is trivial.
Docker is just one piece of the puzzle, which is something you seem to be missing. In our case, Kubernetes brings it all together into an easy-to-use package.
Life is full of few absolutes. It's great if your cgroup deploy is working for you, it's just not a great fit for everyone. The comparison between Docker and cgroups is a bit apples to oranges, too. In reality, you need to compare a lot more than just the tech, the Docker ecosystem is the biggest appeal to us.
I haven't used their cluster provisioning in a while, but back when I did it made a lot of modifications to AWS before falling over, at which point I need to either undo those changes or the scripts need to be smart enough to resume where they left off. Shell scripts are not well suited to that purpose, in my experience.
Yes, I am aware of their salt stack procedure, but it doesn't hold a candle to the simplicity of Kubernetes on CoreOS, which unlike the 23+ directories worth of salt things to read is something you would probably fit on 3 A4 sheets of paper: https://github.com/coreos/coreos-kubernetes/tree/master/mult... I provisioned a Vagrant cluster using the Kubernetes salt mechanism just for this comment, and the "Cockpit" username and password didn't work after it was finished. Which salt config contains that information? Beats me. I'm thankful that `vagrant ssh master` did as expected.
I recognize that the previous paragraph is just _my_ experience and _my_ preferences, but the anecdote is the reason why (in my experience) one must understand what the "magic" deployment process is doing, and only after that voluntarily cede control to the scripts merely as a labor-saving tactic. It does me no good to have a cluster brought to life that I then have zero idea how to maintain.
There are similarities to the ECS CLI, but we are going for the whole "batteries included" approach.
Builds, Logs, cluster and app scaling, encrypted environment, and more.
I have not tried it myself personally but I have seen a blog post comparing it to Kubernetes: https://railsadventures.wordpress.com/2015/12/06/why-we-chos... It sounds like ECS is not really there either.
I would first start with the use case, checkout what came out in docker 1.9 and the read up on what the various volume plugins provide.
(I work for ClusterHQ and we read and hear a lot of feedback around the topic, also )
I was able to do this very easily with solaris zones, and even BSD jails, but every installation of docker that is any way integrated into packages seems to be unable to do this.
perhaps I'm simply not using the right google search terms.
pipework mybr01 mycontainername email@example.com
Weave is supposed to make this easy to (and offers some simplicity over docker 1.9's requirement for a clustered key-value store), but I got frustrated at their implicit iptables nat rule for networks created by default and haven't worked out how to stop that.
Or to create a subnet with custom IP for the docker daemon's bridge network? Use -b option to docker daemon.
https://docs.docker.com/engine/userguide/networking/ is a good place to start. If you have a specific question, feel free to ask and I can try to answer.
not to mention port collisions with things that must run on predefined ports (think SMTP or pesky applications that keep redirecting you back to port 80)
I'm looking to expose 'an IP' similar to a bridged/open network in KVM.
Docker seems really intent on NATing containers behind the host, which IMO is not acceptable from a security perspective when I want to firewall outbound access based on a containers role.
Kubernetes is another option which creates a unique IP per container pod on the same network as the host. Not as flexible as a vxlan approach where containers can be micro-segmented into specific networks, but more like BSD jails that you're used to.
Example Docker images for iPerf client and server:
Edit to clarify: You can deploy each of the above images on different hosts located on different providers. There's no need for the hosts to have visibility between each other at all.
The idea is that your Docker containers will connect to a central server (443/TCP outgoing traffic) and have a virtual private network among them through this "hub", so they've got full access to each other. This is a layer 2 network and in my offering I have DHCP running by default to simplify things (100.64.0.0/24... naughty, I know :)). The communications are encrypted, so effectively you've got a sort of Virtual Private Network between your containers.
As I said, I haven't tried yet with Docker, but it's worth a shot. My service simplifies the process of getting up and running.
It's a networking solution that uses BGP to connect containers across multiple hosts, and so can very easily integrate with existing infrastructure.
I'm sure Docker is not a panacea to all your infrastructure problems but it surely is a worthwhile tool to learn :)
Thats a far stronger reason to use Docker - as I'm sure people have wrestled with this issue when trying to use Capistrano/Grunt/Ansible to deploy as well.
Otherwise more people would use Nix.
This is definitely a great start to Docker and I like how you provided an application to allow the reader to just work through the process of deploying something. Will definitely recommend this, as Docker is something easier shown than explained. Great stuff man!
PS: If you do come to the workshop, drop by and say Hi!
They even have some of the same restrictions(Docker needs root, as does chroot; they both work by making system calls lie to the process).
Whenever I hear a comparison with VMs, I wonder for a second, "Wait, is there some clever way to invoke the virtualization instructions without evicting references to the OS from the CPU's context to provide isolation without a separate guest OS?"
So let's say you have a python app that for relies on a dependency that needs C bindings (e.g. ImageMagick). Instead of running `./app.py` freshly downloaded from some Git <repo>, you would run `docker run <repo> ./app.py`. In the former case, you would need to care of, say, the C dependencies. In the second case, they are packaged in the image that Docker will download from <repo> prior to run the ./app.py process in it. (Note that the two <repo> are not the same things. One is a Git repo, the other is a Docker repo - called an Image.)
Think of process of the building a container as taking a snapshot of the entire OS (such as VM images) but w/o the high overhead of running these images.
Feel free to reach out to me if you need more clarification!
On the deployment side there are also still lots of inconsistencies between the different tools but I can see it becoming the go-to pragmatic way to bootstrap your private "cloud" very soon.
Again deployment is only part of the solution and I wish Docker would hire more good people to work on different development "best-practices" for Linux as well as OSX.
If Docker eventually wants to become a major service provider they should complement that kind of tooling with the same vigorous level of documentation and blogging as Heroku, Digital Ocean, Codeship and the likes come up with consistently.
Which hypervisor are you using?
Is Vagrant still part of that setup (a tool that should become obsolete with a pure docker development approach IMHO)?
I didn't find anything like that on either the Docker nor CoreOS official docs unfortunately. Both of those companies are fighting for territory in this super lucrative space, one of them should provide it without relying on the greater community (like one of you or me doing what's arguably their job for free).
Edit: I'm sorry if this comes off as the typical abrasive comment but I've been working with both those technologies for more than a year now and mind you not alone. A couple of good developers and sysadmins I work with are also still trying to figure out how to go about all of these issues and what I see is a big disconnect between what gets advertised vs. where we are at this point. As somebody who did FreeBSD jails and OpenVZ container system administration as well as general "distributed systems" development myself for many years I also admit to painfully miss the amazing simplicity of creating a monolith and effortlessly deploying it to Heroku. It's something I've become used to over the last couple of years and I miss it, even though I myself find the promise of the current "microservice" trend very interesting as well.
I use VirtualBox to run CoreOS, but VM can run anything as long as it comes with a recent Docker and does not require a lot of maintenance. Then I run lsyncd to synchronize files from the host transparently into that VM and edit whatever files I need in the Emacs on the host. When I need to run a docker command, I do it either from Emacs by prefixing the command with ssh local-vm-name or from a ssh session in a terminal.
To test the things I use, for example, another VM where /etc/hosts points for my production domains to the VM with Docker. Another useful thing is to expose during development, say, PHP/JS code that I edit directly into container for quick testing feedback. For that I can run the container with an extra host volume mount that override the software tree in the image with one that comes from Docker VM and where lsyncd copies all my changes that I made in Editor.
I think a lot of confusion comes from unclear definitions so I try my best here, it would be great if you could chip in once again...
1. our base operating system is Mac OSX which we will simply call osx
2. on top of osx we run a hypervisor (e.g. xhyve, vmware, virtualbox)
3. on top of our hypervisor we are running a virtual machine with CoreOS as our docker_engine (with the help of or without vagrant)
4. on top of our docker_engine we run an arbitrary set of docker_container instances most of which are comprised of a "bespoke" docker_image of our own; a certain "microservice" in development
How do we:
a) map a ("microservice") project's source code directory located on our osx file system onto the currently instantiated development docker_image for our current project's development docker_container
osx => hypervisor => docker_engine => docker_container
b) and pass any file changes from osx down to the docker_container level as well (i.e. inotify, lsyncd, etc)
BTW, if we'd wanted to make this an even more helpful effort why not make this a proper gist?
By itself, containerization is great already, without involving the devs at all.
What Docker attempts to provide is a Framework that allows Devs and Sysadmins the use of the same tools. Ideally, if you can get your devs to use this framework, you will unlock the next step: you no longer deliver sources or packages in production, but rather entire Docker Images, ready to use and with clearly identified interfaces with other systems. It's a dream come true for IT sysadmins, who can focus on their own problematics, monitoring, logging, security, resource management, network architecture etc.
And that's what great about the Docker effort: it's the devs trying to be the best wingmen in the world with their sysadmin pals.
It's a set of tooling and an ecosystem and not just the container technology.
I recommend start with Docker Machine (https://docs.docker.com/machine/)
Another well-written beginner tutorial (but with a few hardcoded urls): http://stackengine.com/docker-101-01-docker-development-envi...