
Docker, Containers, and the Future of Application Delivery - bdon
http://www.slideshare.net/dotCloud/why-docker2bisv4
======
jacques_chester
I feel like the long-term architectural implications of virtual machines and
now containers haven't quite sunk in. I'm not talking about the
_administrative_ advantages, which I think everyone is across these days. I
mean the implications for the design of new applications.

As far as I am aware, folk aren't really writing distributable applications
that target the VM up. You can get preconfigured stacks, or you can get
standalone apps that you install in your environment.

But nobody's said: "Hey, if we control the app design from the OS up, we can
make it much more intelligent, robust and at the same time sweep away a lot of
unnecessary inner platform nonsense".

In terms of the slides, my approach is to reduce the NxN matrix by eliminating
a lot of the choices. Why write your blog engine to support 5 different web
servers when you can select and bundle the web server? Repeat for other
components.

It gets better. Why write a thin, poorly-featured database abstraction layer
when you can take serious advantages of a particular database's features?

You can't do this if you write under old shared-hosting assumptions. You _can_
do this if you target the VM or container as the unit of design and
deployment.

Yes, this is one of my bonnet-bees, since at least 2008:
[http://clubtroppo.com.au/2008/07/10/shared-hosting-is-
doomed...](http://clubtroppo.com.au/2008/07/10/shared-hosting-is-doomed-and-i-
have-the-graphs-to-prove-it/)

~~~
zdw
But then, you get to take responsibility for that entire stack. This is a bad
thing.

Remember PHP register globals? Rails 2.x? Perl 4? I'd bet a lot of lazy devs
would still be using those if they could just wrap it all up into a container
and say "run this!" That's what commercial products do. And they're much worse
for security as a result.

Fundamentally, I'd say the solution is to automate testing and installation.
Make it extremely easy for a dev to test app A against a matrix of language
implementations B, C, D, databases E, F, G, and OS platforms H, I, J. Make it
easy to make packages that install natively on each platform, with the built-
in package management tools. FPM and similar help with this. Nearly every
platform will allow you to create your own package repos. Better tools =
better code = more flexibility = less ecosystem dependency

Containerization as a logical separation for security (ala chroot/jails before
it) makes sense, but doing it so you can shove your whole OS fork in there and
then fail to maintain it seems foolhardy and myopic.

~~~
jacques_chester
What I notice about all your examples is that they are stack problems that an
_application_ could not, on its own, have fixed. In the current model, each
part of the stack has an independent lifecycle, creating shear points and
hidden security flaws.

If the application can control the whole stack, then the application author
can fix it.

Automating test and install just puts you back where you started: with a
gigantic test matrix that will impose non-trivial drag on the whole
application's development.

And it's not necessary. It's just ... not. necessary.

~~~
jrochkind1
> If the application can control the whole stack, then the application author
> can fix it.

You are right, but the other point is that it becomes the application authors
responsibility to fix it.

If you're bundling apache httpd with your app, and there's a security flaw and
a new version released, it becomes your responsibility to release a new
version of your app with the new version of httpd.

If there are 1000 apps doing this, that's 1000 apps that need to release a new
version. Instead of the current common situtation, where you just count on the
OS-specific package manager to release a new version.

Dozens of copies of httpd floating around packaged in application-delivered
VMs is dozens of different upgrades the owner needs to make, after dozens of
different app authors provide new versions. (And what if one app author
doesn't? Because they are slow or too busy or no longer around? And how does
the owner keep track of which of these app-delivered VM's even needs an
upgrade?)

~~~
jacques_chester
You're describing what you see as the advantages of the shared hosting
scenario and in the blog post I linked, I explain why I think that business
will be progressively squeezed out by VPSes and SaaS.

In any case, there's no difference in _kind_ between relying on an upstream
app developer and an upstream distribution. You still need to trigger the
updates.

And you might have noticed that stuff is left alone to bitrot anyhow.

------
boothead
I think docker or similar is a great step forward... One question in my mind
though: what about the databases?

Say you have a web app and a reporting app that use the same database (and
probably a communications framework - zmq server rabbitmq etc in the middle
there as well). How does docker deal with the following:

* The data? I can see postgres, redis or whatever being packaged up into containers, but what about the data that they use? Will there be attachable storage? Will you share some exposed resource on the host? Will the data be another container on top of the database app container?

* Routing. How do you tell your reporting/web app containers "this is where your message bus and database live?"

* Coordination. I'm used to using something like supervisord to control my processes - what's the equivalent in docker land? Replace the scripts in supervisord with the equivalent dockerized apps? A docker file for the host specifying what to run? How do you know if your app that you've run inside docker has crashed?

* Or do you just package the whole lot above up into one container?

 _edit_ actually that was more than one :-)

~~~
vidarh
> * Coordination. I'm used to using something like supervisord to control my
> processes - what's the equivalent in docker land?

The question doesn't really make sense: The equivalent is supervisord running
inside the lxc container.

> * Routing. How do you tell your reporting/web app containers "this is where
> your message bus and database live?"

How do you tell them in a cluster? This is a problem anyone who's ever needed
to scale a system beyond a single server has already had to deal with, whether
or not the application is package up a container, and there's a plethora of
solutions, ranging from hardcoding it in config files, to LDAP or DNS based
approaches, to things like Zookeeper or Doozer, to keeping all the config data
in a database server (and hardcoding the dsn for that), to rsyncing files with
all the config data to all the servers, and lots more.

~~~
johnbellone
I'm not sure there's going to be a bullet-proof solution to this. I believe
that Docker is the right step forward, especially for the application
containers. What I have been tinkering with is the idea of having smaller,
configurable pieces of infrastructure and then providing a simple tool on top
of that (e.g. 'heroku' CLI).

Once you are past the procurement and provisioning steps you really need a way
to describe how to wire everything together. I definitely haven't solved it
yet but I sure hope to! :)

~~~
vidarh
Take a look at "juju". Canonical is doing a bunch of stuff in this area. Juju
does service orchestration across containers. I don't particularly like how
they've done it, but it shows a fairly typical approach (scripts that expose a
standard interface for associating a service to another, coupled with a tool
that lets you mutate a description of your entire environment and translates
that into calls to the scripts of the various containers/vms that should be
"connected" in some way to add or remove the relations between them)

~~~
vosper
What don't you like about Juju's approach, and what do you think the right way
would be?

~~~
vidarh
To be fair, it's been a while since I've looked at it, so it could have
matured quite a while. I should give it more of a chance. My impression was
probably also coloured by not liking Python... Other than that, the main
objection I had was that it seemed on one hand that writing charms seemed
over-complicated (might have been coloured by the Pyhthon examples..) , and
that there seemed to be too much "magic" in the examples. But I looked right
after it was released, so it's definitively time for another look.

(EDIT: It actually does look vastly improved over last time; not least the
documentation)

Specifically, I run a private cloud at work across a few dozen servers and a
bit over a hundred VMs currently, and we very much need control over which
physical servers we deploy to because the age and capabilities varies a lot -
ranging from ancient single CPU servers to 32 core monstrosities stuffed full
of SSDs. They're also in two data centres.

When I last looked at juju it seemed to lack a simple way to specify machines
or data centres. I just looked at the site again and it has a "set-constraint"
command now that seems to do that.

The second issue is/was deployment. OpenStack or EC2 used to be the only
options for deploying other than locally. Local deployment was possible via
LXC. EC2 is ludicrously expensive, and OpenStack is ridiculous overkill to us
compared to our current system (which is OpenVz - our stack predates LXC -
managed via a few hundred lines of Ruby script) .

I don't know if that has changed (will look, though, when I get time away from
an annoying IP renumbering exercise...), but we'd need either a lighter option
("bare" LXC or something like Docker on each host would both be ok) or an easy
way to hook in our own provisioning script.

(EDIT: I see they've added support for deployment via MAAS at least, which is
a great)

------
jwilliams
I like Docker and I use it in dev/sandpit quite a bit lately... but I must
admit, I don't quite follow the metaphor being pushed here.

Standard workloads don't require containers. You can have standard workloads
on your physical or virtualised hardware. The choice between these is going to
depend on a bunch of factors.

I don't see why they need to be bound together, as appears to be the case with
Docker? Plus I don't see how that provides any special leverage (as opposed to
having the choice).

~~~
mateuszf
It has great advantages by doing it this way.

1\. For developer: \- your application works exactly the same way on your
development, test and production environment because of using exactly the same
os/libs/configuration \- very fast snapshot/restore simplifies automatic tests
and makes them practical

2\. For system admin: \- configure your system once in order to make docker
work, run any docker image (program) with one, the same command \- run many
applications which do not influence each other in a quite safe way \- download
and run application without worrying about its dependencies \- very low
footprint in terms of disk/memory/cpu usage in comparison with standard
virtualization

~~~
jwilliams
1\. Agree that's handy in dev. For development a container is probably a good
choice - in other environments virtualization or physical hardware may make
sense.

"More consistent" dev/test/prod is great, but I don't need nor want identical.
My needs in dev are different from production - e.g. Everything from basic
settings, URLs, networking, reloading behaviour to performance tuning. My
needs for my dev DB are quite different from those in production.

2\. Agree, but this isn't unique to containers (which is probably what I was
aiming at).

Low footprint is somewhat irrelevant to "configure once". You can manage this
on physical hardware if needs be.

Low footprint is great when you're resource constrained, somewhat irrelevant
at scale. Plus a VM gives you other features/tradeoffs in return for that
cost. Different circumstances will tend to favour one over the other.

I find Docker useful. Containers and Standardised Workloads are great. I see
the utility in both, my argument is I see them as orthogonal. Particularly
referring to the "alternatives" at the end of the deck, which seemed
unnecessary distinctions for me.

~~~
vidarh
> My needs in dev are different from production - e.g. Everything from basic
> settings, URLs, networking, reloading behaviour to performance tuning. My
> needs for my dev DB are quite different from those in production.

But you'll need a lot of things to be configured identically, and you'll need
it to be reproducible (say for when a server fails). That's the point - nobody
forces you to have just one identical script.

The containers vs. vm's distinction otherwise makes a difference for one
reason only: Because containers allow higher density, it is acceptable to
employ them in situations where the cost of vm's would be too high.

This is how it hangs together with standardised workloads: They're lightweight
enough that standardised workloads becomes viable for many more use cases.

(In fact, on Linux, LXC is built on top of cgroups/namespaces which are far
more flexible than that: Instead of building full containers, you can run
individual applications in their own containers, or you can have servers with
built in cgroups support that could e.g. do stuff like take an incoming
connection, authenticate the user, look that user up and see "ok, this users
connection is allowed to use 5% CPU and 10% of disk IO and 100MB memory",
fork() and assign the forked process to cgroups accordingly, while at the same
time isolating him so the process in question can only see a certain
directory, can't see other processes etc.. Heck, you could even give the
process its own network, and firewall it so it can only connect to certain
specific network resources. )

------
secstate
I lover docker as much as the next pseudo-sysadmin (dev forced to do admin
work), but I think the analogy with packing containers is a bit of a stretch.
LXC works only on Linux. It's not gonna help you with BSDs, OpenSolaris or
Windows.

Additionally, a lot of this feels like Docker is taking credit for LXC. Docker
is a brilliant abstraction of LXC's obscure native interface, but LXC came
before Docker and does most of the heavy lifting.

~~~
stfp
To your second point - I think you're right, but I think that pattern - the
productization of existing technology - is a very powerful, positive thing.
And picking the right abstraction, so that it feels - as you put it -
brilliant (a view I share totally), is not trivial.

Also FWIW, on giving credit:
[https://twitter.com/getdocker/status/357983297114079233](https://twitter.com/getdocker/status/357983297114079233)

~~~
secstate
For sure, I didn't mean to downplay how important tools like Docker are, just
that it may not be as revolutionary as the shipping container.

I think some other comments above make a better argument about how the full
power of containers has yet to be realized.

------
kraemate
I dont get the hype. BSD and Solaris have had proper containers for years now.
Just because Linux hasn't had any all these years and LXC is maturing only
now, it is wrong to think that containers are the next big thing.

~~~
FooBarWidget
I also think it's strange. FreeBSD jails are essentially the same thing. When
I describe Docker, I would call it "FreeBSD jails for Linux" rather than
"lightweight virtual machine" or even "iPhone apps-style isolation".

But I do get the hype. BSD and Solaris, although they have many interesting
benefits, just aren't that popular for a variety of reasons. I won't turn this
into a discussion of why they aren't popular, but it's a fact that Linux is
much more popular. Most people aren't going to switch even if BSD and Solaris
provide more benefits. But now that an important feature has become available
for Linux, I can see why people get excited.

It's like Node.js. Servers with evented I/O has exited for years, if not
decades. But the fact that Node.js is Javascript made it mainstream and that's
why there's hype.

~~~
justincormack
Actually the Linux "container" implementation (namespacing) is based on Plan
9, and is much more modular than FreeBSD jails. Not that lxs/docker use that,
just use as a jail...

------
GoNB
Unrelated: Why are people familiar with Hacker News (e.g. the Docker folks)
still using slideshare.net? That site feels like it's stuck around 2005. I
thought everyone here knew about [http://slid.es](http://slid.es) by now --
its UX is far superior.

~~~
plainOldText
I think [http://speakerdeck.com/](http://speakerdeck.com/) is even better. IMO
the UI is very cool and the entire site very easy to navigate; I actually find
it hard to believe it's not a very popular slides sharing platform.

~~~
ludwigvan
Weird, isn't it? Especially considering it was acquired by Github, where
almost anyone here has an account. Maybe Github is not advertising it enough.

------
Aqueous
It'd be nice to get a suitable cross-platform container format, that would let
you create a deployable container across multiple OSes.

Process control groups exist in Darwin/Mac OS X - I wish you could sandbox
packages with private network namespaces and filesystems as well.

~~~
est
I don't understand, binary/lib usually doesn't share across platforms.

~~~
Aqueous
Right. You could have a container which has universal dylibs and static libs
for x86_64/i386 compiled for Darwin in one directory and ELF shared objects
and static libraries compiled for x86_64/i386 for Linux, logic to detect the
platform and the main application binaries compiled for multiple platforms.
And why not throw Windows in there too?

This would create a universal container, assuming all major OSes acquire
facilities for process control groups, namespaces and chroot.

Disk space is no longer a consideration. The containers can be as big as we
want - why not make them run natively everywhere?

~~~
vidarh
Why bother? I'd be far simpler, and more resource efficient, to run whatever
the user prefers of Xen/Virtualbox/Vmware or "bare" Linux as the base and not
have to create monstrous franken-containers.

~~~
Aqueous
Well, speaking from personal experience, I develop on Mac OS X and deploy to
Linux. It would be helpful to be able to run the same container on both for
testing purposes.

------
m_mueller
There is one fallacy here: The 'thin' mobile client as being the future. Even
though the world has moved towards thin client in the form of web clients, I
firmly believe that the tides will turn - either by having the browser
becoming a thick client itself (with local storage, lots of local logic) or by
reintroducing native clients again - for which Containerization could play an
essential role.

Imagine if we'd have a standardized container format on top of Linux, the BSDs
including OSX and Windows with Cygwin - hello cross platform client
applications with a single installer. Apple and Microsoft could integrate
container technology in their AppStores and we could submit the same package
_everywhere_.

~~~
kstaken
The browser is already a thick client with local storage and lots of local
logic. The difference is that the entire client is shipped to the browser on
demand.

~~~
m_mueller
Ok, that's an interesting point. I'd argue that local storage hasn't really
taken hold yet, i.e. the large majority of sites and apps don't work offline,
but yeah, it already stretches the notion of a thin client (hence my beef
about using this term on where we're headed in the future).

------
jwheeler79
I still don't get it.

~~~
general_failure
Same here. Why can't people talk with concrete examples instead of this high
level vague stuff. This thing is for do devs - explain it like so.

The analogies are pointless, they don't convey anything to me.

~~~
nickstinemates
All I can do is point to my real, practical experience in using Docker.

[http://tryrethink.info](http://tryrethink.info) \- over 3,000 happy customers
(unique, sandboxed instances) served in 24 hours. With about a day of effort
total.

[http://nick.stinemat.es](http://nick.stinemat.es) \- my blog, which is pretty
Docker focused because that's what I've been working on since I decided to
start improving writing.

This covers basic continuous integration and deployment of an application to
pretty non-trivial volume.

~~~
jwheeler79
Would you say it's one of those things you feel the warm fuzzy feeling with
after using it? I've been wanting to give it a try with all the hype
surrounding it, but I don't see what it's going to do that provisioning a $10
Digital Ocean server wouldn't (albeit with a little bit of hassle).

~~~
nickstinemates
I could literally type for days about the benefits of docker over a $10
digital ocean server, but no one would read it.

What I will say is this. If you're writing a trivial application that you and
only you will ever need to work with, in an environment completely controlled
by you, and you have a recipe that works - you're right, Docker probably isn't
for you.

If you, like me, work with a huge product suite with many buildtime and
runtime dependencies (services and applications), with many different runtime
configurations, where even automated installation can take up to 15-20 minutes
because of the sheer amount of work that's going on, there's a massive
_massive_ amount of efficiency to be gained in the dev/test/release/packaging
process, let alone the _massive_ amount of efficiency to be gained by the ops
team in working in foreign environments.

There are certainly lots of other use cases (PaaS/SaaS are easy,) and those
are valuable business building tools, but less interesting to me personally.

------
megaman821
How does Docker handle deploying to machines with wildly varying capabilities?
Every machine you deploy on may have different configurations and performance
tunings. Is there a Docker container that can run a DB well on a EC2 small
instance and 16 core 128GB RAM dedicated server?

------
tlrobinson
Today I found out about a similar-ish project called TestLab, "A toolkit for
building virtual computer labs"
[https://github.com/zpatten/testlab](https://github.com/zpatten/testlab)

------
ballard
Enterprise people have already moved in on this with opaque application
volumes. NDA's so don't bother asking for detail.

------
eterm
About a minute after loading this website a setup.exe downloaded which my
antivirus promptly found and got rid of.

Anyone else experience this?

------
dschiptsov
over-hyped FreeBSD's jails?)

~~~
kstaken
Yep, jails, containers, solaris zones all the way back to IBM mainframes have
been similar core technologies. Docker it self isn't containerization, Docker
builds on Linux containers but adds a layer to dramatically simplify the
build, distribution and execution of containers and that's what makes it game
changing.

docker run gtihub.com/some/project

That command will clone, build and execute a container based on the contents
of a repository by simply dropping a Dockerfile in the root of the project.
That's a fundamentally different level of usage beyond any existing container
technology.

