
Why Always Docker? - Fizzadar
http://pointlessramblings.com/posts/Why_Always_Docker/
======
melted
To understand scenarios under which Docker/Kubernetes/LXC are useful, you need
to understand the environment in which Linux containers originated. They were
born in Google data centers, where engineers may routinely need to reliably
deploy tens of thousands of preemptible nodes and tie them together into a
service by providing health checks, monitoring, endpoint enumeration, resource
limits, isolation, and so on. Docker/Kubernetes let you do exactly that. At
Google, though, Borg is run by SREs, and engineers don't have to worry about
managing it, so it is quite economical to spin up even single task jobs there.
Google also makes deployment much easier by using static linking, and
structuring their build system outputs in such a way that they can be either
easily deployed by copying over or packaging into an official, versioned
deployment package (+providing command line flags through Borg config files).
When tasks/jobs go away (get preempted or killed -- servers in this
environment usually do not support orderly shutdown), whatever they wrote to
local disk gets cleaned out, including binaries and data files. Persistent
data is written to persistent, distributed storage backends, where it belongs.

As you can see, most parts of this picture map nearly exactly to how
Kubernetes/Docker are supposed to be used. Used in this way to manage large
deployments, containers provide an unbeatable value proposition.

~~~
jacques_chester
Put another way: Google has built several generations of internal PaaSes.

This unblocks continuous deployment at the final step, so latency from idea to
production falls from years/months/weeks to hours. Or minutes.

Docker's had an interesting life: they built a PaaS, discarded the PaaS, now
they're building a PaaS. Because that's what most developers actually need for
their daily lives.

It turns out that tinkering with V8s is a lot of fun for a lot of people, but
most drivers just want to know how to turn on the car and have the same basic
interface work for any workload: wheel, accelerator, brake.

Disclaimer: I work for Pivotal, which is the majority donor of engineering
effort to Cloud Foundry, a PaaS inspired in part by Google's experiences.

~~~
melted
I wouldn't quite go as far as to call Borg a PaaS. It's at a somewhat lower
level, closer to IaaS. You get a known quantity from it in the form of a
stable, tuned, stripped down underlying Linux image, plus a relatively small
set of services, but you can deploy pretty much whatever the hell you want
without a lot of constraints a "true" PaaS would force you to accept.

~~~
jacques_chester
My understanding is that BOSH is the closest to this from the Cloud Foundry
ecosystem.

------
willejs
This raises some interesting points, and I agree in part.

I think docker makes sense in some cases, single, compiled binaries that
adhere to 12 factor app standards, I'm all for putting in a docker container.
I would then run them on a PaaS, if the reasoning is sound. I am working on a
project doing this currently.

However, shoe horning something like a php-fpm & nginx stack in there, or
anything that doesn't fit the aforementioned spec, seems and mostly is
complicated. Doing something like this has caveats, becomes confusing and when
you look into it, crazy.

I am fed up of the hype and people thinking that docker is the silver bullet
that solves all problems, and should be used for everything.

Ultimately, I feel like a lot of people don't understand how docker works
under the hood, and what it takes to deploy and operate applications in docker
containers in production. The result of this is mostly scary.

I feel like I don't have to go into details about security, entry point
scripts, gosu, multi processes, logging, sketchy build processes, mounting
config volumes, persistent storage, layer caches, networking, links, SDN and
more. These are some of the things you have to work with with, around or avoid
with docker, they are the issues people are not aware of, or don't yet
understand.

~~~
magicmu
I think the hype that you (and the author) are talking about exposes an
interesting problem in products that target developers. Overwhelming
popularity is generally awesome for a business / product, but in this case it
seems that over-use may actually be diluting the core value-prop of Docker. If
people were silly, which sometimes happens, I could see a snowball effect
where Docker is generalized to be wholly un-useful, which would be terrible
for the product. Not that that will happen, but an interesting thought.

~~~
ddw
And often tools that the developer masses love optimize for getting started
quickly ("I can spin up an instance of Elasticsearch in one line!") instead of
what is sustainable.

------
sz4kerto
"These kind of systems have their own configs, be it elasticsearch.yml or
my.cnf. The Dockerfile format is completely fucking useless at this kind of
thing."

A solution I like: use Docker and mount the config as a volume or add it to
the image in an additional build step. (I.e. have a my-app:2.3.4-base and then
when moving to prod, create a new image my-app:2.3.4-prod). The reason why
'Docker in production' is inevitable (as I can see it) is because it makes
trivial to iterate on your whole setup, not just your application code. If you
work with gcc version X and Java version y, then you change and test with new
versions, then you want to version control these changes, and update them in
production easily, within your normal development flow.

(By inevitable, I mean that it's going to happen. Images are the new
packages.)

~~~
ownagefool
I don't really get it to be honest.

Sure, the Dockerfile format is simple but if you need to do anything
complicated, you just call another script that does it. I don't really see how
it harms you?

Also, I don't really understand why he wouldn't run his private registry in
kubernetes if he has such a stack. I'd pretty much run everything in it.

------
markbnj
>> ... and I need none of Dockers scaling properties, so I'll run it direct on
hardware.

What is not "direct on hardware" about Docker containers? There's a bit of a
misunderstanding here, and I wouldn't nitpick on it if I didn't think it
betrayed something about the author's point. In some way or other he sees
Docker as additional overhead, like a VM. While there obviously is _some_
overhead this isn't an accurate picture. As for the overall point of doing
everything in a container, according to various sources that is exactly what
Google does now, for example. The reason is that containers capture
dependencies, and they make for much more fluid and manageable systems. As
with most changes of this magnitude there are waves of adulation and
revulsion, but overall I think this is the new world.

On the elasticsearch point: you can use environment variables inside the
elasticsearch.yml file, and you can set environment variables inside a
container when you execute it so there is a complete pathway to pipe
configuration information into the container. There are really only two things
that cause an issue: discovery and disk volumes. Discovery is a problem
because es uses udp multicast by default, but there are plugins that
substitute other mechanisms for listing cluster members. On kubernetes/GKE we
use a fabric8 plugin for this. Disk volumes are an issue just because most
container platforms don't yet deal well with them. We had to roll our own
solution for dynamically attaching replication controllers to GCE persistent
disks, but there are some better solutions in the release pipe.

------
zwischenzug
I have some sympathy with a 'Why Docker' rant, but recently I've had
experiences which has modified my view.

The separation from code and data has made reasoning about my DB upgrades
(postgres, mostly) much easier.

The 'Docker is great for dev, not prod' view is also one I used to favour, but
it's inevitably true that what begins in dev does not stay in dev.

Finally, the Dockerfile limitations led me to create my own CM tool (ShutIt)
so that I could configure my stateless and complex environments into code that
could easily be understood and changed by the casual dev.

------
bitcointicker
Personally the best use case for docker I have at the moment is for the
testing of chef cookbooks with the Chef Test Kitchen Docker driver -
[https://github.com/portertech/kitchen-
docker](https://github.com/portertech/kitchen-docker)

I can write my cookbooks and almost instantly test them inside a container.
You can even test on multiple platforms at the same time (Debian,Rhel etc) in
parallel. You can perform integration testing using serverspec once the
container has converged to the required state -
[http://serverspec.org/](http://serverspec.org/)

~~~
bazfoo
I found myself doing the same thing for Ansible.

The problem I ran into was where I wanted to test service restarting in a
systemd based environment. Older releases using sysvinit work perfectly fine.

~~~
arianvanp
This is why you should check out systemd-nspawn. It was designed especially
for this use case.

Also. If you're on upstart, give lxc a shot. We currently test our ansible
scripts by deploying to lxc by giving each container a static IP in a bridged
network to simulate our production environment. Just swap ansible inventory
files. Works like a charm.

------
meirelles
For my use case chef makes much more sense. I like docker, actually is a very
important tool to my development environment and testing, but with many moving
parts in production, some of them needing persist data, would be a hell split
and manage so many app containers. I can't see how Docker would help me save
time. To production LXC/KVM/nothing + chef is usually better to me.

~~~
bitcointicker
You can use chef and docker together, if you really want to. Containers do
provide some benefits as others have mentioned in this thread ( Packaging,
avoiding conflicts, maybe even as a chroot on steroids for isolation
purposes).

You could have a server managed by chef which installs docker, pulls down a
number of containers and then launches them, hooking them together if
required. If random ports are used, chef can capture these and then hook into
a load balancer to register the containers.

You can even have chef build containers from a Dockerfile, to make sure they
have the latest updates, tag the image and then launch them.

So many options it often makes your head spin :-)

~~~
meirelles
Yes. I agree with you. Have many other good uses for Docker. But I found LXC
easier, as it's possible assign a public IP and let the chef mange the
iptables/service discover exactly like a VM/baremetal. Docker drops almost all
caps, which is great for security, but isn't possible a container manage his
own isolated iptables.

------
KirinDave
> "The Dockerfile format is completely fucking useless at this kind of thing."

Right... which is why we have Docker Compose. The point of the image is to
provide the code and the harness for launching it WITHOUT those assumptions.

> "But wait - how do we configure these services for multiple environments
> (test/prod clusters)? They don't read our ENVvars, nor do they know of our
> internal service discovery tools."

This is why docker containers are composed out of other containers. You use an
elasticsearch container as a basis and extend it out with your tools to make
your unique flavor of deployable search unit. This is not a new technique to
anyone, as even the es docker image itself is built off another base image.

I get the impression the writer of this has yet to really internalize what
docker containers are.

> "Tools like pyinfra and Ansible are much more suitable for this kind of work
> (and don't install useless crap to generate a config file)."

Are they though? This is said without really any justification. To me, I'd
100% rather do it via Docker. Next to actually locking everything into one big
solid lump via Nix, Docker actually gives you reproducible and reusable chunks
of code with nearly infinite and modular configurability, without any care
about installations stepping over one another or even library conflicts.

Sure, things like an unprunable stale image cache filling up small disks is
annoying. But the alternative is a continuous and inscrutable agglomeration of
code and configuration files onto a box, eventually leading to total disaster.

But kinda typical of someone who wants to run Go. If you're building Go you've
already accepted that you'll never ship the same executable twice.

~~~
jacques_chester
> _This is why docker containers are composed out of other containers. You use
> an elasticsearch container as a basis and extend it out_

My limited experience is that this recreated all the worst properties of
single-inheritance subclassing. In particular, a lot of subclassing for
construction.

~~~
KirinDave
Docker containers shouldnt have substantial subclassing. In fact, for
production work you should remake it from scratch for security.

The benefit is the triviality and the orthogonality. Docker makes system
components that can't interact and that can cleanly mesh with each other via
simple contracts. As a means of retrofitting older software models into a new
style of system assembly, it's excellent.

------
godzillabrennus
Docker seems to be everywhere these days.

Mostly I see it in dev environments and not production though.

I'm also waiting for Cal Leeming to post his annual update on Docker. Last
years was memorable: [http://iops.io/blog/docker-
hype/](http://iops.io/blog/docker-hype/)

~~~
sleepycal
It's actually coming in about a week or two. I wanted to do it on 17th
(exactly 1 year after) but needed more time to work on it. No spoilers, I
don't want to the ruin the surprise :)

------
Annatar
Why use Docker, when payload can be packaged into an OS package, and run
inside of a SmartOS zone, which is a fully functional UNIX system, yet
completely isolated and running at the speed of bare metal? Makes no sense to
use Docker for anything if I can do configuration managment and payload
deployment with OS packages inside of SmartOS zone.

[https://youtu.be/0T2XFSALOaU?t=1245](https://youtu.be/0T2XFSALOaU?t=1245)

~~~
j_mcnally
Wait.... are you saying docker is slower than bare metal? Have you used
docker?

~~~
Annatar
I'm saying that a lot of people end up running Docker in a VM... why?

I'm also saying that dumping a bunch of files from a developer's laptop into a
Docker image is going to be a nightmare in terms of lifecycle management (how
about a subsystem rollback or upgrade inside of that image?)

And finally, I'm saying I see no point to Docker, if I can just make OS
packages and run them inside of zones. With zones, I have a fully functional
UNIX server in complete isolation and security; with Docker, I have a re-
invented init which isn't really init, and if I want SSH and all the other
things one normally expects of a system, I have to engineer them myself. Why
would I use Docker if I can use zones in SmartOS? What does Docker buy me?

~~~
azernik
a) from a quick look at SmartOS, it looks like yet another implementation of
containerization, with an option to run a full KVM if you want. And it has to
run as a full OpenSolaris-based system image, instead of just being a binary
installable on a Linux system (much more familiar to most developers)

b) "dumping a bunch of files from a developer's laptop into a Docker image"...
I'm sorry, what? I have no idea what workflow you're referring to here.

WRT your specific gripes about subsystem rollback - the usual Docker best
practice is to have each container run only a single subsystem, and to have
images be generated by checked-in Dockerfiles based only on checked-in
resources. If you need to upgrade or downgrade, you spin up a new container
running a different image, fail over to it, and kill the old one.

Once a container starts running it is immutable. Any of the features of a
running container can be inferred just from looking at the Dockerfile(s) that
built it and the connections it has to storage volumes, other containers, and
the external network.

~~~
Annatar
> from a quick look at SmartOS, it looks like yet another implementation of
> containerization

It is the first ever implementation of true containers (zones were released in
2005), and it is modeled on BSD jails.

What is or is not familiar to most developers is irrelevant to me when I am
engineering a solution, because my focus is on encapsulation, stability and
lifecycle management. What others are familiar with is irrelevant in that
case, especially since correctness of operation and data integrity are
priority, with everything else taking a back seat to those.

> WRT your specific gripes about subsystem rollback - the usual Docker best
> practice is to have each container run only a single subsystem

But it doesn't have to be: [http://phusion.github.io/baseimage-
docker/](http://phusion.github.io/baseimage-docker/)

besides, if there is an issue, and one were to follow running only one service
inside of a Docker image, one could not ssh in to troubleshoot the image. With
Solaris zones on SmartOS, it is completely unnecessary to run a single service
or process inside of a zone, because zones offer full isolation. I see no
sense in opting for a harder approach with Docker, especially when that
approach does not offer full isolation nor security.

> If you need to upgrade or downgrade, you spin up a new container running a
> different image, fail over to it, and kill the old one

Which I imagine means that I have to build a whole new image, presumably based
on the old image, then deploy an entire image (what if it is an Oracle
database software, which is anywhere from 800 MB to 2.5 GB, not counting the
database?) It is much cheaper and faster to just rebuild the affected package,
and upgrade it in place inside of a zone, than having to respin an entire
image, especially if that image is several gigabytes.

~~~
tra3
I want to discuss your last point. With Docker, you are free to either modify
the image or the running container. An image is a "template" for a container
and in the scenario you describe, the ideal solution is to create a new image
because it can be potentially running on multiple nodes. However, nothing
prevents you from accessing the container (no SSH required) and modifying the
container in place. Although I do believe it is discouraged.

Thanks for the SmartOS reference, it looks very interesting.

------
willcodeforfoo
It's a good question, especially in the age of small static binaries with no
external depdencies anyway.

Even if the isolation isn't of much value, Docker is still useful as transport
and storage. Getting back to the shipping container metaphor, it's easier to
move things around if they are all the same. And Docker containers are a
pretty good way to do that with code.

~~~
sz4kerto
I don't know if this is the age of small binaries. Maybe in some industries.
The artifacts we're deploying are -- partly because of various constraint,
partly because of the weight of legacy - are between 25-50 MB. Oh, and they
are run inside of a Java app server, that's also 100-200 MB. Ah, and that runs
on a Java VM. The integration tests require a running Firefox, Chrome, V8,
JVM, databases, whatever.

(No, I can't replace these with a couple of command-line Unix tools just yet.)

~~~
lobster_johnson
I suspect the parent is mostly referring to Go.

~~~
auvrw
i do wonder what aspects of Go make it good for the container use-case. both
Docker and the other container system mentioned at the top of the article are
written in Go.

~~~
jacques_chester
> _i do wonder what aspects of Go make it good for the container use-case._

Same as JARs or C/C++ binaries. You can ship the compiled product to the
target runtime and expect it to launch and run as-is.

Languages with an interpreted nature require containers to also ship an
additional runtime, plus a dependencies mechanism.

~~~
techdragon
Go is a step further than these though since it enables "static binaries" you
can run go programs in docker with nothing else in their containers. Just one
file, the Go binary.

Which is amazing and frustrating since it exposes the inability of other
languages to operate in such a simple environment. Even languages like Rust, C
and C++ aren't able to do this reliably all the time, with the results being
highly dependent on libraries and platforms of choice.

~~~
auvrw
thanks for the &replies ;-) ... tbh, Go vs. Rust for the, "i wanna write a
Docker-thing!" use case was pretty much the question i had in mind.

the Rust ppl are looking toward static linking [links]...

[https://internals.rust-lang.org/t/static-binary-support-
in-r...](https://internals.rust-lang.org/t/static-binary-support-in-
rust/2011/48) [https://github.com/rust-lang/rust-
buildbot/issues/24](https://github.com/rust-lang/rust-buildbot/issues/24)

------
zeta0134
I use docker to build a particularly complex (at least to me) NDS project. I
do this because I regularly develop on Windows or Linux, so do my friends, and
none of us have simple working arrangements. The toolkits for NDS are a bit of
a pain to set up, and I want the source for my project to be usable by anyone
in the community, regardless of what platform they use, or what changes happen
to the development tools over time.

Thus, docker. It lets me figure out compile-time libraries and dependencies
once, ever, on one platform. (Debian base.) Then magically everyone else on
the team can just hit up the build script, which calls into the docker image
(building it first if needed) and viola, project built. It's not _nearly_ as
efficient for compiling and making frequent changes, but in our case, the lack
of complex setup and differences between build environments is worth the extra
overhead.

I think there's something to be said for Docker as a development tool in
general; it's nice to be able to play around with development libraries
without (a) cluttering up my main machine's list of installed packages, or (b)
spinning up a virtual machine and sapping my workstation's RAM.

------
MichaelBurge
If it's a single Go binary, I imagine you can just compile it using the
Makefile.

I googled and found this:
[https://github.com/docker/distribution](https://github.com/docker/distribution)

It has a dockerfile that just calls make. Everyone uses the usual unix tools
to build software - Docker makes some sense for deployment, but it's not
really suitable for development(what if I need to add profiling? Dwarf
debugging information? Tweak the optimization settings? Disassemble one of the
object files? Attach gdb to a process? Strace the process to understand it?).
So there'll always be the basic build instructions, and the Dockerfile will
probably wrap them: Adding Docker is way more abstraction than I'm willing to
deal with when debugging a tricky problem.

------
j_mcnally
I for one am excited to get to the point where it doesnt make sense to
dockerize everything?

Docker is like violence, if its not working you aren't using enough.

------
olalonde
> These kind of systems have their own configs, be it elasticsearch.yml or
> my.cnf. The Dockerfile format is completely fucking useless at this kind of
> thing.

confd is meant to solve this problem [0]. We use it at work to keep our
bitcoind server configuration in sync with etcd [1]. Deis (the PaaS) also
relies heavily on it, to generate nginx configuration files for example [2].

[0]
[https://github.com/kelseyhightower/confd](https://github.com/kelseyhightower/confd)

[1] [https://github.com/olalonde/coreos-
bitcoind](https://github.com/olalonde/coreos-bitcoind)

[2]
[https://github.com/deis/deis/tree/master/router/rootfs/etc/c...](https://github.com/deis/deis/tree/master/router/rootfs/etc/confd)

~~~
SevereOverfl0w
I've always felt like config should be "COPY"'d into an image. Etcd/Confd
looks really neat, in principle, but I feel like it's asking for trouble in
terms of "immutable containers."

I'm not overly familiar with the system though, so I may be misunderstanding.

------
dragonsh
Real things get lost in hype cycle, as it is said right tool for right job.

If you are looking for lightweight container VM use LXD. This let you use
saltstack, ansible, chef or puppet etc. CM for system management. Same
configuration can run on bare metal, VM based vagrant on desktop or cloud
services like AWS, Azure, Google or many others.

If you are looking for application containers running single daemmon use
docker (I am not using the term process since many daemons fork multiple
processes and docker still call it single process).

Docker by default doesn't yet support unprivileged containers which poses
security risks on multi-tenant system so can only be used with added overhead
of virtual machine in AWS, Google, Azure etc. But its still good for
continuous integration and development given hype resulted in integration of
many tools around it.

------
euroclydon
If the only instructions or working distribution for a piece of software is a
Docker image and you're not into Docker, than that is probably not an OSS
project you should use.

I learned this the hard way with Bosun. I should have just avoided it totally,
and saved a bunch of time.

------
pekk
Despite its problems, it is an approximation to a packaging standard which
provides enough isolation to manage dependencies successfully.

Does anyone not remember what it was like to fight shared library versioning
conflicts? Do you want to be handling the GitHub issues attached to people
screwing up that kind of thing in 2017 because their distribution or OS X
package manager randomly changed?

~~~
davexunit
>Despite its problems, it is an approximation to a packaging standard which
provides enough isolation to manage dependencies successfully.

Docker is ultimately a non-solution that papers over the problems of
traditional system package managers, language-specific package managers, and
the myriad of software (mostly Java) that no one actually knows how to build
from source. Containers do not compose. There are many runtime environments to
consider, and Docker can't handle anything but containers. You need to use
some other software to manage the host system at the very least. Furthermore,
the container images have no useful provenance for users to inspect. It's a
security nightmare.

Functional package management is the real solution here. Software like GNU
Guix and Nix solve real problems. They remove global state (/usr), enable
reproducible builds, allow unprivileged package management, support
transactional upgrades and roll backs, deduplicate software system-wide for
all users, handle full-system configuration in a declarative way, eliminate
the need to trust any particular provider of binaries, and more.

>Does anyone not remember what it was like to fight shared library versioning
conflicts?

Using Docker to solve this problem is like using a sledgehammer to drive a
nail.

~~~
eropple
_> Docker is ultimately a non-solution that papers over the problems of
traditional system package managers, language-specific package managers, and
the myriad of software (mostly Java) that no one actually knows how to build
from source._

It's interesting that you saythis--I would say almost the exact opposite here:
most of the Java applications are the best-behaved on systems I deal with,
both in terms of execution (it's in the Maven repo) and in terms of process
control (cgroups are nice for, like, a Ruby app, but here I've got -Xmx). The
poorly behaved ones seem to be ones that don't use standard tools, and maybe
I've been lucky but for me that's all third-party and mostly open-source
stuff; a lot of new-hotness stuff (Kafka, _I am looking at you_ , I love you
but you are a pain in the behind to deploy) can't just be run straight out of
the Maven repository with a bash script or whatever.

About the only place I use containerization at all (and I don't use Docker,
for reasons I've described elsewhere around here) is for Ruby or Python
applications where otherwise I do end up with _stuff_ thrown all over the
place and multiple versions of the runtime fighting for supremacy. I'd love to
use Guix/Nix, but it's a hard fight to win in a corporate environment.

 _> Using Docker to solve this problem is like using a sledgehammer to drive a
nail._

I wish to upvote this eleven times. I can but do so only once.

~~~
thinkpad20
> I'd love to use Guix/Nix, but it's a hard fight to win in a corporate
> environment.

I don't know how large or flexible your organization is, but I've been driving
hard at my company (which is a bit shy of 100 employees) for using nix, and
it's working. When I started advocating for it there was no small amount of
skepticism, but we started off just using it for a very small and specific use
case, and from there it has slowly but steadily gained acceptance from other
developers (most of whom have no particular interest in FP) as a real solution
to innumerable problems that we have w.r.t. package management. If you're
interested in Guix/Nix, I'd encourage trying to get permission to use it to
solve a specific problem.

~~~
eropple
I'm a consultant. While I can push for a lot if it's conventional, but I have
to pick my battles. That's not one I feel I can win.

------
castell
Run an Linux application in a container like FreeBSD jail or Sandboxie, that's
what I want. I don't need the management overhead

Give me Docker "light" or a good tutorial for LXC(?).

------
NamPNQ
> how do we configure these services for multiple environments (test/prod
> clusters)?

Docker have config ENV varibale and VOLUME, just research about it

------
dschiptsov
The same reason as with Why Always Java - availability bias, self-serving
bias, attribution error and related mass hysteria.

