
Your Docker image might be broken without you knowing it - jballanc
http://phusion.github.io/baseimage-docker/
======
xal
The comments about the init process are true. It makes sense to run a proper
PID1 system here such as runit.

I'd argue with the rest of the post. The problem is that phusion makes the
common mistake of thinking of containers as faster VMs. That's fine, this is
where almost everyone starts when first looking at Docker paradigm.

A good rule of thumb is: If you feel like your container should have Cron[1]
or SSH[2], you are trying to build a VM not a container.

VMs are something that you run a few of on a particular computer. Containers
are something that you will run thousands or tens of thousands on a single
server. They are a lot more lightweight and loading them up with VM cruft
doesn't help there.

[1] Cron: use the cron of the outer machine with docker run [2] SSH: use lxc
attach

~~~
FooBarWidget
I disagree with running cron outside the Docker container. One of the reasons
for using Docker is to lower deployment pain. The moment you use cron on the
host machine you've introduced yet another moving part, and yet another
dependency that must be installed on the host.

I also disagree that there is a mistake here involving thinking of containers
as faster VMs. Yes, you can think them of applications, but the fact is still
that there is a whole operating system running inside the container, and that
many apps rely on cron and other stuff. Given that crond is so small and
lightweight, and that a lot of people don't know what depends on cron and what
not, I think it's better to turn it on by default. If you know for sure that
you don't use cron, you can still turn it off.

Remember, the goal of baseimage-docker is to provide a base image that is
correct for most people, especially people who are not intimately familiar
with the Unix system model.

Lxc-attach, although it works, has several problems:

* You are doing things outside Docker so you won't be able to track it (logs, attach, etc). Also, docker might use LXC right now, but there is no warranty it will do so forever. For example, what if you're using Docker on OS X? No lxc-attach there.

* It does not allow you to limit access. What if you want to give a person only access to a specific container? You can do that with SSH through the use of keys.

* lxc-attach has caveats with --elevated-privileges, documented in the man page.

~~~
zimbatm
Just run another container that runs cron.

In regards of lxc-attach: this is a command that docker should expose to allow
network operations. Even with baseimage-docker you won't be able to track it
because all you can do is attach/log sshd. Also how do all these SSH keys and
ports on a single host ?

------
nailer
Fascinating. My first inclination, when I started running Docker, was to run
/sbin/init and launch a full systemd and all services.

I even asked on ServerFault (ie, StackOverflow for servers) about it and was
told, quite aggressively, that running a full OS is wrong:

[http://serverfault.com/questions/573378/how-can-i-
persistent...](http://serverfault.com/questions/573378/how-can-i-persistently-
run-a-docker-container-without-specifying-a-command)

Addressed individually:

1\. Reaping orphans inside the container.

Yup. If your app's parent process crashes, its child processes may now be
orphans. However in this case your monitoring should also restart the entire
container.

2\. Logging.

Assuming you run your docker image in a .service file (which is what CoreOS
uses as standard), systemd-journald on the host will log everything as coming
from whatever your unit (.service) name is. So if you `systemctl myapp start`
output and errors will show up in `journalctl -u myapp` in the parent OS.

3\. Scheduled tasks.

For things like logrotate, it really depends whether you're handing logs
inside or outside the container. Again, I'd use systemd-journald in CoreOS,
rather than individual containers, for logs, so they'd be rotated in CoreOS.
For other scheduled tasks it depends.

4\. SSHd

It depends. SSH isn't the only way to access a container, you can run `lxc-
attach` or similar from the host to go directly to a container.

I do mention CoreOS here because that's what I use, but RHEL 7 beta, recent
Fedoras, and upcoming Debian/Ubuntus would all operate similarly.

~~~
FooBarWidget
Regarding reaping orphans: orphans do not necessarily imply that something
crashed or that something went wrong. Orphans is a very normal part of system
operation. Consider an app that daemonizes by double forking. Double forking a
common technique with the intention of making the process become adopted by
init. It expects PID 1 to be able to handle that correctly.

Regarding logging: that only holds for output that is written to the terminal.
There are lots of software out there that never log to the terminal and log
_only_ to syslog.

As for all the other stuff: it is up to debate whether they should be handled
inside or outside the container. The right solution probably varies in a case-
by-case basis. Baseimage-docker's point is not to tell that everyone _must_ do
things the way it does. It's to provide a good and useful default, for the
people who have no idea that these issues exist. If you are aware of these
issues and you think that it makes sense to do things in another way, go
ahead.

~~~
nailer
> Consider an app that daemonizes by double forking. Double forking a common
> technique with the intention of making the process become adopted by init.

But that's daemonization - if you don't daemonize (because your container is
itself a daemon) you won't need it.

Ack re: syslog. The stuff I'm writing now uses stdout, and I let systemd do
the work. If I had some proprietary app that needed a local syslog then I'd
run that in the container. But I'd more likely ask the vendor if I could just
disable the syslog requirement and either get stdout or journald for things
like multiple field support etc.

The reason I'm writing is because the post gives the impression that everyone
is likely to be 'doing it wrong'. There are definitely some things to
consider, but I don't think that impression is accurate.

~~~
FooBarWidget
Correct, but how do you know your app never spawns anything that may
daemonize? If you wrote your app yourself and know all your dependencies and
how everything behave, then fine. But what if you're simply packaging someone
elses app? What if you're building a database server container? Have you read
every single source line to be sure that it will never behave that way and
leave behind zombie processes?

 _That_ is what baseimage-docker is about. It's about providing a sane, safe
default that behaves correctly. There are also other ways to make your system
behave correctly of course.

As for "giving the impression that everything else is wrong": that is not the
intent of the article. The title is only to catch the reader's attention and
to encourage the reader to continue reading and to understand the edge cases.
The title says other things MIGHT be wrong, it does not say that they ARE
wrong. So let me make it clear: any solution which solves the edge cases and
issues described in the article, is correctly. Baseimage-docker is _one_
solution, the one provided by us. It is not the only possible solution.

~~~
nailer
Good point and accepted re : database example.

------
markbnj
I've only been working with Docker for a couple of months, and I find this
discussion really interesting. The goal of trying to get containers to behave
more like a full system across various lifecycle events is somewhat orthogonal
to my own aims, which have been to get my containers as close to stateless as
I can.

Like some other posters here I view containers less as a lightweight VM, and
more as a process sandbox. In the context of a scalable architecture I would
like a container to represent a single abstract component, which can be spun
up (perhaps in response to autoscaling events), grabs its config, connects to
the appropriate resources, streams its logs/events out to sinks, reads and
writes files from external volumes, and runs until it faults or you shut it
down.

Ideally there would be nothing inside the container at shutdown that you care
about. After shutdown the container, and potentially the instance it was
running on, disappear. Spinning up another one is a matter of launching a new
container from a reference image.

So far, in cases where I have needed daemons running in the container, I have
pointed my CMD at a launch script that starts the appropriate services, and
then launches the application components, typically using supervisord. That
has worked fine, but I admit to not understanding the PID1 issue well-enough
up to this point.

~~~
FooBarWidget
Baseimage-docker does not imply that your container becomes stateful. Using
services like cron and SSH do not imply statefulness.

I also think that the container should be as stateless as possible. When state
is necessary, it can be saved at a bind mounted directory.

The point of baseimage-docker is to ensure that the system works _correctly_.
See its description about the role of PID 1. It has got nothing to do with the
statefulness discussion.

~~~
markbnj
Agreed, and a fair point. Statefulness is not a consequence of using
Baseimage-docker, and I didn't mean to suggest it was. A clearer way to put it
is perhaps to say that aiming for a container that is as simple and stateless
as possible makes the "problems" outlined in the OP seem less compelling to
me.

Take the syslog example. If I am starting processes that log to syslog, and I
want syslog running because I care about those messages, then I should be
taking steps to ship those messages out of the container, otherwise I am
creating state that has to be preserved across system lifecycle events to be
of any use. If I am pursuing a stateless container then I will not be blindly
running things that create state without deciding how to handle it. Along
those lines if you are pursuing this kind of design you want to have a good
handle on everything that's running, and what state it produces. I don't know
everything that's running and producing output in a full Linux installation.
I'm sure I could figure it out, but it seems to me that Docker's minimalistic
approach makes it easier to draw lines around this stuff.

The OP implied that you could design what looks like a solid container and
that it might yet be broken in ways that aren't obvious. I'm very eager to
know if that's really the case, as I am considering deploying some production
components using the tool.

So far the system services argument doesn't seem very compelling to me. I
haven't run into any issues launching services from scripts at container
start. Examples would be logstash, redis, supervisord, etc. It could be very
convenient to have an image already configured with a proper init system, but
I am not sure that it is fixing anything that is broken.

I don't have enough experience to get deeply into the PID1 issue. All I can
say is that I haven't run into any problems . I can't say, for example,
whether everything is shutting down cleanly in all cases, but the way I build
my containers I don't care that much. Unless I go back in for specific
diagnostic reasons a container only gets started once.

~~~
FooBarWidget
Correct, fully agreed with what you said about syslog. But that's not the
problem that baseimage-docker is trying to solve. Suppose that you're building
a Docker container, and something fails. Nothing on stdout and stderr. You
decide to look in /var/log/syslog, but nothing there too. You scratch your
head. If only you knew that /var/log/syslog only works if the syslog daemon is
running. _That_ sort of thing is what baseimage-docker solves. Whether you
want to ship logs outside the container, that's up to you.

Right now I am building a web app in a Docker container. The web app is
written in Rails, hosted by Nginx and Phusion Passenger. To make setup as easy
as possible for users, the container also contains PostgreSQL. I run
Nginx+Passenger and PostgreSQL at the same time by hooking them both on runit.
The init system in baseimage-docker ensures that a 'docker stop' properly
shuts down both Nginx and PostgreSQL.

------
josh-wrale
Cross-distro support notwithstanding, why not just skip Docker, LXC and VMs.
Instead, use cgroups on bare-metal to make processes behave. On that note,
forget bridging, use SR-IOV virtual functions with VLANs for QoS and _Profit_.

Edit: It seems this comment has been voted down. I think perhaps this is seen
as irrelevant, but I would disagree, because Docker uses LXC and masks its
function in much the same way as LXC uses cgroups and masks their function.
cgroups can be used to achieve similar goals without these many layers of
abstraction. In this way, I believe this comment to be relevant to the
discussion of full vs. application containers on Linux. There are certainly
many reasons for using containers, but one of the leading reasons is process
limits (e.g. RAM, network namespace). Limiting process usage of those
resources, using only cgroups, is quite easy in comparison to all Phusion has
gone through here to something with similar (though admittedly different)
aims. Example: [http://www.andrewklau.com//controlling-glusterfsd-cpu-
outbre...](http://www.andrewklau.com//controlling-glusterfsd-cpu-outbreaks-
with-cgroups/)

Edit 2: I would also appreciate constructive criticism. That is, I've been
downvoted without useful feedback. Specific feedback as to what is wrong with
my comment would enable me to contribute more constructively to this
discussion. Without such feedback, I believe the downvote can be seen as a
simple and tribal "go away".

~~~
pekk
Constructive criticism, about how this sounds: it isn't clear that what you
propose is actually more valuable than using Docker. It sounds like it's
complex and requires a lot of manual intervention. It doesn't sound like your
alternative covers Docker's use cases.

Your idea may need to be more fleshed out, but at a minimum it needs to be
explained in a way that makes it clear why most users of Docker would see a
significant benefit to use your approach instead.

------
philips
You really should not run ssh in your containers. If you have a ton of
containers then key management and security updates of SSH will be a pain.
There are two tools that can easily help out:

\- nsenter lets you pick and chose what namespaces you enter. Say the host OS
has tcpdump but your container doesn't. Then you can use nsenter to enter the
network namespace but not the mount namespace: sudo nsenter -t 772 -n tcpdump
-i lo

\- lxc-attach will let you run a command inside of an existing container. This
is lxc specific I believe and probably not a great long term solution. But,
most people have it installed.

------
ewindisch
I disagree with the premise that using Docker to run individual processes is
"wrong". Phusion is doing a disservice by suggesting as such. There ARE use-
cases where such a base-image is useful, but I believe these should be the
uncommon case, not the common one. Even yet, if running multiple processes in
a container is needed, it's preferable to use Docker-in-Docker.

I suppose part of the problem is the two benefits of Docker and
containerization are frequently confused. Docker provides portability and
build bundling, but ALSO provide loose process isolation. You should want to
take advantage of that process isolation and by doing so, should want to run
SSH or cron in their own containers, not in a single container with your
application process. If your application has multiple processes, each should
have their own containers. These containers can be linked and share volumes,
devices, namespaces, etc. Granted, some of the functionality one might desire
for this model is still missing or in development, but much of it is there
already and that's the model I aspire Docker to follow.

It might also be to some degree a matter of legacy versus green-field
applications. For instance, I've been deploying OpenStack's 'devstack'
developer environment (which forks dozens of binaries) inside of a single
Docker container. In this case, the Phusion base-image might make sense.
However, the proper way of using Docker would be to run dozens of containers,
each running a single service.

The reason I don't do this is because the OpenStack development/testing tools
provide this forking and enforce this model, using 'screen' as a pseudo-init
process. From the Docker perspective, this is a legacy application. I could
and probably will change those development tools to create multiple
containers, but until then, it's easiest to stick to a single container.

~~~
tinco

        I disagree with the premise that using Docker to run individual processes is "wrong". Phusion is doing a disservice by suggesting as such.
    

This is not the premise of the article. The premise is that someone goes 'from
ubuntu; apt-get install memcache; cmd ["memcached"]' and thinks everything is
going to be alright, when in reality they've just set up a rather buggy
system.

If you're absolutely certain your app is going to be fine running as the sole
(PID1) process in the container, then this article has no problem with that.
It just says that if you're going to run something you've got from apt-get,
then chances are, your system is going to have to be a little more like a
Debian system.

------
zimbatm
It will work but things are addressed on the wrong level in my opinion.

syslog: each container now has it's own logs to handle. If you want them to be
persistent/forwarded it might be better if all containers could share the
/dev/log device of the host (not sure of the implications though).

ssh: lxc-attach. Docker should expose that.

zombies: it's a bug in the program to not wait(1) on child processes.

cron: make a separate container that runs cron.

init crashes: bug in the program again. it's possible to use the hosts's init
system to restart a container if necessary.

~~~
FooBarWidget
lxc-attach: see
[https://news.ycombinator.com/item?id=7258242](https://news.ycombinator.com/item?id=7258242)
about why I think SSH is more appropriate.

Zombies: this is not about child processes created by the program. It's about
child processes created by child processes! For example what if your app
spawns another app that daemonizes by double forking? Your PID 1 _has_ to reap
all _adopted_ child processes, not just the ones it spawned.

~~~
zimbatm
Then it's a bug in the child process. Turtles all the way down. Also, double-
forking is a hack that should burn in hell.

EDIT to the reply below: It's still a design issue but I agree that it's not
always practical to change existing software. A small PID1 wrapper that reaps
zombie processes and execs the target program would be a good middle-ground.

~~~
FooBarWidget
It's not. Most apps rightfully expect that they're _not_ PID 1, and that the
real PID 1 takes care of that sort of stuff. Only in a container does it
happen often that your totally-not-designed-to-be-a-PID-1 app, actually is PID
1.

What if you're creating a PostgreSQL container, and your init script spawns a
daemon, after which it exec()s the PostgreSQL server process as PID 1? The
daemon then spawns a few processes that fork a few times. PostgreSQL only
waitpid()s on its own postmaster worker processes and so those other processes
become zombies. Are you telling me that PostgreSQL is broken and that you have
to patch PostgreSQL?

I think using a proper init system, and running PostgreSQL under it, is a much
saner view on things. The small wrapper that you mentioned is exactly the
/sbin/my_init provided by baseimage-docker.

------
thu
It may be a matter of opinion but advocating to run cron, sshd, and so on in
your containers, let alone in every single one by providing a base image to do
that seems plain wrong.

Let's take an example. You have Nginx, a web app, and a database. You can put
everything in the same container or not. If you choose to put everything in
different containers, you will be able to use tools at the Docker level to
manage them (e.g. replace one of those processes).

And the fundamental idea is that we expect to have plenty of Docker images
around that you can pick and play with, and those Docker-level tools will be
able to manage all those things.

Now if you put everything in the same container, you're back to square one,
reinventing the tools to manage those individual process. You can say that you
don't need to re-invent anything, because you're used to full-fledged
operating systems. Still, if you have a nice story to deploy containers on
multiple hosts, to send logs across those hosts, and so on, the road will be
more straightfoward when you decide to use multiple hosts.

This is about uniformity. I want processes (and containers around them), and
hosts, that's it. I don't want additional levels. I don't want processes,
arbitrarily grouped inside some VMs (or containers), and hosts. Two levels
instead of three.

~~~
FooBarWidget
Right, cron and sshd are open for debate, but at the very least you have to
make your PID 1 behave correctly by reaping adopted child processes. That is a
major part of baseimage-docker.

Baseimage-docker is not advocating putting everything in the same container.
It's advocating putting all the _necessary, low-level services_ in the same
container. What if your app happens to use a library that needs to schedule
things to run periodically using cron? To me it doesn't make sense to split
that cron job to another container. The app might physically consist of
multiple processes and components, but I think it should logically behave as a
single unit.

For stuff like Nginx and the database, it's not so clear what is the right
thing to do. It depends your use case. I don't think that putting those major
services in the same container is always correct (though it might be), but I
also don't think that splitting them out to Docker containers is always the
right thing to do.

You say that that putting stuff in the same container puts us back to square
one. I think splitting them puts us back to square one. Your base OS _already_
runs all your processes as single units. You have to worry about each one of
them separately, resulting in lots of moving parts that all increase
deployment complexity. The beauty of Docker should be that you can group
things. If you don't group things then why would you be using Docker? You
might as well apt-get install your app and have it run as a normal daemon.

One use case where it really really makes sense to put everything in the same
container: when distributing an app to end users who have little to no system
administration knowledge. For example, what if you want to distribute the
Discourse forum software? It depends on Rails, Nginx and PostgreSQL. Users are
already having a lot of trouble installing Ruby, running 'bundle install',
setting up Nginx and setting up PostgreSQL. Imagine if they can just 'docker
run discourse' and it immediately listens on port 80, or whatever port they
prefer, with the database and everything already taken care of for them.

~~~
thu
I guess we both understand things well enough to know that limits to draw are
not rigid. That being said, here is my take on what you say.

An app should logically behave as a single unit. I would say that's true, and
that unit is a cluster of containers. Docker is not yet ready as a tool to
manage clusters of containers, but I believe it will. In the meantime tools
like Fig or Gaudi are exploring the design space.

You say that having everything separate is back to square one, because you
have to manage things separately. My opinion is to develop tools to manage
cluster of containers, not to cram things to fit in a single one (I'm not
being harsh, sorry if it sounds like). If you use Docker to group things (at
the container level, instead of at the cluster of containers level), what
should we do if I want to share something with you (a program) ? I can be nice
and provide a Dockerfile, but you would still have to put it in your existing
"logical single unit", thus loosing the benefits of, e.g. dependencies
isolation.

The distribution case for enduser is a good one, where the limits will depend
on what you really want. For instance if you don't expect people to expand
your app by adding additional processes, why not. But I think it is still a
workaround for the cluster-level tool I keep talking about.

I am using Docker to create a cluster of containers (for
[https://reesd.com](https://reesd.com)). Since the infamous cluster-level tool
of my dream doesn't exist yet, I'm still relying on Bash scripts (because I
feel like exploring my possibilities and don't want to start writing a
solidified tool). The script is pretty simple: a bunch of `docker run -d`,
saving containers IDs and IPs around (this could be replaced by `docker
inspect` and such).

Well that script is done so it can be run next to itself multiple times. So I
can have multiple instances of the whole Reesd service on my laptop. To deploy
it, I run the same script, possibly next to the live one. I have additional
scripts to e.g. replace one specific container (say, the web app). So really
when talking about uniformity, I want to be able to run Reesd on my laptop, or
on multiple machines, and possibly side-by-side, using the same Docker
features.

A possibility that I haven't tried regarding your last paragraph is the
docker-in-docker feature.

~~~
FooBarWidget
Very interesting viewpoint. Yes, if Docker performs cluster management right
then that would change a lot of things. I see that the CoreOS guys released
Fleet today, possibly in response to this article. I'll have a look at this
later.
[https://news.ycombinator.com/item?id=7260596](https://news.ycombinator.com/item?id=7260596)

------
hrjet
Why not just use

CMD ["/sbin/init"]

And start your app through an init.rd script?

The article says "upstart" is designed to be run on real hardware and not a
virtualised system. If that is true, then perhaps there is value in baseimage-
docker, but details are lacking.

~~~
FooBarWidget
So why don't you try it and see whether it works?

One of the things /sbin/init does is checking and mounting your filesystems.
But you can't do that in an unprivileged Docker container because you don't
have direct hardware access. This is only one example of where things go
wrong. The entire init process is _full_ of these kinds of code where it is
assumed that there is direct hardware access.

Even when your container is started with -privileged, you still can't do that.
The host OS is already controlling the hardware.

Also, /sbin/init usually does not like having SIGTERM sent to it, which is
what 'docker stop' does. Depending on the implementation, /sbin/init either
terminates uncleanly (causing the entire container to be killed uncleanly) or
ignores the signal outright (causing the 'docker stop' timeout to kick in,
also causing the container to be killed uncleanly).

~~~
philips
It depends on the init system, however.

systemd makes an effort to ensure that running /sbin/init inside of a
container works and can be detected by the software and services underneath
it[1]. In general this means that if you take a copy of Arch or Fedora and try
to run it inside of a container it works properly without any hacks.

For your own services you can also start to do the right things by using the
virtualization detection code[2] that is built in. The most immediately useful
one being: ConditionVirtualization=container and !container. With these
directives you can tell your services to run or not run depending on whether
you are in a container or on real hardware.

[1]:
[http://www.freedesktop.org/wiki/Software/systemd/ContainerIn...](http://www.freedesktop.org/wiki/Software/systemd/ContainerInterface/)
[2]
[http://www.freedesktop.org/software/systemd/man/systemd.unit...](http://www.freedesktop.org/software/systemd/man/systemd.unit.html#ConditionPathExists=)

~~~
mst
ConditionVirtualization=container seems like a "with great power comes great
ability to screw up in subtle and horrible ways" sort of feature, wherein when
you need it, you really need it, but most of the time, a different approach
will be vastly preferable.

~~~
philips
Absolutely. This was in the context of the parent talking about doing things
without certain privileges or skipping unnecessary steps.

------
tomgruner
Docker is a container for running processes, or a process. Containers should
be disposable and transient. I have begun to think of it in terms similar to
OOP. Images are your Classes. Containers are your class Instances. When you
are done with an instance, you discard it and make a new instance. So don't go
shoving all kinds of crap into the instance like crons and sshd that don't
belong there. Most devs don't expect to have their code be free of memory
leaks when it comes to interpreted languages. And docker containers don't need
to worry about child processes being stopped - they should just be disposed of
and you make a new container from your image. Keeping containers around would
be like trying to pickle a python class instance perpetually that has
references to who knows what... Just make a new instance when you need it. And
just make a new container when you need one. I use named containers and a
Makefile that stops and deletes existing containers with the same name before
starting a new one.

~~~
FooBarWidget
To me, that does not make any sense. Your _program executable_ is already like
a Class. A normal _process_ is already like instances of your classes. If all
you want is OOP, then why are you using Docker? Your Unix system has been
doing that for 30+ years!

~~~
tomgruner
The benefits of me that docker brings is the ability to share containers with
my team and to deploy those same containers to the server. The reason that I
started thinking of it in terms of programming, like OOP, was to help me get
my head around how to correctly use docker on a server. As I start to
understand docker, finding the correct place for configuration, data, crons,
debugging, and execution are all important and the closest paradigm that I can
easily apply to it is OOP, even though it is not a perfect match. OOP can also
help visualize how one image can inherit from another image and override both
methods and configuration of that image so that instances will behave in a
different manner.

------
damm
I think there's a lot of assumptions made, and there are a lot of assumptions
made about your base image.

The Ubuntu base image (or how it's built) can be found
[https://github.com/tianon/docker-brew-
ubuntu](https://github.com/tianon/docker-brew-ubuntu)

Some excellent examples of how to use them with /sbin/init can be found
[https://github.com/tianon/dockerfiles/tree/master/sbin-
init/...](https://github.com/tianon/dockerfiles/tree/master/sbin-
init/ubuntu/upstart)

Not everyone who uses Docker uses CRON, nor considers them long-term
containers; rather short term process containers.

Docker is growing and how we use docker will change so be flexible and realize
what you considered useful yesterday may not be required tomorrow. We will
have to re-learn best practices and keep learning after that.

Note, the Ubuntu image isn't made by Ubuntu. Maybe Phusion should host their
own Ubuntu image just for sakes of sakes.

------
rschmitty
Is there a "explain Docker to me like I'm 5" post?

This seems like the old "I have problems with managing everything I need for
my app so I'll just run docker containers. Now I have 2 problems"

~~~
tinco
The Linux kernel has some features that make it possible to isolate a process
from all (most) system resources, without actually running it inside a VM.

Docker is a tool that makes it easy to launch such isolated processes. You
just specify what the filesystem environment should be for the new process,
and what process to run in a small file and off it goes.

In theory this could make provisioning easier, having each application come in
a Docker container that satisfies its own system level dependencies, and does
its service level dependencies over connections to other containers/external
hosts.

------
brokenparser
So if you run anything other than Ubuntu inside Docker, this is useless
because the steps to build your own aren't outlined.

I find Docker to be horribly counter-intuitive and ass-backwards anyway, so
not much harm done there as people are in general better off with something
else entirely (plain lxc, libvirt, virtualbox, xen, openvz...). I recommend to
steer away from it at least until 1.0 is out.

EDIT: I put it in my .plan to build a better BusyBox image aimed at running
statically compiled programs with minimal baggage, but I'm not sure when I'll
get a round tuit*

*: [http://i.ebayimg.com/00/s/NDgwWDY0MA==/z/z-4AAOxyUrZSr82N/$_...](http://i.ebayimg.com/00/s/NDgwWDY0MA==/z/z-4AAOxyUrZSr82N/$_12.JPG)

~~~
FooBarWidget
Why do you think it isn't outlined? The website explains exactly what the
modifications are, what they do, and what they are good for. The Dockerfile is
on Github for everyone to see. The website makes explicit mention that the
init system is /sbin/my_init, for which the full source is available on
Github. It's trivial to take the my_init script and integrate it into your
non-Ubuntu container. You can even write your own init system based on the
website description if you so choose.

------
pini42
I think it is not related to Docker itself, but to the fact the it is using
all purpose Linux distributions. I'm pretty sure that very soon we will see
explosion on new distros addressing exactly these problem and built explicitly
for running inside containers.

~~~
krakensden
I used to think that that was what CoreOS was, but I have since become
confused.

~~~
shill
CoresOS is designed to host containers, not be a base container image.

------
tel
How does this play with the CoreOS premise where each docker should be hosting
a single process managed intelligently through something like systemd?

Under this model I'd expect that systemd's pgroup support should help with
zombie processes and generally take over many of the services that baseimage-
docker is suggesting here. As other have mentioned in this thread, there's a
fairly large difference of opinion between running containers like fast VMs or
like thin layers around single processes—does baseimage-docker make sense only
in the latter?

~~~
tinco
baseimage-docker is meant to make it easier to make a correct environment for
the processes you run in it, so perhaps make it be more like a fast VM.

From what we've seen the CoreOS people and perhaps the Docker people as well
like to see Docker more as a thin layer around processes, being managed by
external services.

------
DanHulton
Off-topic, but I'd thought I'd screwed up my DNS for a moment and this article
redirected to the silly side-project I've been working on: ipaidthemost.com.

I guess we borrowed the same template?

------
krakensden
I'm pretty suspicious of using runit instead of Upstart- nobody tests Ubuntu
with runit, and you're liable to get in trouble if you depend on some other
service running on the machine. Although clearly it works well enough for
them.

I also sort of suspect that the closer you are to running a full distribution
in your containers, the less benefit you're getting from the containers.

~~~
FooBarWidget
Baseimage-docker uses runit exactly to _not_ run a full distribution in the
container. Upstart tries to boot a full Ubuntu. A full Ubuntu is not necessary
inside the container. Therefore, baseimage-docker provides a custom init
system that boots only the minimal subset of Ubuntu that is necessary for it
to run correctly in Docker.

------
akerl_
I was super stoked to read this, and went diving to borrow some of their work
for my own Docker usage. However, I'm confused by their choice of Python for
the my_init script. The site claims they chose runit because it is more
lightweight than supervisord, a Python tool of similar merit. Making the init
process depend on Python seems to negate that advantage.

~~~
FooBarWidget
It's not only Python that makes supervisord relatively heavy compared to
runit. It's also the amount of code in supervisord (and its dependencies).
my_init is only a single file, less than 300 lines, with minimal dependencies.

Baseimage-docker is also in a "minimal viable product" phase. We're still
trying to tweak things until they're right. For example my_init recently
received some features which are important in certain use cases; features
which would have been much lower to implement in C.

In the future we may optimize things by rewriting my_init in C. Right now it's
laziness on our part.

~~~
akerl_
Based on the points raised in your article, I ended up poking my own Docker
images to get runit working on them. It looks like runit is A) pretty snazzy,
and B) capable of being the init daemon. Am I missing some trade-off or issue
there, that prevents it from being PID1 in your scenario?

~~~
FooBarWidget
Yes. Runit does not correctly reap adopted child processes, which is why we
run runit under my_init. My_init (or an alternative that behaves like my_init)
is absolutely necessary for correct operation.

------
peterwwillis
They just described implementing an OpenVZ VM.

------
jaybuff
"Note that the shell script must run the daemon without letting it
daemonize/fork it. Usually, daemons provide a command line flag or a config
file option for that."

fghack is an anti-backgrounding tool.
[http://cr.yp.to/daemontools/fghack.html](http://cr.yp.to/daemontools/fghack.html)

------
willvarfar
I've been trying to get some tools to run in a docker for a few days now. So
far the problems have been that there isn't a convincing HOME folder and user,
and that the locale isn't set (only explodes if there are unicode filenames,
but there are plenty of those e.g. for SSL certs).

Does this script sort out those kind of things?

~~~
FooBarWidget
No, you have to take care of environment variables yourself. Unfortunately,
Docker does not inherit environment variables from the base image, so in every
Dockerfile you have to ensure that the right environment variables are set.

~~~
ARothfusz
That's not "unfortunate" \-- it is by design. You use containers to minimize
dependencies on the host.

~~~
FooBarWidget
What does minimizing dependencies on the host have anything to do with not
inheriting environment variables from the base image?

Note: I'm not talking about inheriting environment variables from the _host_!

------
the_mitsuhiko
I don't understand the PID1 case. You are running a single process, why do you
have to collect zombies?

In fact, I understand none of these points. This seems all very hard to relate
to. These are containers and not VMs. Most of that stuff should run in a
separate container.

~~~
FooBarWidget
Your single process might spawn child processes that double fork, resulting in
zombies. Unless you've read every source code single line in the app, plus
every single source code line in all its dependencies (and all dependencies of
all dependencies), you really can't be sure that that _won 't_ happen. And
when it does happen, your system is not behaving correctly.

And what if your single process spawns a child process that encounters an
error, and logs only to syslog? If your syslog daemon is not running, you will
never know that there has been an error. Again, if you've read every single
line and know that this does not happen, then that's fine. But the point of
baseimage-docker is to provide a good and safe default so that these edge
cases are already taken care of for you.

------
kapilvt
just use the ubuntu-upstart stackbrew image.. compatible with all the packages
etc..

