
Systemd vs. Docker - tomaac
https://lwn.net/Articles/676831/
======
js2
_One of these is killing "zombie" processes that have been abandoned by their
calling session._

That's funny terminolgy, isn't it? Killing a process usually means sending it
a signal, typically TERM or KILL, that causes it to exit. But a zombie process
is one that has already exited, but hasn't been waited for by its parent,
where its parent is either the process that spawned it, or if that process has
died, the process with PID 1. This is usually referred to as reaping the
zombie process, not killing it. AFAIK, a signal sent to a zombie process is
simply ignored.

Or do the quotes around zombie imply a different meaning, such as "zombie-
like"?

~~~
masklinn
No, it's a zombie in the normal sense, the killing here is not sending it a
signal but reaping zombie processes (in the sense of personified death reaping
souls) by waiting on it.

Things would probably be clearer if the quotes were around "killing" rather
than "zombie", mayhaps the interviewer/writer was unfamiliar with the
terminology.

~~~
ibotty
I strongly doubt that! Josh knows the terminology. It was surly just an
oversight.

------
thwarted
_Poettering says that PID 1 has special requirements. One of these is killing
"zombie" processes that have been abandoned by their calling session. This is
a real problem for Docker since the application runs as PID 1 and does not
handle the zombie processes. For example, containers running the Oracle
database can end up with thousands of zombie processes._

Why does Poettering keep claiming this when he's the one who submitted the
patch that adds the PR_SET_CHILD_SUBREAPER prctl(2) [0] functionality?

[0] [http://man7.org/linux/man-
pages/man2/prctl.2.html](http://man7.org/linux/man-pages/man2/prctl.2.html)

~~~
pas
I guess he's saying, that you can't just take any random binary and run it in
a Docker container, because if that binary spawns a lot of children but does
not wait for them, then you'll have a lot of zombies.

Docker could run a minimal pid1 in each container to address this. Though if
this had been a big issue I guess this would have been already fixed.

Naturally, a proof of concept of the problem would be great. (Let's say a
Dockerfile.)

~~~
vidarh
It has been a reasonably big issue. E.g. I kept seeing zombies with Consul for
a while until we realised that _every single_ Consul Docker container on
Dockerhub just had Consul run as pid 1 in the container (this is a while ago,
no idea if that's still the case), without realising that Consul health checks
then could end up as zombies if you weren't very careful about how you wrote
them (e.g. typical example: Spawning curl from a shell script, with a timeout
on the health check that was shorter than any timeouts on the curl requests).

It's usually fairly simple to fix (e.g. for Consul above, I raised it with the
Consul guys and they said they'd look at adding waiting on children to it as a
precaution - it's just a couple of lines -, but people building containers
could also introduce a minimal init, or you can write your health checks to
guard against it), but it happens all over the place, and people are often
unaware and so not on the lookout for it and it may not be immediately
obvious.

The reason I raised it as an issue for Consul, for example, even though it
wasn't really their fault, but an issue with the _containers_ , is that people
need to be aware of the problem when packaging the containers, need to be
aware that a given application may spawn children, and that they may not wait
for them. Even a lot of people aware of the zombie issue end up packaging
software that they didn't realise where spawning child processes that could
end up as zombies (in this case, it took running it in a container without a
proper pid 1, using health checks which not everyone will do, _and_ writing
the health checks in a particular way in order to notice the effects).

Thankfully there are a number of tiny little inits. E.g. there's suckless
sinit [1], Tini[2] , and here's a tiny little proof of concept Go init [3] I
wrote (though frankly, suckless or Tini compiled with musl will give you a
much smaller binary) as what little you actually _need_ to do is very trivial.

[1] [http://git.suckless.org/sinit](http://git.suckless.org/sinit)

[2] [https://github.com/krallin/tini](https://github.com/krallin/tini)

[3]
[https://gist.github.com/vidarh/91a110792c86d6c3bb41](https://gist.github.com/vidarh/91a110792c86d6c3bb41)

~~~
pas
Seeing how even the trivial pid1 "scripts" solve the problem, it's truly
baffling why Docker doesn't have a --with-reaper flag.

Also thanks for the Consul example, makes it much-much easier to see the issue
and argue for a general solution. (So not every random
app/project/service/daemon has to implement pid1 functionality.)

~~~
masklinn
> Seeing how even the trivial pid1 "scripts" solve the problem, it's truly
> baffling why Docker doesn't have a --with-reaper flag.

That doesn't fix the issue since you need to know about the issue and accept
that it exists, at that point you can just as easily use one of the micro-
inits available.

The alternative is to enable it by default, but now you've broken BC for the
weirdo who actually expects orphan processes to be adopted by the root process
they're starting.

~~~
shykes
Yes, the problem is that we would need to change the default behavior of
Docker, which many people and scripts expect to be stable. It's a case of
interface stability vs. functionality improvement. So far interface stability
has won. I personnally think it would be better to change the default, but
anything that breaks an interface, even a subtle implicit one, has the burden
of arguing a solution, thinking through migration issues, submitting
patches... So far I have seen a lot of drive-by criticisms and dismissal of
the need to even discuss the tradeoff (see for example this lovely fellow:
[https://lwn.net/Articles/677419/](https://lwn.net/Articles/677419/)). But I
have not seen anyone stepping up to do the work.

We all pick our battles - including me of course!

------
storrgie
why not systemd-nspawn (zoidberg voice)

Seems like the way Fedora is packaging systemd for 24 is going to move
systemd-nspawn to a level of maturity that will likely surpass some of the
clunky issues folks have with running docker.

------
pmoriarty
CoreOS's Rocket is built around systemd?

That alone disqualifies it for me right there.

~~~
bryanlarsen
Why the systemd hate? Because it's a big monolithic project that takes over
your system? You do realize that Docker is much more monolithic and
opinionated than systemd, right?

~~~
kordless
When you ask questions, especially leading ones, it causes a good deal of
confusion around the topic at hand. The reasons behind this are complex, but
they have something to do with our tendency to double bind each other.

Someone has the right to say why something is "disqualified" for them, even if
it is devoid of context. What is awesome here is that the leading expert for
this topic is replying directly to the negative (empty) opinion and actually
presents a (rich) alternate opinion.

How does you asking unanswerable questions contribute to resolving the
conversation to something we can all learn from?

~~~
GFK_of_xmaspast
So in other words "questions are a burden and answers are a prison for
oneself".

~~~
kordless
Regardless of their nature, questions are definitely a burden. However, I
think the way some questions are put can cause a disproportionate amount of
burden when they contain hidden meanings or agendas.

If someone is having issues being direct and use techniques to "hide" how they
feel about something in a question, they effectively load the question with
intent. I think sometimes those questions can be viral in nature, causing
angry memes like what they mention in "This Video Will Make You Angry":
[https://www.youtube.com/watch?v=rE3j_RHkqJc](https://www.youtube.com/watch?v=rE3j_RHkqJc)

Logic would dictate that we should learn to avoid questions which cause
excessive amounts of processing with little return in their answers. A simple
way to filter on these is to ask if the question conflicts itself when
answered in a given way.

------
atemerev
I don't always run containerized applications, but when I do, I prefer them
completely systemd-free, thank you.

Sometimes I wonder if systemd is actually a part of big plan of moving
everyone to microservices and containers and maybe even unikernels — anything,
just anything without this abomination.

~~~
bryanlarsen
Can you explain your position to me? I can understand somebody who dislikes
systemd and dislikes docker. I can understand somebody who likes both systemd
and docker. But disliking systemd but liking docker? That I don't understand.
Any effective criticism of systemd that I've heard generally can also be
applied to docker.

Like yours: "I wonder if systemd is actually a part of big plan of moving
everyone to microservices and containers and maybe even unikernels" works even
better if you replace systemd with docker.

~~~
atemerev
Docker is just a toolkit for composing and networking layered OS images. It
improves isolation of things and adheres to simple principles (immutable
containers, restarting instead attempting to recover, etc.) It structures
things better. Inter-container communication is deliberately simple (env
variables and, recently, networking).

Systemd spits on isolation, it embraces integration of everything.
Supervision, logging, communication, IO, configuration, state management —
everything goes through systemd. Everything is binary and opaque. Docker is
transparent.

~~~
bryanlarsen
Your criticism of systemd still applies to docker. "Supervision, logging,
communication, IO, configuration, state management — everything goes through
docker"

If I use systemd I have to type 'systemd logs' to get at my logs, or I can use
a plugin to move it somewhere else. If I use docker I have to type 'docker
logs', or I can use a plugin to move it somewhere else. etc. etc.

P.S. Agree completely with your praise of Docker. I'm firmly in the 'love both
systemd and docker and wish they got along' camp.

~~~
qwertyuiop924
Wrong. Docker manages containers, and only containers. 'docker logs' shows you
the logs from your containers. Docker never tried to make me run my non-
container logs through 'docker logs'. You know what else docker never tried to
be? cron. Or udev. Or consolekit. Or init. It just tries to manage your
containers.

~~~
Etzos
I think this is a little bit off. You're looking at them from two different
perspectives.

Docker wants to manage your containers. And in that regard it is one
monolithic daemon that manages _everything_ about your containers.

Systemd wants to manage your computer and things related to init. It is a
bunch of modular, but strongly integrated, pieces that manage _everything_
about your init and process management.

~~~
qwertyuiop924
>>It is a bunch of modular, but strongly integrated, pieces that manage
everything about your init and process management.

OH REALLY? well, can I just run systemd-udevd, without systemd? how about
journald? No? well than, if everything depends on one massive daemon, it isn't
very modular, is it.

~~~
JdeBP
The correct answer is actually "Yes, you can.".

~~~
qwertyuiop924
Not soon. You can't run journald without systemd now, and you won't be able to
run systemd-udevd without systemd as soon as kdbus gets merged, which the
systemd devs are pushing for heavily.

~~~
JdeBP
Your information is out of date with respect to systemd-udevd, per
[https://news.ycombinator.com/item?id=10518933](https://news.ycombinator.com/item?id=10518933)
and others; and your assertion about journald is simply wrong unless something
has changed _very_ recently.

