
Daemon Showdown: Upstart vs. Runit vs. Systemd vs. Circus vs. God - pjscott
http://tech.cueup.com/blog/2013/03/08/running-daemons/
======
pjscott
Hi, author of the post here. The thing that really surprised me, when writing
this, was how difficult it was to find the edge of the topic: where does
process supervision end, and server monitoring begin? If most of these
programs include something for handling logs, are they then also logging
software? Where's the boundary between watching processes and gathering
metrics on them? Good software is canonically supposed to do one thing well,
but it seems like nobody can agree what handles what here. I even started
writing sections on logging and server monitoring, before sanity prevailed and
I chopped them out.

It's maddening that we have no clear separation of responsibilities for this
stuff. And so we end up with things like Upstart's half-hearted logging,
because it's not clear where Upstart's job stops.

~~~
jacques_chester
The tangled world of service monitoring, process monitoring, process
launching, configuration management etc etc et bloody c is waiting for a ZFS-
style collapsing of multiple fighting tools into a single layer.

I noticed this most of all when I was writing puppet manifests that ... create
Upstart scripts.

Why do I have one tool that configures and supervises the bits at rest and a
completely different tool that configures and supervises the same bits in
flight?

~~~
stfp
This!

Curious for people's thoughts on whether Solaris's SMF (or its general model)
fits this bill.

------
mrweasel
I still prefer Bernsteins daemon-tools. I first used them when I had to deploy
qmail and now I use them to manage Python webapplications. The default logging
option is a bit weird, but the simplicity is a real winner for me.

I don't like Upstart, I don't agree that it's "admirably simple to use", it
needlessly complex. For what I do, I don't need or want to care about
runlevels, I always want respawning and I don't like the syntax of the
configuration.

systemd seems like a really complex solution to a very simple problem, but I
never used it.

Supervisord while being well documented and popular again seems to complicate
what I want to do. Again I dislike the configuration, I fail to see why a
shell script isn't better.

Runit I never used, because why would I replace daemon-tools with, what was
when I first read about it, just a daemon-tools clone, made by people who
didn't like Bernsteins licensing. I guess it's what I would most likely go
for, if daemon-tools where not available.

The rest I never heard of.

~~~
pjscott
Runit is very similar in design to Daemontools, but it seems generally more
featureful and better maintained. If you like one, chances are you'll like the
other. (Also, daemontools' licensing is no longer an issue; it's been public
domain since 2007.)

I do take issue with your characterization of Upstart and Systemd as
needlessly complex; almost all of that complexity is for their role as init.d
replacements, and doesn't need to affect you if you just want to get a few
programs running. Their big selling point, for me, is that one of them is
usually the default and the config files are quite easy to write if you're not
doing anything elaborate.

~~~
sigil
As a diehard daemontools user this comment made me take another look at runit.
Just to be clear, runit is one of several forks of daemontools. Another one is
daemontools-encore [1].

There are a couple nice runit features that stock daemontools lacks.

The ability to run each service in a separate process session with "runsvdir
-P" is a big one. If your ./run file consists of a pipeline, bouncing the
service with TERM will produce orphans otherwise. (I have some patches to
daemontools-encore that do this on a per-service basis.)

The svlogd program from runit also looks nicer than multilog. I've tried to
like tai64, I really have, but I'd rather just have human AND machine readable
timestamps from the get go.

I like that service dependencies aren't statically declared in runit, but
rather something you can block on in your ./run file.

[1] <http://untroubled.org/daemontools-encore/>

~~~
stock_toaster
runit also has man pages.

~~~
sigil
daemontools also has manpages, thanks to Garrit Pape, the author of runit ;)

<http://smarden.org/pape/djb/manpages/>

The manpages are installed by default in every system I use -- debian, ubuntu,
MacPorts, and FreeBSD ports.

~~~
stock_toaster
Indeed. They are not part of the standard daemontools source though, which is
why I thought it worth mentioning.

------
justsee
"Process Supervision: Solved Problem" [0] is a great rundown too (from one of
the Chef Opscode guys).

His recommendation is to use runit (based on djb's daemontools).

He recommends against using Bluepill, God, Foreman, supervisor and others.

[0] [http://jtimberman.housepub.org/blog/2012/12/29/process-
super...](http://jtimberman.housepub.org/blog/2012/12/29/process-supervision-
solved-problem/)

~~~
pjscott
If you do what that guys says and use runit, you'll be happy. I use runit, and
I am happy with it.

However, I also think you'll be happy with Upstart or Systemd, even if you
find them over-engineered and inelegant. They'll do what they need to with
minimal configuration, and you're probably already using one of them behind
the scenes. Why not use one of them, if you have it sitting right there,
already installed and configured?

------
sigil
"The one caveat to be aware of is that Runit expects daemons not to fork."

Nitpick: it's fine if your daemon forks, it just shouldn't background itself,
which is different.

A nice succinct implementation of daemonization lives in the BSD sources:
[https://github.com/DragonFlyBSD/DragonFlyBSD/blob/master/lib...](https://github.com/DragonFlyBSD/DragonFlyBSD/blob/master/lib/libc/gen/daemon.c)

But don't do this in your code ;) _Do_ use a service manager to do this for
you. Programs which lack a foreground option are harder to test and interact
with due to the action-at-a-distance -- one of my chief objections with
init.d-style service management over what I'll call daemontools-style service
management. Nearly all popular daemons will have an option to stay in
foreground for this reason. The only exceptional case that comes to mind is
nginx, which does have a foreground option, but you'll lose zero-downtime
upgrades if you use it (due to some extreme cleverness).

------
gingerlime
Like others have already said, monit also deserves a mention. Whilst it can
start the process for you, it probably doesn't really fall under the same
category. It's more about making sure things are running, do not consume too
much resources etc.

However, the flexibility of monit beyond process monitoring is really great.
Anything from filesystem usage, cpu, memory to monitoring tcp ports, file
checksums, you name it. I'm yet to find something you can't do with this
little toy, and it's rock solid. I even use it to pull graphite stats[1] and
report when certain thresholds are reached, like the percentage of 500s
compared to other response codes in my nginx logs.

I also personally like its configuration syntax better than most yaml-like
DSLs. It feels almost like writing natural text.

[1]<http://blog.gingerlime.com/2013/graphite-alerts-with-monit/>

~~~
bdcravens
+1 to monit. I have an app that does a lot of scraping via Selenium, headless
using xvfb and Firefox. If the Selenium scripts fail, Firefox never closes. On
an EC2 micro instance, it doesn't take long before 40MB+ FF processes kill
other things, like Tomcat (which runs my queue processor). I use monit to
watch the root URL, and when it fails, it fires off a handful of scripts that
restart the various pieces of the application.

------
rubyrescue
highly recommend runit + monit. The beauty of runit is the ./run script is
easy to run by hand to test. The beauty of monit is it can watch things like
http urls and ports, and if something goes wrong, monit executes sv down
myservice; sv up myservice. (sv is the actual launcher-thingy of runit)

If you use monit without runit, you end up in this weird world where monit
starts and stops things in a very odd environment and it's flakier,
environment variables are missing, etc. Also monit takes a long time to notice
things are down, and doesn't start things right away on boot.

Finally if you don't boot your services with monit, you booted them with
something else meaning that when a restart occurs, you're starting a service
in a different environment than you booted in. weird.

so, let monit check health of things, let runit start and stop things.

~~~
sandGorgon
are you using ruby with rhnit?

which server (mongrel, thin, etc) do you recommend for playing nice with runit
- which the OP mentioned has a problem with fork.

~~~
rubyrescue
passenger

------
nasalgoat
This didn't really seems like a comparison, more of a description. I was
hoping for a bit more.

We're currently using supervisord but we've had to hack it to support more
than 200 workers, and I'd like something with the ability to change the number
of workers dynamically.

~~~
encoderer
We use Supervisor integrated with Zookeeper to dynamically adjust the number
of worker processes.

------
rektide
Systemd has socket activation, where it opens sockets and passes them to
applications.

But the maintainers seem to have sub-zero interest in passing said listening
sockets to multiple applications, which I'm sorry to see not be available. I'm
thankful for this article, because it discusses Supervisord and Circus's
willingness to do some pooled program handling.

------
contingencies
Amazed to see so many simple responses in this thread given that the single-
system paradigm is sort-of heading out of date now (manual Unix systems
administration is arguably a dying horse). For any serious (ie. highly
available) systems, a major option that _should_ be under discussion is
Pacemaker/Corosync. <http://clusterlabs.org/>

It's a pain to learn, easy to screw up, non-trivial to test exhaustively, but
provides highly flexible daemon migration and monitoring functionality you
aren't likely to find anywhere else, for free, and for any imaginable service.
(Actually you can monitor anything with it, including hardware. Check it out.)

~~~
eikenberry
These projects seem to do a very poor job of presenting themselves. I spent a
good deal of time reading about each and have yet to have a really good grasp
of what they are for. They seem to be about maintaining a cluster of hardware
servers to run arbitrary services with configuration and fail over.

Do you know of some decent article, blog post, video, whatever that presents
these products, what they are good for and a typical use case?

~~~
contingencies
I agree wholeheartedly about the failure to communicate clearly. At least one
contributing factor to the present situation is the lucrative consulting that
exists around these solution types and helps to fund their development. Simply
put, _if it was that easy, everyone would be doing it (and we'd be out of a
job)_. However, to be fair, things change fast and there does exist a lot of
good documentation - just not necessarily perfectly up to date for your
scenario. Your assumption is perfectly correct. Have a look at
<http://www.linbit.com/en/downloads/tech-guides> or
<http://clusterlabs.org/doc/> or try #linux-ha or #linux-cluster on freenode.

------
jhawthorn
I wouldn't use god. It's had issues crashing or misbehaving in the past. I've
found monit very stable if a little awkward to configure correctly.

Today I would use systemd with monit for additional monitoring. systemd
ensures that the same environment is used regardless of how it is invoked, and
the units are extremely simple. Monit can ensure a running server is behaving
(memory limits, CPU limits, testing HTTP) and rely on systemctl for restarting
processes.

------
antihero
One thing I always found annoying is, say I am on a shared host like
WebFaction, and I use supervisord to manage my uWSGI (or whatever) instances.
What do I do to ensure supervisord itself starts? Current approach is a script
run by cron that greps ps for supervisord and if not found, launches it. But
that seems rather flakey and unreliable.

Suggestions?

------
mappu
Poor man's solution:

    
    
        echo 'pidof myapp > /dev/null || su appuser -c "myapp --arguments" &' > ~/check_services.sh
        echo 'exit 0' >> ~/check_services.sh
        chmod 700 ~/check_services.sh
        #Add ~/check_services.sh & to /etc/rc.local for startup
        #Add */1 * * * /root/check_services.sh to crontab

~~~
stfp
Interesting hack, sure, but "poor man solutions" would be interesting if "rich
man solutions" cost too much. That is not the case here. For someone that can
understand this, runit will take < 10 minutes to learn.

------
slurgfest
I would really like to hear from more people who use Circus in production. It
seems interesting.

~~~
adambratt
Tried it and I can tell you it's not production ready. It would randomly go
rogue on our production servers and would start falsely detecting a worker
crash and start launching more.

It took us a long time to trace it down and we quickly switched to runit and
supervisor on a few servers. All of our problems went away.

~~~
tarekmoz
Circus author here.

I am interested in any form of feedback on the issues you had. Falsely
detecting a work crash sounds very weird and unprobable because Circus uses
the system PID list to check on processes - so I wonder what happens in your
case.

------
amalag
Monit is another standard program, but I think these these things being
compared are really not similar. Aren't upstart and systemd replacements for
init.d runlevel scripts? While God & Monit make sure programs are running. I
suppose there is overlap.

~~~
pjscott
There's no clear delineation here. Upstart and Systemd are init.d
replacements, both of which can make sure that programs are up and running,
and incorporate some basic process monitoring. Runit is similar, and _can_
replace init.d, but will happily run as just another process. God can't
replace init.d, but it can easily handle the daemonize-and-keep-running
functionality that's at the core of what we want from all this stuff. In
addition, God can do some health checking, for things like memory and CPU
consumption. IIRC Monit doesn't actually start other programs under itself,
but it can be configured to watch running programs, and kick them if
necessary.

If you have a clean and useful taxonomy for this stuff, I'd be interested in
hearing it.

~~~
amalag
Monit will startup a program if it's not running according to it's pid file.
In other words if the pid file doesn't exist it will start the program. I use
it in deploys by killing a program and letting Monit start it again with the
new version.

------
ballard
1\. Keep services running: Runit, even with daemontools differences, is hard
to beat.

2\. Hard(-ish) resource limits and accounting: LXC w/ cgroups. Almost as good
as full paravirtualization (Xen). There are still some issues with limiting
resource contention impact between cgroups.

3\. Softer resource limits, rogue app restarter: We've heavily modified
bluepill because it seemed to lack insight on the needs and challenges of
large-scale production ops. Specifically, we've added optional total child
process limits (a few issues reported, fixed and even submitted a pull
request). It might be useful to add max # of processes, nic bandwidth, iops
and couple other checks.

Also worth considering:

4\. Status monitoring: Icinga

5\. Performance: collectd

6\. Entropy injection: Chaos Monkey

------
__david__
I'll throw my hat in the ring here and point out daemon-manager:
<http://porkrind.org/daemon-manager/>

It was designed to be simple (creating a new daemon conf file is trivial), low
over-head (it's very init-ish in function so it needs to not take a lot of
memory or processor time), stable (can't crash or it loses track of everything
that it launched), and secure (lets you launch your daemons as different users
--usually at a lower security level).

I've been using it myself an almost every server I administrate (I wrote it to
scratch my own itch) for a couple years now and it's been very stable.

------
jokull
runit user here. For someone without a strong understanding of UNIX the runit
documentation will strike her as a little low level. That being said, the
learning curve is steep but short. What’s missing is a blog post that explains
UNIX from the point of view of runit. Starting & restarting and logging. It’s
actually a good place to start learning more about UNIX.

Zed Shaw has also starting a project of his own to tackle these things. Not
sure how it handles multiple unrelated things. Seems to be a good idea to
document in understandable Python code, what the requirements are to daemonize
a process corrently.

------
andrewflnr
When I saw the title "vs. God", I thought it was going to include something
like "Tup vs. the Eye of Mordor", a whimsical performance comparison between
the Tup build system and "The Eye or Mordor" in the guise of a simple script
that does no dependency checking because it already knows what has changed:
<http://gittup.org/tup/tup_vs_mordor.html>

------
dlgtho
"[Upstart] currently has no way to adjust the maximum number of file
descriptors, or limit memory usage"

That's weird, because that's one of the things i like about Upstart. The
ability to set pam limits and others in the same file. Upstart seems to
support the same options Runit memory wise: stack, data, memlock ...

limit nofile 10000 10000

limit nproc 1024 1024

nice 3

chroot /var/roots/mychroot

[0] <http://upstart.ubuntu.com/wiki/Stanzas#limit>

------
sdfjkl
My favorite remains BSD style (rc.conf+rc.d/) init - which for some reason
receives no mention here. Arch Linux used to have it too, but then
inexplicably dumped it.

[http://www.freebsd.org/cgi/man.cgi?query=rc&amp;sektion=...](http://www.freebsd.org/cgi/man.cgi?query=rc&amp;sektion=8)

~~~
antihero
Because having long bash scripts and every process implementing it's own
daemonisation in weird and wonderful ways is redundant and inconsistent, and
makes making something a daemon frustrating.

~~~
Sssnake
>Because having long bash scripts

Has nothing to do with BSD style init? No part of the system should be either
long or bash.

>it's own daemonisation in weird and wonderful ways is redundant and
inconsistent

Calling daemon() is not that hard.

>and makes making something a daemon frustrating

That doesn't even make sense. People writing shitty software that should be a
daemon but isn't makes that "frustrating".

------
misiti3780
I have been using supervisor with my django apps for the past 2 years. I have
no real complaints, and actually like how you can easily add new processes
with a single conf file and then type

supervisorctl update to start the new deamon without having to restart the
whole supervisor process over again ...

~~~
pjscott
Just FYI, that feature is shared by all the alternatives mentioned in the
article. (The exception would be runit, which has one supervisor process per
service, thus making the issue moot.)

------
leknarf
We're using Bluepill, which has a friendly DSL to define CPU and RAM usage
limitations. This can be quite handy if you need to occasionally restart a
process with memory leaks or the like.

I believe God and Monit have similar capabilities, although I haven't used
them personally.

------
chanon
I'm extremely happy with supervisord, but now wondering how Runit is better.

------
afita
Don't forget about s6 which seems pretty interesting as well (I admit I
haven't used it yet). <http://www.skarnet.org/software/s6/>

------
tel
Anyone have any experience with Bump's Angel?

~~~
jamwt
Angel author here. It works very well for us and some others, but it must be
said it is not as featureful as most of these. For example, it provides no
assistance for log rotation (you'd need to do this via a shell pipeline with
multilog or similar) and does not have a "control interface" (zmq socket etc)
of any kind.

~~~
pjscott
I think that logging should be separate here, so I'm actually okay with built-
in logging support being minimal to nonexistent. If you use multilog (from
daemontools) it will handle logging to rotated files; if you use svlogd (from
runit), it can do the same, and also includes syslog support. If you just log
to syslog in the first place, and your syslogd is at all modern, it can also
handle writing things to log files, here or on a remote machine, which is cool
too if you're okay with configuring it.

Or you can do something event more radical, like using fluentd[1], which looks
really useful and well-designed. I like having this decision decoupled from
the process manager. So, I would not count that as a strike against Angel.

[1] <http://fluentd.org>

~~~
jamwt
Yeah, I agree. To this end, I just added support for specifying a logger
process, and angel will now pipe stdout/stderr to it ala daemontools.
Definitely don't want angel in the business of being a log rotator.

------
pendexgabo
what about <http://libslack.org/daemon/> ?

------
Sssnake
I'm amazed at how far the linux world has fallen in the last decade. It is now
"normal" for people to run an extra layer of buggy crap software to monitor
their buggy crap software and restart it when it crashes. Rather than the long
standing unix solution of not using buggy crap software that crashes
constantly in the first place.

~~~
stfp
All that unix software was so great! Apache 1! Woo! bind! Yeah! sendmail!
Nice! That's pretty much it! It never crashed and was so secure! Happy times!

Trololol.

------
donpdonp
I was hoping the daemon from Daniel Suarez's Daemon novel had been located.

