

Ask HN: keeping services up and running? - nprincigalli

Services are born into a life of sweat and suffering, but some can't stand the beating or the garbage being thrown at them, and they die on you. Hence the question:<p>What are you using to keep them always up and running?<p>I've used djb's daemontools some 6 years ago, and it was okay back then, but I wonder what is being used by those setting up their gear now. I've also compiled this quick list of players in this space after some googling and asking around:<p><pre><code>  * monit http://mmonit.com/monit/
  * supervisord http://supervisord.org/
  * daemonize http://bmc.github.com/daemonize/
  * runit http://smarden.sunsite.dk/runit/
  * perp http://b0llix.net/perp/
  * launchd http://launchd.macosforge.org/
  * DJB's daemontools http://cr.yp.to/daemontools.html
</code></pre>
Pointers to alternatives, too, are greatly appreciated!<p>Thank you!
======
nixme
Dustin Sallings says, " _Don’t start programs, run programs._ "

See: <http://dustin.github.com/2010/02/28/running-processes.html>

------
aphyr
I use init. Most of my services manage their own daemonization and forked
workers. It's easier to design a correct system to detect failure, issue
alerts, and restart as necessary than building unkillable workers. Coupled
with some basic init scripts, it works pretty well.

I like monit, but it has a weird habit of running super-slowly, failing to
restart services, or otherwise flaking out.

------
_delirium
This post by patio11 is a more in-depth "how to keep everything up" post
rather than a rundown of specific tools (though it mentions tools also), but I
thought it was quite informative:
[http://www.kalzumeus.com/2010/04/20/building-highly-
reliable...](http://www.kalzumeus.com/2010/04/20/building-highly-reliable-
websites-for-small-companies/)

------
njl
I've used both daemontools and monit in anger. For a single server that I'm
not particularly concerned about, monit has been more than sufficient and
convenient. The web interface to see what's going on isn't bad either. On the
other hand, it can send me annoying blizzards of emails when things break.

For the next project, I'm going back to daemontools, for a multitude of
reasons. Between the /service directory and the svc command, automating stuff
is ridiculously easy. I can get the active health check stuff by writing a
ten-line script that does a much better check on a server than the generic
checks monit provides. I can get emails when bad stuff happens with logcheck.

Daemontools is just so goddamn unixy.

------
kineticac
I used to use Monit a lot, but recently started going simpler. We run our
entire backend / platform on Heroku, and all our processes are actually
DelayedJob workers that run constantly. Using a begin/rescue/ensure we can
make sure the "process" here keeps going. It's a little harder to work in a
cloud environment, but doing this has saved us from needing any type of sys
admin work at all.

It's great using tools like Heroku, who then teams up with awesome other
services and then provides great tools to really make life easy as a
developer.

------
skorgu
Bear in mind that at least supervisord and daemontools expect to be the parent
of a running, foregrounded process. Monit expects processes to run in the
background and generate a pid file, I'm not very familiar with the others.

I really enjoy supervisord personally, it feels similar to daemontools in
execution but has a somewhat friendlier interface all around.

~~~
jubbam
I second this, supervisord is a pleasure to use, pretty straight forward and
quick to get up to speed with how to use it.

------
msisk6
Monit seems to be the currently preferred solution in the Rails world. I use
it on many different sites with no problems.

I've also used daemontools in the past and never had an issue with it, either.

I've heard good things about god, but I've not used it myself. (Now there's a
sentence that could be taken out of context.)

------
damienfir
I use supervisord to run my python webservers. Works pretty well.

------
vimalg2
I pretty much use Monit for everything. (Though I wasn't aware of
supervisord.)

I'm always looking forward to infrastructure/scaling/systems-engineering posts
like these.

------
aditya
god is nice too: <http://god.rubyforge.org/>

~~~
there
i've had terrible experiences with god. after a few days of running, it would
routinely be the one process eating up the most cpu and lots of memory, yet
its only reason for running was to watch out for other processes eating cpu
and memory.

i didn't need millisecond accuracy in detecting rogue processes, so i just
switched to a small, custom script that runs from cron every minute.

~~~
kineticac
God has posed many problems for me in the past. I saw all the same problems
"there" saw. It would slow down our entire slice on slicehost to a standstill
after a few days of running. It was using more resources than everything else
combined. I'm not sure how it is now, but we switched to Monit after God and
it's been awesome.

I wrote a quick snippet on how we setup the basics of Monit and also talked
about God as well:
[http://artchang.com/?sort=&search=monit](http://artchang.com/?sort=&search=monit)

------
bensummers
SMF on Solaris / OpenSolaris.

------
jacquesm
monit works wonders for me.

I got it through a post much like this one.

