

Shinken: A Pythonic Nagios - rphlx
http://www.shinken-monitoring.org/features

======
natch
I read the "Why do this project" section on the site, and am not convinced.
Nagios is a very actively maintained project, and did have major features
added about a year and a half ago. It is totally suited for use in large
installations, especially with the help of existing third party enhancements
like GroundWork Monitor. And on the non-issue of wanting a flexible Nagios, a
lack of flexibility is NOT one of Nagios' problems. It is insanely flexible,
and has tons of add-ons and plugins. Also being in C, it performs very well,
especially the current version. I've looked at the code, and it's not bad -
maybe the shinken authors are just intimidated by C.

------
ghshephard
Is there anyone out there who uses Nagios (and by use, I mean have more than a
hundred services being actively monitored) that finds nagios lacking? Nagios
is basically an engine for state tracking, notification, and cmd dispatch.
Ironically, the base set of Nagios packages isn't capable of doing any
monitoring - you need to add plugins to fill out the command library to do
that.

Nagios, to me, is like cvs, jira, cacti, sendmail, bind - just one of those
sysadmin toolkit apps that works pretty well and I've never really been
inspired to really invest the time in learning the alternatives - never really
needed them. (Yes - I know git/subversion blow cvs away, and postfix is
probably the right MTA to use. I think most bind actually is the only one on
my list that actually still has mindshare. Issue tracking (jira) is so over
the place that I'm guessing there is no "one popular" system out there)

------
jobenjo
We've been using a different nagios alternative, zabbix, for quite a while and
I'm always surprised how few people have heard of it. The UI takes getting
used to, but it's full featured and dependable.

~~~
riffraff
happy zabbix user here, it's true, the ui is not especially obvious, but it
works quite well, and with the graphing capabilities it is also a reasonable
replacement for cacti.

------
sharms
Does anyone have experience with running large scale deployments of this? I am
not a fan of nagios itself (last I looked at the code it was a horrible perl
mess), but the idea itself is great.

~~~
viraptor
Not very likely, as the project itself got wiki and ML only in January... But
the architecture seems better than Nagios tbh. More distributed and with more
options for redundancy.

Also the roadmap doesn't look great so far:
<http://sourceforge.net/apps/trac/shinken/report/2>

But if someone started writing a drop-in Nagios replacement, I hope they know
what the problems are and how to solve them.

------
jbellis
Isn't that the same thing zenoss is trying to do?

/not really familiar w/ zenoss but I know their founder spoke at pycon a
couple years ago

------
rubyrescue
interesting. i'm working on erlmon, an Erlang and Lua driven monitoring tool,
and there are some good ideas in this tool that we may borrow.

~~~
sant0sk1
Sounds cool. How far along are you? Have anything online yet?

~~~
rubyrescue
up on github now. it works but we're still working out config file format, and
it's just now able to send alerts. lot of work to do before we could even call
it alpha...

<http://github.com/darrikmazey/erlmon/tree/master/erlmon-1.0/>

------
olefoo
This looks promising, especially since you can use existing nagios-plugins.

Anything that improves Nagios configuration is a good thing.

~~~
ghshephard
I've always thought that Nagios configuration was pretty good. For the small
organizations, it's simple enough with it's object/template infrastructure,
particularly now that you can create custom properties and pass arguments to
nrpe, that most sysadmins can get a monitored environment up in 30 minutes,
and then add devices on the order or 2-3 minutes per system (inheriting all
the monitors for that object class)

When organizations get larger, the configs are straightforward enough that the
machine generation (which is absolutely necessary when your monitoring 10s of
thousands of services) is also relatively straightforward.

My company is dropping north of $1mm on ITIL practices, ticketing, trending,
and KPI generators to depart from our current Jira/Cacti/Nagios world, but
Nagios is one of the products that we're continuing to use for the foreseeable
future. (Eventually that will change - I've discovered that Billion Dollar
corporations would rather spend $500K / framework and then start adding
developers (at $150K fully loaded) to extend that platform with proprietary
in-house developed systems, rather than simply go buy an open-source product
for $5K and do the same thing.

I'm not bitter though, really.

~~~
olefoo
It may just be that the Nagios developers have a different cognitive style
than I do, because I find the configuration language to be irritating to the
extreme. So much so that I end up using templates to script configuration as
much as possible. Somehow monit has always been more comfortable to use in my
hand.

~~~
ghshephard
Interesting. About the only "Nagios Annoyance" that I have (and I don't really
see a good way around it) is that you don't specify the commands you are
running for a service check, but have to reference your command library (which
in turn has the name and calling order of the actual parameters)

I realize that there is no way to update the command/parameters you are
calling in one place without doing this, but the cmd entry usually almost
_completely_ maps to the command that I'm calling - parameters, command name,
etc.. that it's addition has never really seemed to be that critical - and
always results in one more "lookup" when trying to figure out what a service
check is doing.

