
Watchy: a distributed system monitoring solution - tbrock
http://redbrain.github.io/watchy/
======
tlrobinson
By "distributed" they seem to mean "writes to a shared database"?

~~~
opendais
That is what it looks like to me too. Although it looks like it can operate
against a Mongo cluster so its distributed.

~~~
xorcist
So not distributed then.

------
dkhenry
I really like that people are valuing monitoring and making tools to do it,
but I mean how is this better then SNMP? It doesn't look like it gives any
more information then the prTable of the UCd-SNMP-MIB ([http://www.net-
snmp.org/docs/mibs/ucdavis.html](http://www.net-
snmp.org/docs/mibs/ucdavis.html))

Also if you used snmp instead of rolling your own agent you would get security
and encryption (SNMPv3) and the ability to use one of any number of existing
monitoring tools.

------
spo81rty
Monitoring background services can be difficult due to changing process ids.
If anyone is looking for a commercial product that does this, I would
recommend checking out Stackify. We do server monitoring, app monitoring,
errors, logs, metrics, etc all in one app.
[http://www.stackify.com](http://www.stackify.com)

~~~
xorcist
That's a very strange comment. I've never had that problem across several
years, platforms and monitoring systems.

Perhaps for a fast respawning process I can see potential problems monitoring
it, but that is usually indicative of a problem in itself and something you
generally watch for.

~~~
spo81rty
A good example of this is Windows performance counters since they are process
ID based. That is more complicated than knowing if a service named X is
running. It's tough if you want to know more than just if the app is running.

~~~
xorcist
But in which monitoring system is that a problem? I've just never seen it.

------
gingerlime
UDP might be a rather poor-choice security-wise, especially if you rely on
this monitoring to alert you if something goes down.

It's much easier to spoof a UDP packet from an arbitrary source-IP address
than it is with TCP.

It gives a fairly easy opportunity for attackers to fake all your service
metrics, make it look like everything is Up/Down etc...

~~~
cbsmith
TCP's spoof protections are fairly limited and weak. If you are worried about
being spoofed, address it at layer-7. I wouldn't let spoofing concerns drive
my choice of layer-3 protocol.

~~~
mobiplayer
I personally hope you chose IP as a layer-3 protocol.

You might want to decide between TCP and UDP for layer-4, though, but as
gingerlime pointed out it is stupidly easy to successfully spoof UDP comms,
there's not even need to have a MITM situation. Same goes for ICMP.

~~~
cbsmith
> I personally hope you chose IP as a layer-3 protocol.

ROTFL! Whoopsie. ;-)

> You might want to decide between TCP and UDP for layer-4, though, but as
> gingerlime pointed out it is stupidly easy to successfully spoof UDP comms,
> there's not even need to have a MITM situation. Same goes for ICMP.

Yes, but TCP's protections are quite weak. If you are really concerned about
spoofing, you are much better off handling that with cryptographic solutions.

------
graycat
What about false alarm rate? Do we know what it is? Can we adjust it? Since
Watchy is for distributed systems, can it do well detecting problems that have
'distributed' causes?

------
8ig8
If you're looking for really basic website monitoring and notifications, Vigil
is a handy iOS app.

[http://vigil-app.com](http://vigil-app.com)

~~~
marykaichini
Is it only for smart phones? Do you have the usual version of the tool? I am
in search of a website monitoring tool and I am ready to listen to any
options. At the moment I like a lot Anturis, which is all in one monitoring
tool, but they have just developed iOS app and it is still fresh. I am ready
to consider anything else.

------
nopaste7
How does it compare to nagios/icinga?

~~~
kbaker
At least looking at its current featureset it looks much closer like it wants
to be Logstash or Apache Flume, but with an included frontend. They note the
deep inspiration from statsd as well.

Also all communication is over UDP at the moment, which could be bad for poor
quality links, but probably OK for most servers.

The main difference from just a log shipper is including a frontend web
service to look at data, and a library to link in the app to deliver stats
instead of using a more standard protocol like syslog / collectd / ganglia,
etc. and having the server parse / ship it.

So it's more of an integrated solution than combining different logging
pieces.

