

The Virtues of Monitoring - roidrage
http://www.paperplanes.de/2011/1/5/the_virtues_of_monitoring.html

======
josephruscio
This is a really good introduction to the different levels of infrastructure
monitoring and their various pros/cons.

<http://librato.com> (full disclosure: I hack there) is a startup with a new
kind of entry in the process-level monitoring/management space. Would love any
feedback the infrastructure-minded part of the community here might have.

------
Uchikoma
Monitoring is one of the most important thing.

Rule of thumb: Think of everything you wish you had monitored after a crisis
has happened (which might be "everything").

~~~
riffraff
the problem is you will probably think of monitoring the stuff that you wished
you had monitored only _after_ the crisis happened, people are built in a way
that makes it hard for us to think of what may go wrong.

So another simple rule I learned with time is to trust/understand the
defaults,plugins,knobs,metrics that come with well known monitoring systems
("why the hell should I monitor _that_?"). This way you use the experience of
other people as a backup for your own.

~~~
ojilles
How about starting with the application/business metrics first (as those are
presumably easier to articulate). As things fail over time move down the stack
(infra/system) to get earlier warnings?

