
Open-Sourcing Rearview: Real-Time Monitoring With Graphite - wastedbrains
https://techblog.livingsocial.com/blog/2013/09/30/open-sourcing-rearview-real-time-monitoring-with-graphite/
======
foz
At my company we've been using Graphite and StatsD for nearly two years now,
we rely on it heavily for tracking performance and troubleshooting issues. We
rely on Icinga, Pingdom, NewRelic and other tools to alert of us of problems.

Often, when things have gone really wrong (DoS, internal network issues, app
errors, disk full) the affected machine(s) stop reporting to graphite (or
under-report data). We get alerted by monitoring the services, not the stats.

Being alerted about low or unusual values might be helpful in some cases, but
based on my experience, it would too noisy. Usually when something bad
happens, we anyway investigate Graphite and analytics tools to understand the
impact on traffic and KPIs.

I could see Rearview being useful for some cases, but not as a replacement for
real monitoring and alerting tools.

~~~
SEJeff
In my currently non-existent freetime, I'm a Graphite co-maintainer (check
github). If you have any improvements or suggestions, please feel free to send
us pull requests. The current pull requests are a bit of a mess, but I blame
myself and will be getting around to merging a ton of them "real soon now TM".

~~~
foz
Thank you for your work on Graphite. For all it's UI strangeness and quirks,
it is a great solution that a lot of people love (myself included).

I'll peek at the pull requests and see if my company might be able to
contribute some help.

------
jwatte
Server side graphs didn't work out for all our monitoring use, so we don't use
graphite. You should make a version that works with istatd :-)
[https://github.com/imvu-open/istatd](https://github.com/imvu-open/istatd)

~~~
sakers
I'll definitely check that out! I'm also thinking we may need to add support
for a time-series database as Graphite does have its limitations.

------
dekz
This looks really polished and definitely a great idea. I can see why you
chose Ruby for the scripting of the monitors, being able to evaluate that code
in a predefined binding can be quite powerful, especially with the aid of
helpers being pre-defined as well.

Why not a full ruby stack, or was the "live" scripting done after the initial
inception?

~~~
sakers
We have always used Ruby for the scripting (we're predominately a Ruby shop so
this was key for future adoption.) The very first mvp for this tool was
individual Ruby scripts running against Graphite and being scheduled via cron.
The first real backend scheduler was built in Scala, but for various reasons
we've converted to Rails/Puma/Celluloid running in a VM using Jruby. The
monitors themselves run in an MRI sandbox for security purposes.

------
fit2rule
I'm not sure I'm ready to abandon a custom monitoring environment consisting
of a shell environment, screen, ssh certs, lugubrious quantities of /proc/,
and a fair bit of gnuplot. Seems to me thats all you need? Why commit to a
Ruby install for an operator console?

~~~
sakers
I'm not sure lugubrious means what you intended it to mean. :) At any rate,
see my reply here
[https://news.ycombinator.com/item?id=6646402](https://news.ycombinator.com/item?id=6646402)
for a sampling of things Rearview brings to the table. The tl;dr is that it's
not a NOC tool, it's more for process monitoring whether that be application
processes, engineering processes, or business processes. It also does provide
a central location for anyone to see the state and history of an application
or business unit.

~~~
fit2rule
>The tl;dr is that it's not a NOC tool, it's more for process monitoring
whether that be application processes, engineering processes, or business
processes. It also does provide a central location for anyone to see the state
and history of an application or business unit.

Ah. I've usually just used email for that. :)

------
mh-
this looks fantastic. thanks for releasing it.

the UI is quite polished

~~~
sakers
Your welcome! We did a second feature release of our Ruby version today that
has even more UI goodness. Basically we've added the ability to group
categories of monitors under one dashboard. You can then switch between
categories using carousel controls or direct from drop down. We're hoping to
open source this version soon and crossing our fingers that the Ruby version
will see more collaboration from outside developers.

~~~
mh-
sounds really cool. looking forward to trying this out

