This seems to be basically a collector that sends data to Datalog (a NON open-source service that costs $15/host/month). I recommend setting up Grahite+statsd (it's truly amazing and open-source realtime metrics setup).
> but optimizes for shortest time to app metrics viz.
I'm not sure I understand what "shortest time to app metrics viz" means. Are you referring to the time Pup takes to create charts? Or the lowest time frequency (minutes, seconds etc.) it can handle?
Graphite defaults to reporting at 1-minute intervals, and can report at a finer granularity (1 sec. intervals) if you tweak a few settings and set up a cluster. I'm just wondering how Pup differs in this respect.
I meant it as a shortest time from "I wish I could see my app metrics" to actually seeing them. In other words, ease of set-up, lack of knobs, having metrics appear automatically.
By default, pup aggregates statsd data in 10s intervals, and the UI refreshes it continuously.
Excellent project! If I understand this right, it's an addition/replacement for StatsD that adds the graphing normally done in Graphite. I've tried to get the whole Graphite/StatsD stack setup before, and I could never get Graphite together quite right.
Somewhat off-topic, but what value is there in using statsd rather than talking to carbon, the graphite backend, directly? Statsd receives metrics over udp; carbon-cache can receive them over tcp, udp, or amqp. Statsd aggregates metrics and flushed them a set intervals; if you want this, carbon-aggregator can do it.
Other, more experienced people can speak to this, but I'd argue that statsd has a useful ontology for classifying metrics based on what you want them to track.
If you're watching a number go up, use a statsd.Counter. Want to track request arrival rates? Use statsd.Gauges. Want to record times for request fulfillment? Use statsd.Timers.
Statsd isn't necessarily better or worse than carbon-cache via UDP, but it provides a handy solution for the above use cases.
That's what it looks like. I'm not sure I really care much for it, other than the fact that it no longer appears to use the Graphite URL api. Personally, I'd rather have graphite-web running and then use an alternative frontend like Descartes[1] or Graphiti[2].
Still, it's pretty good that there is an alternative frontend, if nothing else but to expand the graphite ecosystem.
One thing I find disappointing is that you can't delete severs once they've been added. Especially since they count towards your limit. Unless I'm missing something?
They'll disappear automatically when you stop sending data about them (within a few hours) - we've done that for our customers who have elastic / rolling servers on AWS for ex.
If that doesn't cut if for you, we're happy to help - let us know what you're trying to do!
That's kind of cool :) But what about the case where I have an important server that's down for a few hours. Do I lose all the data previously collected?
Nope - all the data stays there, and can be shown on dashboards. We just don't show the hosts that don't submit data in the host lists, and don't count them towards the total number of hosts charged for. Does that answer your question?
Does it only integrate with statsd, or could it aggregate metrics from collectd, for example? This could be an alternative to Graphite for graphing metrics because Graphite is a pain in the ass to setup (no proper installer), comes with a lot of bloat and is in a semi-broken state regarding features right now.
Pup will graph all the metrics Datadog does - it already collects OS metrics, and comes with a set of plug-ins for connecting to common apps, and it wouldn't be very difficult to connect collectd to it.
Pup's feature-set is limited today, though, compared to graphite. You may also check-out the full http://datadoghq.com service for more. It's SaaS, and free up to 5 servers.
So:
- It doesn't need to run on the same server as the application, and you can have any number of apps reporting to it. The only limitation here is that apps communicate with Pup in UDP.
- Our goal with pup was to make it super-easy to see app metrics. It is therefore much narrower in scope than services like New Relic or projects like Graphite. It's open-source, though, and you can take it anywhere you want.
- We ourselves operate a service that can consume data from Pup and other sources, and provides metrics + events aggregation and correlation, fancy graphing, alerting, etc.. You can check-it out at http://datadoghq.com
- DogStatsD is running on my Server and is collecting data.
- I can use "dogstatsd-ruby" to collect data from my (i.e.) Rails App.
- DogStatsD reports Stats to your Service Datadog HQ (optional)
- Pup is a small version of Datadog HQ that is running locally on my Server and connects directly to DogStatsD.
- If i feel Pup doen't satisfy my needs, i can switch from Pup to Datadog HQ, get more functions and you my money? :)
In a nutshell, yes. You can also contribute some of the features you need to pup if you are so inclined.
We're on a mission to provide monitoring that doesn't suck, and we believe that making it easy and rewarding to instrument your app is an important step on the way.
We designed pup to be first and foremost accessible to developers, but it will work just the same on production systems.
Once you get addicted to metrics and want more aggregation / graphing / alerting / analysis capabilities, there's a number of open-source components you can pipe your statsd data into. Or you can use our own http://datadoghq.com service for that.
Statsd accepts data over UDP, so you can run it on whichever server you like, and the logging is asynchronous and takes a few microseconds (last time I counted). I haven't had a chance to use it in production extensively, but I tested it a bit and liked it very much.
This looks potentially interesting, but I can't really evaluate it without thorough and accurate documentation. It would be nice if there were more than just instructions for downloading it and a trivial use case. Without it, your work's not done.
I was able to get in running against an existing statsd installation trivially, and explore it to my heart's content on my existing metrics. And I can look at the source - what more do you want? That was the easiest 3rd-party dashboard setup I've ever experienced.
If your needs are trivial, then it's probably adequate. Not everyone's needs are, though. I'm trying to evaluate it in the context of needing an enterprise-grade dashboard solution, with tens of thousands of metrics across thousands of hosts, and robust filtering/aggregation capability.
To my knowledge nobody has really solved this problem yet in a satisfactory way, outside some closed-source solutions at big internet companies.
otterley - for that, you can take a look at http://datadoghq.com. We do all of the above.
Our service isn't free beyond 5 hosts, but is quite a bit cheaper than rolling and running your own or flying blind and facing the consquences. It's a hard problem, and we're on a mission to solve it for companies that don't start with goo* or end in *ook.
I looked at Datadog, but they still suffer from the same malady everyone else does: inspired by Cacti and Graphite, they use a flat namespace for metric names (e.g. interfaces.eth0.pkts) and host is the only supported dimension you can query against. Unfortunately, none of the monitoring and analytics startups I'm aware of have support for arbitrary dimensions you can query or aggregate against (host, interface, disk, datacenter, etc.).
Without some understanding of the dimensions of the data, it is very difficult to compose dashboards or aggregation rules that have arbitrary filters or can update automatically when new components are added.
I really ought to elaborate in a blog post someday :)
otterley - Don't walk away yet... we do support arbitrary dimensions!
You can attach any arbitrary set of tags to metrics or events - on a per-datapoint basis, and slice / dice / alert based on those tags. Datadog will automatically tag your points by chef role or AWS availability-zone, for example, and you can add any other tag you want. Tags also don't have to be tied to a host and can also relate to a specific volume, mysql index, etc...
(Note to HN, this is a feature of datadoghq - pup will gladly collect and filter on tags, but won't aggregate them, yet)
But tags aren't dimensions. Tags are merely a list of dimensionless values and completely lack context that are essential for dashboard construction and value aggregation.
Suppose I apply the tags "ord", "foo.example.com" and "bar.example.org" to a metric. How does the dashboard builder or the aggregator know what they are? They're values without the fields.
Pup will work with any existing statsd client - it just replaces the node.js statsd server. In other words, you can keep all your code and apps untouched (FYI I'm not the author, but a Datadog co-founder)
A no-hassle, real-time display of custom metrics for developers: 1 command to run and you can watch your app in action, as long as you instrument it with a statsd client (pretty much standard these days).
So, no logs to parse to get any app metric, everything already graphed for you in real-time, quasi-nil setup.
We chose statsd because it means that you don't have to change your code to get it to production. You can then rely on pup for display (but no historical data), or on your internal tool chain (e.g. graphite et al.) or give our service (http://www.datadoghq.com) a try if you don't want to set your own stuff up.