

All The Metrics! Or How You Too Can Graph Everything. - pmoriarty
http://sysadvent.blogspot.com/2011/12/day-23-all-metrics-or-how-you-too-can.html

======
SEJeff
Graphite co-maintainer here. We are working through the issues to finish up
the 0.9.13 release now, the last of 0.9.x. There are some exciting features
coming up in the master branch (future 0.10) as we slowly merge megacarbon
into master. We are also slowly upping the test coverage one project at a time
(I spent some considerable effort on whisper over the Christmas break) to give
us more confidence when merging contributions from our huge user community.

So TL;DNR: if you've got graphite questions, feel free to ask here :)

~~~
pmoriarty
How would you respond to the criticisms of graphite in a previous graphite-
themed HN thread? [1]

Some examples:

    
    
      "FWIW I've found influxdb considerably easier to install and manage
      than graphite (graphite doesn't play well with virtualenv, which
      makes dependency management horrible, compared to influxdb's single
      static binary)
    
      Also, I can see logging dictionaries being much more efficient and
      useful than logging single values -- with graphite if you want to
      track page hits per section of your site (of which you have 10) per
      user (100) per browser (5), you end up with 5000 individual metrics,
      and you need to have thought of them in advance. With influxdb you
      can log {"section": "front page", "user": "bob", "browser":
      "firefox", "hits": 1} as a single metric and then use an SQL-like
      query to filter by section / user / browser (or any combination of
      those) as and when you want to."[2]
    
      "I've spent the last week working on upgrading our Graphite
      system. I ultimately killed it and went with InfluxDB. The ease of
      installation and cluster creation were clear winners.
    
      Additionally the storage options for Influx trump Graphite across
      the board. I tried writing a custom backend and it went nowhere. The
      docs and code are terrible. I also noticed that Ceres hasn't had a
      commit in a year - kind of disheartening."[3]
    
      "I haven't had the best experience with Graphite. Namely, our main
      systems practically never crash but Graphite does fall over every
      few months. Seriously, Graphite is less reliable than the systems we
      use it to monitor."[4]
    
      "We've had the same problems."[5]
    
      "My biggest problem with Graphite was that it managed to grind an
      expensive large RAID array into the ground with a relatively small
      number (in my eyes) of metrics. We had the realisation that we'd
      waste a tremendous amount of hardware or have to cut down
      drastically on our data collection if we were to roll out Graphite
      across the board.
    
      (And yes, we had crashes too)
    
      The reason for the disk grinding was simple: The whisper storage
      system is ridiculously inefficient as it does tiny writes all over
      the places, and an excessive number of system calls to boot."[6]
    
    

[1] -
[https://news.ycombinator.com/item?id=8739208](https://news.ycombinator.com/item?id=8739208)

[2] -
[https://news.ycombinator.com/item?id=8739784](https://news.ycombinator.com/item?id=8739784)

[3] -
[https://news.ycombinator.com/item?id=8740058](https://news.ycombinator.com/item?id=8740058)

[4] -
[https://news.ycombinator.com/item?id=8739391](https://news.ycombinator.com/item?id=8739391)

[5] -
[https://news.ycombinator.com/item?id=8742242](https://news.ycombinator.com/item?id=8742242)

[6] -
[https://news.ycombinator.com/item?id=8739465](https://news.ycombinator.com/item?id=8739465)

~~~
SEJeff
Ooooohhh a hard one. I better get an upvote for this (joking)!

So WRT installation, it is a PITA to install graphite as it follows an
antipattern of hardcoding /opt into setup.cfg, which we'll be changing in the
0.10 release (to making installing via normal python tools like virtualenv and
pip work). That being said, there are really 3 main components to graphite:

1\. Whisper, the "improved" RRD. 2\. Carbon, the relay and caching daemon that
writes out metrics (as whisper by default) 3\. Graphite, which is simply a
webui for reading data from carbon and creating graphs, or returning JSON.

The only part that is actually interesting is Graphite, whereas carbon and
whisper can be though more of as implementation details for when Chris Davis
first wrote graphite. There is a large collection of tools that work with
carbon's super simple text based line protocol. Additionally, there are tons
of tools that work with the json or png graph data that graphite-web returns.
As an ecosystem, we've made it so that swapping out for a different backend is
trivial.

""" Also, I can see logging dictionaries being much more efficient and useful
than logging single values -- with graphite if you want to track page hits per
section of your site (of which you have 10) per user (100) per browser (5),
you end up with 5000 individual metrics, and you need to have thought of them
in advance. With influxdb you can log {"section": "front page", "user": "bob",
"browser": "firefox", "hits": 1} as a single metric and then use an SQL-like
query to filter by section / user / browser (or any combination of those) as
and when you want to."[2] """

This is precisely what statsd is for, if you're not familar with it, you
really should look into it.

Now regarding "influx vs graphite", I honestly don't see influx as a
competitor nor will I likely ever see it as a competitor. Sure you can do some
aggregations and apply some functions to your data stored in influx, awesome,
but Influx can be used as a backend for graphite[1]. So many people don't
realize that writing a pluggable backend for graphite is pretty simple. For
super scalable backends, I'm somewhat partial to the Cyanite[2] backend which
stores metrics in Cassandra, but other people swear by opentsdb + graphite[3].

One thing I will say is that the carbon relay is not great software. Around
100k metrics per second even tuned on excellent hardware, twisted (python)
falls over and it just eats it. I've been meaning to take a shot at rewriting
this in golang, but don't have tons and tons of free time to do this on just
yet. For a faster relay/aggregator, may I point you to the c version[4] which
is super performant.

Now for anyone who says that whisper is inefficient because it does tiny
writes, perhaps they should understand the software they are attempting to use
before deploying it. Whisper is more or less RRD, but allowing backfilling old
data. Lots and lots of tiny writes are how the software it replaced worked and
how it is meant to work. That being said, there have been several pull
requests to batch the writes so that where possible, it has less of an IO
penalty. However, you fundamentally need to understand how to troubleshoot a
Linux system and tune a system / hardware for the application running on it. I
can't stress that enough.

Regarding other choices, I think quite highly of a drop in graphite-web
compatible rewrite in flask named graphite-api[5] from one of our excellent
contributors who works for exoscale. Graphite api is interesting in that it is
basically just the good parts of graphite-web, namely the functions.py that do
all of the transformations. For a dashboard with that, I also suggest you look
into grafana[6], which also supports influxdb natively, but again, graphite
has a ton more options for transforming the data so they are overlapping, but
not competitors.

Also please note that unlike Influx or many of the "competitors" of graphite,
we're a small handful of developers spread out over 2-3 continents who don't
have a ton of free time. We work on Graphite for fun and try to make things
better for the community. There are plenty of good tools != to graphite to
use, this is a good thing! #monitoringlove

Does this answer most of your questions?

[1] [https://github.com/vimeo/graphite-
influxdb](https://github.com/vimeo/graphite-influxdb)

[2] [https://github.com/brutasse/graphite-
cyanite](https://github.com/brutasse/graphite-cyanite) and
[https://github.com/pyr/cyanite](https://github.com/pyr/cyanite)

[3] [https://github.com/mikebryant/graphite-opentsdb-
finder](https://github.com/mikebryant/graphite-opentsdb-finder)

[4] [https://github.com/grobian/carbon-c-
relay](https://github.com/grobian/carbon-c-relay)

[5] [https://github.com/brutasse/graphite-
api](https://github.com/brutasse/graphite-api)

[6] [http://grafana.org/](http://grafana.org/)

~~~
pmoriarty
Fantastic answer! Thank you for taking the time to answer in such detail.

You've definitely got my upvote, though you deserve far more.

~~~
SEJeff
Thanks! I should also point out the Synthesize project, which makes setting up
the entire graphite stack trivial. It was written by another one of the
Graphite co-maintainers, the famous Jason Dixon aka obsfuscurity:

[https://github.com/obfuscurity/synthesize](https://github.com/obfuscurity/synthesize)

