
Time series database Graphite seems to be falling into disfavor - js2
https://www.vividcortex.com/blog/2015/11/05/nobody-loves-graphite-anymore/
======
dmourati
Obligatory response from Graphite contributor and author Jason Dixon who is
shouted out in the vividcortex intro.

[http://obfuscurity.com/2015/11/Everybody-Loves-
Graphite](http://obfuscurity.com/2015/11/Everybody-Loves-Graphite)

Personal response:

I've used Graphite, OpenTSDB, Ganglia, Cacti, and a bunch more solutions.
Recently, work transitioned from OpenTSDB to a hosted solution from a startup
called Wavefront.

[https://www.wavefront.com/](https://www.wavefront.com/)

This has been a smash hit.

Scale matters. If you are a small shop, and can workaround the atrocious UI
shortcomings of Graphite by all means, go for it. As you start to get larger,
OpenTSDB looks attractive. We went too far down that path but were able to
quickly (< 3 months) transition over to Wavefront and haven't looked back.

~~~
duaneb
A closed silo seems like a distinct step backwards compared to a stack you can
maintain yourself.

~~~
dmourati
Read a little more into the second part of your sentence, especially the word
"maintain."

------
chaotic-good
I'm building my own time-series database for some time.

[https://github.com/akumuli/Akumuli](https://github.com/akumuli/Akumuli)

It's a standalone solution without dependencies on other services or
databases. It can handle more than a million data points per second and uses
fixed amount of memory to store data on disk but without Graphite's
shortcomings (compression is used for everything and timestamps is stored with
nanosecond precision).

I'm focused mostly on realtime time-series analysis. Akumuli has built-in
anomaly detector: EWMA, Holt-Winters and sketch based methods. It can generate
different time-series representations, like SAX and PAA. DTW and correlation
search is in progress now.

------
gtrubetskoy
I've been in search of a good time-series database solution for some time now,
and have pretty much given up on it and am in the process of rolling my own:
[https://github.com/grisha/timeriver](https://github.com/grisha/timeriver)

My issues with the present state of TS isn't the volume. I was looking for
using TS outside of the DevOps world. Everyday things like your heart rate
over time, price of gas at the nearest station, number of people in line at
your coffee shop, etc. All these are interesting, and
Graphite/RRDTool/InfluxDB/etc did not seem like appropriate storage, because
to use it with your other data (which is most likely in a relational DB of
some kind) you need to export/import it and who wants that.

I call this problem "data seclusion". When data exists in some kind of an
incompatible format (e.g. Whisper files), it will end up ignored because of
that extra conversion step necessary to link it with your other data. Data in
Graphite and such is mostly good for generating charts, but TS analysis is so
much more than that, even at its simplest.

I think that the good old relational database is fine storage for TS and we
gave up on it way too early, especially given what's new in PostgreSQL. Making
it horizontally scalable, distributed, using consensus protocols, etc - these
are not _time series_ problems, these are _database_ problems and we do not
yet have a good solution for these. (We have many that "kind of" work, support
some features but not others e.g. Cassandra). Projects like InfluxDB are mired
in solving the wrong problem which will eventually get solved at the DB level.

More thoughts on the subject: [http://grisha.org/blog/2015/03/28/on-time-
series/](http://grisha.org/blog/2015/03/28/on-time-series/)
[http://grisha.org/blog/2015/09/23/storing-time-series-in-
pos...](http://grisha.org/blog/2015/09/23/storing-time-series-in-postgresql-
efficiently/) and [http://grisha.org/blog/2015/05/04/recording-time-
series/](http://grisha.org/blog/2015/05/04/recording-time-series/)

~~~
cbsmith
I highly recommend looking at Keogh's work, particularly the iSAX and iSAX2
stuff:
[http://www.cs.ucr.edu/~eamonn/SAX.htm](http://www.cs.ucr.edu/~eamonn/SAX.htm)

While it is used for machine learning, it actually makes a lot of sense to
follow a similar approach for more general time series applications.

~~~
chaotic-good
SAX requires different type of storage. You can store SAXified time series in
ElasticSearch, or Solr but time-series database doesn't fit for this. Time-
series databases should be able to generate SAX representation.

~~~
cbsmith
> You can store SAXified time series in ElasticSearch, or Solr but time-series
> database doesn't fit for this.

That's making some presumptions about the time-series database use case. SAX
is convenient for storing and retrieving the data as well as identifying
trends or recurring behaviour. What more do you need?

~~~
chaotic-good
You can't retrieve original time-series data from SAX storage because of
normalization. To query time-series data by content (approximate 1-NN, motif
discovery, etc) you need inverted index.

~~~
cbsmith
The normalized data is the index. It can still be pointing to the raw data.
The nice thing is that the index organizes the data in a way that makes it
easily (lossless) compressible.

~~~
chaotic-good
> It can still be pointing to the raw data.

Yep. Each SAX word should be mapped to the list of seriesid:timestamp pairs.
This list is often referred as postings list in information retrieval. The
resulting data-structure is an inverted index. SAX and iSAX papers describes
inverted index variant (really bad one) based on folder structure but one can
use convenient IR tools for this.

~~~
cbsmith
iSAX2 yields a much better inverted index style model.

The thing is, _particularly_ with time series data, a lot of times it is
sufficient to at least _start_ with the summary data in the index.

------
mbell
I actually love Graphite's API. I've yet to find another solution that is as
easy to use and has as many available functions
([http://graphite.readthedocs.org/en/latest/functions.html](http://graphite.readthedocs.org/en/latest/functions.html)).

For that reason we still use graphite's API, but not the UI or datastore.
Whisper nor whatever the newer one is would survive the load we put on it.

Our setup looks like:

grafana -> graphite-api -> graphite-influxdb -> InfluxDB <\- statsite (C impl
of statsd) <\- metrics

~~~
js2
Is it the case even with the latest Grafana (2.5) that you still get more
functionality by running graphite-influxdb as a middleman?

I have Grafana 1.9 pointed directly at InfluxDB 0.8 and here's one thing you
_can't_ do: group a series by X, then plot only the top Y groups. Is that the
sort of thing that using graphite-influxdb provides?

~~~
mbell
There are a lot of things missing from Influx's query language. The most
glaring issue is that you can't do 'multi step math' e.g. you can't do
something like 'average series A and series B into 5min buckets then sum
them'.

------
lumpypua
What does a modern metrics stack look like? There are so many words...
Graphfana, Kibana, StatsD, Graphite, etc. Which bits should I choose and
attach together?

~~~
Karunamon
I've got to advise against Kibana/Elasticsearch until the company gets more
mature. I was 100% on board with them up until the redesign.

Not so much for the _design_ (oh look it's white instead of black, who cares),
but the way they've handled it subsequently.

The 3.x -> 4.x transition for Kibana left a product that was missing really
basic features (like, the ability to set graph colors for one - a bug/feature
request that's been open since last january).

As it stands, upgrading from Kibana 3 to Kibana 4 is a step _backwards_. You
lose functionality rather than gaining it.

They also decided to require a major version bump to Elasticsearch (to 2.x)
with a point release of Kibana.

I used to be super optimistic about the Elastic guys, but some of these
decisions are just head-scratchingly awful.

~~~
meowface
After many years of using Splunk at my job, switching to ELK for personal
project use was quite a disappointment for me. Of course, ELK is free and
Splunk is very expensive, but I was still surprised at the gap.

This might be derailing the thread a bit, but is there any log management
platform like ELK or Splunk that has an expressive and versatile query
language like Splunk's? My biggest issue with ELK is that analytics is mostly
expected to be done through Kibana's GUI, while with Splunk you can craft
terse queries to do almost any sort of transformation and visualization
imaginable. I don't like how ELK is so GUI-oriented.

~~~
glial
Not sure what kind of queries you want to do, but Elasticsearch has a fairly
extensive Query DSL that'll let you do all sorts of aggregations:

[https://www.elastic.co/guide/en/elasticsearch/reference/curr...](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-
dsl.html)

There is a Python implementation that makes creating complex queries pretty
easy:

[https://github.com/elastic/elasticsearch-dsl-
py](https://github.com/elastic/elasticsearch-dsl-py)

I agree with the criticisms of Kibana, but I have had no problems querying
Elasticsearch directly. It also supports scripted queries if the built-in
aggregations aren't enough.

Of course, then you have to build your own visualizations with the results...

~~~
meowface
I'm not sure I'd consider that a DSL. It's unwieldy to write a full JSON
object for every one-off query I want to do.

Using one of their examples, this:

    
    
        {
          "query": {
            "filtered": {
              "query": {
                "bool": {
                  "must": [{"match": {"title": "python"}}],
                  "must_not": [{"match": {"description": "beta"}}]
                }
              },
              "filter": {"term": {"category": "search"}}
            }
          },
          "aggs" : {
            "per_tag": {
              "terms": {"field": "tags"},
              "aggs": {
                "max_lines": {"max": {"field": "lines"}}
              }
            }
          }
        }
    

would be the following Splunk query:

    
    
        title=python description!=beta | stats max(lines) by tags
    

It would be nice if there was some kind of query compiler that could generate
ES JSON from an expressive query language.

~~~
illumen
The linked python dsl does pretty well. Not as nice as your reduced dsl but.

    
    
      s = Search(using=client, index="my-index") \
          .filter("term", category="search") \
          .query("match", title="python")   \
          .query(~Q("match", description="beta"))

------
KyleBrandt
With [http://bosun.org](http://bosun.org) (Stack Overflows alerting system:
Expressions, Notification Templates, Historical testing etc) we have hedged
our bets by adding in different query functions to the expression language.
Currently it can query:

\- OpenTSDB

\- InfluxDB

\- Graphite

\- Elastic (Expects to be populated by logstash)

OpenTSDB was the original time series database so that has some extra UI
features compared to others for graphing. My hope is that InfluxDB will mature
to the point in availability that we can use it and ditch the hbase
dependency. But Currently OpenTSDB is the best option for us.

------
jrv
If you prefer running an open-source solution yourself, Prometheus
([http://prometheus.io/](http://prometheus.io/)) addresses exactly those
points and works especially well in a dynamic cloud / microservices /
container scheduler world.

It has a dimensional data model, a powerful query language to go with it, and
covers aspects from instrumentation to storing data, all the way to alerting
and dashboarding. The latest version of Grafana has native Prometheus support
now too. Many tools (like Kubernetes or etcd) already export Prometheus
metrics natively, so you can monitor them with Prometheus right out of the
box. Support for many kinds of service discovery (Kubernetes, Marathon, EC2,
Consul, ...) make it work very well to monitor dynamically scheduled services
as well. Disclaimer: Prometheus author.

~~~
js2
I chose InfluxDB about a year ago because it was the only option that allowed
me to put each individual event into a time series and then do the
grouping/counting after the fact. The prometheus docs used to say this it
wasn't suited to that use case. Is that still the case?

edit: yup, the prometheus docs indicate InfluxDB is better suited to this use
case.

[http://prometheus.io/docs/introduction/comparison/](http://prometheus.io/docs/introduction/comparison/)

Aside, I really appreciate this comparison page and the prometheus docs in
general are well done.

~~~
jrv
Yes, Prometheus is fundamentally a store for numeric time series (with a set
of dimensions attached), not a store for individual events or log entries.

------
ceejayoz
It's buried about 10 paragraphs down, but it should be noted the author is a
Graphite competitor.

------
discordianfish
It looks like they (deliberately?) didn't mention the open source projects
that were specifically build to address the shortcomings of graphite, like
prometheus or opentsdb.

~~~
late2part
The author most likely didn't mention them because they suffer from the same
limiting assumptions that graphite does. When you're dealing with
multidimensional changing metrics, opentsdb has the same issues graphite does.

~~~
discordianfish
Graphite's data model doesn't support multidimensional metrics which is in my
opinion it's biggest shortcoming. To support multidimensional data, a suitable
data model is needed. AFAIK influxdb stores all label dimensions and the value
for each data point. No matter how high your cardinality is, the storage
requirement is the same. If you need to have, let say, a dimension 'client_ip'
in a metrics http_response_time, that's a reasonable model. I'd argue though
that outside analytics/data warehousing such high cardinality is rarely
needed. That's why prometheus (can't speak for opentsdb) stores the dimensions
identifying a metric (metric key/label pairs) once. For all data points it
just needs to store the value which makes reading and writing less expensive.
Now however VividCortex stores it's metrics, it has to choose from similar
trade offs. Sure, they host it for you - still, there are open source project
out there addressing exactly those issues.

~~~
jrv
Though I'm a Prometheus author, I need to defend InfluxDB here :) Since 0.9.0,
they support tags (vs. fields), which store indexed dimensions similarly to
how Prometheus does it. So this argument doesn't count anymore. Still, there
are many other differences in scope and functionality between the two systems.

~~~
discordianfish
So influxdb is now ready to be used as prometheus long time storage?

------
anton_gogolev
I dunno. It really rocks. And we were so envious that Linux folks have
Graphite/StatsD that we've ported it to .NET/Windows:
[https://bitbucket.org/aeroclub-it/statsify](https://bitbucket.org/aeroclub-
it/statsify)

------
yeukhon
Graphite is ugly. Graphite is hard to use beyond just clicking a few graphs.
Graphite requires some hard work to extend. But I am still using Graphite
because

1) it is simple to set up!

2) nice to have a fixed-size flat file as database, although performance
degrades very quickly. Cache misses is too frequent.

3) over raw socket is also quite attractive

No other solution can compete with Graphite for its simplicity. But there is
so much more than just sending data to graphite from collectd, or your custom
Python program....

Whenever I see OpenTSDB I sigh. Do we really need H technology here? Yeah I
work for a small shop but do I really want to maintain OpenTSDB....when I
already have so many databases? I am choosing between Cassandra and PostgreSQL
for my TSD.

~~~
erbdex
At scales when it starts breaking one doesn't directly consume the metrics. In
our setup of over 500K metrics/min, we had scripts consuming app metrics from
graphite and pushing them back into a different, higher hierarchy which was
then consumed by graphiti/grafana.

The Graphite URL API was visionary, and still makes the cut in most common
use-cases.

------
cbsmith
Live by the sword, die by the sword.

It is kind of amazing how hard it is for existing time series database systems
to track the changing needs of the marketplace.

~~~
obfuscurity_
You mean for open source time series database systems run entirely by
volunteers in their spare (personal) time? I know, right? What jerks.

~~~
cbsmith
What? I'm not saying they are jerks. I'm pointing out that it is harder for
existing projects to track the changes in the market place. It is _hard_ for a
new open source project to launch and build up enough of a community to be
viable.

------
jsudhams
Can some one change heading so it is understood as software and not actual
Graphite

~~~
dang
Ok, we changed the title to use representative language from the article.

Btw, this article was heavily flagged. It's not really legit to flag a story
just because people don't like the title. Plenty of good stories have
problematic titles. Depriving others of a chance to read the content,
especially when there's a good discussion going on in the thread, is a bad use
of flagging power.

~~~
mintplant
I didn't flag it, but perhaps others did because the author is a Graphite
competitor, yet this isn't disclosed until the end.

~~~
cwyers
It's not like this is a Medium post or something, it's on the company blog.

------
IIAOPSW
Here I was thinking I would read some informative article on graphite,
graphene and related carbon derivatives. Dear programmers: pick better names.

~~~
devdas
Naming is a hard problem in computer science :)

~~~
jabl
Like they say, the two fundamentally difficult problems in CS are cache
invalidation, naming things, and off by one errors.

