
OpenTSDB – A Distributed, Scalable Time Series Database - jmngomes
http://opentsdb.net/
======
kev009
OpenTSDB itself is pretty easy to configure, the tsds are stateless and you
can spin up some for ingest, some for frontend/grafana etc by blanket applying
a small config file. You can use any of TCP, HTTP, or DNS load balancing to
handle failure depending on how you ingest. It's easy enough to build packages
for, and we recently committed it to FreeBSD ports where you can get a whole
up to date HDFS/HBase stack without 3rd party garbage.

The hard part is HBase and to a lesser degree HDFS.. there are just a lot of
things to read and learn about, and failure to address one of them will
eventually cause interesting struggles. That said, my company looked at the
field in 2012 and chose OpenTSDB, and it's handled things well enough
considering fairly poor configuration of HBase. We recently reevaluate the
space and tried Influx and Karios. Influx fell on its face in several ways
that will take years to fully iron out, Kairos trades one set of evil for
another, so we decided to do some learning and redeploy OpenTSDB correctly. As
far as open source options, it's far ahead in the battle hardened department.

If you want to consider it, I would advise checking out some of the
slides/videos on [http://www.hbasecon.com/](http://www.hbasecon.com/) \- these
two being the most critical:

* OpenTSDB and AsyncHBase Update

* HBase Performance Tuning @ Salesforce

HBase seems pretty vibrant. There's been a lot of progress with things like
WAL shipping and performance tweaks like SSD block caching, off GC heap
support, read access from multiple regionservers. OpenTSDB plugs along at a
slower pace, but it's much more professional than i.e. Influx changing storage
engines 5 time and getting worse with each. The OpenTSDB 2.2 release looks
really exciting and fixes some hard high rate problems for both reads and
writes:
[http://opentsdb.net/docs/build/html/new.html#id1](http://opentsdb.net/docs/build/html/new.html#id1)

In the commercial space, Circonus looks pretty interesting and comes built and
recommended by people I trust. It goes a lot farther than metrics storage. I
wasn't able to get my management to do more than a video call to really dig in
and evaluate it myself.

------
secure
If you’re looking into time-series based monitoring/alerting, I’ve been
delighted with [http://prometheus.io/](http://prometheus.io/).

I’m mentioning this because I previously tried to use OpenTSDB for monitoring
(back when prometheus and others didn’t exist), and found it too cumbersome to
run (not packaged in Debian, requires an underlying HBase instance, etc.) for
my taste. Many more recently created projects, such as Prometheus, come pre-
packaged in a Docker container and can run on a single host — often they’re
even a single binary.

~~~
krenoten
Prometheus is for smaller problems. You have to manage your own sharding with
prometheus, so it doesn't work for handling large volumes of data unless you
are willing to do complex automation for a large number of disparate
replication topologies. This works fine at Soundcloud, where it came from,
because the teams there like keeping their monitoring systems isolated from
each other because they don't trust each other.

For problems where you require a single source of truth for high volumes of
data, you can't really justify that kind of effort for the data management
when distributed databases have been created to solve this problem.

Prometheus is nice and featureful, but it makes me laugh when the proponents
(usually [ex-]soundcloud folks) brag about how scalable it is.

Side note: they love to say that because it's pull based, that magically makes
it more scalable. This is a crock of shit. Push vs Pull doesn't get you out of
capacity planning, and from this perspective they are identical in terms of
throughput requirements. Pull means you need a single source to know about a
ton of endpoints. Push means you need a ton of sources to know about a single
endpoint. Which one is simpler to operate? They say it's more scalable because
when they failed to properly do capacity planning, it hid their dropped data
more effectively and their dashboards kept running, oblivious to the
degredation.

------
freezey00
I have been reading a lot of these comments here and some of the people who
ran into issues just sound like they don't understand what they were using or
building. As many have said before opentsdb is great if you understand all of
your components. Hbase is somewhat of a black art. You also truly need to
understand your data. You cannot just start writing data and when it fails be
upset with the product. Opentsdb is no silver bullet to the metrics game but
it sure as hell is better than most at a larger scale.

We are doing right now roughly 1.2m/s and will likely grow to 6-8 depending on
how we chop this up. Either way you have to use any level of intelligence with
this product as you do with others. I cannot stress this enough but plan out
your data. Make sure you are very strict with the pattern of how things get
written and the amount of tags you allow. No datapoint will initially make
sense for you you may have to shift things around a bit. There is an active PR
collectd that addresses metric mapping which can be very beneficial to high
throughout shops.

Either way just make sure you know what you are getting into and understand
your problem as well as your product you are choosing. Opentsdb could be
overkill for people who just want a low amount of data per second. You will
also need to plan retention policies / long term storage / hardware
requirements / etc. remember you are running hbase.

------
rodionos
Not to steal limelight from OpenTSDB - it's great in many ways - if anyone is
interested in experimenting with TSDBs, we've been developing HBase-based
Axibase Time Series Database for 3+ years:
[https://axibase.com/products/axibase-time-series-
database](https://axibase.com/products/axibase-time-series-database). It
supports SQL and is somewhat better packaged with built-in rule engine for
alerting and visualization: example -
[http://apps.axibase.com/chartlab/2ef08f32](http://apps.axibase.com/chartlab/2ef08f32).
Accepts most line protocols including nmon, tcollector, scollector (works for
windows, often overlooked), collectd, statsd.

Disclosure: I work for Axibase.

------
necubi
We've been running OpenTSDB at a very large scale (~100k writes/s) for the
past year and a half and have been pretty happy with it. It's not as reliable
on the read side as something like ganglia and the query options aren't as
powerful as InfluxDB, but it can scale better than any other option I know of.

The ability to look back into the past and get full-resolution metrics is a
huge deal. With Ganglia or Graphite (which aggressively downsample old data)
there's a lot of squinting to try to make out patterns past a week or so.

That said, we also use HBase as our main data store, so we have a lot of
experience with operating it. I'm not sure I could recommend OpenTSDB to an
organization that lacks that expertise.

~~~
willempienaar
If you were to create an application that relies heavily on a time series
database, what would you recommend?

I'm writing an application of this sorts, and my requirements won't be too bad
(about 1000-4000 metrics on a per second basis). I will also rely heavily on
predefined queries (rollups to 10s, 1m, 10m, 1h, etc.).

I'm hesitant to use OpenTSDB because the community "buy-in" seems to be going
down.

EDIT: I can't lose data to downsampling. I will need to store this for both
troubleshooting and regulatory purposes.

~~~
Cidan
I wouldn't worry too much about what the community thinks. TSDB fits the bill
and works great. That being said, I've also heard wonderful things about
druid.io ([http://druid.io/](http://druid.io/)), however I've yet to
personally try it. We're going to start testing and throwing data at druid
early next year though.

------
necubi
We've been running OpenTSDB at a very large scale (~100k writes/s) for the
past year and a half and have been pretty happy with it. It's not as reliable
on the read side as something like ganglia and the query options aren't as
powerful as InfluxDB, but it can scale better than any other option I know of.

The ability to look back into the past and get full-resolution metrics is a
huge deal. With Ganglia or Graphite (which aggressively downsample old data)
there's a lot of squinting to try to make out patterns past a week or so.

That said, we also use HBase as our main data store, so we have a lot of
experience with operating it. I'm not sure I could recommend OpenTSDB to an
organization that lacks that expertise.

------
crudbug
I have been looking for a scalable TS solution similar to ElasticSearch -
single node to 1000's all PnP.

------
chaotic-good
OpenTSDB is really nice but it has some drawbacks:

1\. Limited timestamp precision (because of storage schema).

2\. Dependency on HBase.

------
jtwebman
It is alot of work to keep up. I would use KairosDB instead.
[https://kairosdb.github.io/](https://kairosdb.github.io/).

~~~
Cidan
What sort of work are you referring to when you say it's a lot to keep up?

We've been using OpenTSDB here in an extremely high traffic setup for about a
year now, with absolutely no issues at all. It took a few hours at most to
setup and figure out scaling.

~~~
krenoten
I love OpenTSDB, but I think it's highly unusual for people to have
"absulutely no issues at all". I ran OpenTSDB at the hundreds of thousands to
low millions of ops per second range for several years. OpenTSDB has
historically crapped itself when you have writes hit it for rows that it
already ran its all-columns-for-an-hour-into-a-single-column compaction on.
Now it just drops data on reads.

If you have a large team of engineers writing data into it, they will
sometimes abuse the schema by overloading single metrics with many tags. Every
permutation of tag*value creates a new row per hour of data. This creates
extremely hot shards when engineers inevitably do something like store a
client IP address into a tag. When this happens, you may have to write some
web-UI scraping code to figure out which regions are getting the most traffic
(assuming you have "extremely high traffic setup" numbers of region servers),
or script up something like misra-gries on a tcpdump of its inbound metrics to
see where the hot shit is so you can get somebody's deploy reverted. I know of
some companies that have forked OpenTSDB and prefixed each row with a hash
such that it spreads overloaded metrics around the cluster much more evenly.
KairosDB solves this by not using a lexicographically sharded database
(Cassandra).

When run properly, it runs great. But that takes some really painful learning
experiences to learn how to do, generally.

~~~
kev009
OpenTSDB 2.2 has this hot region problem fixed with salted UIDs.

~~~
krenoten
Badass! This would have saved me about a week of my life if it had come a few
years ago!

------
dschiptsov
So, we need a new database engine instead of mere using tables with
partitioning in PostgreSQL with a timestamp as a primary key?

And it is Java - waste two times more resources than the data you have &trade;

~~~
rodionos
Not sure if your Java comment is warranted, a lot of Apache projects are
either written in Java or run on JVM. Yes, HBase's memstore can be optimized,
in particular for small key values, but that doesn't negate a need for time-
series databases with proven scaling. HDFS offers that. How you were planning
to design your tables in PostreSQL if you have 1m+ unique series inserting at
100K+/sec. Assuming you use timestamp as a PK does it translate into 1m+
tables?

~~~
dschiptsov
We have already seen how Cassandra clone in C++ perform.

I think that adding a new indexing strategy to a decent engine could be a good
idea.

Informix did this.

~~~
kev009
Mapr has an HBase compatible stack in C++, I would love to see TSDB numbers on
that. TSDB is pretty efficient itself and since metrics collection is such a
batchy operation JVM isn't really that much of a problem. You can run some
dedicated TSDs for front end/UIs to ameliorate JVM pain there too.

