
Zabbix, Time Series Data and TimescaleDB - RobAtticus
https://blog.zabbix.com/zabbix-time-series-data-and-timescaledb/6642/
======
boomskats
Every time TimescaleDB is brought up, I feel the need to point people to their
shadily worded proprietary licence[0], and pg_partman[1].

Do the same benchmarks against a pg_partman managed partitioned db and you'll
get the exact same performance. We do, at least - 150k or so metrics per
second, 10 columns per metric.

Not trying to crap on the TimescaleDB guys, I've found a lot of their writeups
extremely useful and can totally see how their commercially supported product
fits. However, I like to see pg_partman at least mentioned somewhere in the
article/comments. It's awesome and does the same job.

[0][https://github.com/timescale/timescaledb/blob/master/LICENSE](https://github.com/timescale/timescaledb/blob/master/LICENSE)

[1][https://github.com/pgpartman/pg_partman](https://github.com/pgpartman/pg_partman)

~~~
mfreed
(Timescale cofounder here)

Hey, just wanted to clarify: the vast majority of TimescaleDB code is Apache2,
and you can easily compile (and we ourselves build & distribute) Apache2-only
binaries.

When we announced a new license in December, we didn't _relicense_ any code,
we just said that some _future_ features will be available under a Community
or Enterprise License. The code under this "Timescale License" is clearly
marked and in a separate subdirectory, and for virtually all users (except the
public cloud DBaaS providers), the community features are free.

This is the actual top-level LICENSE file in the repo:
[https://github.com/timescale/timescaledb/blob/master/LICENSE](https://github.com/timescale/timescaledb/blob/master/LICENSE)

And here's a blog post discussing in more depth:
[https://blog.timescale.com/how-we-are-building-an-open-
sourc...](https://blog.timescale.com/how-we-are-building-an-open-source-
business-a7701516a480/)

~~~
GordonS
Timescale user here. I actually think your "community" TSL license and
Enterprise licenses are a good compromise.

Perhaps what's not immediately obvious is that the TSL license is there to
protect against cloud providers offering hosted TimescaleDB without
contributing back - systems that add value (e.g. are backed by TimescaleDB for
DML) can use TSL-licensed code without any issue.

------
lima
TimescaleDB confuses me. Postgres is an OLTP database and their disk storage
format is uncompressed and not particularly effective.

By clever sharding, you can work around the performance issues somewhat but
it'll never be as efficient as an OLAP column store like ClickHouse or MemSQL:

\- Timestamps and metric values compress very nicely using delta-of-delta
encoding.

\- Compression dramatically improves scan performance.

\- Aligning data by columns means much faster aggregation. A typical time
series query does min/max/avg aggregations by timestamp. You can load data
straight from disk into memory, use SSE/AVX instructions and only the small
subset of data you aggregate on will have to be read from disk.

So what's the use case for TimescaleDB? Complex queries that OLAP databases
can't handle? Small amounts of metrics where storage cost is irrelevant, but
PostgreSQL compatibility matters?

Storing time series data in TimescaleDB takes at least 10x (if not more) space
compared to, say, ClickHouse or the Prometheus TSDB.

~~~
akulkarni
(TimescaleDB co-founder)

TimescaleDB is more performant that you may think. We've benchmarked this
extensively: eg outperforming vs InfluxDB [1] [2], vs Cassandra [3], vs Mongo
[4].

We've also open-sourced the benchmarking suite so others can run these
themselves and verify our results. [5]

We also beat MemSQL regularly for enterprise engagements (unfortunately can't
share those results publicly).

I think the scalability of ClickHouse is quite compelling, and if you need
more than 1-2M inserts a second and 100TBs of storage, then that would be one
reason where I'd recommend another database over our own. But horizontal
scalability is something we have been working on for nearly a year, so we
expect this to be a less of an issue in the near future (will have more to
share later this month).

You are correct however that TimescaleDB requires more storage than some of
these other options. If storage is the most important criteria for you (ie
more important than usability or performance), then again I would recommend
you to one of the other databases that are more optimized for compression.
However, you can get 6-8x compression by running TimescaleDB on ZFS today, and
we are also currently working on additional techniques for achieving higher
compression rates.

[1] [https://blog.timescale.com/timescaledb-vs-influxdb-for-
time-...](https://blog.timescale.com/timescaledb-vs-influxdb-for-time-series-
data-timescale-influx-sql-nosql-36489299877/)

[2] [https://blog.timescale.com/what-is-high-cardinality-how-
do-t...](https://blog.timescale.com/what-is-high-cardinality-how-do-time-
series-databases-influxdb-timescaledb-compare/)

[3] [https://blog.timescale.com/time-series-data-cassandra-vs-
tim...](https://blog.timescale.com/time-series-data-cassandra-vs-timescaledb-
postgresql-7c2cc50a89ce/)

[4] [https://blog.timescale.com/how-to-store-time-series-data-
mon...](https://blog.timescale.com/how-to-store-time-series-data-mongodb-vs-
timescaledb-postgresql-a73939734016/)

[5] [https://github.com/timescale/tsbs](https://github.com/timescale/tsbs)

~~~
ruw1090
> You are correct however that TimescaleDB requires more storage than some of
> these other options. If storage is the most important criteria for you (ie
> more important than usability or performance), then again I would recommend
> you to one of the other databases that are more optimized for compression.
> However, you can get 6-8x compression by running TimescaleDB on ZFS today,
> and we are also currently working on additional techniques for achieving
> higher compression rates.

This is a weird answer since compression is used by columnar databases like
MemSQL and Clickhouse to both save on storage and accelerate queries. Compare
this to using a generic a filesystem compression which would both compress
worse and make the system slower.

~~~
RobAtticus
We haven't really found it to be the case that the system is slower with ZFS.
As the sibling mentions, you are trading some CPU for better I/O. We usually
see better insert performance and similar/better query latency.

------
linsomniac
Experiences with Zabbix? I tried it back around a decade ago and wanted to
like it, but didn't find it very reliable. And now the details are escaping
me. I ended up sticking with Nagios and Opsview. Around 5 years ago I switched
to a templated Icinga2 config and have been pretty happy with that, but it's
pretty low level.

~~~
wbh1
Surprised to see Prometheus hasn't been mentioned yet, and even _Nagios_ is
being mentioned as a better alternative. My company (higher-ed, ~100k combined
students/fac/staff) is desperately trying to get away from Nagios. Once you
get Nagios to the scale where you have to implement mod_gearman, you've gone
too far.

I'd recommend taking a look at Prometheus[1]. It has its own _very_ performant
TSDB, there's exporters for just about everything, it's the defacto way that
things like Kubernetes expose metrics, and it has first class support in
Grafana for visualization.

We POC'd Zabbix, Icinga, ScienceLogic, Instana, Sensu, and Prometheus.
Prometheus was our favorite. Take a look at the comparison between it and
other popular monitoring products to see if it fits your needs though [2].

[1]
[https://github.com/prometheus/prometheus](https://github.com/prometheus/prometheus)
[2]
[https://prometheus.io/docs/introduction/comparison/](https://prometheus.io/docs/introduction/comparison/)

~~~
tecleandor
The problem I have with Prometheus is, I have most of my nodes in very closed
networks I don't have control (Healthcare) and I can't set up proxies so
Prometheus can reach them, I can only go outside. So, by now, my best option
seems to be InfluxDB, which doesn't look bad to me.

~~~
linsomniac
I've been using InfluxDB for ~3 years now for storing metrics (almost
exclusively via Telegraf, a few custom ones), and it has been great! It
replaced a collectd setup and dramatically decreased load across my fleet.

When I first started using it, it was pretty early and had some issues. In
fact, I nearly trashed it. I also didn't like the pull vs. push model from
Prometheus. They ended up resolving the InfluxDB issues I was having right as
I was about to give up on it, and it's been solid since. I use it with Grafana
to generate graphs of system use. I set it up before TICK was a thing.

~~~
h1d
I was about to like InfluxDB but ever since people say it eats memory and your
data, I stopped caring.

[https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/FAQ](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/FAQ)

("How does VictoriaMetrics compare to InfluxDB?")

~~~
linsomniac
That hasn't been my experience. I've been running it for ~3 years in our dev,
stg, and prod environments. Prod is using 1.5GB of RAM on a 5GB instance. I've
never had a data loss issue.

------
marknadal
The biggest problem I've had with timescale systems is managing the SSDs/HDDs
underneath.

Having to resize/grow/stripe/etc. them is a pain.

So we came up with a clever solution that batches chunks to S3:

[https://www.youtube.com/watch?v=x_WqBuEA7s8](https://www.youtube.com/watch?v=x_WqBuEA7s8)

$10/day for 100M records (100GB data), all costs!

And best yet, reduced DevOps! Very practical, super simple.

~~~
enordstr
Timescale engineer here. Just want to point out that you can also attach
additional disks using tablespaces, which are fully supported on hypertables.
With a few simple commands, this allows you to add new disks and move old
disks out of rotation while still being able to query the old data on them.

------
techntoke
Zabbix is like a step up from Nagios. I don't know how they can even stay
relevant with Prometheus.

------
sreeramb93
My experience with timescaledb is - it does not support gorilla encoding. So
the storage needs for it is very high.

~~~
SEJeff
Gorilla TSDB format paper for those who might not get the reference:
[https://www.vldb.org/pvldb/vol8/p1816-teller.pdf](https://www.vldb.org/pvldb/vol8/p1816-teller.pdf)

