Hacker News new | past | comments | ask | show | jobs | submit login

Does TimescaleDB support automated downsampling using various functions (min/max/mean/avg) and then during querying automatically picking the correct downsampled data? This is the biggest issue that I and others have with InfluxDB, that it doesn't do that, so the only convenient way to use it is just to expire all data outside the retention policy. Ticket here: https://github.com/influxdata/influxdb/issues/7198





I think what you are referring to is the TimescaleDB real-time aggregates https://docs.timescale.com/latest/using-timescaledb/continuo...

It allows you to define aggregations that are automatically used when quering the raw table if the query matches, and it also allows you to drow the raw data with a retention policy but keep the aggregated form (https://docs.timescale.com/latest/using-timescaledb/continuo...)


OK, but it looks like I still have to define these aggregates manually. I was really more talking about the standard use-case that folks used to use Graphite / rrdtool for: Keep track of real-time high-fidelity metrics while still being able to query aggressively-downsampled historical data for comparison, and doing so without having to configure anything.

Hi @heipei -- one thing to observe is that Graphite & rrdtool are designed for a specific monitoring use case, while TimescaleDB is a more general-purpose time-series database.

So what that means is that TimescaleDB has mechanisms to make it really easy to define downsampling (continuous aggregates, data retention policies), and even have queries that transparency query across the historical aggregates and new raw data (real-time aggregates, which parent pointed to, which isn't supported by InfluxDB).

What the database _by itself_ doesn't do is automatically create certain continuous aggregates on metrics immediately, because frankly, users' needs vary so much.

That said, we have built stacks/solutions that leverage TimescaleDB and do precisely that. For example, we just released a design doc and beta around our refreshed native integration with Prometheus, that addresses an extremely similar use case to Graphite / rrdtool. Because now this is automated, it defines many of these things out-of-the-box, so you don't need to configure anything. Check it out and input welcome!

https://tsdb.co/prom-design-doc


Thanks for the pointer. I truly understand that TimescaleDB is a general-purpose time-series DB and I understand that most use-cases are unique in that it makes sense to make these decisions about what and how to downsample consciously. However, I feel that there is a large audience of people who "just" want a database that they can point their system-metrics collector at (Telegraf), point their dashboard at (Grafana) and just hit "go", much like would with something like Datadog, and have the confidence that they can still scale the database if its ever necessary. Much like ElasticSearch provides default mappings (text/keyword/date/number), this would a great 80-20 solution for the default use-case of "I want to collect system metrics from my hundreds of servers and have a few sensible defaults about granularity, downsampling and data-retention, and only then will I start to worry about whether that data will eventually exceed my one-server deployment."

Yep, that's exactly what the "Timescale Observability" stack is about. Type "helm install", and a full stack is spun-up and auto-configures to scrape information. You have graphs up in Grafana within 2 minutes, zero configuration.

- See https://github.com/timescale/timescale-observability

- Or join the #prometheus channel at https://slack.timescale.com


Very very interested in this too. Sometimes called automated roll ups. I know Elasticsearch does this

I replied to the parent comment. In short yes, it is supported (https://docs.timescale.com/latest/using-timescaledb/continuo...)



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: