Hacker News new | past | comments | ask | show | jobs | submit login

Why should someone use TimescaleDB over ClickHouse for time-series/analytics workloads?

I've heard several points for not choosing ClickHouse and going to TimescaleDB as an extension of PostgreSQL:

1. As it is already mentioned, if metadata (data about timeseries) are already in PostgreSQL, then it is nice to stay in the same database engine for querying data with joins of both metadata and timeseries data, so there is no need to implement integration of the two source in the application layer.

2. Also related to the first item: advantage of already knowing PostgreSQL API. ClickHouse has different management API, so it is necessary to learn. While if you know PostgreSQL, you don't need to learn new management API and only timeseries specific API of TimescaleDB.

3. ClickHouse doesn't support to update and delete of existing data in the same way as relation databases.

Then the final decision still depends on your need.

The biggest reason is if you're using Postgres already as an operational database and want some timeseries/analytical capabilities.

Originally Timescale wasn't much more than automatic partitioning but with the new compression and scale out features, along with the automatic aggregations and other utilities, it can actually be pretty good overall performance. It still won't get you the raw speed of Clickhouse but instead you get all the functionality of Postgres (extensions, full SQL support, JSON, etc) and can avoid big ETL jobs.

Another PG extension is Citus which does scale-out automatic sharding with distributed nodes but is more generalized than Timescale for handing non-timeseries use-cases. Microsoft offers Citus on Azure.

Microsoft also offers Timescale on Azure, but only the Apache-licensed parts.

Yes, along with Aiven and a few others. Unfortunately the free community license is great but require either the Timescale Cloud or running it yourself.

That's a good question! Especially considering these overwhelming benchmarks [1] made via Timescale TSBS [2].

[1] https://www.altinity.com/blog/clickhouse-for-time-series

[2] https://github.com/timescale/tsbs

Those 2018 benchmarks pre-dated many of the features we released last year, including columnar compression, continuous/real-time aggregates, etc.

Then it would be great posting updated benchmark results on TimescaleDB blog.

If you use PostgreSQL, then it feels natural to add TimescaleDB extension and start storing time series or analytical data there alongside other relational data.

If you need effectively storing trillions of rows and performing real-time OLAP queries over billions of rows, then it is better to use ClickHouse [1], since it requires 10x-100x less compute resources (mostly CPU, disk IO and storage space) than PostgreSQL for such workloads.

If you need effectively storing and querying big amounts of time series data, then take a look at VictoriaMetrics [2]. It is built on ideas from ClickHouse, but it is optimized solely for time series workloads. It has comparable performance to ClickHouse, while it is easier to setup and manage comparing to ClickHouse. And it supports MetricsQL [3] - a query language, which is much easier to use comparing to SQL when dealing with time series data. MetricsQL is based on PromQL [4] from Prometheus.

[1] https://clickhouse.tech/

[2] https://github.com/VictoriaMetrics/VictoriaMetrics

[3] https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/Metr...

[4] https://medium.com/@valyala/promql-tutorial-for-beginners-9a...

We spent about 6 months looking at pretty much every database tech on the market, cockroach, clickhouse, influx, voltdb, memsql etc were top contenders, there was an outdated article on medium.com (by victoria metrics) which slammed TimescaleDB for its disk usage, we did not realised it was biased, so we dropped TSDB dropped off the list, but we saw a email about their compression segment by device_id, and gave it a shot, ....we implemented it, 5 months after our production release we now have outstanding performance and compression (95x) We are planning to move the rest of our databases to TSDB now as it ticks our boxes our use case is HTAP, not solely OLAP and OLTP

I'm super excited about this news, but TSDB please work on allowing us to put data over 1 year old on slow disk seperate servers, so we can keep the hot stuff on the NVME servers, once you get this sorted it will be the perfect fit for us.

> TSDB please work on allowing us to put data over 1 year old on slow disk seperate servers, so we can keep the hot stuff on the NVME servers, once you get this sorted it will be the perfect fit for us.

ClickHouse recently added multi-volume storage for exactly the use case you describe. [1] It's a great feature.

[1] https://www.altinity.com/blog/2019/11/27/amplifying-clickhou...

Glad to hear it is working out for you! I'll relay the request re: old data. But please also feel free to email me directly at ajay (at) timescale.com (or email support (at) timescale.com) if you have any follow up questions / requests.

Good news: TimescaleDB already offers this feature. Feel free to ping us support (at) timescale.com and we can walk you through it. Thanks!

It is customary on HN to disclaim when you are a member of/contributor to products you are suggesting.

Another thing to mention is that TimescaleDB has much stronger ACID guarantees than ClickHouse. Which means you get more clear semantics for consistency

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact