Thanks for sharing, I really like Postgres. However, I generally use Postgres for OLTP work. I would like to point out two things:
1. Based on Clickbench results, pg_analytics is still far from top-tier performance. If you're looking for a high-performance OLAP database, you should consider the top-ranked products.
2. The queries tested by Clickbench are really simple, far from real data analysis scenarios. You must be aware of the TPC-DS and TPC-H benchmarks, because ClickHouse simply cannot run these test suites, so their Clickbench does not include these datasets.
Lastly, I want to say, if your enterprise is of a certain size, separating OLTP and OLAP into two databases is the right choice, because you will have two different teams responsible for two different tasks. By the way, I use StarRocks for OLAP work.
1. On Clickbench, make sure you're doing an apples-to-apples comparison by comparing scores from the same instance. We used the most commonly-used c6a.4xlarge instance. While a few databases like DuckDB rank higher, the performance of Datafusion (our underlying query engine) is constantly improving, and pg_analytics inherits those improvements.
Then again, people only care about performance and benchmarks up to a certain threshold. The goal of pg_analytics is not to displace something like StarRocks, but to enable analytical workloads that require both row and column-oriented data or Postgres transactions.
2. We're working on TPC-H benchmarks. They're good for demonstrating JOIN performance and we'll have them published early next week.
Most of the time, all that matter in terms of performance is user's tolerance. Once that is reached, operational complexity becomes a lot more important. We use raw Postgres for analytics, knowing that projects like these and cloud offerings like AlloyDB will make our lives easier (in terms of performance) as time goes.
pg_bm25 looks awesome too! Next up, take fdw to the level of Trino/Drill, and we dont need anything else other than postgres and its extensions!
Clickbench is really nice because it is so easy to compare and contribute benchmarks. Is there anything out there like Clickbench, but for the TPC-DS and TPC-H benchmarks?
TPC-H has some published benchmarks on their website, but it's for specific hardware systems. It's certainly not as user-friendly and modern as ClickBench. I'm not sure if it's possible to make something better without their consent, though.
1. Based on Clickbench results, pg_analytics is still far from top-tier performance. If you're looking for a high-performance OLAP database, you should consider the top-ranked products.
2. The queries tested by Clickbench are really simple, far from real data analysis scenarios. You must be aware of the TPC-DS and TPC-H benchmarks, because ClickHouse simply cannot run these test suites, so their Clickbench does not include these datasets.
Lastly, I want to say, if your enterprise is of a certain size, separating OLTP and OLAP into two databases is the right choice, because you will have two different teams responsible for two different tasks. By the way, I use StarRocks for OLAP work.