I wonder if ClickBench needs more data or something.
I just randomly compared ClickHouse, Databend and Doris on c6a.4xlarge machine. For some queries, there's a big differences, like ClickHouse being order of magnitude faster than the others in Q28 and two orders of magnitude slower in Q29. That sounds useful because it shows some of the databases do something differently than the others and areas for improvement.
But for most of the queries, the comparison is like 0.02s vs 0.03s vs 0.03s. That doesn't sound very meaningful and also I wonder how precise the measurement is, since if we're looking at difference of milliseconds it is much easier for randomness to sneak in, compared to measurements in minutes.
While I can read it in a way that modern columnar databases are superfast and awesome, I wonder how the results would look like on order or two orders of magnitude more data.
It can be applied for a larger dataset, see https://github.com/ClickHouse/ClickBench/tree/main/clickhous... (100 billion records), but so far only ClickHouse has been tested on this volume - it takes too much time and cost to load this data into every DBMS.
Trust for what purpose? Are you a Marketing Droid for a product which looked good/bad in some particular benchmark? A fanboy yearning to proclaim yourself Right on the internet? A paid-per-click author cranking out web content? A PHB looking for validation of his half-assed decisions?
For the great majority of use cases, any looks-good benchmark will do the job.
(Vs. if you actually need to know about real-world performance - you'll have to do some serious work for that information.)
Full disclosure: I work at StarTree, which is powered by Apache Pinot.
ClickHouse's ClickBench is a good general tool. However, it's not the end-all, be-all of performance benchmarking and testing. Its results may or may not be applicable for guidance on the performance of your specific use case when you get to production.
It is definitely a stab at getting an objective suite of tools for the real-time analytics space. But just like you had YCSB as a good general performance test, eventually a subset of users wanted something specific for Cassandra and Cassandra-like databases (DSE, ScyllaDB, etc.), so you eventually saw cassandra-stress. We have to consider cases where certain databases may need to have testing suites that really capture their capabilities.
ClickHouse themselves publishes a list of Limitations that everyone should keep in mind as they run ClickBench:
> It looks like the queries are all single table queries with group-bys and aggregates over a reasonably small data set (10s of GB)?
>I'm sure some real workloads look like this, but I don't think it's a very good test case to show the strengths/weaknesses of an analytical databases query processor or query optimizer (no joins, unions, window functions, complex query shapes ?).
> For example, if there were any queries with some complex joins Clickhouse would likely not do very well right now given its immature query optimizer (Clickhouse blogs always recommend denormalizing data into tables with many columns to avoid joins).
So, again, ClickBench is a good (great) beginning. As an industry we should not let it be seen as the end. I'd be interested in the community's opinions on what and how we should be doing better.
I wouldn't take techempower as the holy grail of testing web frameworks. Most of the top performing languages+frameworks aren't representing how real developers write code.
(I work at ClickHouse)