its great starting insight, but again its small dataset (100GB) which almost fits memory, and I think many details are missing (for example clickbench publishes all configs and queries, and more detailed report, so vendors can reproduce/optimize/dispute them).
What counts as large or small definitely varies a lot depending on the context of the conversation/analysis.
MotherDuck's "Big Data is Dead" post [0] sticks in mind:
> The general feedback we got talking to folks in the industry was that 100 GB was the right order of magnitude for a data warehouse. This is where we focused a lot of our efforts in benchmarking.
Another point of reference is [1]
> [...] Umbra achieves unprecedentedly low query latencies. On small data sets, it is even faster than interpreter engines like DuckDB
reply