Thank you for the kind words! :) 1, makes sense. On 2, I understand your thinkin...

zhousun · 2025-03-08T18:36:10 1741458970

Yep what I want say is the line between the two designs is indeed very blur.

Logical replication with mooncake will try to create a columnar version of a postgres heap table, that can be readable within postgres (using pg_mooncake); or outside postgres (similar to peerdb + clickhouse) with other engines like duckdb, StarRocks,Trino and possibly ClickHouse.

But since we can purposely build the columnstore storage engine to have postgres CDC in mind, we can replicate real-time updates/deletes(especially in cases traditional OLAP system won't keep up).

saisrirampur · 2025-03-08T19:54:53 1741463693

I understand. In that scenario, why can't users just use these other query engines directly instead of the extension. You're heavily relying on DuckDB within your extension but may not be able to unleash its full power since you're embedding it within Postgres and operating within the constraints of the Postgres extension framework and interface.

zhousun · 2025-03-08T20:43:36 1741466616

lol spot-on comment and stay tuned for our v2 :)

The focus of mooncake is to be a columnar storage engine, that natively integrate with pg, allowing writing from pg, replicating from pg, and reading by pg using pg_mooncake. We want people to use other engine to read from mooncake, and here they are effectively stateless engine, that's much easier to manage and avoids all data ETL problems.

saisrirampur · 2025-03-08T20:51:34 1741467094

Sounds good. I'm still a bit confused. But will wait for your next version. :) ETL problems still aren't avoided — replicating from Postgres sources using logical replication is still ETL. One topic we didn't chat much is, be careful about what you're signing up for with logical replication — we built an entire company just to solve the logical replication/decoding problem. ;)