I understand. Can you give an example of a “modern distributed relational column...

manigandham · on Nov 8, 2018

Clickhouse, MemSQL, Redshift, MapD, Kinetica, etc.

If you just want rollups and don't care about every row, then look at Druid (or imply.io for a startup making it easier).

All these systems can delete old data very quick as they just delete entire compressed partition files.

gianm · on Nov 8, 2018

Fwiw, more recent versions of Druid have a no-rollup mode that does ingestion row-for-row. It ended up being useful for cases where you _do_ care about every row, maybe because you want to retrieve individual rows or maybe because you don't want to define your rollups at ingestion time. And in that mode, Druid behaves like the other DBs you mention.

(I am a Druid committer.)

preetamjinka · on Nov 8, 2018

Some of those we’ve looked at before and decided not to go with because of unknown observability, high operational requirements, or cost. But yeah, no real problems with data models or queries.

I think Druid has come the closest to the most ideal system for the requirements I’ve had to deal with, but haven’t used it yet.

Thanks, by the way! This helps a lot.