Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I understand. Can you give an example of a “modern distributed relational column-oriented database”?

Two capabilities that are important in my work are roll-ups (reducing resolution of data) and fast bulk deletes of old data.



Clickhouse, MemSQL, Redshift, MapD, Kinetica, etc.

If you just want rollups and don't care about every row, then look at Druid (or imply.io for a startup making it easier).

All these systems can delete old data very quick as they just delete entire compressed partition files.


Fwiw, more recent versions of Druid have a no-rollup mode that does ingestion row-for-row. It ended up being useful for cases where you _do_ care about every row, maybe because you want to retrieve individual rows or maybe because you don't want to define your rollups at ingestion time. And in that mode, Druid behaves like the other DBs you mention.

(I am a Druid committer.)


Some of those we’ve looked at before and decided not to go with because of unknown observability, high operational requirements, or cost. But yeah, no real problems with data models or queries.

I think Druid has come the closest to the most ideal system for the requirements I’ve had to deal with, but haven’t used it yet.

Thanks, by the way! This helps a lot.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: