Hacker News new | past | comments | ask | show | jobs | submit login

Ok, autosizing is missing in pg_partman. Retention works the same (just dropping childs). I've read the paper and there it gets more clear for me.

So you distribute the childtables to several nodes of a server cluster.

Is network latency a problem? I guess one should colocate the servers in one location rather than spread it out?

How good does it work when nodes die?

Do you use query parallelization (available since 9.6 in vanilla) on a single node and across different nodes?




Yes we distribute the child tables among the cluster. Our default distribution mode uses "sticky" partitioning where a partition prefers to stay on the same node. This allows you to control data colocation via the partition key. Our clustered solution is not released yet but we plan to handle node failures via regular postgres STONIH mechanisms. Once node failure is detected, the system reconfigure which children to use.

Query parallelization works in single and multi-node cases.


And a clarification just in case this wasn't clear: We use constraint exclusion analysis to determine which of the "child tables" of the parent hypertable to query. So, when performing query parallelization, you aren't sending subqueries to chunks/nodes that will not match your SQL predicate (plus a bunch of other query optimizations we do with pushdown, etc.).

This is also true at write time: You can insert a batch of rows in a single insert, then the query planner will split them into sub-batches, which it then will insert into the appropriate chunk. So, you don't need to "pre-partition" your data before writing to the system (even in the forthcoming distributed version).


I've pinged someone on the team whose a big more familiar with our clustering, so hopefully they can give you a more detailed response soon.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: