Like you, I think CREATE STATISTICS is huge. I work with a sharded PostgreSQL set-up where we roll our own sharding based on customer data. This means that most of our tables have compound primary keys where the identifying account data is part of the identifier.

Just speculating off hand, but this sounds like a schema-defined column dependency which is doubly-troublesome since, like I said, this is within our primary key index. I am super excited to see just how much a difference CREATE STATISTICS will improve overall performance.

It's difficult to say whether the extended statistics could help with your schema, particularly in Pg10 where we only implemented two simple statistics types - functional dependencies and ndistinct (GROUP BY) coefficients.

Maybe the changes in 9.6 that allow using foreign key constraints during estimation would help, though?

