I find it disapointing that the optimizer of the database is not able to get to ...

jeffdavis · on Jan 24, 2014

In postgres, distinct aggregates are a blind spot in the planner.

DISTINCT itself is handled well, e.g. "SELECT DISTINCT x, y ...".

But the distinct aggregates are not really seen by the planner, and they are always implemented with a sort. So this is a peculiar case for postgres.

One reason is because distinct aggregates can be more complicated to plan when you have a case like: "SELECT COUNT(distinct x), COUNT(distinct y) ..." because you actually need two different distinct sets as input.

Some databases have solutions for that problem, but postgres does not.

aaronblohowiak · on Jan 24, 2014

Well-documented MIT-licensed code. You can help!

jeffdavis · on Jan 24, 2014

Agreed. There's a commit fest going on right now, anyone should feel free to jump in and review a couple of patches with us. Even if you don't know C.

https://commitfest.postgresql.org/action/commitfest_view?id=...

Review is a good way to learn about new features, by the way. That's the main reason I'm able to actually write recursive queries, for instance.

elliptic · on Jan 24, 2014

Serious question - how realistic is that? How many people make significant changes to the planner/executor besides Tom Lane?

jeffdavis · on Jan 24, 2014

This particular change, in it's complete form, would be challenging in my opinion. Many other kinds of planner changes aren't so bad though, but people (including me) tend to avoid them.

Executor changes are much less intimidating and lots of people make executor changes. In my opinion, the executor is the most approachable/easiest major component.

I would encourage people to not get intimidated though. For every project where I was determined enough, I succeeded. Tom Lane does especially amazing things, of course, but I think reasoning like "I'm not Tom Lane, therefore the planner is too hard" is the wrong lesson to take away.

calinet6 · on Jan 25, 2014

For reference, a college-level CS databases course I took had us hack on the Postgres query planner to improve a very specific case such as this.

It's large and slightly complicated, but damn it's clean and understandable. Or at least it was back in 2005...

danielweber · on Jan 24, 2014

Yeah, in theory the database is supposed to take care of things like this for you.

In practice, every time I've seen a slow DB query, I've always been able to get huge improvements on it, even though the DB in theory had all the knowledge necessary to realize that I don't care about the intermediate results.

6cxs2hd6 · on Jan 24, 2014

But isn't `DISTINCT` sort of shoving an imperative square peg into the round hole of declaritive set relations?

At least in my time with MSSQL, seeing `SELECT DISTINCT ...` was a red flag. It nearly always meant something needed to be refactored using GROUP BY.

schrodinger · on Jan 24, 2014

FYI, in mssql at least, distinct and group by are entirely equivalent to the query planner. I'm not saying you're wrong though, since group by let's you explicitly pick which columns to go by, whereas distinct uses them all.

alinhan · on Jan 24, 2014

I don't understand what you mean by "group by let's you explicitly pick which columns to go by, whereas distinct uses them all". Can you give an example?

schrodinger · on Jan 24, 2014

select userid, name from users group by userid, name

is equivalent to

select distinct userid, name from users

It's easy to add another column to the "distinct" based query and change the behavior accidentally. With the "group by" query, you'd get an error saying that the new column needs to be either aggregated or added to the group by.

pdubs · on Jan 24, 2014

SELECT COUNT(DISTINCT ...) is a very different thing. It's counting distinct values of a particular column in a result set and totally not a square peg/round hole situation.

jeffdavis · on Jan 24, 2014

I've never heard DISTINCT described as imperative.

It still leaves room to determine how and when to achieve the DISTINCT property, perhaps even doing no work at all.

* It can do a sort+unique or a hash, depending on cardinality, availability of memory, and a few other factors.

* It can push down a distinct or pull up a distinct, resulting in a faster plan.

* If there is another part of the plan that already implies distinct, like a GROUP BY, then it doesn't need to do any work at all. (Or similarly, if you're doing a distinct on a primary key).

darksaints · on Jan 24, 2014

How is that imperative in the slightest?

lvh · on Jan 24, 2014

The declarative language promise was that it would be possible to perform such optimizations. Indeed, by-and-large, query planners do a pretty great job. I don't think it was ever a promise that query planners would be able to always find the fastest query for what you want.

jerf · on Jan 24, 2014

I read the declarative language promise as you will never have to know what's going on under the hood. This is never true in any non-trivial declarative language being used for non-trivial work. I have developed a very negative initial reaction to anything that claims to be "declarative".

Pxtl · on Jan 24, 2014

Likewise. The more declarative a platform is, the more I'm trusting its abstractions to never leak, because when I do I'm going to have to learn every agonizing minute detail of how it works under the hood.

SQL is a tiny kernel of beautiful relational mathematics wrapped up in a 3-foot ball of ugly implementation hacks.

NoSQL is what happens when you throw out the beautiful kernel and just keep the hacks, so it's not really better I guess.

joevandyk · on Jan 24, 2014

I wish more people were working on planner optimization/stats. It's a hard problem though, I know I'm nowhere near smart enough for it.

darksaints · on Jan 24, 2014

I don't think it is a matter of intelligence, just skillset. Query planners could go a long way with a few people with strong relational logic skills that could prove logical equivalence of expressions.

baudehlo · on Jan 24, 2014

There are some settings worth tweaking in postgres' config to improve the query planner. The one that made the biggest difference to us was "cpu_tuple_cost". Setting it to 0.2 (the default is 0.01) made it choose more optimal query plans more often.

lincolnq · on Jan 24, 2014

Huh, interesting! I just looked up the parameter - "the planner's estimate of the cost of processing each row during a query". So increasing this will tend to cause the query planner to forgo complete table scans in favor of indexed queries more often. Increasing the parameter by a factor of 20 seems like you're telling it "never do complete table scans unless it's completely unavoidable, always use an index".

I can see why this would make the query plans look better (in that index scans make more sense to humans) but are you sure you're actually getting better performance? If you are getting better performance, why is the default so wrong? Computers are really fast at sequential scans, less so at random-access stuff.

hsitz · on Jan 25, 2014

"Computers are really fast at sequential scans, less so at random-access stuff."

This is true if you're simple talking about raw IO throughput, but in terms of algorithmic searching the truth is almost exactly the opposite. For example, to find a record in a 100 million record table with a unique key using sequential search would require on average a scan of 50 million records. To access the same record when that key is indexed requires accessing/testing a maximum 23 or 24 entries in the index (2^23 is roughly 10M) and then direct access of the exact record needed in the table. Indexes are used because they speed things up exponentially, even though it may be true that random access IO is much slower than sequential IO.

People, I suspect, also have much higher throughput when scanning sequentially compared to random access of items. But imagine trying to find the single person in a copy of the Manhattan white pages (is there still such a thing?) with the name 'Horace Greeley'. Would you prefer to scan sequentially from the beginning (no use of an index) or make use of the fact that white page phone books are indexed by lastname, firstname?

Of course, because sequential scanning is so much faster than random access it will always be faster on tiny tables. It will depend on the system and record size, but I'm guessing indexing will take over as faster method as table size grows beyond 5k records or so, give or take an order of magnitude. Not sure of exact numbers but you get the idea. As number of records goes up from there the speed advantage of indexing over scanning grows at exponential rate.

corresation · on Jan 24, 2014

The default is configured for magnetic disks, where random IO is brutally slow (on the order of 30-100 per second) and sequential reads are reasonably fast. But what if you're running it on a Micron P420m with 700,000 IOPS per second?

jdub · on Jan 25, 2014

700,000 I/Os per second per second? That is some serious storage acceleration you've got there.

Sami_Lehtinen · on Jan 26, 2014

With proper SSD drives you can get seriously high IOPS. https://www.fusionio.com/products/iodrive2-duo/

corresation · on Jan 26, 2014

I think he was humorously noting that I erroneously said IOPS per second, akin to saying ATM machine, CD disc, etc.

jdub · on Feb 5, 2014

corresation · on Jan 24, 2014

It's worth noting that SQL Server, presuming you've configured uniques and relationships correctly, actually runs exactly the same query plan for their first query and their last query.

I detailed this at http://dennisforbes.ca/index.php/2014/01/24/your-query-plann...