I kind of hate this "tl;dr" culture. I can explain the reason why this is never ...

ataggart · on May 7, 2012

I'm not sure I buy your argument. It is certainly true that with MVCC "there is more than one legitimate answer the question of COUNT", but that's true for any query, not just COUNT. Given that we can maintain transactional correctness for the data, it's not at all clear to me why the same mechanisms cannot be used to maintain transactional correctness for the metadata.

fusiongyro · on May 7, 2012

Yes, but we're not surprised when those other queries have to read the table. For some reason it's surprising with COUNT. I maintain this is not a technical problem so much as an intuition problem.

Explain how you're going to maintain this metadata in more depth. If it can be done in an MVCC manner (i.e. without introducing locks) then it should be explored. How's it going to work?

seunosewa · on May 7, 2012

We can just maintain multiple versions of the row count.

fusiongyro · on May 7, 2012

It might be worth seeing what the cost of this would be in practice. I suspect if the answer were this easy the developers would have just done it by now, but you never know.

ralph · on May 7, 2012

I agree. Given count(star) is a common need you'd think there would be multiple count(star) results in the metadata for the table, each with a generation associated with it.

fusiongyro · on May 7, 2012

It's a short, easy query to write. I'm not convinced it's a big need.

So, your proposal is basically this: make a metadata table. On every insert or delete from the table, update the metadata table accordingly. MVCC will make sure anybody reading from it will get the right answer. Is that about right?

jeltz · on May 7, 2012

I do not see how this would be implemented without introducing extra locking which would harm concurrency, which would remove some of the performance advantage of having MVCC.

Now I see it could be worth taking that penalty on specified tables, jsut like you do for indexes. And some databases do support this in a more general form: materialized views.

fusiongyro · on May 7, 2012

I think you could fake this proposal with triggers. It might be worth seeing what the additional cost is.

scotty79 · on May 8, 2012

Proper solution (not fake with triggers) should be at most as costly as having partial index on the table.

scotty79 · on May 8, 2012

^ tl;dr no, it's still slow

sorry, I couldn't help myself :-)

I understand that it's semi-hard problem given MVCC but I could keep track of counts myself for each set of conditions and each table that I need. It's as simple as keeping count as record in some table, incrementing it in on insert, decrementing in on delete and adjusting on update if row begins or stops to satisfy conditions I want to count by. The only grudge I have with PostrgreSQL is that when confronted with this problem PostgreSQL fans respond with "We've got MVCC so it is supposed to be slow. Stop thinking it's supposed to be fast." not with "Hmm... Maybe we should introduce new facility similar to indexes where you could define what you want to count and the db will count by this conditions and provide results for you fast."

fdr · on May 8, 2012

> The only grudge I have with PostrgreSQL is that when confronted with this problem PostgreSQL fans respond with "We've got MVCC so it is supposed to be slow. Stop thinking it's supposed to be fast." not with "Hmm... Maybe we should introduce new facility similar to indexes where you could define what you want to count and the db will count by this conditions and provide results for you fast."

Plausible, but somewhat bothersome. It could be a special case of the materialized view problem, but it would take some convincing that to suggest that a special count-optimizing physical structure is worth the additional knob and maintenance burden it brings with it. Not impossible to convince, but it would require some tight argumentation with ample evidence.

In my quick assessment, the need for faster counts over lots of data is subsumed by materialized views, which unfortunately is a very tough feature to write, but has the advantage of generally considered being being worth it. A cut-down version of materialized views that just supports counting on various dimensions that adds more bulk to the planner as well as requiring execution maintenance and bug-fixing seems less-worth it. Somewhere in-between is a variant that supports more aggregate functions besides count(), particularly ones where one can define an inverse transition function (which in the case of count is "add one for every record").

There are many interesting features one could add -- and this is one of them -- but at the end of the day there is going to be an assessment by the long-time community members who have the privilege of fixing obscure bugs for years as to:

* Who is going to write it?

* How long will it take vs. other features?

* Is a feature is going to be maintained to a level of quality we find acceptable into the foreseeable future?

* Is the impact worthwhile?

* Could this in any way be done as an extension?

The level of the bar of quality has changed over time, too, pretty much inexorably moving up. Getting LISTEN/NOTIFY in now would be much harder than when it showed up in...Postgres95 (and, granted, it was pretty busted back then, apparently, along with...a lot of stuff). LISTEN/NOTIFY turned out to have a good impact/complexity ratio, so it gets good maintenance, but in a hypothetical world where it was not already committed with a good history of service and use I bet it would have required quite some convincing to add.

There is some old code in existence now that would not be accepted again today. A more sad example is dusty implementation of hash indexes that thankfully few people use or see reason to use. Hash indexes possess ample warnings in the manual to not use them. Yet, it doesn't seem reasonable to deprecate them for people that rely on them, and nobody seems to care enough to improve it, or even convincingly whine about improving it. The community is pretty sensitive to these code barnacles, so if one goes in wanting such a feature, one has to be well-prepared to argue both feasibility of implementation and size and duration of impact.

scotty79 · on May 8, 2012

People ask "why others are using MySQL instead proper database like PostgreSQL?"

Some of the people try and get burned because features that MySQL did without anyone noticing now take so much time that one wonders if he triggered some strange bug that caused db to do full table scan where one is unnecessary. Then he goes to the internet and look for clues and he sees response "oh, it's how it supposed to work, nobody is working on because it does not seem like a problem" over and over.

I think that feature might have an impact.

Humble question from person who never tried to build a database engine:

Do you know why db can't just keep track of number of entries in a full or partial primary key indexes and use them to give count fast ?

fdr · on May 8, 2012

> Some of the people try and get burned because features that MySQL did without anyone noticing now take so much time that one wonders if he triggered some strange bug that caused db to do full table scan where one is unnecessary. Then he goes to the internet and look for clues and he sees response "oh, it's how it supposed to work, nobody is working on because it does not seem like a problem" over and over.

Just about every non-trivial database implementation will have a pathology that others do not. In this case, InnoDB is was faster because it supported index-only scans, a feature that was very useful, but hard to implement in PG because of some details of its MVCC implementation -- now there is an implementation in 9.2 that gives it a shot at being in the same class of performance even on wide-row tables. However, both are still much slower than MyISAM: if you search for "why is count slow on innodb?" you'll get a lot of hits -- it suffers much the same defect vs. MyISAM (which has poor SMP concurrency, allowing it to count things as a luxury) that Postgres does.

Perhaps "not a problem" is a lazy answer for "it's hard to get exactly right, and nobody in the same class of implementation is bothering, and this has been discussed to death and is more work than you realize, so don't expect anyone to implement it soon unless you intend to argue it (unless you are also a long-term maintainer) and do it." As you can see...that was many more words, including giving background on two MySQL storage engines.

> Do you know why db can't just keep track of number of entries in a full or partial primary key indexes and use them to give count fast ?

It could. But you'd have to spec out a brand new top-level set of utility commands (like CREATE INDEX) to make this physical structure, it now is yet another knob that can drastically change performance characteristics, the planner gets to scan the projections and qualifiers for yet another little optimization, someone else gets to maintain it (unless you are maintainer), and then there's going to be the person that asks "why isn't is this snapshot-isolated version not even nearly as fast as caching the count in memcached?" (or, if one does it the other way, the reverse), and then finally when you get around to implementing a more general set of functionality you'll be stuck with this old form for basically eternity (depending on your release policy) much like hash indexes. Don't forget to add backup-and-restore support, manual pages, and EXPLAIN and/or other diagnostic output, if necessary.

On the plus side, it could end the endless howling about this problem.

Personally, I think we can get enough hooks in place that someone should be able to implement this secondary structure in a satisfying way without coupling it inextricably into the database, and that's a project I'd have enthusiasm for.

If someone else made this feature in another popular database that serves similar use cases and it got used a lot, then I think the understanding of the upsides would be such that such a feature is more likely to happen. Index-scans fall into that category, and a special structure not seen in other databases doesn't appear to at this time.

scotty79 · on May 12, 2012

Thank you for all the information.

> On the plus side, it could end the endless howling about this problem.

:-) I think that would be also immense relief for the howlers.