

Constant factors: constantly foiling you - leonidg
http://blog.ksplice.com/2010/05/constant-factors-constantly-foiling-you/

======
marcua
Depending on 1) The storage engine you're using, and, 2) What other queries
you run against this table

You could cluster the table on the 'time' column. That would store the data in
the table in the same order as it appears in the index, and make time-based
range queries a lot faster.

~~~
leif
Shouldn't matter, this query shouldn't hit the data file. Clustering would
just make fragmentation worse. I'd like to see the OP verify that it's only
hitting the index file though (see my other post).

------
kscaldef
I have to admit that I'm still a little puzzled here. Why does MySQL need to
retrieve the actual rows from disk at all? Since the query is just returning
count(*), can't it be answered entirely from the index? Does the index not
reflect deletions or something like that?

~~~
leonidg
That's a good question, and one I'm not entirely sure about. My guess is that
the index itself is also scattered across the disk, since it gets added to at
the same time as the rows get added. So in the real query, where we are
reading actual data, and not just counting the number of rows, there are
actually two seek costs --- seeking to finding the correct part of the index
to read, and then seeking to find the row it points to.

~~~
leif
When you insert in sorted order into a b-tree (as you do with timestamps),
later inserts will be more likely to be all in one block, and also more likely
to be near the top of the tree.

Since b-trees have the guarantee that no root-to-leaf path is more than a
constant factor longer than any other, you are probably hitting the very edge
of that constant, and since disk seeks are the designated hitter of the
performance team, you're getting hit pretty hard.

So, effectively, even your index is growing fragmented the more you use it.

To verify the fact that you're not hitting the data file though, maybe you
could strace it and watch for reads on the index and the data file.

~~~
kscaldef
Typically, one expects a b-tree lookup to require at most 1 disk seek (for the
leaf-nodes), and ideally your indices are fully in memory. So, color me
skeptical that fragmentation of the index itself is to blame here.

~~~
leif
Inserting in sorted order is not a typical case, and I strongly doubt your "at
most 1 disk seek" claim but my memory is fuzzy so you may be correct.

You're definitely right that the cache should be hot though, so I am now also
a skeptical hue.

~~~
kscaldef
One of the primary virtues of b-trees is that they make efficient use of disk.
Even if the full data structure is too large to fit in memory, because of the
large branching factor you can usually fit all the interior nodes in memory.
So, in a lookup, only the final access of the leaf node requires a disk
access. Of course, traversal may require additional seeks and reads.

------
edw519
Hmmm...let's think outside the schema for a moment.

OP is benchmarking two instances of COUNTs against a relational database, each
looking for the total number of rows in a 3 day period, right?

I'm going to make a guess that this is historical data and that "time" is
stamped and _never changes again_...

Why not just pre-answer the question by incrementing a field (based on date)
in another table every time you add a record. Then instead of COUNTing
millions of records in seconds, you can read 3 records in micro-seconds.

I know what you're thinking: this will "slow down" your transaction processing
time. That's true, but only by microseconds. I'd rather wait a few
microseconds several times per day than 61 seconds when I really needed to
know something.

Just because a DBMS has lots of capabilities (SQL, indexing, triggers, stored
procedures) doesn't necessarily mean that's the best way to deploy something.
Sometimes a little thinking and some extra functionality in your app can
leapfrog mountains.

[Aside: I have implemented this exact solution many times, often reducing
historical reporting run times by 95% or more. No one ever noticed any latency
at transaction update time.]

~~~
leonidg
You are totally right, if all I wanted to do was figure out how many total
rows there were, and needed to do this often, it would make much more sense to
store that number in a separate table. Unfortunately, those particular queries
were just simple examples illustrating the problem --- the real program
actually needed to read and use the real data. :-)

