

Caching COUNT with PHP and Redis - calebrown
http://forrst.com/posts/Caching_COUNT_with_PHP_and_Redis_Forrst_http_fo-m0v

======
antirez
> if for some reason the memory cache diverges from the true count, we just
> wipe out the memory cache and the app takes care of regenerating it
> automatically

Warning, this is a myth: once you starte relying on caching to serve your
amount of traffic, flushing the cache will have the result of taking the site
down. This is why persistence is an absolute requirement of a serious caching
server IMHO. With Redis when things may be in desync it's better to
selectively remove entries with (RANDOMKEY+DEL) at a rate that the system is
able to handle.

Also given that you are using Redis that has atomic increments, why not going
the extra mile and issuing INCR/DECR operations when something is
added/removed?

~~~
kylebragger
I should have been more specific--technically the cache would be wiped for a
specific post, never for the entire db of posts. But you're totally right
about the persistence --that's why something like redis made sense to me.

Useful tip about randomkey and del, thank you.

I def. could switch it over to an incr/decr setup. Maybe I'll do that tonight.

------
sumeeta
Doesn’t this seem too basic to have been voted up so much? That is, compared
to the rest of the technical posts that are voted up.

And perhaps MemcacheDB (<http://memcachedb.org/>) is a better bet if you don’t
need Redis’s list and set operations. But I don’t understand why the cache
needs to be persistent.

------
seldo
Two things occur:

1\. This is a better fit for memcache than redis, since you can get the counts
to expire automatically

2\. If you are fetching paginated results with a LIMIT (and you usually are)
then you can use the SQL_CALC_FOUND_ROWS prefix on your SELECT to get these
counts for "free" from MySQL without needing to do your own caching (see
<http://dev.mysql.com/doc/refman/5.0/en/select.html> )

------
carbocation
Assuming good indexing with non-enormous tables, he could use subqueries to
get the COUNT() data at the same time that he pulls the records themselves.
And, as he notes, he could have a count field in MySQL, caching at that level
instead of in Redis.

------
arnorhs
Redis is great, maybe memcached would serve better for this pupose as some of
the comments have mentioned. I just want to point out one thing about his
multiple count queries per page.

You can also do a grouped count query in a single query, by using something
like:

SELECT COUNT(*) FROM comments WHERE GROUP BY post_id;

From that he gets an array of all the counts, so he can look those up easily.

And that result can also be cached as well, of course.

------
ars
MySQL already caches these. You've added a cache on top of a cache. Is a round
trip to the database so slow?

~~~
kylebragger
I was under the perhaps false impression that count star queries with or
without where conditions are not cached on innodb tables.

~~~
ars
Not exactly. On isam table the count(*) queries can sometimes directly lookup
the answer from the index, or the table statistics.

On innodb, because of transactions, this is not possible, and it needs to
actually count the rows.

BUT, the query itself is still cached - at least until the underlying table
gets updated, which invalidates the cache (I'm not sure of the exact
invalidation strategy, but it's part of MySQL and doesn't depend on the
engine).

~~~
chrisbolt
Updating the table invalidates all cached queries that refer to that table,
making the query cache useless for any site that gets a decent amount of
updates.

