

Optimizing Cache Performance on Posterous - jrnkntl
http://technology.posterous.com/planning-and-engineering-your-cache-for-maxim-0

======
anamax
> Most important is the cache’s miss rate: how frequently do we need to
> regenerate data? It is the miss rate that ultimately impacts site
> performance.

Not so fast. While the miss rate determines how much you need to regenerate,
the miss time can have a huge effect on the average access time, which is what
your users see.

If a is the average access time and m is the miss time (both in units of hit
time), the required hit rate is (m-a)/(m-1). For largish m and reasonable a,
the required hit rate can be quite high. (Memcache can be very fast relative
to query+render.)

Things are more complicated when people are involved because people don't
task-switch while waiting, they wander off. For example, you might want to
cache so frequently accessed things have a higher miss rate. Why? Because the
occasional long-access time on something that you do frequently can be seen as
a glitch while a long-access time on something that you're doing for the first
time (or rarely) looks like a broken system.

------
ojilles
Pretty cool stuff, especially the tool. I usually just graph the memcached
hit/miss ratio in (cacti|ganglia|zabbix). Isn't that much easier to determine
current and past cache performance than adding additional logging?

(Doesn't allow for simulation of course)

~~~
vincentchu
Definitely graph memcached hit/miss ratios -- they're a good way to diagnose
issues in your cache. However, I can think of a few instances where more
logging is helpful:

1\. Your cache server is shared between several different classes of objects.
In this case, the overall memcache hit/miss ratios might mask issues occurring
in one class of cached values.

2\. You want to figure out how your cache will improve with a given change in
cache size (i.e., simulate results ahead of time, as you mentioned).

