

This is a story of caching - terpua
http://code.google.com/p/memcached/wiki/TutorialCachingStory

======
pilif
... and then the customer called and asked why the graphs on the front page
were wrong even though they clearly just edited hugetable.

You explain them that it'll just take a little while to be updated, but the
customer didn't like that answer. The data needs to always be current.

Apparently, you need to flush parts of the cache as new data arrives.
Unfortunately though, you can't as memcache is a strict key/value store. So
you change how you name the cache keys and make them dependent of, say the
max(timestamp) of your hugetable.

Load goes back up to 2 because all requests now still have to check the table.

But it's still not as bad.

Until the next phone call...

~~~
fliph
Or you could just update the cache when the data changes.

~~~
pilif
True, if it's possible.

Let's say that hugetable is some interface table filled by a different system
you have no control on. You could add a trigger on the database that shells
out to some script to clean the cache, but if the that external tool adds rows
one by one, that's really expensive (aside of the fact that this is NOT what
triggers were invented for).

Or the data in hugetable depends on a lot of different components in your
application. Then it's really hard to always be sure to invalidate the cache
correctly and there's sure to be a location where you'll forget.

In addition, invalidating the cache on write works counter to the pattern
described in the tutorial that concentrates the caching around retrieval.

Don't get me wrong: I agree with the article. It's just never as easy as these
tutorials make it seem.

~~~
mattlanger
In what way was this a tutorial? Seemed more like a parable meant to drive
home the line about how "When they have questions, they ask the mailing list
or read the faq again."

Each problem poses different challenges, and it doesn't seem fair to attempt
to invalidate a given example by theorizing about hypothetical complexities
the original authors never alluded to.

~~~
pilif
You misunderstood the intent of my possible continuation of the original post.
I was not trying to invalidate it, but just noting what problems might (and
do) arise.

I totally agree with all the article says. Caching in general and especially
memcache are awesome, but as always it's a trade off. What you get in
performance, you pay for in complexity

------
fliph
For some reason, I started reading the story with the assumption that it was a
"don't do it this way" tutorial, and I got very nervous towards the end. ("But
that's exactly how I use memcache!")

------
fizzfur
hehe, all software documentation should come in 3 forms: Reference, Tutorial
and Pop-up Book

~~~
steveklabnik
_why actually proposed (on a few separate occasions) that there should be more
computer books in the 80 page range. More like the Poignant Guide or Nobody
Knows Shoes than a dead-tree, slow version of Google.

------
sloak
The real story is in how they push untested code into production just to see
what happens. ;)

~~~
bigiain
Heh - 'cause none of _us_ have ever done that, right? <looks around nervously
for any colleagues reading>

------
adamtj
Programmer and Sysadmin were either very lucky, or not working on anything
important, or else they would have been fired or gone out of business. You
can't just add caching and magically expect things to work. You have to think
hard about expiration policies and test to make sure you aren't going to get
wrong answers, or else you need to prove that wrong answers are ok.

------
mikeklaas
"All programming is an exercise in caching."

-Terje Mathisen

