
Caching in theory and practice - lowe
https://tech.dropbox.com/2012/10/caching-in-theory-and-practice/
======
jcampbell1
This was a disappointing read:

It is an interesting question -- Given actual dropbox usage patterns, is there
there an caching approach that is better than LRU for real Dropbox customers?

The answer I got was, hybrid LFU/LRU algorithms could work better, but they
are complicated and were never tested against real Dropbox access patterns.

~~~
eblume
Only so much an intern can do in a summer, I suppose.

~~~
dskang
He says that he was an intern back in '11. I'm guessing he's full-time now.
Not very clear.

~~~
dudey
He's a junior now :-)

------
pencilcode
is there any known algorithm that looks at the relationship between cached
items? Ie. File A is accessed, then B, then C, then D, then E, and so on. A
would a have stronger relationship to B, one step away, than to D, two steps
away. So if we later access file A, the algorithm would know there's a higher
probability that we need File B next, so it could check if file B is in the
cache and if not, prefetch it and save it in the cache.

~~~
eblume
Seems like a natural place for an application of a 1st-order markov model -
one state for every entity (file), record the load events from each previously
loaded file to the next file, and then your cache can predictively load a set
of files by computing the most likely transitions.

Going higher than 1-order might make it even smarter, but with the cost of
taking more memory (increasing the likelyhood of thrashing).

It actually seems like a very novel and interesting approach, although I don't
know if it's been done before.

(Quick googling reveals that there is a LOT of work done on this approach, but
I don't have the time right now to see if it's a good one or not.)

~~~
azth
Didn't Microsoft introduce something similar in Windows Vista, where it tries
to keep the system cache occupied in order to have a better response time for
the user?

------
dllthomas
In the case of viewing a slideshow of photographs on a webpage (and,
doubtless, some other things), the browser cache probably further reduces the
usefulness of LRU on the server side and reduces the chance of worst-case for
MRU.

------
aidenn0
I still like random replacement; it's really fast and works better than you
think.

