
Random notes on improving the Redis LRU algorithm - dwaxe
http://antirez.com/news/109
======
jonstewart
Anyone familiar with caching algorithms will groan when they read the
description of the "sampling" LRU. First, LRU isn't that great--it is a _lot_
better than nothing and having a suitably large cache tends to cover up cache
eviction problems, but LRU is still susceptible to a number of problems. The
pseudo-LRU redis used to have is appropriate for very high-speed caches (it's
essentially a 5-way associative cache), but is so very not appropriate when
going out to disk, where the access latency is horrible.

ARC as mentioned below incorporates the LFU statistics, but it is patent
encumbered. A number of other NoSQL systems implement either 2Queues
("Segmented LRU") or LIRS. My team recently implemented both of these for a
block-caching system and found LIRS to perform better for our purposes;
essentially LIRS has a bit more information that it collects so it can evict
low future-probability data faster and thus retain more data that is likely to
be accessed again in the future. It only takes a few more hits to be worth it.
The downside to LIRS is that it's more complicated than 2Queues, and far less
well-described in the literature.

See
[https://en.wikipedia.org/wiki/Cache_algorithms](https://en.wikipedia.org/wiki/Cache_algorithms)

The thing that's really necessary, though, is to collect metrics and to have
an instrumented version that will log all accesses. Given a log of all
accesses, you can compare whatever implementation you use to Bélády's
algorithm (i.e., perfection). Without instrumentation and testing, it is very
very difficult to know what the right strategy is.

Anyway, this helps confirm redis in my mind as a great amateur NoSQL system.

~~~
antirez
Hello jonstewart,

That LRU is not that great, is the main argument of the blog post, this is why
now Redis contains an eviction policy with LFU elements. Pseudo-LRU in Redis
is a compromise between different goals: Redis has many use cases, so there
are tensions between different features. The Pseudo-LRU/LFU, augmented with
the pool of visited objects, as you can see reading the whole blog post,
provide quite good results without forcing Redis to bind the eviction policy
to the underlying data structure used to represent the key space, and
especially, without using additional memory.

All we can do for now is to have 24 bits per object, which is different than a
24-bit overhead per object, as the 24 bits must be stored _in_ the object
itself.

With the new LFU policy I hope to provide a more quality eviction strategy to
Redis compared to the previous one, however note that in practice the sampling
LRU + the pool provided acceptable real world performances in many use cases.

What I think you don't get, is that providing real-world software that has
many features is not like: let's read the Wikipedia page about caching
algorithms plus 100 papers, select the best, and implement. System software
that works in many use cases and environments is an exercise in compromises,
not "pick the best". Otherwise whoever is able to read the latest paper on a
topic and implement it would be the winner, which is not the case.

~~~
jonstewart
Hello antirez,

Indeed, I think the pseudo-LRU approach as you describe, especially when
combined with a pool, is a pragmatic caching solution for when you don't want
to make major changes to your codebase. You found 24 bits, you dedicated a
small data structure off to the side, and those two in combination are a
really good approximation of LRU, without a lot of code.

I'm obviously not as familiar with the guts of redis's implementation as you
are, so I don't know all the ways that keys can be stored and what the
implications for caching are. Generally a caching data structure can be used
that references the cached data/objects (pointers, basically), so it could be
external to how you otherwise store data. That may not be the right approach
with a low-level C/C++ block caching server, but that's not what redis is,
either. I think you would find that adding a separate data structure would
help you out, but that's just naïve advice; I don't know the guts of redis so
perhaps you can't store keys in a separate structure.

I also well understand that providing real-world software is an exercise in
compromises. However, I recently discovered how horrible an existing LRU
implementation performed in practice for my application (digital forensics)
and spent considerable resources exploring alternatives. And what we really
found out during all of this is that instrumentation and testing under a
diverse set of input is necessary in order to make an informed decision. For
some input, LRU was gangbusters. For others, it was lousy. I've also learned
that sometimes it's necessary to put pragmatism aside and go off and implement
the ideal approach; it can pay off.

Cheers,

Jon

~~~
eternalban
The "ideal approach" to caching, Jon, is an _oracle_. There is no such thing
as an "ideal cache eviction algorithm", just FYI.

~~~
jonstewart
Also known as Bélády's algorithm.

~~~
eternalban
This is my final note (and I am not downvoting you, fyi) but you said
"implement the ideal approach". Magic?

Let's see the code. I checked your github repo and can't find this
"implementation".

~~~
jonstewart
The last sentence I meant more generally than the domain of cache algorithms,
setting up "ideal" as a contrast to "pragmatic", since I'm taking flak for
being "academic" (haha, you should have seen my grades). One lesson I've
learned in my career is that there are instances where it pays to take some
time, see what the research says, experiment, and then refactor. It takes a
lot of time, but the best software is also long-lived so it can pay to take
the time to make one's software the best. antirez has 24 bits to play with and
doesn't want to make major changes, probably also doesn't have a lot of spare
time, and that's his choice.

But I am also aware of the oracle algorithm.

~~~
lossolo
You still don't understand what Redis is and what antirez wrote in first reply
to you. Your "major changes" will have impact on my use cases. Redis is not
only used for caching, I am not using Redis in production for caching and many
other people also do not use Redis for caching.

------
rawnlq
I wonder if he looked at Adaptive Replacement Cache[1]? It's supposedly the
best of both worlds of least recently and least frequently used and the only
reason people don't use it more is because of IBM's patent.

[1]
[https://en.wikipedia.org/wiki/Adaptive_replacement_cache](https://en.wikipedia.org/wiki/Adaptive_replacement_cache)

~~~
tedunangst
With random sampling?

------
squeaky-clean
That logarithmic counter trick is awesome. It's obviously not something you'd
implement everywhere, but it's seems really elegant here. I've never really
thought of treating memory that way, it blew my mind a little.

------
onetwotree
Great post. Antirez delivers once again!

I think this is an excellent read for anyone who wants to gain some insight
into coming up with pragmatic, real world solutions to problems (caching) that
are usually taught in a very formal, "use this bestest algorithm or you're a
dummy" kind of way.

Also the tricks used here are beyond clever.

------
coldcode
I love reading the antirez blog, he has such great analysis you rarely get to
read.

