A classic LRU cache has a map from key to value, with the entries of the map arranged in a doubly linked list. When a value is accessed, it is promoted to the front of the list. So the last item in the list is least-recently accessed.
In a LFU cache, you might try the same idea, except each node also has a counter. When an item is accessed, increment its counter, and then swap with previously-higher-counter items to keep the list sorted. In this way an infrequently accessed item does not get promoted.
However this has a problem for certain access patterns. If every item has the same counter value, then accessing the last item in the list will force it to migrate up the whole list - O(N). You could use a heap instead, and get log time, but that's still not the constant time that LRU provides.
What's proposed here is to bucket items by counter. Each bucket has a linked list of entries, and each entry knows its bucket. The buckets themselves are arranged in a linked list. When an item in the N bucket is accessed, it conceptually unhooks itself from the N bucket and adds itself to the N+1 bucket, creating it if necessary.
Incidentally, a tip for writing this sort of cache: make your list circular, starting with a sentinel node (I call it the "mouth" since it eats its tail). This eliminates all the null checks in the linked list manipulation.
You may know what eggs and wheat and milk and butter are but if you've never seen a pancake before then boy have I got some good news for you. We can talk about croissant later once the buzz wears off.
Not saying up this sort of thing makes for a good interview, just saying it's a cute little problem that has probably been solved a thousand times by a thousand different people.
Are there any blogposts about this?
For a hint, have a look at how a CPU implements the LRU cache (eg. L1, L2). It’s not rocket science.
There's a few cool alternatives. The first and most obvious - use a good dense hash table implementation! For eviction strategies: a random eviction strategy actually does pretty well in the general case and requires no extra memory - great if you're caching an int32->int32 mapping and the cost of recomputing isn't astronomical. You can also use try the CLOCK algorithm which just requires 1 bit of storage per cache entry.
From an abstract perspective, all page replacement algorithms can also be used as cache eviction policies. There's a whole list of these at https://en.wikipedia.org/wiki/Page_replacement_algorithm#Pag... .
While binary heaps are notorious for bad CPU-cache behaviour, improvements exist. Thus only benchmarks will convince me.
Knowing how some real life high performance LFU caches work, I think this is optimizing the wrong thing. Look at the design of caffeine and ristretto for some discussion of the tradeoffs with actual measurements. I believe both are based on TinyLFU with modifications.
I don't know any serious LFU implementations that actually use a min heap and are still competitive, so the premise that LFU is commonly log N is not actually true.
My implementation: https://pastebin.com/48mMTTdK
I am surprised to see that the paper was only published in 2010. I would expect something this simple to have been discovered earlier. I guess it's easy only after someone tells you constant time is possible.
Interestingly LFU has access patterns that force it to be slower than LRU. From the paper referenced above:
>A counterexample for LFU is a sequence of k + 1 accesses to each of N - 1 pages, followed by 2k alternating accesses to two new pages, k to each, with this pattern repeated indefinitely (so that all but the initial N - 1 pages have k accesses). [here N is the size of the cache]
The best of both worlds is to use LRFU.
Now a 0(1) LRFU implementation would be nice to see.
Also, no talk on this blog about the constant factor I believe
After a code search on github, gecko, chromium and Linux have zero LFU despite having tons of LRUs.
And zero LRFU too.
So it seems that browsers and OS engeeners should learn better cache algorithms, it could allow huge performance gains!