- It primarily compares against relatively weak, naive algorithms like LRU because (inexplicably) that is what many open source systems use. All of the high-performance closed source systems I've seen use better algorithms that are not represented here. Maybe an availability bias? Also, some great algorithms don't even have names AFAIK, just implementations, which doesn't lend them to easy reference.
- CPU cache efficiency of algorithm implementations is often highly relevant to real-world performance of cache replacement algorithms. This is frequently overlooked in the academic literature, which focuses on hit rate under various workloads. Many algorithms that produce good hit rates have poor throughput in practice due to CPU cache behavior; quite a bit of R&D at database companies focuses on minimizing CPU cache thrashing from these algorithms. A slight reduction in hit rate that substantially reduces CPU cache thrashing can be a big win. Cache throughput often comes from surprising places.
- Schedule-based replacement mechanics are ignored. Given some large number of concurrent in-flight operations against the cache, you can adaptively reschedule the order of operations to approximate Bélády's optimal algorithm. For many workloads you can get a step function improvement in cache throughput from this, and it applies to the aforementioned CPU cache efficiency as well. Design of schedule-optimized cache replacement algorithms is not trivial; the obvious way is NP-Hard, so the challenge is in finding efficient approximations.
Generally though, I like the formulation of LHD. It would be interesting to see how it sits in the broader scope of real-world cache replacement algorithm implementations. This is one of those areas in database computer science where the closed source design theory is richer than what makes it into the academic literature, so it is more difficult to figure out what the state-of-the-art even looks like.
I wonder if Jeff Dean (and google) have tried a neural network version of this. The core idea here seems similar to “learned index structures”
Although kudos to the authors, i really like the intuition of this approach!
Which is essentially the same observation this policy is based on, but with gradations instead of a binary decision.
Still need to give a more thorough read, but spotting other errors, ie; FB memc (according to their paper) balances pages by tail age, not hit rate.
Seems interesting though.
The eviction is doing random sampling of N entry and is straight forward. But how does it compute the hit density online per entry.
Edit: perhaps in a file system, the higher-level information could still be provided?