Cache Modeling and Optimization Using Miniature Simulations [pdf]

erwan · on Oct 28, 2018

Sharing this paper because effective workload simulation is a necessary step toward the Holy Grail: workload aware and self-tuning caches. In plain english, that means a cache that can adjust its parameters so that it is most effective for a given workload. Ideally, you want all of that without human supervision or pre-trained models, and in its fully accomplished form the cache can provision resources if it is worth it. Note that effectiveness here is traditionally measured in terms of hit:miss ration but other metrics exist too.

That's part of the broader idea of "intelligent systems" pro-actively adjusting themselves (resources, internal structure, overall policies) to the workload they are fed in order to optimize performance or cost-effectiveness.

The most basic caches have static eviction policies (e.g LRU/LFU), there has been some work on adaptive eviction policies (e.g ARC), and more recently on a clever combination of adjusting eviction policies and admission policies (e.g TinyLFU). The overarching goal of this line of research is to figure out a "silver bullet" cache which perform optimally (or close-to) regardless of R/W distributions by scrambling between different policies wrt. admission and eviction.

NovaX · on Oct 28, 2018

Along your line of thinking, we have a paper scheduled for Middleware (Dec. 2018) and it explores how to make W-TinyLFU adaptive. I can share this paper privately until the conference (ben.manes @ gmail). It shows how to use simple, low-overhead schemes to improve the policy across workloads that exhibit various recency/frequency biases.

yvdriess · on Oct 28, 2018

Are there not significant diminishing returns to such techniques? More sophisticated eviction strategies need to be proporially more effective to compensate for the increased overhead. Or do you see the simulations only as an offline tool for better researching caching strategies.

d-- · on Oct 28, 2018

It turns out that most caching algorithms we use in practice (CDNs, memcached, storage, etc.) are still pretty far from the optimal hit ratio/ miss ratio.

See, e.g., recent work that exposes that gap (I'm an author)

https://www.cs.cmu.edu/~dberger1/pdf/2018PracticalBound_SIGM...

https://github.com/dasebe/optimalwebcaching

gstaro · on Oct 28, 2018

Another pointer to a paper that seems related: https://www.usenix.org/conference/nsdi18/presentation/beckma...

This is a very hot topic recently. It seems to me that several communities (databases, CDNs, storage, CPU architecture, theory) do a lot of redundant work and don't communicate a lot with each other..

d-- · on Oct 28, 2018

Yes, that paper LHD (from a colleague of mine) does use online stochastic modeling of the workload to make better decisions. So this definitely fits the bill

JDWolf · on Oct 28, 2018

Thank you for the post. This is an area I am currently exploring in regards to parallelization and have had a somewhat tough time finding papers on the topic.

erwan · on Oct 28, 2018

My pleasure! Could you give a high-level description of what is the problem/angle you are approaching?

JDWolf · on Oct 28, 2018

Still somewhat preliminary but its a grad level course in algos and the goal is to invent a paper that can be submitted to a conference of our choice. My background is in relational databases so I have generally leaned towards optimization in that area. My initial explorations have been to use C's forking of processes and shmem to parallelize both merge and quick sorts so far on my laptop but want to extend this to generate results on different types of hardware and for different types of data sets. I hypothesize that a better understanding of caching will facilitate a lot of this optimization as many of the enterprise systems are pretty advanced in this area.

Thanks again!