
Beyond the LRU cache policy - japaget
https://github.com/ben-manes/caffeine/wiki/Efficiency
======
Terretta
Cat's finally out of the bag, at least a whisker or paw. For anyone thinking
about this, here is some more:

Advection implemented a tiered version of a cache eviction algo very similar
to this single tier take since early 2000's. These graphs showing near optimal
curves hold up in the real world.

When you have the one cache and window shown here, easy to see how to extend
to layers (capacity/cost/performance) of cache. Harder to extend this across
independent nodes across a global footprint of federated hubs, but that works
too.

Crucial to get eviction and refill right when you're dealing with video sized
files and limited cache pool. Pushed the VDN to innovate here well beyond what
CDNs had or still have.

 _/ / Advection was a private label VDN (under strong NDAs and no sales and
marketing) under the hood of brand name CDNs, with a VDN provisioning API.
Several of largest premier certified streaming CDNs were Advection powered for
their streaming certification, and many of the largest events throughout the
last decade used Advection under the hood, even into this decade. The company
was bootstrapped, profitable, and growing into the start of this decade, but
HLS finally meant other CDNs could do video well enough they didn't have to
have a parallel dedicated video infra, they could finally use their own edge,
taking wind out of sails. While I stepped away several years ago, AFAIK
Advection still offers boutique transactional video, able to track and control
every user's stream in real time, enabling at scale a ton of very slick
capabilities still behind some extraordinary VOD and live video business
models, the ones that aren't just naïvely ad supported. Advection also
innovated something we called "zero admin", now called devops. It had no
sysadmins or support people, only devs._

~~~
NovaX
Are you allowed to describe the eviction policy? It would be interesting to
know the differences. You can contact me privately (ben.manes at gmail). I'd
be interested in knowing,

\- Did it use a sketch in a similar manner? \- Was it an LRU window to a
classic LFU? Did it retain history? \- Was it robust in all workload patterns
or optimized for VDN workloads?

I would expect a 4-Segment LRU (S4LRU) policy to work well in VDN, while being
a simple O(1) policy. That's used by Facebook photos, among others, so
probably good for CDN content.

~~~
Terretta
There are a couple other key diffs not yet out of the bag, so I can't go into
more, but I will say S4LRU is not at all optimal for very large video edge.

Most of the research work remains focused on HTML or images. The problem was
very different for video given objects per cache and nontrivial time to
replace.

~~~
NovaX
Any chance you have pointers for public trace files? Its interesting to see
the behaviors under different workload patterns.

Adding a cost model (GDSF, et al) is too specialized that I haven't spent time
looking in that domain. From your hints that sounds like the unexplored area
that your work successfully optimized.

