
Cache Concurrency Control - brazeepd
https://www.braze.com/perspectives/article/cache-concurrency-control
======
dormando
People learn and re-learn this at almost every company, which I find
fascinating. The redis nx bits aren't a bad way of dealing with it.

Since they mention memcached: I've been working on a protocol extension to
bake this exact thing in more directly. Though in braze's case, it's unclear
to me why they didn't use the method of add'ing a secondary key with a low TTL
since that doesn't cross systems at least?

With the new protocol to memcached you get "win" tokens, which are very
loosely similar to leases. Rather than explicit lease tokens a client is
notified of if it "won" or if an object is "stale" etc, and the existing CAS
mechanisms are used for replacing objects.

IE: If you fetch an object and miss, it'll auto-create an object with a
specified TTL, and return a CAS value (a version number). Winner recaches,
other clients are told to retry or wait.

Closer to the braze use case, you can set a "TTL remaining threshold" with a
request. If you fetch an object which initially had a 180s TTL, but now has a
<90s one, you get a win token. Other clients get the existing value, but only
one client is allowed to recache.

There's a bit more to it as the changes are trying to stay flexible for a
number of possible scenarios. Cuts out roundtrips and finally gives people
more modern cache semantics to work with built in. Hoping to ship this soon,
but I need to track down client authors for feedback.

------
jwahba
For patterns like this I like to reach for request coalescing. Here's an
example of a package that does it in the golang standard lib
[https://godoc.org/golang.org/x/sync/singleflight](https://godoc.org/golang.org/x/sync/singleflight)

~~~
tyingq
I'm curious if assigning a random value from an acceptable range for the
expiry time is a common approach. I assume that wouldn't work for all cases,
but might spread the avalanche for some.

Google searches on this topic (random expiry to mitigate an avalanche cache
refresh) turn up very little.

~~~
NovaX
It is very common and often referred to as jitter.

A variation, called scaled ttl, was nicely discussed in this video.
[https://www.youtube.com/watch?v=kxMKnx__uso](https://www.youtube.com/watch?v=kxMKnx__uso)

