> ... then used this hash as not only the stored key but also to figure out the shard the key should go into. This does introduce a chance of key collision, that’s something we plan to deal with later.
If a Get and Set can produce a collision and it is not dealt with, then it is unsafe for production. Imagine calling user foo's data and returning user bar's data instead. This could leak PII and would be an unnoticed bug. Generally, in collisions, you drop down to a list and hurt your runtime to ensure accuracy. I need a correct cache over a fast cache.
Unless I misunderstood, this is not production ready. Again, amazing work; it is just not done yet.
Ristretto does that already. It evicts as many keys as needed to stay within the MaxCost. This is done outside the critical path by background goroutines. However, you also don't want to evict too many keys, because that would affect the hit ratios.
That was confusing to me as well in the article. However they also wrote:
> If however, a key is already present in the cache, Set would update the key immediately. This is to avoid a cached key holding a stale value.
So it sounds like inserts to the cache might not immediately apply, but overwrites are guaranteed to invalidate stale data in cache.
Rough numbers but maybe 75% of those keys existed last month and will still exist next month and probably next year, too, and 25% will exist for minutes to hours and likely never be seen again.
Collisions are uncommon, but a regular enough occurrence (dozen or two a year).
One way to check for collision would be to either store the key as well or use another hashing algorithm (which would further decrease the probability of collision, but not eliminate it). Either would introduce some slowness, because every Get would need to be checked.
...kind of wondering, they had so at least three "well, go doesn't do ..." (lock-less data structures, thread local storage, and channels aren't high perf) rabbit holes that I wonder if go was really the right implementation language for their project (iirc they've also had articles go by about writing best-in-class go versions of rocksdb/etc.).
But, they do seem very capable of rolling their own infra, and I suppose someone has to do it to bootstrap ecosystems in each new/upcoming language.
You're right about our rabbit holes. Our initial "why" for this project was essentially: "Caffeine is written in Java with excellent performance. Go should be (and usually is) competitive with Java. Why is there no Go equivalent?"
With such a high bar we were disappointed in naive, idiomatic Go solutions, but we knew there had to be some way to close the gaps.
I look forward to continuing to do so. We have a few things on the roadmap such as adaptivity (see Ben Manes' work for more info) and managing our own memory.
I joke that Go is kind of like the wild West. It comes with its Benefits and own unique challenges.
Not an expectation/criticism/etc. of Ristretto, but out of curiosity, does anybody know of work about making the cache aware of the expected cost of a cache miss for each key?
To spell out what I mean: if I'm sending a query result to the cache, I could also pass along how long the query took to run. Then, in theory, the cache could try to drop keys with low (expected number of fetches * expected cost per miss), not just low (expected number of fetches). Of course that's rough/imperfect, and for some caching scenarios the miss cost doesn't vary by key.
I guess mostly the question is if it's enough of a help to justify the metadata+complexity, which I guess requires a trace of gets/sets from something that has miss costs that vary.
The intuition is just that now that we're pretty good at optimizing hit rate, maybe expanding the problem to "minimize total miss cost" could be an interesting direction for the use cases where there's a difference.
Makes me wish I could justify adding instrumentation to our 'memoize' wrappers at work, although our caching doesn't really match the use case for something like Ristretto or Caffeine. (Little enough stuff and the latency of a networked cache is fine for us, so "just set aside lots of RAM on some cache servers" works well enough.)
I really enjoyed this write-up as well! But I couldn't tell from the post: does it have a distributed mode like groupcache? (I think the answer is no). I think nearly all decisions in groupcache are because of the word "group" :).
Similarly, it would be great if there's a "safe mode" instead of the (IIUC) "start dropping Set requests when overloaded". You'd naturally want most applications to be prepared for your cache to get blown away, so you should be prepared for it to be lossy, but I'd certainly want a knob for that :).
For the part where an overloaded cache drops Set requests, the most frequent Set requests are the one most likely to be admitted in between the dropped ones, so I guess this is also sort of taken care of.
I have no affiliation with the project, but reading and asking myself the same question as you, I reasoned what I wrote above.
> does it have a distributed mode like groupcache?
Not at the moment, no. Though it could, theoretically, be used within groupcache. We'll look into that as it has been brought up a few times now.
> it would be great if there's a "safe mode" instead of the (IIUC) "start dropping Set requests when overloaded".
We could definitely look into adding a config flag for that. I'd love to know: in what situations would lossless Sets be important to you? We already have guaranteed updates, so the only Sets that are dropped are new items.
Of course, one way to ensure every key goes in is to have a cache of infinite MaxCost. But, that’s an orthogonal idea, I guess.
There is no "distributed mode" as far as I can see. It is an in memory library for a K/V cache that you use in your code. That's it.
In setting up the cache, can I assign values to be arbitrary length, binary safe strings? Can this be used to store indexed blob data of large size? Not saying this is a good idea btw...
We spend a lot of time here in postmortems asking what they hell were they thinking. This is a great opportunity to ask pre-catastrophe.
The iteration order of Go maps is randomized, but it's far from being evenly distributed (it could be good enough for the Ristretto use case though). Here is a test that shows that some keys are 6 times more likely to be selected on the first iteration https://play.golang.org/p/dT1CWuqoHEM.
1. If the chosen sample keys have high Estimate, the incoming keys could be rejected, affecting the Hit ratios. However, our Hit ratio benchmarks didn't show such behavior (within 1% of exact LFU).
> With this approach, the hit ratios are within 1% of the exact LFU policies for a variety of workloads.
2. The chosen keys have low Estimates, in which case they'd be removed and hence, won't suffer from repeated selection.
So, yeah. Doesn't affect Ristretto. But, good thing to keep in mind for other workloads.
I wonder how much the type casting costs you’ll see in real world situations? I’ve not benchmarked that generally but it’s always something I worry about with ‘generic’ data structures like this.
I was worried about type casting as well and generally avoided bare interfaces like the plague. However, in all the benchmarks I've ran for throughput, type casting was a small percentage of CPU time. I'll see if I can dig up a flamegraph.
Of course, the only reason sync.Pool performs so well is because of its internal usage of thread local storage. If the Go team exposed that, we could make our own typed sync.Pool implementations... I won't hold my breath, though.
Alternatively I think within the decade we are going to expect every service to contain its own fault prevention logic instead of just the web tier. At which point every service has its own reverse caching proxy in front of it.
But neither of those are requirements for micro services. I certainly will try this out the next time I need a cache in my microservice.