
The State of Caching in Go - boyter
https://blog.dgraph.io/post/caching-in-go/
======
djpilot
Great article (or start of article)!

The GitHub stars obsession is too much, though. I read the request in the main
content area, then a nag banner popped up at the bottom of the page requesting
it again. How annoyingly desperate and uncool.

Enough already, I'm trying to read!

Or _was_ , anyway..

Plus I'm not even logged in to GitHub at present and cannot without getting to
another device for 2FA.

No offense, but now I'll likely never give that star.

~~~
mrjn
Arrr... Sorry about that. That was my idea, so my bad. We will get rid of the
pop up banner.

~~~
justinclift
Please do. Same here... read the initial request, but when the lower banner
thing popped up _also_ begging for a star my perception of that was negative.

Clicking on the [x] in the corner of the pop up (to get rid of it) then
launched a new window (well, new tab in this case).

Didn't read the rest of the post.

~~~
paulftw
Do you mind sharing what browser/OS were you using? Clicking on the [x]
shouldn't do that.

(i'm the person who implemented that widget)

~~~
bepvte
On firefox on android, the X opens github

~~~
paulftw
thanks, will look into that.

------
mrjn
(co-author here) Thanks for sharing this post! We've elaborated the various
issues we encountered trying to use a concurrent LRU cache in Dgraph and our
dissatisfaction at existing choices which fail to provide memory management,
high concurrency and hit ratios.

With guidance from Ben Manes, we're planning to work on a new concurrent Go
cache based on Caffeine (Java). If you're interested in helping (or already
have something similar), do reach out to us!

Meanwhile, check out Dgraph, our distributed graph database which is what
keeps us busy: [https://github.com/dgraph-
io/dgraph](https://github.com/dgraph-io/dgraph)

~~~
kasey_junk
What are you thinking about for the API signatures? Without beating a dead
horse this is where go’s lack of generics really becomes burdensome.

If it were an on disk cache []byte would likely make sense but given it’s in
memory I’m not sure. If you use interface{} you’ll need to measure that cost
as well.

~~~
amanmangal
Our current plan is to store []byte given that badger stores blobs of bytes.
Using interface{} may have some overhead, but we are no so worried about that
right now.

~~~
kasey_junk
So the []byte solution requires running through a serielization step which for
most in memory uses will be expensive. Did you choose that because in you use
case it eventually ends up setialized in any case?

~~~
amanmangal
correct!

------
valyala
Cache hit ratio increases with the size of cache and decreases with the number
of "hot" items regardless of cache eviction policy - LRU, FIFO, etc. Cache hit
ratio reaches 100% when cache size becomes large enough to hold all the "hot"
data. It would be great to see
[https://github.com/VictoriaMetrics/fastcache](https://github.com/VictoriaMetrics/fastcache)
in the benchmark results. It is faster and it uses less memory comparing to
other solutions for caching the same amount of data.

------
sethammons
I think it is a well put together analysis, but I feel I'm missing something.

> FreeCache and GroupCache reads are not lock-free and don’t scale after a
> point (20 concurrent accesses). (lower value is better on y axis)

If you look at the graphs, they level out and seem to work best with more
concurrent load, the exact opposite of what they say. Am I missing something?

Also, I think it would be valuable to see higher concurrent requests: with
what I'm used to, 60 concurrent requests would be low - I'm interested in
maybe 1-5k concurrent requests.

~~~
cpitman
I agree, those graphs were pretty confusing. As far as I can tell, they are
not showing the average time each operation takes (ie average elapsed wall
time), they must be taking the length of the test and dividing by the number
of operations. It would be much easier to parse if they showed
operations/second instead.

In other words, 9 concurrent pregnant mothers would show 1 month/baby on those
charts.

If I have that right, their comment makes more sense. After 20 concurrent
requests, the overall throughput of operations/second does not increase with
additional concurrency.

~~~
amanmangal
This is true that the latency of each operation won't change much as we
increase concurrent accesses, but the amortized latency would (be expected to)
reduce as the graphs show. When we did benchmarks, the Go benchmark framework
provides us with ns/ops which we directly used to plot the graphs. We could
calculate ops/sec too but that would just be another step of calculation using
throughput = 1/(time taken/ops), may be more clear to understand though. The
graphs would still flatten out after 20 concurrent accesses.

------
dstroot
The post speaks about go not having a thriving ecosystem of packages. My view
is that there are actually a lot of Go packages out there but certainly go’s
indifference to packages and the fact it is just now getting an “official”
package capability (RIP dep) has contributed to slower growth of the
ecosystem. NPM has had many many growing pains but I sure do love the ease of
use and discoverability of packages. I sincerely hope go catches up. I love
the language and tooling.

~~~
cdoxsey
Go has lots of packages. They are easily discoverable on godoc.org or just
some googling.

There is a cultural skepticism about the overuse of packages. There are
occasions where the STD lib copied a function instead of adding an import for
example. Like many things in Go this pushback is needed but sometimes goes too
far.

But maybe single function packages in npm we're a bad idea.

Anyway caching is a mixed bag in my opinion. Local per-thread caching is often
better than a global cache as it avoids contention. It's also trivial to
implement with a map and requires no special coordination.

It does require rethinking how you design a solution to a problem. FWIW that
design process tends to lead you down a better direction anyway for producing
distributed systems.

For example one giant, randomly distributed Kafka topic with a global redis db
for a cache is probably a lot worse off than a system with more predictable
data locality on the consumers.

~~~
mrjn
(co-author here) The number of Go libraries tend to be just slightly below the
number of their users (dramatically speaking).

Libraries are hardened by repeated usage by many different users who each
bring their own special use cases and improve them to bring them to production
quality. Go has no platform where certain well-written libraries can be
recommended and get more exposure. Thus, they don't tend to mature more than
the specific use case they get written for, provided they are still being
maintained.

Arch Linux is a great example of giving well-doing packages more exposure by
upgrading an AUR to community to core/extra. That way, more users gather
around packages improving them even further.

To the second point about per-thread caching, Go does not expose threads to
end-users. So, there's no concept of thread-local. What you're describing
results in lock striping, which has contention issues as described in the
post.

~~~
sagichmal
> Go has no platform where certain well-written libraries can be recommended
> and get more exposure.

Godoc.org is that platform, it serves the purpose. Others in the module
universe are under development.

> To the second point about per-thread caching, Go does not expose threads to
> end-users. So, there's no concept of thread-local. What you're describing
> results in lock striping, which has contention issues as described in the
> post.

In Go you'd have to orchestrate this locality yourself, i.e. a fixed worker
(goroutine) pool each with its own cache. This is probably less work than it
sounds.

Generally, Go does encourage you to author solutions to your specific problem,
rather than adapting a general-purpose library. This is almost as much a part
of the ethos of Go as implicitly-satisfied interfaces, or "share memory by
communicating", or any of the other proverbs. Maybe Go takes it too far. But
effectively no other language exists at this point on the spectrum, and I'm
happy that we have at least some representation over here; I think lots of
programmers live here, too, and appreciate the tradeoffs.

~~~
mrjn
> Godoc.org is that platform, it serves the purpose.

Godoc.org is great. But (beyond documentation) it is at best a search engine,
providing equal platform to all libraries however production ready or broken
they might be. Unless I'm missing something, it does not intend to promote
certain libraries over others, the same way as AUR -> community -> core works
in Arch (which is the model I think is missing in Go).

> In Go you'd have to orchestrate this locality yourself, i.e. a fixed worker
> (goroutine) pool each with its own cache.

Having many small caches within the same process would result in more misses
per key, which if it results in disk accesses would not be ideal or might be
worse than contention.

Moreover, being able to spin Goroutines as and when required to branch off a
big job into smaller tasks is the beauty and benefit of Go compared to other
languages like C++ or Java, where you must start a thread pool upfront and
shoot tasks off to it.

~~~
Foxboron
>the same way as AUR -> community -> core works in Arch (which is the model I
think is missing in Go).

Arch packager here.

It doesn't _quite_ work like that. The distinction between community and extra
is largely based on who packages it, it used to be defined as above 5% usage.
The distinction between that and core is that core is considered essential to
the distribution. No package is going to go from AUR to core without replacing
some integral part of the system.

~~~
mrjn
I see. Then I guess there's a different model that can more accurately
represent the idea of giving certain well-written/actively-maintained/popular
packages a special platform than others to make it more enticing for wider
community to adopt those.

The underlying issue is that building a production ready library doesn't just
happen -- it needs one or few initial authors and (along with a large group of
users) a group of dedicated power users who can then continue to maintain the
library and optimize it for their usage, in turn developing it enough to be
considered production ready.

------
northwindfoo
I'm not very familiar with go but one thing I don't understand is why the
caches need to be distributed. I wonder, why not just have one cache per
thread?

~~~
jitl
That requires N times more memory, where N is the number of threads. 32x more
memory??

~~~
Twirrim
Given the Zipf distribution, I wonder if just a small LRU cache per thread
might not be a terrible thing?

Juggling caches on databases is a challenging thing, and there has been some
back-and-forth on best practices. MySQL for the longest time shipped with a
query cache. As of MySQL 5.7.20 it was deprecated, and has now been removed in
MySQL 8, largely because it was as likely to hurt you badly as help you,
particularly with correctly sized InnoDB buffering.

~~~
NovaX
Zipf is a little idealistic. It is perfect for a quick analysis as the base
case that a cache should excel at. Unfortunately LRU doesn't because it can be
easily polluted. (examples: [https://github.com/ben-
manes/caffeine/wiki/Efficiency](https://github.com/ben-
manes/caffeine/wiki/Efficiency))

A database will often scan many records which would flush an LRU. Postgres
uses small LRU buffer caches in your per-thread model, backed by a larger LRU-
like cache. The buffer caches are easily flushed by scans, but protect the
shared cache from this noise. That shared cache could probably benefit from a
smarter policy and this is an on going topic.

------
djpilot
> LSB 9-16

What is this? Google thinks it's some sort of metal stamp.

~~~
mwkaufma
Least significant bits

~~~
djpilot
Makes sense now, thanks!

