Hacker News new | past | comments | ask | show | jobs | submit login
Groupcache: an alternative to memcached, written in Go (github.com/golang)
341 points by sferik on July 29, 2013 | hide | past | favorite | 97 comments



For those that didn't catch it, this is from Brad Fitzpatrick, the same guy who made memcache


ASAIK, he wrote the original memcached in perl but what everyone is using now is a rewrite of memcached in C by Anatoly Vorobey [1]. Not that it diminishes his efforts or anything, just pointing it out.

[1] http://en.wikipedia.org/wiki/Memcached#History


And then I rewrote it in C++ for App Engine and other users within Google (Blogger, Plus, hundreds others).

I also then ported the C++ memcached (called memcacheg) to Go, for profiling Go vs C++, and named that one memcachego. So I've written it about 4 times.


"I've written it 4 times, and I've gotten exceedingly good at it" :)

Kidding aside, anything you learned during the rewrites that caused noticeable changes to the design, or it's still the same thing, just transliterated?


So, to get this straight, the popular port, which people install with a "yum install memcached" (or from source), is still the C version, right?

The others sound like they are mostly used by Google.


How did the Go version compare to the C/C++ one?


> I also then ported the C++ memcached (called memcacheg) to Go, for profiling Go vs C++

How were the results of the profiling? Anything interesting come out?


Well, at least he invented it :)


Off-topic: What exactly does ASAIK mean? All slang definitions online point to, "as soon as I know," which frankly makes no sense.


I meant AFAIK.


next in line, memcached in js? or will anything which can be written in go will be written in go?


And if stuff doesn't get written in Go, then people will criticise Go for not having anything in production. The internet is not going to run out of space through people writing too much software.


groupcache has a couple of differences from memcached that are probably part of why it exists:

- thundering herds suck, so it manages how the cache is filled when a key is missing

- repeatedly retrieving super hot values will eventually exhaust the network pipe of whichever server held, so groupcache replicates hot values to more than one cache server

Since this was built for dl.google.com, the 'value' might be big (say, a chunk of the Chrome binary) and the request rates for the top cache keys probably get pretty high. So, even if memcached is doing good work for lots of folks, the distinct features of groupcache probably address real needs in the app.

And, yeah, groupcache and the dl.google.com rewrite are Go partly because Brad likes Go, but both projects had reasons to exist quite apart from deploying Go for Go's sake.


Go seems like a reasonable


We just published his talk at dotScale 2013 where he talks about software design, Go, lockservers and more: http://www.youtube.com/watch?v=sYukPc0y_Ro


I'd be curious to hear performance numbers (assuming a reasonable front-end server to this library). I get it that the replicated in-memory caching part is valuable. But (from painful experiences with Java) I also fear that a GC based memory management system is anti-optimal for an in-memory cache of small objects, especially as the size of heap grows to beyond a couple of GB( * ).

(* ) http://cdn.parleys.com/p/5148922a0364bc17fc56c60f/GarbageCol...


Go doesn't scan big []byte/strings, so the GC isn't a problem.


Brad could you elaborate on this statement please. I tried a few google searches to find documentation on that behaviour but failed.


Roughly: a GC has to look at the pointers in use to see what objects are still reachable. But it doesn't have to look at the contents of strings, byte slices, int arrays, etc., because, by definition, those don't contain pointers.

For dl.google.com, the first user of groupcache, it sounds like the Go runtime was working with relatively few, relatively large chunks of memory (think 2MB chunks of a Chrome binary). Since that doesn't have so many pointer-containing objects, just big chunks of inert data that don't need scanned, it's easyish. On the other hand, something like throwaway2424's case could be a GC stress test--many gigs of RAM holding a network of small, pointer-filled objects.

Really, the tl;dr may be to prototype/measure if you need to know how a GC will do. Assumptions are easy; data is hard.


Ahh got it. Also, the reason Brad specifically mentions big arrays of bytes/strings is that they don't look like pointers. Soo.... I'm guessing you still want to stay away from huge arrays of ints (or whatever type has the same size as pointers on your platform). Right ?

And as for measuring... it's not cheap (developer time wise) to build a realistic enough model of your application and put realistic enough loads on it for long periods of time to test out each of your hypotheses (and GC theories do require long periods of load). So you have to narrow the space of choices informed by past experiences and sometimes by gathering semi-reliable folk lore. I'm currently engaging in the latter activity :-)


++ablankst that byte isn't special here. Brad singled out []byte/string because those are the types stored in groupcache (see ByteView in the groupcache code for his sneaky tricks to make them look the same to other code).

And yes, unfair to ask someone to test every guess they have. But if you do want to know a bit more about what effect GC and Go would have, experimenting with toy programs isn't a bad way to get a feel.


In Go, and other type-safe languages, the compiler usually knows exactly which ints are being used as pointers. This is the difference between "precise" and "conservative" GC and is one of the reasons that comparing GC performance in C++ to GC performance in other languages is difficult.


I was under the impression (again, folk-lore, mailing lists etc.) that Go had a conservative collector. I can't find an official documentation link that'll tell me if it is or not at the moment.


This neat StackOverflow answer appears to be by the 'atom' fellow who contributed a patch to make the collector more precise (patch review = https://codereview.appspot.com/7307086/): http://stackoverflow.com/questions/7823725/what-kind-of-garb...

Atom also says 1.0's was more conservative, but, as Brad also said, still didn't scan "objects such as []byte" (meaning all plain-old-data arrays? who knows). The Go 1.1 Release Notes mention the collector becoming more precise, which was a particular issue on 32-bit because big heaps could span a lot of the address space.

You can see the GC source itself doing some per-type switching: https://code.google.com/p/go/source/browse/src/pkg/runtime/m...

At some point, this sort of discussion probably gets you less useful info per unit effort than just playing with a Go distribution, trying out whatever toy programs you find interesting.


Go's GC isn't nearly as powerful as any of the many GC implementations for the JVM, but Go's approach is also quite differently. In Go, you can reduce / avoid the garbage that you create. And when the GC turns out to be the limiting factor of your cache, then you can also mmap some dedicated pages from the OS and manage the memory on your own. But I guess that won't be necessary. Do you have any issues with Go's GC today?


Well I don't have issues with Go's GC today because my horrific experiences with various JVM's has been a powerful deterrent for any future experiences in performance critical large-heap server processes with a GC controlled heap. Other than that, I really really like the design of Go. This is in addition to the factor of my faith in the solid team and company backing the language.

I'm quite interested in the two mitigations you mentioned. Especially the "manage memory on your own option" ? I was not aware that there was a go language (that is, not a C extension) blessed way to do such a thing.

As for "reduce/avoid" creating garbage, how can I achieve that for a large heap caching application (like the one we're talking about here) ?


In any GC'd language, the less you allocate the less you GC, so "recycling" old objects when you need new ones can help. So where you'd normally make() a []byte, you can instead get one from a list or buffered channel you made of old discarded []bytes. So that gets you back to malloc/free/new/delete, but you can at least do it only for the most performance-relevant allocations in your program and keep being lazy everywhere else.

Another sometimes-useful way to save allocations--not for a cache, but in general--is just to turn short-lived allocations into longer-lived ones: if fooBuf is used and thrown away by each of several calls to obj.baz(), then make a single fooBuf and store it as a (private) field on obj, assuming that doesn't present thread-safety or other problems in the specific context.

There are certainly things you wouldn't use Go for, but now you know more about reducing memory pressure in GC'd languages. :)


Well various techniques to recycle/reuse allocations instead of "garbaging" are well known techniques from the Java world too. Also, the generational collectors popular in the JVM's also do a good job of lots of short-lived allocations (as long as a vast majority of them die young).

The problem is, none of those things help when you have a large cache of small(ish) objects that need to be scanned fully at every GC cycle. There's nothing worse for your performance than stopping the world for a few seconds and using that time to wipe your L1/2/3 caches with completely useless data. That's the reason I called large-heaped caches the anti-pattern for a GC'd system.

Brad's answer about using large byte array allocations (and presumably, managing the smaller chunks manually) is a fair enough answer to this question. But of course in this case you'll be writing code to manage small allocations out of that big buffer yourself, thus negating the whole point of GC. But it might be a decent enough compromise if you like the rest of the language a lot (which Brad clearly does :-)


The easiest way would be to store everything you can on the stack. The Go compiler does a very good job at escape analysis and tries to put everything on the stack that does not escape. Therefore, it's often better to use output parameters instead of return values to reduce (or eliminate) the number of allocations.

Some careful thoughts about the memory layout of your structs, especially which of them should be embedded and / or passed around by pointers and which of them shouldn't, might also pay off.

Another common optimization is to put the allocated objects back to a memory pool for later use. Take a look at the bufCache channel [1] from the bufio package for example (the http and the json package are using the same trick).

[1]: http://tip.golang.org/src/pkg/bufio/bufio.go#L79


Go's garbage collector sucks pretty hard for large heaps. A 16GB heap that's mostly live objects will take tens of seconds per round, during which no goroutines will run (it's a stop-the-world collector). Basically it's 15 years behind Java in this regard.

If you're worried about this, stick with C++.


I couldn't find in the code where the objects are actually stored, but theoretically there's nothing stopping you in Go from allocating yourself up an array of X bytes (for X in the mega or gigabytes) and completely managing it yourself, leaving the GC merely for incidental activities, which themselves can be minimized through various design patterns and some care.

It is true that you should describe Go as a garbage-collected language, but unlike a lot of other such languages, Go has C-like mutable arrays readily available. In practice it's more like a sort of hybrid, in that even though it's garbage collected you have a lot of opportunities to write code that still doesn't really use it, without having to "drop down to" C or something. I still hope to see some improvements in it before I could make a big commitment to it, but I've certainly got my eye on it.


>I couldn't find in the code where the objects are actually stored, but theoretically there's nothing stopping you in Go from allocating yourself up an array of X bytes (for X in the mega or gigabytes) and completely managing it yourself,

Nothing stops you, except: - The determination NOT to do something the computer can do - Previous exposure to a proper, modern GC - The fact that you used Go to get away from this kind of shit in the first place


I did not mean this is a technique you'd use in general. I meant specifically in the context of groupcache, where an array of bytes that you manage yourself is a reasonable approach to the problem.

If you don't want to do that, don't. I'd never start the prototype with that functionality, for instance. But when you discover that you need it, Go permits it in a way that Python or Perl do not. Go is GC'ed and you are free to use that to the extent you want, but it's easier to escape from the GC than it is in many GC'ed languages. It may not be obvious from a casual reading of the spec, but there's a lot of ways around garbage in Go. Not like, say, Rust, and certainly not the same level of Raw Unstoppable Power as C++ (at a corresponding huge complexity cost), but it's a very interesting middle ground.

Note that if you were writing groupcache in Java, you'd probably end up doing the same thing and just allocating an expanse of bytes which you manage yourself.


"Previous exposure to a proper, modern GC"

I've often heard people say things that imply that "proper" GC's don't behave like whatever badness I'm experiencing currently. Can you give me an example of a Garbage collector that I could safely use for a large heap of small cached (ie. long lifetimes) objects without a significant throughput penalty and with deterministic latencies (ie., no random full system pauses) over, say, a tcmalloc based hand-allocated/free'd system ?


A 16GB heap that's mostly live objects will take tens of seconds per round

The situation you describe, as you describe it, is (no longer - was it ever?) a pathological case. I run stuff with 12-20GB active objects and a few dozen http hits a second and the only times I get a latency over 100ms is when I reload/refresh some data from disk (1.5GB of gob), then it goes to 150-250ms typically.


Is this figure for Go 1.0 or 1.1? The latter is said to have a much-improved parallel GC. http://golang.org/doc/go1.1#performance


It is still stop the world.


Basically it's 15 years behind Java in this regard this is fanboysm or just ignorance.

It does not matter how sophisticated GC code is when you have millions of tiny objects from from crappy Java code with useless copying and referencing. This is why it uses so much memory.

It is very difficult to find any worse thing that Java in terms of wasting of memory. Take any Hadoop none and measure the ratio between memory used by Java processes and amount of actual data stored in memory. Near factor of 2?


>It does not matter how sophisticated GC code is when you have millions of tiny objects from from crappy Java code with useless copying and referencing. This is why it uses so much memory.

Or, you know, you can avoid writing "crappy" code. Whereas with Go's GC, even proper code will have problems.

You are arguing something that is beyond discussion: Java's GC is better than Go's.


Or, you know, you can avoid writing "crappy" code. - In Java - hardly. It requires a discipline and skills beyond capacity of a typical developer. This phenomena could be called "memory is cheap and JVM will handle everything".

Even in Clojure code, for example, they implement a node of a binary tree as a hash-table which is a meaningless waste. Position based selectors would be good enough.

Java's GC is better than Go's - yes, in theory, on paper. In practice well-written Go code could outperform typical Java code, even with less sophisticated GC.

Of course, I cannot prove that, but there are some intuitions to support such claims.


I really wouldn't call this an alternative. If you are running memcached, it's very unlikely you can switch to Groupcache.

Parts of your application may rely on the expiration feature. But the biggest change is the inability to overwrite a current cache key. Every application I've used does this constantly (object updates).

Groupcache in its current form is useful for a very narrow set of applications.


In Rails 4 with russian doll caching, when data or the view changes, the key changes, so rather than expiring keys, you just don't use them again. So you have no real need for expiry or overwrites, as when data changes, the key changes with it (I think they use the object modified date + hash of view template). As far as I can see groupcache would be compatible with that caching scheme, or others like it.

The only bit which could be tricky is if you want to clear the cache, I don't see a way of doing that from a quick look at the API. But to clear the cache you could have a special version key hashed into each key whose sole purpose is to empty the cache, which you bump whenever you want to remove all keys at once for whatever reason (in a perfect world, this shouldn't be necessary of course).


The intention with Russian doll caching is that the cache uses LRU, so you don't need to worry about clearing the cache. No poison pill needed. Memcache operates this way.

Given that there's an 'lru' folder with a package that says 'implements a LRU cache' as the first comment, I assume this works that way as well.


Yes in a perfect world you never need to clear everything, but sometimes it's useful for testing or to allow end users to clear the cache in case of caching bugs.


I would think that the real limiting factor right now is that it is only available for Go applications (its a Go library, not a standalone server). There are comparably very few Go applications in the wild that could use this... where a memcached server can interface with just about anything. Or am I missing something?


Groupcache being a library is what made me so excited about it in the first place.

I'm writing a torrent tracker in Go that could make great use of a distributed cache. This would enable you to put an arbitrary number of "tracker nodes" behind a load-balancer.

One of the goals of my project is that its easy to configure: everything except the RDBMS is hosted inside a single binary and configured with a single file. In the simplest case: that binary is responsible for two web-servers, a process-local cache, and a DB connection pool. (In more complex cases, each binary simply acts as another node behind a load-balancer; and they attempt to cooperate together.)

It's useful to avoid hitting the disk by having the tracker cache torrents that are active. When you start adding more tracker nodes, this is where having a distributed cache would be nice.

Without groupcache, that means I've just forced my users to install and configure memcached. With groupcache, the configuration and server are simply part of my application. (As an added bonus, since Go is statically linked, I haven't even added any external dependencies on shared libraries.)

---

I agree that the use-cases for groupcache are far more limited in scope, but I was still glad to hear that this was a library, and not a standalone server. I think we need _more_ distributed caches implemented as libraries, not less.

There are many use-cases where it may not be preferable to have a separate caching server. In my career: getting approval to install memcached on a customer's server could add _weeks_ to our deploy time simply because of red-tape; in my personal programming endeavors: setting up memcached is just an added layer of complexity that could easily be avoided by using libraries instead of servers.

I'd argue that we should leave memcached [more specifically: standalone distributed caches] for the more complicated scenarios where a dedicated solution is called for. Bring on the distributed-cache libraries where a much simpler, ad-hoc solution would be more beneficial to end-users.


I suppose you could write a small Go app to use the groupcache library, and call that Go app from your non-Go app co-located on the same machine.


So your key is key + transaction ID, where the transaction ID is simply a value that changes when the data changes. You search the cache with the current value of the ID, and if the data changed the cache entry will be regenerated.

An example of a good transaction ID would be SHA1(data).


Values that must expire could perhaps use a truncated time value as part of the key? It's not as flexible as regular expiration, because the entries would only expire when you cross the point in time where the truncated time value changes, which could be a problem in a lot of applications (sudden surges of requests every x seconds)


[deleted]


Read the article for reasons for using groupcache.


I want to use this, but since the keys are immutable, how can I store data like sessions which can change and would sometimes have to be invalidated from the server side (i.e. you can't simply change the session ID in the cookie and use a new cache entry, because bad-guy could still be holding on to an old stolen session ID)?

In general, how can one learn to think in an immutable fashion to effectively exploit this?


The simple solution is to always version everything. There's no such thing as an update, merely a new version of a thing.


So without any mutable storage, how do you verify that the version that the client requested is the latest version?


If the client has an old link serve an old version. But have links to what will be future versions and make the client walk them. It is doable but different. Lots of stuff is static anyway, like CSS, so you can use for this and have a different process for stuff that varies.


But as the grandparent said, serving the old version may result in a security vulnerability. There are cases where you MUST serve the latest version, and the latest version only.


Then don't use groupcache.


That much is obvious.


The general approach is to store initial values and then changesets to those objects.

For example git has loads of immutable data (every commit, tree and blob is immutable), and only very little mutable data (the refs) that point to some of those immutable objects.

You simply wouldn't use groupcache for session data; you'd use memcached.


As others said, groupcache isn't useful for session storage. memcache was used for a lot of things other than caching, because it was a very versatile hammer and a lot of things looked like nails. groupcache is for when your data's immutable and bandwidth used for the most popular keys might exhaust a single memcache server's pipe.


This isn't for sessions unless they're also backed up somewhere else. It's just for caching. You would need some other system where you look up which version of a session is current, then look up sessions in the cache by a unique version id.


That "other system" could and probably should be a database.


Am I reading this as a distributed, immutable, weak hash table rather than what one would consider a 'cache'?

Mind you, doing so avoids the hardest parts of caching (and especially distributed caching, which otherwise begins to underperform around ≥ 5-7 nodes), so I can see significant upside. No surprise stales, distribution update clogging, etc.


I noticed this when he talked about speeding up the Google download servers. Very interesting :)

Its an alternative to memcache but not a direct replacement. I hope he adds CAS etc.

I hope they start using the kernel's buffer cache as the backing store, or explain why its not a good idea: http://williamedwardscoder.tumblr.com/post/13363076806/buffc...


Apparently there is no CAS by design, this allows groupcache to replicate hot keys on multiple hosts.


Yes, but if it was opt-in...

Its nice if you can use groupcache as a memcache replacement even if some use-cases tie the hands of the implementation e.g. replication of hot items.


I've been thinking about something similar. I don't see how timed expiration would conflict with the two most important features - the filling mechanism and the replication of hot items. Am I missing something that would make timed expiration impossible?


Yeah on the first pass of the problem you seem right.

The CAS must have an authoritative node (my mind wanders thinking about replication and failover) but the key it protects - with the version baked in - can be replicated surely?


CAS is incompatible with the distribution architecture, which uses a best-effort distributed lock in lieu of e.g. a strongly consistent distributed state machine. It would require a lot more work.


If you have a bug in another application on the server running the cache that causes it to grow its memory use, your cache would suddenly disappear/underperform, and the failure could cascade on to the system that the cache is in front of. If instead you let the offending program crash because the cache is using regular memory, this would not happen. Just a thought.


If the cache uses regular memory, you hit swap...

I think the failure mode of a system-level cache is rather better than per-application islands that can conflict.


If you hit swap, again only the offending application or instance is punished, not everyone else (for instance by pummeling a backend database server that other services are using as well)


Actually not to my thinking:

If program A hits swap, it means that cold pages are written to swap so that A can get those pages; this initial writing is done by program A, its true. But A may not be the cause of the problem, A is just the straw that breaks the camel's back.

And those pages that got written to swap likely belong to others, and they pay the cost when they need those pages back...

In my practical experience, when one of my apps hits swap, the whole system becomes distressed. It is not isolated to the 'offender'.

You can of course avoid swap, but with your OS doing overcommit on memory allocations, you are just inviting a completely different way of failing and that too is hard to manage. You end up having to know a lot about your deployment environment and ring-fence memory between components and manage their budgets. If you want to have both app code and cache on the same node - and that's a central tenet of groupcache - then you have to make sure everything is under-dimensioned because the needs of one cannot steal from the other; your cache isn't adaptive.

That's why I built a system to do caching centrally at the OS level.

I hope someone like Brad is browsing here and can make some kind of piecing observation I've missed.


I understand google solves that problem by not enabling swap on their servers.


That's rather common. If swapping can harm your application, than don't swap. On a machine where slowdown is tolerable (temporarily, on a desktop), swap is fine. On a machine whose entire purpose is to serve as a fast cache in front of slow storage, swapoff and fall back to shedding or queuing requests at the frontend.


Without any specific knowledge of Google's practices, I can say this is certainly true - this is standard nowadays.


That is my experience as well. In my thought experiment the 'offender' would be a server instance, not a process running among other applications on a single machine. Applications that hit swap often have memory leaks, and hitting swap is then just a matter of time. Creating a cascading failure may be preventable however.


http://talks.golang.org/2013/oscon-dl.slide#46

> 64 MB max per-node memory usage

So this is best used as a LRU cache of hot items.

It doesn't compete/replace memcache comprehensively, but it does attack the use of memcache as a relief for hot items.

I can see me mixing my Go programs with both groupcache and memcache.

Edit: I have glanced through the code and cannot see where the 64 MB per-node limit comes in. Anyone see that?


It's a configurable limit. 64MB is just what he set the limit to for that particular group, that's what the "64<<20" in the NewGroup call is for. You could configure it to use many gigabytes.


Ah, that's what I was not seeing when I was glancing through the source.

In that case... for me it fully replaces memcache for how I'm using memcache today.


The fact it doesn't have cache expiration times or inc/dec discards it as memcache replacement, at least for me. Well, and the fact that only go is supported at the moment.


How does its sharding by key algorithm work and how does it handle adding new peers? I was looking for at in the source, but couldn't find anything related to it.


You need to implement that part yourself. Look in peers.go. It'll be dependent on how you discover peers in general.



I'm a bit confused about how you use this system with immutable keys. At face value it's a great idea, but I need a simple example of how's it is used to say retrieve a piece of data, then later update that to a new value.

Is this anything like how vector clocks are used, where the client uses the clocks to figure out which is the right state in a distributed system?


The use-case for which this was originally designed was a download server: the values were chunks of files, each of which gets its own unique identifier. If you want to update a file, you simply create a new file for your new version, and point clients at it instead of the old version.


I like the idea, but it seems like it would make deployment a pain. How do I spin up a new server without rebalancing and/or restarting the world? Not to mention that now when I do need to restart the world, I can't do so without also clearing my cache.



How does this compare to Hazelcast[1]? Seems like the same idea, but far less features?

[1] http://www.hazelcast.com/


Hazelcast is incredibly buggy. Avoid at all costs.


Got any links\talks\papers on this as a dev group at my work just started using it.


Just personal experience, sorry.

* it deadlocks. Often.

* it often doesn't recover from network partitions. Symptoms include both sides of the partition never merging, and deadlocks. Recovering requires rebooting the entire cluster at once. Due to their locking strategy, a rolling reboot of the cluster isn't enough.

*some of the developers didn't seem to demonstrate understanding of races, or distributed systems. When I reported racing bugs, they asked if I could attach reliable unit tests.


It'd be interesting to compare this alternative to other solutions like Redis.

Has anyone used both/either?


Redis is a data store and is limited by RAM. Groupcache is a distributed cache. They are not comparable?


If I were to speculate, I would say that Redis is probably faster.


TIL that waiting for a network roundtrip is faster than accessing an in-process hash map directly.


How do you set an item's TTL? How do you set the maximum size of the process's cache?


> How do you set an item's TTL?

Values stored are immutable, and expire when they are no longer used. There is no concept of a TTL.

> How do you set the maximum size of the process's cache?

The cache is broken into a set of named groups, each with their own separate cache and cache-filling method. The size of each group's cache is set when created via NewGroup[1].

--

[1] http://godoc.org/github.com/golang/groupcache#NewGroup




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: