Hacker News new | past | comments | ask | show | jobs | submit login
Writing a very fast cache service with millions of entries in Go (allegro.tech)
180 points by janisz on March 30, 2016 | hide | past | favorite | 88 comments

From the article:

> [Go] also has managed memory, so it looks safer and easier to use than C/C++.

But most of the post describes a sophisticated way to work around the garbage collector, totally reliant on a specific implementation detail of the current Go GC (skipping of pointer-free data types), documented in a GitHub issue. It seems easier to not have, or to not use, the GC in the first place for this specific project if most of the engineering effort is going to go into an elaborate workaround for it. In particular, I don't understand the reason for rejecting offheap: the article suggests "a cache which relied on those functions would need to be implemented", but surely this is less complex to implement than the cache that relies on hiding what are effectively pointers behind offsets so the GC won't think to scan them.

(This isn't a suggestion to not use Golang at all, to be clear. Nor do I mean to suggest that GCs are bad things in general.)

Maybe this type of program is better suited for a language like Rust.

However, while not having a GC, you have to take care of all memory issues in a way that you convince the compiler that your won't ever blow up.

It would be very interesting to have such a comparison, so we could see whether it's easier to work around the GC, or easier to write bullet-proof code with manual memory management.

I'd expect the Rust to be more robust with regard to performance, but the question is: Would it also pay off in term of code complexity?

I don't want to argue specifically that Rust is the solution here, because I think using Go with offheap is a perfectly fine solution.

But I do think that the code complexity for a trivial hash table with striped locking can't possibly be higher than the complexity of serializing everything into a giant byte array and maintaining offsets into that array into a map which you make sure doesn't have any pointers in it.

I'm not sure how the current implementation of Go handles it, but its spiritual relatives Modula-3/Oberon handled this quite well, with a GC for most occasions and ways to bypass this with "unsafe" modules that allowed for untracked allocations/deallocations and pointer arithmetic. It's not really an either/or situation by (language) definition...

Yes, it is quite sad that the massive industry adoption of Java and .NET forced those languages into extinction.

Although from all mainstream languages, C# is probably the closest in spirit to Modula-3.

I'm puzzled as to why all mature GC doesn't have "permspace."

Go's GC is non-moving and therefore cannot be generational.


Nitpick: you don't need moving GC for generational GC. But you lose most (but not all) of the benefit of generational GC without moving GC.

if you understand functional paradigms, rust is actual easier to understand in larger programs

I confronted a similar caching requirement (in my case the cache needs to be much larger) in Java recently and chose to implement off-heap for some of the reasons you mention. It avoids GC and heap size concerns entirely and makes it easy to tune the rest of the application's GC profile. Systems handles 45k writes/sec and about double that for reads with very low latency minimal CPU. Implementing concurrent writes/eviction without typical Java concurrency controls was a bit tricky though.

Project is here for those interested: https://github.com/Maascamp/fohlc

Wondering what your use case is for such a large cache that is not clustered (unless I misunderstood the code). Both in terms of single process dying = lost 10gb of data and also what single process needs a 10gb cache. I do see ability to read from disk listed.

How would you compare your cache to open source data grids that also provide off heap (infinispan, geode/gemfire, ignite) or just other general cache solutions (redis/memcache)?

The use case in mainly deduplication of a high volume data stream (though it's got a few other uses). The write volume is fairly stable so it's sized in such away that we'll never emit dupes even when the upstream source crashes and needs to be rebuilt from backups (for this case that means > a billion cache entries). Something like the opposite of a bloom filter (https://www.somethingsimilar.com/2012/05/21/the-opposite-of-...) didn't work because we don't want false negatives either. Since the cache is fed by a Kafka log HA is achieved simply by having multiple consumers individually populating their own cache instance. The persistence mechanisms are to allow for code deployments that don't blow away the cache, not HA.

We actually experimented with grid caches (ignite in particular since it offers off-heap in memory storage as well), but the performance simply isn't there. At the volume we're writing even millisecond latency is a non-starter. We did explore both memcached and redis, but we need strict FIFO and both of those solutions provide nondeterministic LRU.

Now that I think about these issues, I'm wondering now how performant AllegroCache is.

I think the problem with using offheap for something like this is that it requires you to specify the exact size of the data you want to put in each "cell" of the hash table. Take a look at the NewHashFileBacked function in https://github.com/glycerine/offheap/blob/master/offheap.go. If you wanted to cache a variety of json data this could lead to a lot of unused space if most of the items are smaller than the cell size. So, their large byte array plus offsets solution seems preferable.

I do agree with you that perhaps Golang was not the right choice for this project since they spent much of the time working around features of the language.

I still don’t get why people try writing their own data store, especially in a language that's simply not very well suited to that task (and we're an almost 100% Golang shop here). Seems to be a rite of passage.

The requirements are literally <100LOC of wrapping an HTTP interface around Redis with using TTL - which has been battle tested for years and is both rock solid and ridiculously fast.

Public service announcement: Don't write your own data store. Repeat after me: Don't write your own data store, except if you want to experimentally find out how to build data stores. It's arbitrarily hard and gets even more so at every layer. Plus, you leave behind an unmaintainable mess for the people after you. There's already a great OSS data store optimized for every use case and storage medium I could possible imagine.

There are two datastores I would really like that don't exist.

1) An efficient, high-performance distributed persistence store for arbitrarily large CRDT's.

2) A Kafka-like highly-available distributed binary log w/ cheap topics, that doesn't require external coordination, and doesn't lose acknowledged writes (which I'll happily give up any shape of linearization guarantee for).

Riak does CRDTs (not sure about arbitrarily large), though I never understood their reasoning for requiring that CRDTs be pre-registered in a schema — it's a major drawback in an otherwise schemaless K/V store.

But they're also required to be relatively small, because of how they're treated on disk and inside the system as opaque at the storage layer. As a result of this, and a smattering of other reasons, they're also not very fast.

can't help with 1), but as for 2), maybe Kudu is an option for you? http://blog.rodeo.io/2016/01/24/kudu-as-a-more-flexible-kafk...

Uses Raft. Not highly-available enough unfortunately. Availability > Consistency for the use case that matters to me (basically a service message bus where all ACK'd writes will eventually be read by some consumer(s), but I don't care when or in what order, just that they all eventually get consumed, and where as long as any node in the cluster is up, it'll ACK a write).

> just that they all eventually get consumed

Unless you have some sort of replication or consensus, this is still just best-effort delivery. For example, the message could be acknowledged & fsynced followed by drive death. At scale, this would happen.

However, with asynchronous best-effort replication, broker death would be much less likely to lead to a loss of messages.

In general, it's all about playing the odds. Acknowledged writes can still be lost in the best of systems if the entire DC catches fire.

If you tune Kafka settings, you essentially get this behavior. You have to ensure ZK is up, but it will allow all topics to be published to so long as a single broker is alive.

If I tune my Kafka settings I can't get this behavior.

The closest I can get is allowing sloppy leader election, but that will definitely, definitely lose writes that have been ACK'd by a stale leader who comes back and has to become a follower. So I can get Kafka to basically accept all writes regardless of level of cluster degradation, but I can't keep it from dropping some of those writes silently onto the floor.

> Uses Raft. Not highly-available enough unfortunately.

Did you test this or is it just a general view? What sort of write rate are you looking for, per topic?

Raft requires both a stable leader and a majority of replicas to be reachable in order to advance the consensus protocol. I'd like to be able to lose more than a majority of replicas and require no leaders because I don't care about consistency.

I care that as long as even one node in the system is alive, it will accept writes, and those writes will eventually propagate and converge on the union set of all writes ever observed, without obliterating any ACK'd observed writes, as the system as a whole moves toward a less degraded state.

I'd want topics themselves to be able to be basically "free" to create, and the write latency to remain stable over arbitrarily large datasets. Write rate isn't really much of a consideration since it's almost embarrassingly low (hundreds per second).

> I don't care about consistency.

Fair nuff, I asked because I was curious if you were assuming that eventual consistency was necessary for high availability.

> Write rate isn't really much of a consideration since it's almost embarrassingly low (hundreds per second).

Yup, looks like my concern above is the case. A single consensus replica set will do that no problem. I'd use raft if you're in a single geo location or epaxos otherwise (the epaxos source is a little early and needs eyeballs/production diversity testing, but the algorithm is absolutely solid). There's no reason to borrow all the troubles of CRDT's when it's not needed.


Example from the epaxos paper, this is latency vs throughput for 3 replicas in one geo location: http://charap.co/wp-content/uploads/2015/10/latency_over_thr...

What happens when I don't have a majority of Raft replicas alive?

Then raft locks until you do again. We live in the era of on demand infrastructure. With even just 3 replicas and some reasonable automation it's extremely unlikely you'll see any unavailability. Or run 5 way across two different systems (ie 3 at google, 2 at AWS). That setup is less than $100/month to run now.

You seem extremely concerned about a fault tolerant system experiencing more faults than the majority, which I think is quite easily dealt with, and utterly blithe about making inconsistency your best case.

At this kind of request rate I'd honestly just shove things in postgres and call it good. There's just no reason to build out a bunch of CRDT complexity when you don't _need_ it.

The system that's my benchmark to go up against has seen 6 seconds of downtime in 25 years.

Does that give you a better perspective for my availability concerns?

I also don't have "on demand" infrastructure. I have to deploy to a fixed footprint of bare metal hardware.

I'm "utterly blithe" about inconsistency being my best case for a variety of reasons:

1) I know my use case extremely well.

2) I know that within that use case there are several places where availability is the #1, #2, and #3 priority because the system being unavailable for any time at all has a non-trivial direct financial impact.

3) I know within that use case where there are necessary points of consistency and how to move them out of the critical path that requires extreme availability.

4) I know exactly the problem that CRDTs solve for me, and exactly why I need them in this case, and how to avoid them with simpler immutable/idempotent write patterns where possible.

5) I have to be able to tolerate very high proportion of the infrastructure experiencing temporary and/or permanent failure because the places I have to deploy into are no where near the level of reliability of even a low-mid tier data center in the US. Janky telcos, natural disasters, malicious operators, and flimsy power delivery are the norm.

And what consistency assurances does it provide?

Edit: I see you edited in a longer comment after my question above. I was just curious, not looking to pick a fight. Suffice it to say we have very different perspectives on how to attack those requirements. I'm bowing out now, cheers!.

The problem is this:

1) Anything that is suggested which avoids data loss via consistency models is going to reduce my availability to unacceptable levels.

2) Anything that is suggested which keeps my availability up where it needs to be and doesn't use CRDTs, and/or a sufficiently similar semantic guarantee, is going to silently lose data due to "lost-updates" (ACK'd writes that get obliterated during conflict merges and are now gone forever).

At a minimum I need sibling behavior for conflicting writes to ensure that some half-baked LWW strategy that's barely better than "pick at random" doesn't kill my data (which has very high regulatory and audit requirements around it), and then taking that a step further CRDTs, implemented correctly, provide me a nice interface for interacting with multiple version conflicts and resolving them in coherent ways without forcing me, or my other developers, to have to think up case-by-case resolution strategies and verify their correctness.

There really isn't a tremendous amount of complexity to using CRDTs. That's the point of them. They're easy to use. The complexity I'd really like to avoid if possible by avoiding CRDTs is the cognitive burden my team needs to have around distributed data and all it's horrible nuances to satisfy our use case. The knowing when and why to use them, not so much the how.

"At this kind of request rate I'd honestly just shove things in postgres and call it good." is very, very far away from a workable solution because my problem isn't request rate. The number of ops/sec could be hundreds or millions, and it doesn't really change my actual problem. No matter what the request rate is I could just write to a bunch of postgres instances that have a chash ring overlaid onto them... except now I have to deal with my actual problem still, which is redundancy, and local consistency in any postgres replica doesn't guarantee me much, I can't introduce a strong coordinator like a consensus protocol over the top of them, because it dumps my availability, and so I have to figure out how to safely deal with merge conflicts between the replicas... now I'm stuck figuring out how to build CRDTs on top of postgres anyway. So... what exactly did I win?

(1) Spanner does exist, though not really available outside of Google

CockroachDB is probably the closest system to spanner that I've come across (and built in Go)


Yes, Spanner does exist. Spanner also does not satisfy what I said in #1.

Can you elaborate on this?

Well, for starters Google Spanner has almost nothing to do with CRDTs.

For another it would be really weird for Spanner to expose something CRDT shaped since Spanner is a strongly-consistent distributed database which depends heavily on global order and read/write locks driven by meticulously coordinated multi-site global wall-clock time and consensus groups.

In some ways it's almost the opposite of what I want... with exception of the fact that it's also distributed. I want extreme availability with lazy, weak coordination.

Read/write locks...

No, it's slightly more lazy in that that. It does timestamp coordination, but retrospectively.

You mean except for the lock table that's maintained by every paxos leader?

Not sure if this applies to their use case since they mention FIFO in the context of it being a simple eviction policy, but if you _require_ FIFO semantics then both Redis and memcached are out of the question since they use nondeterministic LRU policies (memcached's LRU is particularly egregious in its nondeterminism due to the way slab allocation works).

this is rarely a concern for things that are ephemeral in nature to begin with. it's also a 20 line code change to add a similar ttl buffer into memcached.

I used Redis with Go and I was very happy with the results. I did have one case where I used boltdb.

I run things on a lean vps, and I had a large amount of data I tried to stuff in a trie. Well it worked fine on my macbook with plenty of memory, but it ran out of memory on my vps. Using bolt helped in this case with the memory mapped file.

Can you explain further why golang is bad for data stores, do you mean cache in particular? Seems contrary to many of the new datastores that I see being developed: etcd[1] , cockroachdb[2], influxdb[3] just to name a couple that are written in go.

[1]https://github.com/coreos/etcd [2]https://github.com/cockroachdb/cockroach [3]https://github.com/influxdata/influxdb

Go is awesome, but has very few facilities for effectively working very close to the machine and memory - you can't skip the runtime or GC. Of course, this mostly affects high throughput, high mutation rate, high memory use data stores. Eventually, you'll end up fighting the garbage collector all the time. ETCD isn't directly affected as it isn't so affected as it's not really about larger data or performance, cockroachdb is still more proof-of-concept of a pretty gigantic (though promising) architecture and Influx has honestly been an imperformant mess for us in Prod so far. It's not that it's not possible to write a decent data store in Go, but eventually, as your GC runs wild from the purposefully simple and naive stdlib implementations of maps for example, the neccessary solutions of aggressive pooling or mmapping off-heap space and writing your own allocators for it (while constantly casting something from and to *unsafe) will suck you deep down the rabbit hole and wish you had written it in C(++) or Rust in the first place.

Sort of came here to say this. Interesting article, especially some of the design choices (such as http for the interface), but I am most curious as to why redis, or memcached, or some other k/v store was not good enough?

I wanted to like this article. But this:

> Considering the first point we decided to give up external caches like Redis, Memcached or Couchbase mainly because of additional time needed on the network.

Even mysql+innodb can easily handle 10k read/write queries against a simple pk table with "millions of entries" on desktop-grade hardware.

If the article was "here's a fun experiment to make a cache server in Go", fair enough.

That sounds a little bit harsh (even if I ignore the claims of unsuitability of Go for that purpose). What if I want to write, for example, a Go port of DataDraw? Perhaps with profiling-based memory layout optimization thrown in? And maybe even caching/lazy/caching+lazy collections/maps. If I were obeying your recommendations verbatim, I'd probably still be using mainframe libraries.

> Don't write your own data store

I wish more people had this attitude. The amount of times I have seen people re-implementing something that PhD's have been perfecting since the 70's... it's cringe worthy.

I wonder if I'm missing the point here, but why are Redis or Memcached—implementations of exactly this service which are battle-tested and well-used—not suitable due to "additional time needed on the network", but this service is suitable? Is it just down to the requirement for a HTTP API?

One thing I've noticed is an extreme demand for making internal services available over HTTP. It has it's benefits, but the obvious downsides are the overhead and complexity of HTTP being totally overkill for things like a key-value store of this nature.

I had to read the relevant sections three times to sort this out. I believe what they are saying is that this service they made had to speak HTTP in some specific way, and since Redis doesn't speak that way directly they would have needed to proxy requests in their service to Redis which would mean one network hop to reach their HTTP service plus one hop to reach Redis.

Of course, Redis and Memcached support Unix domain sockets which do not use the network and do not suffer from the overhead of TCP. The authors do not address this at all, suggesting they weren't aware of UDS, nor the fact that TCP within one Linux host does not touch the network at all.

Adding to the confusion is that even given an in-memory cache library which overcame their objections to the "network" based ones, they still elected to write their own low-level cache.

So the comment about time needed on the network was either spurious or misinformed. And one thing it was not: measured.

Quoting redis docs (http://redis.io/topics/benchmarks):

> When the server and client benchmark programs run on the same box, both the TCP/IP loopback and unix domain sockets can be used. Depending on the platform, unix domain sockets can achieve around 50% more throughput than the TCP/IP loopback (on Linux for instance). The default behavior of redis-benchmark is to use the TCP/IP loopback.

Haven't measured myself though.

I was wondering the same. I.e., how this might compare to an NGINX module talking to a caching process on a unix socket.



This has the added benefit of being able to take advantage of other nginx features. For example, you could set different permissions on `set` and `get` routes.

Agreed. Redis is much faster out of the box with minimal tuning than this solution.

I was wondering about the same

Redis eats million of entries for breakfast, is pleasant to work with, has TTL expiration of keys built in and is available as a managed AWS ElastiCache service when you get into serious data sizes: up to 32 core 237 GiB nodes, and then you shard to add more. Redis is also super as a local cache, and simple to deploy and manage together with the app that uses it.

Since you obviously ran some quick benchmarks and concluded that running it locally over a unix socket (confused why you would mention "time needed on the network"... you tested with local sockets, right?) was too slow, you should at least let Antirez know you've run into a new mysterious performance bug ;) Writing a cache service can be a fun side project, but I doubt you gained anything by doing so except another homegrown part to maintain.

redis is single threaded, so 32 cores doesn't mean much without sharding.

I generally prefer memcache unless you have a super locked-down infrastructure (no engineers to deploy a KEYS operation that destroys a shard and all the systems that rely on the data inside until it's finished). Multithreaded + simpler API is great for multitenancy when you have to provide infrastructure to engineers who don't want to learn about infrastructure.

Why did they have to invent "a very fast cache service with millions of entries"? Are they the first company to ever need one so they had to write it? Can't Redis or Memcached be fast (despite the "additional time needed on the network" — even though Redis uses raw TCP for transactions, while this uses HTTP + JSON)?

As others pointed out, Go is also a very poor choice if you need to work around every part of the language.

Probably because they wanted full control and understanding of the source code. Large companies often prefer to develop things in-house, even if there already exist good alternatives.

Large companies often prefer to develop things in-house, even if there already exist good alternatives.

Fair point. Except they are a small consulting company.

Nope. We are not :) We are e-commerce platform - part of Naspers Group: http://www.naspers.com/page.html?pageID=3

Sorry, I stopped reading at 10K rps and p50 of 5ms. In this day and age these numbers are pretty bad, especially for a cache where presumably all accesses are constant time. Every single caching solution listed out-performs this handily.

I had the same feeling when reading this article. That's nothing like "high performance", especially the 5ms median latency. memcached is a whole two orders of magnitude lower in terms of median latency.

> Considering the first point we decided to give up external caches like Redis, Memcached or Couchbase mainly because of additional time needed on the network.

Uh... we regularly get performance better than what the OP has described in their "needs" with Redis. It can run in memory - on the same machine and has plenty of HTTP frontends you can work on (not a lot of networking).

So essentially, to meet their requirements, they had to work around the Go garbage collector and use a non-standard HTTP server and JSON parser. Why not just write it in C++?

> Why not just write it in C++?

No need. Someone already wrote Redis.

Redis is written in C, afaik.

Correction: they thought they had to. I suspect that an in-process solution would avoid the HTTP and JSON issues, and better implementation of the store itself would avoid GC issues.

Can you expand? Not knowing anything about Go, what's an 'in-process solution'?

One of the points of Go seems to have been the idea of making a Go process a lot like a mini-Unix, with many small programs plumbed together and communicating through pipes, only strong-typed ones (and often by passing ownership of data using pointers, so that you don't have to actually copy stuff). Which uses zero HTTP and JSON traffic, which is faster than even the fastest HTTP and JSON library could ever be.

The reason why Go doesn't (currently?) have a super-performant HTTP server and a super-performant JSON library is that the designers probably didn't envision someone using the language like this.

Well yeah obviously, but the requirement was to make a networked cache server. Obviously if they'd cache on the machine that was generating the data it would be faster, regardless of language.

"is that the designers probably didn't envision someone using the language like this"

They didn't envisage that the HTTP server would be used to, you know, serve HTTP requests?

Golang's HTTP server is pretty performant: https://www.reddit.com/r/golang/comments/28so0e/go_networkin...

In this case, while that is a microbenchmark, it's a relevant one; both things are basically measuring "what's the minimal cost for a web request"? I could still quibble around the question of routing, but generally "within 2x of nginx" is good enough for most uses, and generally, for any non-trivial use of either nginx or Go's web server, you're going to dominate the cost of the HTTP request with your processing. (Considering how nasty the inside of nginx is and how nice the Go HTTP server looks, that's actually surprisingly good performance. And I don't mean that "nasty" as a criticism; it is what it is for good reason.)

(That said, if my back was against the wall performance-wise, I'd seriously consider looking at my incoming requests, seeing if there's a strong enough pattern in what's going on, and writing myself a psuedo-HTTP server that isn't actually an HTTP server, but just looks like one, skipping as much parsing as I can on the way in, and emitting a lot more hard-coded stuff as headers on the way out. I've never had to do this yet, but it's an option I'm keeping in my pocket.)

As for JSON, well, people generally conflate "parsing" and "marshalling" with JSON. JSON parsing is so drop-dead easy that one skilled in the art can write a decent parser in just a day or two; it's a good format that way. However, the task of converting a parsed JSON representation into local data types is actually surprisingly subtle, and any mature language will almost certainly have at least two if not more JSON marshallers that work in fundamentally different ways.

There will generally be at least one built for raw parsing speed, but will always hand you back a very generic data structure that has none of your application-specific types in it. There will be one built for really, really convenient marshalling and unmarshalling of your application-specific types, but it'll probably be significantly slower, and make certain decisions that will mean it can't be used safely on arbitrary JSON... i.e., if there's something that may be a string but may be an object, this library will range from inconvenient to impossible to use. And there are other valid cost/benefit points; the JSON marshaller that loads the JSON into memory and deparses it with nearly-0 additional overhead by re-using the original byte buffer intelligently, the JSON marshaller that can build up specialized parsers with code generation at compile time even if you're in a dynamic language, the JSON parser that for better or worse permits a certain amount of sloppiness to deal with sloppy emitters, etc.

Go's default encoding/json is the one built for convenience of application-specific types that can't handle or emit arbitrary JSON easily. As a sideline it can also do the generic raw parsing if you pass it the correct type to start with, but I believe it's paying some overhead vs. something custom written for that. I'm pretty sure they know they made this choice; all that stuff I described was pretty clear by the time Go was being written, that it is basically impossible to write the JSON library, so you might as well choose which one you're looking to ship. I think for a standard library it was the right choice, because it's a solid middle-of-the-ground choice... most JSON can be marshalled by it, because most JSON is still well-enough behaved for that to work. It's mostly good enough for most uses. But if you need the ultimate speed, or the ultimate flexibility, or the ultimate anything-else, you'll need to pick something else.

Why not use Varnish? POST messages of 500 bytes could easily be rewritten / proxy'd to GET requests. That might not be 100% restfull but seems like a lot less work. On our production environment Varnish always responds in less then 2 ms. Even on my development VM I never see response times > 5 ms. It has all the other requirements they state.

Perhaps I'm prejudiced because Varnish has proven to be such an awesome caching mechanism (we still use Redis as a key/value store), but this seems like NIH.

Varnish is actually difficult to use for POST requests unless you know C and can tweak the existing vmods. I just came off the same basic requirements and ended up going with Nginx (OpenResty) with Lua & Redis.

As a user and advocate of Go (in my case primarily as glue code that is beneficial because it's easy and efficient to get to results), articles like this do the platform a disservice.

This implementation is far from fast (two magnitudes better performance and it would be credible as "very fast"), and it is non-idiomatic, specifically doing things to avoid the benefits of Go.

As an aside -- HTTP and serialization are both costly. In many, many cases where I've seen them in effect, they were a significant expense for little to no architectural gain.

> HTTP and serialization are both costly.

But can be done very fast (introducing very little latency) and are essential pure operations, so can be parallelized very well (for throughput). Of course, this doesn't account for dev costs, and doesn't make your architectural point invalid.

Straightforward, simple functionality, extremere perfomance requirements and avertion to allocations and GC?

That sounds like model use case for good old C. I love Go and currently am actively learning it, but why take a memory-safe, GC language and then build your own ad-hoc memory management on top of it to avoid it?

This is probably why the creators of memcached and redis chose C.

I admit I only skimmed the article, but I think the tradeoff of parting with the requirement of having an HTTP API with JSON as the message format for transporting IDs over the wire, and just using Redis instead and its wire protocol, would probably have been the more time-efficient way to arrive at an (at least) acceptable solution for the problem they were looking to solve. But yeah, where's the NIH in that? ;)

I wonder if your team considered leveraging Nginx with some good caching policy, for most use cases it should already have everything you need and it'll probably be tough to beat Nginx's performance.

I found the mention of ffjson interesting, a faster serializer then the standard buildin json serializer ( 2x - 3x as fast) ==> https://github.com/pquerna/ffjson

You should check https://github.com/buger/jsonparser and https://github.com/mailru/easyjson as well. They are even more faster.

Manual memory management isn't an esoteric technology. If that's what's required to reduce GC overhead, it's an acceptable choice, especially when it's in a fairly simple system like a cache (a hash table with an eviction policy).

If you think "very fast" is 5ms, over a LAN, then truly I despair.

I mean, if they've paid you to write this, I presume you have reasonable hardware to run it on. Bog standard BSD sockets TCP on Linux is down around the 10 microsecond range now. What on earth are you doing with the other 4.99ms?

"basically everything in Go is built on pointers: structs, slices, even fixed arrays"

As I understand it while that does apply to pointer-to-struct and slices and probably strings, that isn't true for naked structs and naked arrays. Those both behave as value types like int.

True, one of the prime considerations for Go's design was that despite having pointers, structs could be embedded as values into other structs. That's the reason it doesn't look half like Java.

I'm curious why they didn't use something like mmap. That would have skipped the off heap approach, and also allowed for management, statistics, etc, to run as a separate process.

Edit: Apparently the offheap package does use mmap if you pass a path to their Malloc.

Regarding the comments here "why not Redis via UDS/loopback?" etc. - there are some cases where such solution won't fit - e.g., running your service on Mesos cluster.

Is this the sort of problem where optimistic concurrency would be a better fit than standard locks?

very fast. much entries. so cache. such go. wow

Go is web scale.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact