
A multithreaded fork of Redis that is faster - ericblenkarn
https://docs.keydb.dev/blog/2019/10/07/blog-post/
======
erulabs
Any time these "much faster than Redis" databases come up, the sysadmin in me
wonders how many people have had actual performance limitation issues with
Redis. I've seen Redis servers handle hundreds of GB of traffic per hour. I've
worked at companies where Aerospike and others are proposed as replacements
for Redis because "they're faster" \- and I point out the 98% idle CPUs on the
Redis server, and the near-100%-usage CPUs on the app server fleet and mouth
"But... why?"

Replacing Redis with "something faster" is a bit like removing the doors on a
car because "lighter means faster!". It might look good on a racetrack, but
it's about as pragmatic as climbing through a window every morning before
setting off for work.

~~~
mperham
As the Sidekiq maintainer, I’ve seen many customers need to shard Redis around
5000-10000 jobs/sec. Sharding is a major operational headache so this could be
very useful to heavy job processors if it does prove to scale better.

I also find it interesting that the BSD license enables this 3rd party company
to fork Redis and build closed source commercial software on top of it. One of
the trade offs to consider when licensing a project.

~~~
foota
Ok, what is a job/second?

~~~
nitrogen
Look up what Sidekiq is. It is awesome.

~~~
foota
Aw, thanks! I was mis-parsing this as '5000 Redis jobs per second' instead of
the intended 'needed to use sharding when sidekiq is scheduling more than 5000
jobs/second'

Thanks!

------
scott_s
Two HN threads from around when the project started:

Show HN: KeyDB – A Multithreaded Fork of Redis,
[https://news.ycombinator.com/item?id=19257987](https://news.ycombinator.com/item?id=19257987)

KeyDB: A Multithreaded Redis Fork,
[https://news.ycombinator.com/item?id=19368955](https://news.ycombinator.com/item?id=19368955)

------
maxpert
So this year on RedisConf Antirez demoed threaded version for Redis (only
transport needs to be multi-threaded, core remains single threaded). Numbers
were already amazing. I will pick the community version of Redis any day over
forks.

~~~
drenvuk
I feel like this take is somewhat unfair. Antirez was against threading until
jdsully proved that it worked in exactly the same way that you're saying,
multi-threaded transport with a single threaded core. In addition Keydb allows
users to use ssds in addition to ram only when that is a paid feature for
Redis Lab's enterprise support.

jdsully spurred the implementation of two of the best changes that you might
see in redis in the near future and you consider even using keydb pointless.

Yea, ok.

~~~
antirez
This is historically not correct, the threaded I/O branch was started 1.5
years ago, you can check the history online, it's all public. Moreover before
me, and after me, a number of individuals did the same work _many times_ ,
including Alibaba team, AWS team, and so forth. It's an obvious feature, the
reason for not doing this, or doing this in a very limited fashion, is
philosophical rather than technical. Btw Keydb way of doing threading has
nothing to do with the trick used by Redis I/O threading AFAIK, of just
fanning out to N threads only in the hot places.

------
tengbretson
I've always considered the single-threaded nature of Redis to be one of its
greatest features.

~~~
femto113
Only one thread can access the data at any given time, so it seems like most
of the things you'd expect to be guaranteed by a single thread still are. I
found this comment particularly interesting

    
    
       Unlike most databases the core data structure is the
       fastest part of the system. Most of the query time
       comes from parsing the REPL protocol and copying data
       to/from the network.
    

I wonder if anyone in the Redis ecosphere has explored a binary client server
protocol, something that could be parsed/compiled on the client and then
executed without parsing on the server, if the above is really true seems like
that might offer even more perf gain than multithreading on the server.

~~~
cheald
The human-readable/writable protocol is one of my favorite things about Redis,
tbh.

I can see cases where a really optimized system could benefit from a binary
protocol, but I suspect it'd be a loss for most people.

~~~
axaxs
Why not just offer both?

~~~
femto113
That was my thinking as well, though taking a peek at the actual code suggests
that there's a pretty deep expectation that the client is speaking strings,
e.g. in code that handles the ZRANGE command[1] I see

    
    
        if (c->argc == 5 && !strcasecmp(c->argv[4]->ptr,"withscores"))
    

and a quick grep suggests that's a common pattern

    
    
        % grep argv src/*.c | grep -c -e 'str\(case\)*cmp'
        482
    

I guess this means someone would have to tackle creating an intermediate
binary format first, rewriting the command handlers to expect that format, and
then making client libraries that can produce the format. Perhaps still worth
it in the end, but not trivial.

[1]
[https://github.com/antirez/redis/blob/unstable/src/t_zset.c#...](https://github.com/antirez/redis/blob/unstable/src/t_zset.c#L2422)

------
cyrux004
I would like to hear from somebody who is using this in production. We started
to use Dynomite recently after hearing good things. Would like to hear
comparisons

~~~
bengotow
Same! I'd also be curious to hear about production scenarios that would really
benefit from Redis going 5x faster. It's pretty darn fast to start with!

~~~
penagwin
At my job we're currently re-building our website in django, and we make heavy
use of redis caching.

Our website is definitely not "high traffic" but we get somewhere around
300,000 requests a day, mostly concentrated around business hours (We're a
local clothing wholesaler).

I haven't tested it under production loads, but just swapping our redis for
keydb (THANKS DOCKER!) I saw no improvement in my artificial load tests.

I didn't expect to see much real improvement for this use case, but I just
thought it was worth mentioning that it isn't necessarily faster for all
workloads.

~~~
darkr
We currently do nearly a million requests per minute at peak, on a non
clustered redis pair for caching and rate limiting (so at least one read/write
per request). This design won’t hold up forever, but we’ve got at least a few
years headroom before we need to think about anything more complicated

------
jadbox
Since this has been around for awhile, why hasn't Redis adopted this strategy
into core?

~~~
zymhan
From Antirez's (Redis Maintainer) blog:

> Another thing to note is that Redis is not Memcached, but, like memcached,
> is an in-memory system. To make multithreaded an in-memory system like
> memcached, with a very simple data model, makes a lot of sense. A multi-
> threaded on-disk store is mandatory. A multi-threaded complex in-memory
> system is in the middle where things become ugly: Redis clients are not
> isolated, and data structures are complex. A thread doing LPUSH need to
> serve other threads doing LPOP. There is less to gain, and a lot of
> complexity to add.

[http://antirez.com/news/126](http://antirez.com/news/126)

~~~
mmaunder
This should be top comment. I came here to chat about potential downsides
introduced by complexity of having multiple threads accessing the internal
data structure. Until someone runs this in production where they actually use
the performance it delivers over and above vanilla redis, I'll probably hold
off. I'd like to know it's stable under very high load with contention.

No offense to the creator(s) and I have a ton of gratitude and respect for
them pushing the boundaries. But I also know Antirez is a smart dude and Redis
has delivered insane performance thus far with few issues.

~~~
jdsully
No offense taken. KeyDB has different goals than Redis so you’ll see us try
things Redis might not. I’m willing to make the code more complex if it makes
the user’s life easier in some way.

~~~
gigatexal
That KeyDB exists and others like it is great! I likely won’t be working on
projects that will outstrip the performance of Redis but if I did I’d be
looking at the work of people much, much smarter than I for alternatives and
from what I’ve seen KeyDB looks really compelling. It’s definitely a project
I’ll be following.

------
sb8244
I'm really curious about the latency metric. I'm running elasticache redis and
the round trip average of our Redis get commands is around 1ms. We may have
5-10 in a transaction and look for 10-20ms total time.

Anecdotal, but it leaves me slightly confused

~~~
jdsully
Higher traffic loads result in latency increasing. This benchmark covers the
maximum load so latency will be much higher.

~~~
sb8244
Thanks for explaining.

------
graycat
Q. Interesting. For my Web site, I wrote a simple key-value store, I use for
Web user session state, based on two standard .NET collection classes. My code
is single threaded. Sure, multi-threading could be better, but with my code
design then I'd have to use multi-threaded versions of the collection classes,
IIRC, which ARE in .NET.

But, I'd guess that multi-threaded collection classes, due to the logic for
locking or other means of concurrency control, would be slower and not faster.

So, any thoughts on why, how multi-threaded could be so much faster, e.g., the
OP's 5X, not just for the OP here but in general and maybe general enough to
apply to my code?

By the way, with my code I get a weak version of multi-threaded because the
software _interface_ to my key-value store is just via standard TCP/IP sockets
moving byte arrays from object instance de/serialization. So, I'm taking
advantage of the standard TCP/IP FIFO (first in, first out) queue for the
incoming work to be done. I.e., more than one Web server can be sending a key-
value request to my key-value server at the same time; TCP/IP handles that
_muli-threading_ ; and I get a _weak_ version of multi-threading. Broadly I'm
wondering if having my actual code and the collection classes multi-threaded
have any chance of being faster: Okay, the server has 8 cores so that MIGHT be
the key to being faster.

------
coleifer
Friendly reminder that Kyoto Tycoon might be an option. Has real persistence
(not just dump everything / reload everything, or cripple performance by
turning on aof), is multi-threaded, scriptable via lua, amazing performance.

~~~
manigandham
KT has been out of maintenance for years and doesn't have all the higher-level
useful data structures and operations that Redis does. There are lot of
options if you just wanted fast key/value with persistence from ScyllaDB,
Tarantool, LMDB, RocksDB, etc.

~~~
coleifer
Eh, some of those are embedded, kt is a server like Redis or memcached. Kt was
good enough for cloudflare, if that's enough of an endorsement.

~~~
manigandham
I know, my point is that performance is rarely the need. Usability and useful
APIs are more important which Redis excels at.

Cloudflare stopped using KT because it wasn't much else other than simple and
fast, and was missing a lot of other features.

~~~
coleifer
This post / the topic was performance

~~~
derefr
Normally, when choosing components for a software stack, infra engineers are
pretty good about choosing the components with the fewest features that fit
the semantics that the software's design demands. You wouldn't choose Postgres
where sqlite would do; you wouldn't choose Kubernetes where a single machine
running Docker would do; etc. Components with simpler semantics are not only
lower-maintenance, but _usually_ can be more highly optimized due to having
fewer "gotcha" requirements going into building them.

Which is to say, if someone is looking for a "faster Redis", it's probably
because they originally went with Redis as the "least software they can get
away with" for their particular design needs.

Any, therefore, any software that has more narrow semantics than Redis itself
is not, in fact, a viable "faster Redis", for anyone but those who had no
reason to be running Redis in the first place.

------
jhgg
This is pretty interesting. At work we run a few dozen cache nodes serving
several million QPS during peak. We were debating switching some of this
workload to memcached from redis - but this may be a viable alternative.

------
croh
This looks promising.

But I still love Redis.

1\. It is one of the most reliable software I used in production.

2\. Not everyone wants 'faster than light' software.

3\. Simplicity of redis blows my mind. It is very easy to maintain and almost
never have a single issue.

4\. These days everyone talks about scale but very few projects need FAANG-
level scale. Point is there are so many small projects where current redis
fits seamlessly. Just making case for Redis.

------
JackFr
While I get the performance improvement with multi-threading, won't you lose
ordering of updates?

In many use cases that's not a problem, but I imagine for some it's a
dealbreaker.

------
colinjfw
One big thing I like about redis is that it is single threaded.

------
gok
On 32 cores? Doesn't seem like particularly great scaling.

~~~
antirez
In contrast running N instances scales linearly. That's why Redis multi
threading just attempts to get the low hanging fruit of the write calls. For
serious scalability it's better to orchestrate multiple instances in our
vision.

------
not_a_cop75
I guess I can no longer use Redis. I mean what can the use case of normal
Redis possibly be?

~~~
kirstenbirgit
It works, it's simple, and it's battle tested.

~~~
not_a_cop75
Does anyone not read into the sarcasm? This is the same cliche format stating
that x is better than y, in such a manner to supposedly shame people still
using y.

~~~
gtirloni
Nobody is shaming anyone.

~~~
not_a_cop75
No one is shaming ANYONE?

Look around. Who is not shaming the oil and coal industry right now?

~~~
quickthrower2
There is a guideline on Hacker News (and this would apply to any online
conversation):

"Please respond to the strongest plausible interpretation of what someone
says, not a weaker one that's easier to criticize. Assume good faith."

[https://news.ycombinator.com/newsguidelines.html](https://news.ycombinator.com/newsguidelines.html)

------
stephenr
You went to the effort of forking Redis but didn’t bother to fix the glaring
lack of TLS support?

What is the fucking point of a replicated KV store if the connections between
nodes aren’t encrypted.

~~~
antirez
We have it in Redis 6, ETA for RC1 is end of year. However we did it the right
way with a socket abstraction layer, one of the reasons it took so long.

~~~
stephenr
Is there any relevant documentation/discussion available?

~~~
antirez
Yes, the implementation is in this PR:
[https://github.com/antirez/redis/pull/6236](https://github.com/antirez/redis/pull/6236)

~~~
finnh
Kudos to you for a calm response to an angry question.

