
Lazy Redis is better Redis - mostafah
http://antirez.com/news/93
======
nxb
Is there a document that lists out Redis best practices like "Redis is very
fast as long as you use O(1) and O(log_N) commands"?

Sure it's probably all obvious things, but it would be nice to have a
checklist to skim over, to be sure I haven't forgotten any major consideration
when designing a new system.

~~~
eric_bullington
> "Redis is very fast as long as you use O(1) and O(log_N) commands"

Well, like you say, to me that particular practice would be pretty obvious.
More helpful would be a page that shows the time and space complexity of the
various commands, akin to the following page for Python:

[https://wiki.python.org/moin/TimeComplexity](https://wiki.python.org/moin/TimeComplexity)

Edit: Yeah, like the Redis documentation, lol. I think it says a lot for Redis
that I almost never need to visit the docs, other than the single page that
lists all the commands (back when it was red was the last time I actually
needed to look at it). Once you know the commands, quickly reading the
changelogs provides me with all I need. Anyway, all that to say I had no idea
Redis already provided this complexity information, don't remember them having
this back when I first used learned Redis (although maybe they did). Good to
know.

~~~
latch
You mean, akin to the redis documentation?

[http://redis.io/commands/sort](http://redis.io/commands/sort)

"Time complexity: O(N+M*log(M)) where N is the number of elements in the list
or set to sort, and M the number of returned elements. When the elements are
not sorted, complexity is currently O(N) as there is a copy step that will be
avoided in next releases."

------
mwsherman
As I was reading through, the back of my mind was saying “Nice heuristics but
now you’re adding uncertainty. The behavior of DEL could vary dramatically and
users would not know why.”

I was relieved that antirez recognized this as a semantic change and gave it a
new name. Very thoughtful.

Perhaps there is something to be learned from GC algorithms in this case?

------
korzun
> Everybody knows Redis is single threaded.

You would be surprised.

The whole point of Redis is to run it on powerful single treaded machines. I
can count number of times how 'experts' threw it on multi-core beasts of
systems and were surprised a single core machine that cost /2 of that system
destroyed their provisions.

At the end of the day; latency is the king.

~~~
antirez
Agreed. Worth to note it's always better to have at least two cores per Redis
process if persistence is enabled since there is the saving process that will
burn a single core from time to time.

------
seivan
I am probably missing something here, but if a delete happens asynchronously
wouldn't that make the key still be available? What happens if you check if
that key still exists in a different operation as you're deleting it?

Also, how slow is an operation to rename it before you delete it?

~~~
janerik
The key is removed from the main hashtable and thus not accessible by the
normal code path. It's appended to the internal list for deletion then. That's
basically just moving around a pointer.

A manual rename is not needed.

~~~
seivan
Ah appreciated. It was a lot going over my head.

------
tedd4u
He kind of buried the lede: he intends to implement thread-per-connection :

"... it is finally possible to implement threaded I/O in Redis, so that
different clients are served by different threads. This means that we’ll have
a global lock only when accessing the database, but the clients read/write
syscalls and even the parsing of the command the client is sending, can happen
in different threads. This is a design similar to memcached, and one I look
forward to implement and test."

~~~
antirez
Absolutely not, it's the memcached model. There are N threads (like 4 or 8 or
one per core) serving all the clients in a multiplexed way like is happening
now with a single client.

If we'll go the extra mile implementing also background slow operations, we'll
send them in another "ops" thread, will block the client, and will resurrect
the client and send the result when available. Like we do currently for
blocking operations like BLPOP.

~~~
tedd4u
Okay, I get it -- thanks.

------
vinay_ys
It's a nightmare to do perf/scale testing, capacity planning with a backend
who's performance characteristics change with the dataset & operations being
performed - that too in a fuzzy way due to 'lazy'. I like memcached way better
than this simply because api operations are all deterministic.

~~~
antirez
As specified in the article DEL remains a blocking DEL. It's application code
that calls UNLINK if a background release of memory is needed because of low
latency needs. However 50% of the blog post explains how the implementation
avoids to create problems by releasing objects faster than you can allocate
new ones in order to avoid the undetermined memory behavior[1]. Anyway
memcached is a lovely product, I hope it will be developed more in the future
so that users have a choice. But in the context of our discussion, there is no
equivalent in memcached for this feature since objects are composed of a
single allocation. You get the same if you just use Redis SET / GET / DEL
commands.

[1] This is also handled in the LRU code. If there is nothing to free but
there are objects in the free list, the server waits for the background thread
to make progresses to continue again. For expire and eviction of keys there
will be explicit options where the user specifies if server-deleted objects
should be deleted in a lazy or blocking way, with the old plain blocking as
default.

------
bipin_nag
I have a question. UNLINK will be more responsive than DELETE as it will not
block and run it in background. But will subsequent requests be faster than
DELETE. How will it behave if immediately followed by GET.

In background processing I think it is keeping the list of elements to delete.
So when a GET is received it will check against that list too. The overhead
will remain when you have to process incoming queries and deletion is not
finished yet.

DELETE followed by GET takes x sec (mostly due to DELETE).

//In GET after DELETE time is not affected

UNLINK followed by GET1,GET2,... takes y1,y2... sec

//Will y1>y2>... until deletion finishes ?

Is this correct ? It looks like a trade-off, improving current latency at cost
of later ones. (I feel it is worth it).

