

The Great Redis Misapprehension - dickeytk
http://jeffdickey.info/the-great-redis-misapprehension

======
tabbyjabby
Hmm. The author's recommendation to use Redis' 'KEYS' command to pattern match
keys for invalidation is a dangerous one. 'KEYS' runs in O(n) time. If you're
using Redis in a serious production environment, you do not want want to be
using an O(n) command each time you need to invalidate a group of keys. It's
better to group related items in a Redis hash table, and that way they are all
stored under one common key.

Honestly, I think this article is a more impoverished version of antirez's
post on the same topic [1]. antirez, being one of the principal authors of
Redis, is a much more authoritative source, and he actually describes all the
patterns that this author described in greater detail.

[1] [http://antirez.com/post/take-advantage-of-redis-adding-it-
to...](http://antirez.com/post/take-advantage-of-redis-adding-it-to-your-
stack.html)

~~~
dickeytk
tabby, I've also updated the article to mention the fact that KEYS may not be
a solid use of the system. I would appreciate any feedback!

~~~
tabbyjabby
Sorry, I didn't mean to come across as a dick. I'm glad that you've taken the
feedback of your readers into consideration.

I'm a big fan of Redis, and it's a key component of our stack. Sorted sets are
useful for a lot more than just leader boards, though that is a good use case
for them. It's a bit late here, so I'm not feeling up to writing a big post,
but I'm considering writing my own blog post on my experiences with Redis.

~~~
dickeytk
I don't think you were at all. Many people mentioned the same thing you did.

I'm new at blogging, so I'm trying to take in everything and produce the best
content I can, so I love the feedback!

I would love to see a follow-up article too by the way!

------
typicalrunt
_Now, this is where Redis comes in. You can find keys on wildcards! So you can
just query it like so:

keys post/83/_*

No no no. This is slow and there is a large warning section in the notes
(<http://redis.io/commands/keys>) about using this in production environments.

[Edit: Why is this bad? Think about if you have millions of keys in your
environment. KEYS will need to iterate over a million keys to find the ones
that match your pattern.]

As an alternative to using KEYS, Redis provides the hash object. Store
everything abou an object in a hash (e.g.: "posts:83") and then just delete
the hash object. Everything under it will be removed as well. If you need to
know what's in the hash that is going to be deleted, use the HKEYS command
(which has no such warning about performance).

~~~
dickeytk
Right, I know what the docs say, however, it's fine to use this for cache
invalidation (but maybe not other use-cases) in production.

The docs also say:

"While the time complexity for this operation is O(N), the constant times are
fairly low. For example, Redis running on an entry level laptop can scan a 1
million key database in 40 milliseconds."

If you have so many keys that this is an issue for cache invalidation, you
should be using memcached anyways. (Since it can be distributed, where in
Redis distribution is left up to you to figure out)

I wasn't able to dig it up, but I know I read an article about some consulting
group making a site for a major shoe company where they did exactly this.

Your hash method isn't the best way either though since it's more efficient to
store everything in individual key-value pairs, and hashes cannot be nested.

Really the BEST way to do this in Redis is to use a set containing all they
keys related to an object, then clear each of them out when destroying an
item.

~~~
dickeytk
It's no authoritative source, but I'm obviously not the only one that thinks
this: <http://www.webdevelopment-blog.com/2011/07/puma-on-redis/>

~~~
tabbyjabby
Why is it more efficient to store everything in individual k-v pairs? Hash
tables are certainly more memory efficient, and the difference in CPU
efficiency is so slight as to be inconsequential.

~~~
dickeytk
doh! I got it backwards. You are right that hashes are more efficient:
<http://redis.io/topics/memory-optimization>

However, for this particular problem, the docs do say:

Don't use KEYS in your regular application code. If you're looking for a way
to find keys in a subset of your keyspace, consider using sets.

I wonder why they don't suggest hashes?

~~~
tabbyjabby
Because you might want to use sets to group _any_ type of keys, including
hashes.

------
nostrademons
I'm curious why you would use Redis instead of just using in-memory
datastructures in your app servers? It's trivially easy to implement a
leaderboard as a priority queue, for example. And it eliminates the need to
run yet another server and deal with the associated RPC & command parsing
overhead.

That's the approach Hacker News and Viaweb took, along with Mailinator and
probably several other startups.

~~~
patio11
_I'm curious why you would use Redis instead of just using in-memory
datastructures in your app servers?_

I make fairly pedestrian use of Redis, generally as either a persistent cache,
shared memory, or schemaless DB shared by multiple Rails processes. In-memory
structures have a lot to anti-recommend them in the Rails world: at any given
time I have 4 server processes and 2 worker processes running, and each of
them would need a separate copy of everything. There would be consistency
problems. Those processes have a lifetime measured in days in the best of
cases to minutes in the worst of cases: following a restart, any in-memory
structures have to be rebuilt from the underlying data source. Hypothetically
assuming demand for my products explodes and I can no longer deal with only a
single physical server, Redis plays very well with being accessed from
multiple servers, whereas I'd have to write some sort of REST API to
reimplement Redis (poorly) on top of my actual people-pay-money-for-this
application code to share that state among multiple physical servers, if I
were to go down that route.

Redis has been an absolute dream to administrate: the total overhead for me
was "apt-get install redis-server", adding three lines of configuration to
Rails and tweaking two in Redis, and doing one SCP command when I migrated
servers. The RPC/command parsing overhead is, empirically, negligible in my
use cases. Don't take this advice if you're Google (I know you're Google, but
for the general "you" here), but many people are not Google.

~~~
nostrademons
Ah, I was kinda assuming that there's already a separate appserver tier
distinct from the webservers. If you don't have that, I can see how something
like Redis might be a useful intermediate step so you don't have to go build
one until there's actually a substantive need for it.

It's kinda like a complement to memcached then, right? Memcached gives you an
off-the-shelf distributed hashtable that you can stick things in. Redis gives
you an off-the-shelf list or heap server that you can stick things in. You
might eventually want more control of the algorithms that you can run on
these, but if it's not yet worth setting up a separate server, you can glue
these components together and get a decent approximation.

~~~
patio11
_Ah, I was kinda assuming that there's already a separate appserver tier
distinct from the webservers._

That's kinda an enterprise-y architecture choice in my experience. There's
excellent reasons for it (much like Service Oriented Architecture) but I
generally see folks evolve into it over time rather than starting from it,
unless they come from an enterprise-y background where its assumed from the
beginning. In particular, Rails and some other opinionated frameworks start
from the assumption that 99%+ of the business logic is going to get executed
in the web tier, and while I'll bet you that some of the more famous Rails
deployments eventually move away from that, Rails would fight you every step
of the way if you were trying to do it in greenfield development.

Redis makes a great complement (or drop-in replacement depending on use case)
for memcached. Relatedly, I love how these (and other OSS tools) let little
guys play with big boy solutions without having to have big boy budgets or
organizational resources to make use of them. I think Facebook probably has
about 10 terabytes more memcached than I do, but it turns out that memcached
is really freaking useful way down the scaling/complexity curve, too.

------
philjackson
_"It has some persistence support, but does not appear to be super durable. If
you're thinking of using like that though, you're misunderstanding the tool."_

This warrants a _much_ more detailed explanation. The author should have
spoken about possible durability options, like the append-only file, which,
given the right configuration, makes Redis "fully-durable" at the inevitable
sacrifice of some speed.

A more informative read would be: <http://redis.io/topics/persistence>

------
Detrus
_so if you've been thinking it's persistent, get that OUT of your head now!_

On persistance from antirez <http://antirez.com/post/short-term-redis-
plans.html>

_We want also work both in the communication (most users don't understand that
Redis with both AOF and RDB enabled is very durable already, and this is the
setup we suggest) and the implementation to make sure that Redis AOF can be a
very durable solution, as durable as the best SQL databases out there._

------
antirez
This post is not correct, but of course the biggest problems are the claiming
that Redis is not persistent, and the use of KEYS command.

------
pkulak
If I was greenfielding a new project tomorrow, I'd use a Redis/MySQL combo.
MySQL for all the data in perfect first-normal form, and then Redis for
storing difficult joins, caches, queues, etc. It would be a perfect marriage.

~~~
dickeytk
YES! It's a winning powerhouse.

------
jbwyme
_For some reason, people seem to think it's a key-value store, or some
persistent database, but that's totally not it at all._

From what I understand, it is actually a key-value store and is basically a
superset of memcached. Therefore, if my understanding is true you could use it
merely as a key-value store as well and use its other features (native sets,
lists, etc, pub-sub, and persistance) as needed.

~~~
dickeytk
Your understanding is correct, but those 'other features' is right where Redis
becomes interesting.

If you just want a better memcache, use membase.

------
jasonjei
I use Redis as a distributed locking mechanism too, especially with its setex
feature which can help reduce deadlocks. We have a UUID string as the value
for the key, and the only way to release the lock is that the UUID must be
passed and matched against the value in Redis.

------
Mikushi
Valid article, mostly. But the point is still valid, and i feel that most
people using Redis do not get it, it is a data structure server, it's not that
hard to understand...

Use it as such and you won't be disappointed.

~~~
dickeytk
Yes!! Thank you so much. That's exactly the point I was trying to make.

------
tkahn6
In order to argue that redis isn't a database you have to first define what a
database is and explain why redis doesn't fit that definition. All the author
did was assert that redis isn't a database and give 3 use cases.

This is just bad writing, to be blunt.

And for what it's worth antirez wrote an HN clone using redis as the database.

------
freeformz
Hah! I've been saying this for a while. Good to see it on Hacker News!

