

200,000,000 keys in Redis 2.0.0-rc3 - nochiel
http://blog.zawodny.com/2010/07/24/200000000-keys-in-redis-2-0-0-rc3/

======
rbranson
Try out Tokyo Tyrant. It'll use at least half the space.

------
ergo98
The key is a long cast to a string, and the value is an int. So 2.23GB of data
became 24GB. Quite a large loss factor there.

Calculation the load capacity seems like it should be a trivial exercise and
shouldn't require a test it approach.

~~~
antirez
If you turn a txt file with a list of "common surnames -> percentage of
population" into a binary tree it will get more or less an order of magnitude
bigger in memory compared to the raw txt file.

This is a common pattern: when you add a lot of metadata, for fast access,
memory management, "zero-copy" transmission of this information, expires, ...,
the size is not going to be the one of concatenating all this data like in a
unique string.

If we want we can save more memory per key, with some tradeoff. Example:
instead of having an hash table where every hash slot is a pointer (holding
the key) -> pointer (holding the value) map, we can take the hash and use it
as a unique string with: <len>key<ptr><len>key<ptr>.... for all the colliding
strings. Than tune the hash table in order to have on average 5 or 10
collisions per hash slot.

That is: prefixed length as string and pointer (that is of fixed size).

Maybe we'll do this in future, and in Redis master we are doing a lot of this
in the "value" side of the business. So if you store a list of 10 elements it
will use an amount of memory very similar to storing the same bytes as a comma
separated value.

But for now our reasoning is: it's not bad to be able to store 1 million of
keys with less than 200 MB of memory (100 MB on 32bit systems) if an entry
level box is able to serve this data at the rate of 100k requests/second,
including the networking overhead. And with hashes we have a much better
memory performance compared to top level keys. So... with a few GB our users
can store ten or hundred of millions of stuff in a Redis server.

Better to focus on stability and clustering for now... and when we'll have a
system that can scale transparently we'll focus on memory overhead to check if
it's still important to reduce it, or since we want to be a very high
performance DB , if it's just better to accept the tradeoff.

~~~
ergo98
Interesting post. Thank you.

