
Realtime metrics using Redis bitmaps - rubyorchard
http://blog.getspool.com/2011/11/29/fast-easy-realtime-metrics-using-redis-bitmaps/
======
antirez
It is very good to see this article and Redis bitmaps exploited, since it is
an extremely memory efficient way to store data, and given the encoding, it is
extremely fast to also fetch big amount of information this way.

I really suggest to also looking at GETRANGE and SETRANGE operations that
allow to access sub-ranges of a large bitmap fetching or setting arbitrary
ranges fo bits.

Probably Lua scripting in 2.6 will enable a lot more interesting things
combining server side execution and bitmaps. Thanks for sharing this work.

~~~
pbrumm
It would be really cool if redis supported the ability to count the high bits
in a string as a native command. No need to pull a 16k string down to client

Although it is another command in a growing list of them. maybe something for
lua.

~~~
pjscott
This is something that should probably be implemented natively, for speed.
Preferably with GCC's __builtin_popcount intrinsic, if it's available.

------
pshc
How does it compare with just using sets? Usually only a small fraction of
your total userbase is online any given day, especially as dead accounts
accumulate... wouldn't the bitmaps be inefficiently sparse?

~~~
seiji
You can do clever things with word aligned hybrid bitmaps to heavily compress
sparse bitmaps.

Though, if you are storing binary users, 700 million users take less than 90
MB of space (assuming a straight array implementation). Old days wouldn't
_need_ to be kept in memory. If you want to think beyond redis, you could keep
the active data in redis then flush old data to disk for later querying.

Sounds useful if your users map to a (0, Max] integer representation. Sounds
complicated if you use uuids or external vendor IDs for users anywhere (you'd
need an intermediate mapping table somewhere).

------
cmdrkeene
Here's a quick ruby implementation:

<https://github.com/cmdrkeene/bitmap-counter>

~~~
rubyorchard
Awesome!

------
wahnfrieden
An honest question: what use are realtime metrics if you can't act on them in
realtime? If it takes a day or more to gather enough data to make a decision
and react to it with code or otherwise, then anything more than daily metrics
seem like a distraction. I suppose it can be useful if you have some automated
systems in place or if you're using it for alarms.

~~~
jphackworth
If you're pushing changes to production many times a day, you can use realtime
metrics to notice if anything goes awry, and know when you should roll back.
It won't give you the same nuances as an A/B test over a month, but it's a
great way to do a quick double check to keep things moving quickly while
avoiding disaster.

------
yread
How do they know that user ids are contiguous?

~~~
darklajid
The same way the length of a meter is known. By defining it. They are
measuring visitors, they can assign any numbers they like to those.

In my (limited) use cases for redis which interfaced with external information
like that I usually had a mapping table anyway, to get something like

users:nameFromExternalSource:myIncrementalId

foo:myId:ThatUsersFoo

bar:myId:ThatUsersBar

Note again that I'm no expert on redis. There might be problems with that
approach that I don't know about - but for me this worked out quite well and
seemed to save memory vs any foo:externalIdOrUsername:dataHere name scheme.

------
joshu
Compressed bitmaps would be cool too.

~~~
est
Are there any cool bit level compressors?

You can take advantage of sparse bits and doing logic operations on them in
compressed state.

------
mythz
Cool!

