
100 more of those BITFIELDs - fcambus
http://antirez.com/news/103
======
derefr
With the #N syntax, Redis is getting dangerously close to having C-struct-
typed fields (i.e. product types with a known bit-level encoding.)

I could imagine a STRUCT LOAD command (like SCRIPT LOAD) to push a struct
definition to the server; and then STRUCT GETFIELD and STRUCT SETFIELD
commands to manipulate the structs. The GETFIELD and SETFIELD commands would
just boil down to the same BITFIELD command after dereferencing. (Heck, Redis
could probably cheaply tag the fields with their struct definition, and then
you could GET/SETBYFIELD would be _polymorphic_.)

...thinking about this more, this is possible _now_ , isn't it? The struct-
definition loader command, and the dereferencing commands, can all just be Lua
functions passed to EVAL. The struct definitions could be loaded into a sorted
set; the dereferencing commands would read from that set and then translate
your query into a call to this BITFIELD command to do the work.

Anyone care to implement this silliness? :)

~~~
itamarhaber
Not that silly and perfectly doable with the current support for Lua
scripts...

------
a1k0n
Check out Concise [0] or Roaring Bitmaps [1] before rolling your own
compressed bitset scheme. Wish I had known about these sooner.

(I realize now this is offtopic, as this is bit fields not bit sets;
nevermind, but still it's interesting!)

[0] [http://arxiv.org/pdf/1004.0403.pdf](http://arxiv.org/pdf/1004.0403.pdf)

[1] [http://roaringbitmap.org/](http://roaringbitmap.org/)

~~~
g4nt1
+1 for Roaring Bitmaps. We are starting to use it in multiple products.

------
connor4312
I'm quite excited for bitfields in Redis. We are going to need to implement
some bloom filter-based data structures for one of our products in the near
future. Previously the best way to do this was to either keep a hash map or a
binary k/v, both of which have substantial drawbacks (such as the need for
locking). These bitfields, however, are a perfectly-timed ideal solution to
the problem.

~~~
dvirsky
You can implement bloom filters in redis >2.6 with SETBIT And GETBIT. But I'm
guessing this is not your case exactly? I'm curious what exactly you're trying
to model.

~~~
connor4312
We want to count the number of users, including guests, who are currently
viewing a resource. To count guests, we plan to count unique IPs, so each IP
should be counted at most once even if it connects to several different
servers as a result of load balancing.

The approach we're moving towards is an augmented bloom filter where each
"bit" is actually an unsigned integer counting the number of items hashed to
the position, allowing us to add and remove IPs to the filter as users come
and go (just rehashing and "subtracting" the IP when it leaves). This lets us
ensure that IPs are not counted more than once. SETBIT and GETBIT would work
with a normal filter, but they can't increment multiple bits, so we'd end up
needing to lock.

~~~
antirez
I wonder if you could use Redis hyperloglogs for this... You count IP ins in
an HLL, COUNT IP out, and subtract.

~~~
dvirsky
sounds to me like it's more suitable for HLL than for blooms actually.

------
andrewfromx
i love using 64 bit unsigned ints in mysql for 64 boolean columns. I do:

alter table foo add column bar bigint(20) unsigned default
18446744073709551615

and that gets you 64 111111111111111111111111111111 etc.

~~~
Zikes
That sounds pretty painful to query, though. If the first bit position
represented if the user had confirmed their email address, wouldn't a SELECT
for confirmed users look something like SELECT * FROM users WHERE (mydata &
b'0000000000000000000000000000000000000000000000000000000000000001')?

~~~
33degrees
No, you can just compare against an integer: SELECT * FROM users WHERE mysql &
1

~~~
astrange
This sounds like it'd be a full table scan to me.

------
Dylan16807
53 bits.. they're returning a bitfield as a double float at some layer?

Or am I misreading the implication and they can do signed 64 bits but not
unsigned?

------
bluedino
Would like to see speed vs storage benchmarks. Bit twiddling must be a bit
slow, no?

~~~
dxhdr
The only slow operations these days are L3 cache misses and transcendental
function calls, around 100 cycles each. Optimize for that, not a 1 CPU clock
cycle "bit twiddle."

