

The Redis Manifesto - Rust
http://antirez.com/post/redis-manifesto.html

======
shapeshed
'Code is like a poem'. 'We optimise for joy'.

Too often software development is veiled in acronyms and jargon. It is great
to see a lot of humanity in this manifesto. Software development becomes a
huge part of a benevolent dictators life. The simple aspiration of saying 'I
want to enjoy this' is a refreshing thing to see.

~~~
pjscott
Impressively, the Redis code actually lives up to the lofty rhetoric.

<https://github.com/antirez/redis>

It is some of the best C code that I've seen.

~~~
psykotic
The very first function of the very first file I read had an algorithmic bug.
It's dictGetRandomKey() in dict.c. First it finds a random occupied bucket by
rejection sampling. Then it proceeds like this:

    
    
        /* Now we found a non empty bucket, but it is a linked
         * list and we need to get a random element from the list.
         * The only sane way to do so is counting the elements and
         * select a random index. */
        listlen = 0;
        orighe = he;
        while(he) {
            he = he->next;
            listlen++;
        }
        listele = random() % listlen;
        he = orighe;
        while(listele--) he = he->next;
        return he;
    

This sampling algorithm is not truly random. Elements in smaller buckets are
more likely to be picked. Especially in moderately-sized dictionaries the skew
can be significant, even with a good hash function with pseudorandomness
characteristics (Chernoff-style bounds don't say much for small sizes). To say
nothing of what could happen in an adversarial situation, where an attacker
might heavily skew the outcome of a "random" draw by affecting the
distribution of bucket sizes. That could potentially be much worse than the
usual denial-of-service attacks on hash tables.

Unfortunately, there's no good way to fix this with their hash table
implementation. If they had used open adddressing (which is generally superior
to external chaining for other reasons as well), this wouldn't be a problem.
You'd just try slots at random until you found an occupied one, which you'd
return.

(Even if they wanted to use this skewed sampling method, you can pick a random
element in a linked list of unknown length in a single traversal. It's simple
enough to be a standard interview question, so the two-pass method is
certainly not "the only sane way". The benefit of doing that here is
admittedly very limited, since the buckets will be small to ensure the
efficiency of hash table lookups, and that cost is dominated by cache misses.)

Pedantry on my part? Perhaps. But if nothing else, it betrays a lack of
attention to details.

------
js4all
Brilliant. I wish more developers would adopt to these principles.

I recently discovered redis while looking for a website statistics option. I
was impressed how easy and well designed redis is. Starting with building it,
over using it to sample the data, to querying the data for the charts.

I was so impressed, that we will add Redis to our hosting offerings for
Node.js apps.

------
twsted
Just FYI, it was written almost a year ago.

