

NoSQL East 2009 day 2: Pig/Twitter, Cascading, Neo4j, Redis, Sherpa/Yahoo  - uggedal
http://journal.uggedal.com/nosql-east-2009---summary-of-day-2

======
antirez
There is some problem with the numbers for Redis:

"Can do 19,600 gets and 13,900 sets a second on a MacBook Pro"

In my macbook (not PRO) Redis performs like this:

    
    
        % ./redis-benchmark -q
        SET: 34705.88 requests per second
        GET: 31055.90 requests per second
        INCR: 28739.25 requests per second
        LPUSH: 35013.98 requests per second
        LPOP: 30496.95 requests per second
        ^C
    

But redis _sucks_ on Mac OS X compared to how it performs on Linux. The same
macbook running Linux reaches almost 100k query/sec. An entry level server
running Linux is in the 150k/sec zone.

Sorry but I spent some time in order to make Redis so fast, so to see numbers
an order of magnitude less does not make me happy ;)

About the replication: Redis supports master-slave replication with very very
fast first synchronization. The replication is non-blocking, this means that
if you attach N slaves to the master it continues to reply to clients without
troubles when synching with the slaves.

If the link between master and slaves goes down the two will resynchronize
again automatically. It's possible to use a replica in order to enhance data
durability.

Replication can be controlled at runtime. For instance if you want an instance
to become a replica of another instance all you need to do is something like
this:

    
    
        echo -e "slaveof 1.2.3.4 6379\r\n" | nc 1.1.1.1 6379
    

Final note about the snapshotting persistence mode, in Redis edge on git there
is already support for append-only journal, that makes Redis an option even
when data is very important.

~~~
lucifer
The benchmark is a C program. Do any of the clients come close to matching the
benchmark?

~~~
antirez
Yes, it's just about parallelization.

If you meter the performance, even of a C client, in a busy loop, you are
really measuring the round time trip, because it's a request-reply protocol,
and most clients block until the reply is not ready.

Even using a Ruby / Python / ... client, if you run N of this clients, you'll
see that Redis can handle this number of queries every second.

~~~
lucifer
I understand that. I was gently hinting that the conf. presenter probably was
using a (single) client given his audience.

As an aside, from the end user's point of view (assuming the typical end user
is a web 2.0 app), the _throughput_ isn't the only consideration. Even with N
clients having 100k/s _throughput_ , request latency is likely going to be N*
the 1/tps. ~ 0.1 ms is probably the sort of request latency the end user is
going to be looking at, and not .03 ms (taking your mac numbers as baseline).
Bump up the number of clients and that latency is gonna get higher, even while
throughput gets better.

This has nothing to do with redis (which is great). Just something to keep in
mind when looking at this sort of performance measures.

~~~
antirez
I agree with you that requests/second is not the only or more sensible
parameter to meter performances, this is because redis-benchmark reports
latency percentile too. I just suppressed the output in the example, but it
looks like this:

    
    
        ====== SET ======
        10008 requests completed in 0.39 seconds
        50 parallel clients
        3 bytes payload
        keep alive: 1
    
        1.03% <= 0 milliseconds
        38.83% <= 1 milliseconds
        73.12% <= 2 milliseconds
        95.34% <= 3 milliseconds
        97.93% <= 4 milliseconds
        99.50% <= 5 milliseconds
        99.75% <= 7 milliseconds
        99.84% <= 8 milliseconds
        99.93% <= 9 milliseconds
        99.94% <= 10 milliseconds
        100.00% <= 11 milliseconds
        25401.02 requests per second
    

As you can see under this load most clients are served in 4 milliseconds or
less, including both the transmission of the request and the reception of the
full reply.

