

Scaling Memcached: 500,000+ Operations/Second - cyberian
http://blogs.sun.com/zoran/entry/scaling_memcached_500_000_ops

======
baguasquirrel
Does anyone know what the comparable figures for an Intel server cluster are?
Looking around the net, it seems like the cost of the hardware required for
that half a million ops figure is in the neighborhood of $30k. The article had
an 8 core T2 with 64gb of ram.

[http://www.google.com/products?client=safari&rls=en-us&#...</a><p>Take a look
on Newegg, and it would seem that it'd be enough to buy you around 20 Intel
boxes with 16 gigs of ram each. The configuration I used put a Core2 Quad on
each machine, so our cluster would have 80 cores and 320gb of memory. Unless
there's a huge gotcha somewhere regarding the system bus in the Intel
architecture vs. the Sun architecture, the Intel cluster still seems to win
out. The Sun box indicated that they used DDR2 chips but didn't indicate what
speed, so I picked DDR2 800.

~~~
Retric
Splinting memcache over several servers has all of the classic scaling issues.
Including the need for 10Gig switches which are still stupid expensive:

Cisco 16-Port 10 Gigabit Ethernet Module Our Price $27,433.91

PS: You could probably get close with a 32 port GB switch with a 10GB unlink,
but it's still not free.

------
antirez
A single threaded Redis instance on a 300$ Linux box performs 100,000
operations/second. Assuming a quadcore and running N instances to approach
500,000 op/seconds in cheap hardware should be not hard, so I'm not very
impressed by this numbers, especially given the hardware used.

~~~
vicaya
Read the facebook engineering notes on memcached. Linux kernel needs to be
patched to fully utilize all cores even at 200k qps with 173ms average
latency. Also you didn't give average request size and latency numbers, which
makes your number less helpful.

------
piramida
It does advertise Sun's hardware but scaling memcached to heavily multicore
setups is interesting, irregardless of hardware used - same points would be
valid on highend intel servers. Good to know it's possible to saturate 10G
without much tweaking.

------
vicaya
A typical "drag race" benchmark. There is no latency numbers for the
corresponding throughput. The facebook engineering notes was a little more
helpful, with at least average latency 173ms at 200k qps with an intel 8-core
box with 1GE NIC (they could push it to 300k but the latency was too high.) So
a max 500k qps (with unknown latency) on a much more expensive T2 with 10GE
NIC is not that impressive.

I'd like to see median and 99 percentile latency for any throughput numbers.

------
brendano
also see the older <http://www.facebook.com/note.php?note_id=39391378919>

------
sahaj
is this an ad? serious question.

