

Memcache top - hamstah
http://engineering.tumblr.com/post/48701285213/open-source-memcache-top

======
jbert
Does dropping packets matter if you're looking for hot keys?

As long as you manage to be fair in what you drop, surely all you really need
is a representative sample (even 1-in-100 or whatever would do).

~~~
corresation
This was my sentiment as well. Further I'm not quite sure how this works --
you run one of these instances on every single memcache instance machine,
using packet capture? No mechanism to work across a cluster?

It seems like something one would make a patch for memcache itself (where
injecting such metrics seems quite simple).

------
bmatheny
Since I wrote memkeys, maybe I can clarify a few things.

First, dropping packets matters. If you see only 30-40% of your traffic you
can't guarantee that you have enough data to know what your hot keys actually
are. This is especially true when you are interested in (for instance) sorting
keys by bandwidth usage. You might have a key that gets half as many hits as
the hottest key but is 4x the size and causing network link saturation. In
this case, depending on how much data you're able to capture, you may or may
not even see this data point. Also, the follow-up comment from corresation
about patching memcache doesn't make sense to me.

Second, this was no 'jab' at etsy. I know the etsy guys incredibly well and
we're all friends. We've collaborated on work in more than one occasion. The
jab comment seems like unnecessary speculation. The comment about seeing how
memkeys affects performance is of course spot on. In this case, one thread
will peg a CPU core for packet capture but besides that will not be CPU
intensive. Since it uses packet capture, memkeys doesn't actually interact
with memcached directly so the impact should be minimal. We used it at Tumblr.

Third, fixing the packet loss issue in mctop wasn't feasible as the problem is
with ruby-pcap not with mctop. Additionally, while Tumblr has plenty of ruby
code in production we don't generally use it for building 'real-time'
applications. There are better languages for the job.

I built memkeys because it solved a problem we had, and was fun. That's it.

------
n1c
Larger screenshot from the Github page:
[https://raw.github.com/wiki/bmatheny/memkeys/misc/screenshot...](https://raw.github.com/wiki/bmatheny/memkeys/misc/screenshot.png)

~~~
pestaa
The only useful line to me is the second bottom one. It says the packet loss
is more than 4% as opposed to less than 2% the article claims.

Still a nice tool, though.

~~~
bmatheny
This screenshot was taken on my dev box which has plenty of other activity
happening. On a production memcache box, seeing 1Gb/s, I see ~2% packet loss.

------
mmuro
I guess when you get up to the scale of Tumblr, you need any measure of
performance. I wish I knew what was going on there. Anyone care to elaborate?

~~~
nasalgoat
At Tumblr or in the supplied image?

If the latter, it's a list of memcache key IDs, ordered by hit count.

------
alekseyk
Cool tool, a bit of a pointless jab at Etsy.

I don't know how it queries up Memcached but it's probably a good idea to see
how it affects performance of it before running it against a production cache
pool.

Just a thought.

~~~
crescentfresh
Note: going solely off the two screenshots (mctop:
<http://etsycodeascraft.files.wordpress.com/2012/12/mctop.jpg> , memkeys:
[https://raw.github.com/wiki/bmatheny/memkeys/misc/screenshot...](https://raw.github.com/wiki/bmatheny/memkeys/misc/screenshot.png)),
it's not clear why tumblr didn't fix the stated packet loss issue in mctop
instead.

