

Memcached on TILEPro64 at Facebook - brooksbp
http://gigaom2.files.wordpress.com/2011/07/facebook-tilera-whitepaper.pdf

======
jacques_chester
Interesting reading.

My main concern is that the engineers changed _two_ variables for the test:
the hardware _and_ the software.

They compared standard memcache on x86 versus a modified version of memcache
on the Tilera chips. The modifications are not minor -- they changed the way
requests are handled so that multiplexing can be done in parallel. That
multiplexing step is, per a loosey-goosey application of Ahmdal's law,
probably the _actual_ performance constraint on the x86es.

I think a better test would be to backport the Tilera version to the x86 chips
and then run the tests again. Then they'd have a better idea of what degree of
throughput performance is given by the hardware and what degree by the changes
to the software.

~~~
yvdriess
The Tilera architecture is different enough to make that benchmark meaningless
though. You have to compare the best algorithm for one architecture with the
best for the other. Of course, you could be right if that bag of tricks they
opened up for the Tilera also makes for a more efficient x86 implementation.

~~~
jacques_chester
On my reading, there's nothing sufficiently different about the Tilera
architecture that would prevent them back-porting the multi-threaded
demultiplexer to x86.

It's not a meaningful test in a scientific sense. While I suspect that Tilera
would perform better, it's not properly controlled for. Some of the recorded
difference will the software changes.

