Or is it necessary to instrument each application with hires timers and count instructions between the timer points?
Linux probably has something similar.
kldload: can't load hwpmc: Exec format error
Looks like the linux/ubuntu equivalent may be 'pfmon', also not enabled in the stock kernel.
I think this is the relevant project page: http://perfmon2.sourceforge.net/man/pfm_get_cycle_event.html
The researchers just choose to use Bing for their test data instead of mysql or something else... and their point was that bing was more cpu intensive, so the results would be more solid.
Or is it necessary to instrument each application with hires timers and count instructions between the timer points?