They probably have some faults, but the general conclusions smell right to me: I don't think they're in the "really screwed up and wildly misleading" category, but in the "ok, interesting, could use some work though" category.
There is nothing you should be more wary of than a benchmark that matches your pre-existing intuition. It'll lead you to ignore serious methodological issues, without any sound scientific (or any other epistemological) reason. https://speakerdeck.com/alex/benchmarking is a slide deck I gave at my office on how to do better benchmarking
EDIT: I should probably mention I work at Rackspace, and thus everything I say on this subject should be taken with appropriate grains of sand :)
This statement is exactly the problem he is describing. :) One metric for a specific use case or scenario is a terrible indicator of overall "quality". It is much more nuanced than that. I think the worst tickets I've gotten in the 10 years sofar sysadmining is when a customer just states their app is "slow".
Yes in simplistic terms for a specific metric I'm sure other providers have better hardware than AWS, if that is all one wants to base their value of "better" on then so be it, but that is pretty naive.
Many argue that the AWS ecosystem (25 services at last count) and the extensive featureset of AWS outweigh the bare bones "fast" metrics of other providers.
I think like the poster above is mentioning...there is generally more to it than a simple metric or two sampled a few times from a single endpoint. But I guess it all lies on ones definition of what they consider valuable...