The transition to Real User metrics is not at all related to using an average. All recording and instrumentation processes will benefit through using more sophisticated and nuanced tools than a simple average. Real user metrics are an important data point, but they lack certain information you can get from external monitoring systems. We use both pingdom and catchpoint, by far my favorite is Catchpoint because I can see things like what ISP is involved with a slow request, what geographic region, etc. I can also get scatter plots and nice statistical graphs around median, geometric mean, 75th, 95th, 99th percentile.
So in short the main points are good, simple averages are misleading. Capturing end user performance data is good. Not using external monitoring though isn't a good idea because there a number of things that you can identify if you have that insight.
Looking at your 95th percentile with RUM data means something totally different than looking at your 95th percentile with synthetic data. With RUM you are looking at your actual visitors across every page on your site. With synthetic testing (Keynote, Catchpoint, etc) you're looking at a small sample of pages from random nodes in random locations around the world. The problem with synthetic testing is that it makes all sorts of assumptions about your visitors like their browser, geography, connection speed, state of browser cache, etc. This doesn't mean that synthetic testing isn't useful (it is!), but it's important to recognize the shortcomings of your methodology whether that's from looking at an average or looking only at synthetic data.
I did a talk a couple of years ago on the statistics of web performance where I cover things like median, arithmetic mean, geometric mean, margin of error and sample sizes to carry out proper data analysis. Slides available here: http://www.slideshare.net/bluesmoon/index-3441823
To zashapiro's point about Geometric Mean... while it tends to be superior in the ideal case where the distribution is perfectly Log-normal, on average, most distributions tend to sway from a perfect log-normal. The median gives you a slightly better measure of central tendency in this case.
Secondly, there's the problem with user perception of geometric standard deviation (and consequently margin of error). Unlike arithmetic standard deviation which is +-, the geometric standard deviation is */, which means it's not visually symmetric... humans have an easier time visualizing additive symmetry than they do with multiplicative symmetry.
At LogNormal.com, we track the median, arithmetic mean, geometric mean a whole bunch of percentiles and margins of error and a complete distribution curve.
The average of [0,100,200] and [99,100,101] are both 100. And yet these two data sets are clearly different.
Measures of central tendency should always be supported by measures of dispersion (range, standard deviation, etc.) Not just with web analytics.
I think the more important point is that knowledge of the underlying distribution -- normal (Gaussian), log-normal, exponential, power law, Weibull, etc -- is very important. The more skewed the underlying distribution, the less relevant the mean becomes -- and it can cause you to make some very bad inferences.