There is a technical reason to prefer Mean Square Error derived measures (like RMSE and standard deviation) in some situations (such as machine learning and value estimation): when minimizing one of these measures you tend towards the mean and get expected values correct. Expected values are additive: so they roll up nicely (get the individuals right and you also have the group).
My example tends to be lottery tickets. You minimize MAD by saying they are all worth zero (which is pretty much my opinion). But then you don't get the value of the lottery by summing up all the ticket values. You do/should get get with mean/expectation based estimates.
This Mean Absolute Deviation is a very useful error estimator. Unfortunately it, unlike the standard deviation, requires two passes over a dataset to compute.
I actually used the Mean Absolute Deviation at work to good effect. When you're testing variance over very large sample sizes, it's a lot better than using something like Chi-Squared tests, which nearly always yield a significant result over a certain sample size (maybe 20,000 or so).
1. The SD does have an interpretation: it is the (rescaled) Euclidean distance between your observed data set and a data set in which all points are replaced by the sample mean. Not terribly useful, but arguably not useless.
2. Are there models for which MD is in fact a sufficient statistic?
My example tends to be lottery tickets. You minimize MAD by saying they are all worth zero (which is pretty much my opinion). But then you don't get the value of the lottery by summing up all the ticket values. You do/should get get with mean/expectation based estimates.
More of my writing on this: http://www.win-vector.com/blog/2014/01/use-standard-deviatio... . Though I am also a fan of quantile regression (it just solves different problems).