
Revisiting a 90-year-old debate: the advantages of the mean deviation (2004) - yconst
http://www.leeds.ac.uk/educol/documents/00003759.htm
======
jmount
There is a technical reason to prefer Mean Square Error derived measures (like
RMSE and standard deviation) in some situations (such as machine learning and
value estimation): when minimizing one of these measures you tend towards the
mean and get expected values correct. Expected values are additive: so they
roll up nicely (get the individuals right and you also have the group).

My example tends to be lottery tickets. You minimize MAD by saying they are
all worth zero (which is pretty much my opinion). But then you don't get the
value of the lottery by summing up all the ticket values. You do/should get
get with mean/expectation based estimates.

More of my writing on this: [http://www.win-vector.com/blog/2014/01/use-
standard-deviatio...](http://www.win-vector.com/blog/2014/01/use-standard-
deviation-not-mad-about-mad/) . Though I am also a fan of quantile regression
(it just solves different problems).

------
OliverJones
There's an epic rant on this topic here, [https://www.edge.org/response-
detail/25401](https://www.edge.org/response-detail/25401) , by Nassim Taleb,
one of the 21st century's greatest masters of epic rants.

This Mean Absolute Deviation is a very useful error estimator. Unfortunately
it, unlike the standard deviation, requires two passes over a dataset to
compute.

~~~
GFK_of_xmaspast
What about
[https://en.wikipedia.org/wiki/Algorithms_for_calculating_var...](https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online_algorithm)

------
arafa
I actually used the Mean Absolute Deviation at work to good effect. When
you're testing variance over very large sample sizes, it's a lot better than
using something like Chi-Squared tests, which nearly always yield a
significant result over a certain sample size (maybe 20,000 or so).

------
busyant
This may seem weird, but is this being posted because of some standardized
testing?

My 6th grader took home a practice standardized test today. Last question on
the test required a MAD computation.

He didn't know what MAD was and I had to look it up myself.

I think understanding the _idea_ behind MAD is great and potentially useful
and interesting.

However, I'm not sure what 6th graders are going to be able to grind through
two iterations over a data set to get the correct answer.

------
rlucas
Why, oh why, didn't they study _this_ kind of stuff when I very nearly managed
to drop out of the History of Science program in college?

------
nerdponx
Two thoughts:

1\. The SD does have an interpretation: it is the (rescaled) Euclidean
distance between your observed data set and a data set in which all points are
replaced by the sample mean. Not terribly useful, but arguably not useless.

2\. Are there models for which MD is in fact a sufficient statistic?

