
Nassim Taleb: We should retire the notion of standard deviation - pyduan
http://www.edge.org/response-detail/25401
======
Homunculiheaded
I sometimes think that progress in the 21st century will be summed up as: "The
realization that the normal distribution is not the only way to model data".

Taleb's favorite topic is the "black swan event" which is something that the
normal distribution, and the idea of standard deviation, don't model that
well. In a normal distribution very extreme events should only happen once in
the lifetime of several universes. Of course assuming variation inline with a
Gaussian process is at the heart of how the Black-Sholes model calculates
risk/volatility/etc.

Benoit Mandelbrot argued that financial markets follow a distribution much
more similar to the Cauchy distribution (specifically the Levy distribution)
rather than a Gaussian. The problem of course is that the Cauchy distribution
is pathological in that it doesn't have a mean or variance, you can calculate
similar properties for it (location and scale), but it doesn't obey the
central limit theorem so in practice it can be very strange to work with.

The normal distribution is fantastic in that it does appear frequently in
nature, is very well behaved, and has been extensively studied. However a
great amount of future progress is going to come from wrestling with more
challenging distributions, and paying more attention to when assumptions of
normality need to be questioned. Of course one of the challenges of this is
that the normal distribution is baked into a very large number of our existing
statistical tools.

~~~
jerf
This is actually what I expected to read: "The standard deviation is useful
because with the average and the standard deviation, one can fully
characterize a normal distribution. However, the standard deviation is less
useful a statistical summary the farther away from 'normal' you get, and in
reality, there is no such thing as a normal distribution, as a true normal
distribution is defined on the entire real number line from negative infinity
to positive infinity. Reality always provides some bound, and it's often quite
distorted from Guassian. For instance, a 'normal' distribution averaging 2
with a standard deviation of 1.4, bounded by 0, is quite non-Gaussian in many
important ways! (Not least of which is that you're going to have to do
something to replace the missing probability...)

"People rarely check how closely their data conform to the standard
distribution; indeed, many people blindly apply the standard deviation to
their data regardless of its distribution! The resulting number is often more
obfuscatory than helpful, to the extent that it crowded out more useful
summaries.

"It's a useful metric when treated carefully, but it is rare to encounter it
treated carefully. Science courses would be well-served to stop teaching it in
favor of a stronger emphasis on multiple distributions. (Multiple
distributions are usually touched upon, but implicitly our curricula overfavor
the Gaussian distribution and end up accidentally implicitly convincing
students its the only one.)"

But that's just me.

~~~
Helianthus
>Reality always provides some bound

But... it doesn't. You ever hear about the hypothetical possibility of your
atoms lining up and falling through the floor?

It's hypothetical in the sense that it's really ridiculously unlikely, but
_there is no bound preventing it_.

Now the central point about different probability curves stands, but that's
not what Taleb was talking about--he seems to think that it's the tool's fault
if people are using it wrong--and it's also not what Homunculiheaded argued.

~~~
jerf
"But... it doesn't. You ever hear about the hypothetical possibility of your
atoms lining up and falling through the floor?"

A bad example; that's a very, very large sample space, such that deviations
from mathematical perfection are irrelevant. They do exist, if you're precise
enough (for instance, the universe is not modeled by perfectly continuous
space), but I'm not inclined to argue them, because it's too easy to argue
that they're irrelevant. So instead consider something more human-sized: Match
a normal distribution to the height of human beings.

It works very well, except in real life, the probability of a negative-height
human being is _zero_. This is not what the Gaussian model predicts.

Unfortunately, rather more science takes place in the second domain than the
first.

"that's not what Taleb was talking about"

I'm quite aware. The fact that I commented on how I got something other than
what I expected rather suggested that, I thought... The fact that this isn't
precisely what Homunculiheaded said is also why I posted, rather than just
upvoting....

~~~
Helianthus
>The fact that this isn't precisely what Homunculiheaded said is also why I
posted

Ah. I misread the following...

>>This is actually what I expected to read:

as agreement ("This is actually what I expected to read."). My mistake.

>Unfortunately, rather more science takes place in the second domain than the
first.

As I said to Homunculiheaded, this is because of the relative utility of the
models, which we _understand_ \--and even those that do not understand it do
not make the tool's use invalid.

What are we bemoaning, here, but actual misunderstanding itself?

And really, what's the point of that?

------
n00b101
Taleb has a good point about people mistakenly interpreting standard deviation
(sigma) as Mean Absolute Deviation (MAD). I like that he gives some
conversions (sigma ~= 1.25 * MAD, for Normal distribution).

I think it's rather silly to talk about "retiring" standard deviation, but we
can't blame Taleb - the publication itself posed the question "2014: What
Scientific Idea is Ready for Retirement?" to various scientific personalities.

What Taleb failed to mention is that, once properly understood, standard
deviation has distribution interpretations that can be much more useful than
MAD. For example, if the data is approximately normally distributed, then
there is approximately a 99.99% probability that the next data observation
will be <= 4 * sigma.

Not everything is approximately normally distributed, but a lot of phenomena
ARE normally distributed. It's a well known fact that the phenomena which
Taleb is most interested in (namely, financial return time-series) are not
normally distributed. But I would like to know how Taleb proposes to "retire"
volatility (sigma) from financial theory and replace it with MAD? Standard
deviation is so central in finance that even the prices of some financial
instruments (options) are quoted in terms of standard deviation (e.g. "That
put option is currently selling at 30% vol"). How do we rewrite Black-Scholes
option pricing theory and Markowitz portfolio theory in terms of MAD and
remove all the sigmas everywhere? Surely Taleb has already written that paper
for us so that we can retire standard deviation?

~~~
regularfry
I think his point is that Black-Scholes et al are holed beneath the waterline
precisely because they involve standard deviations. In his world, you're
better off being unable to price an option than you would be with Black-
Scholes. Your example of "That put option is currently selling at 30% vol" is
actually an example of why the system is so completely broken: if volatility
as standard deviation was valid, _all options against the same underlying
instrument would have the same implied volatility_. The volatility smile
shouldn't exist.

This wouldn't matter if the down-side wasn't so crippling.

I don't think Taleb has to be the one to propose a replacement for portfolio
theory, and I think criticism of him for not doing so is pointless. You don't
need to have a spare tire handy to point out that your neighbour's car has a
flat, and you don't have to run an airline to tell people not to get on a
plane with the engines visibly on fire.

~~~
tdees40
I've never understood this part of Taleb's argument. Of course, constant vol
Black-Scholes does not hold. BUT NO ONE USES THIS. Everyone in the financial
industry is well aware of the volatility skew, and spends lots of time
adjusting for it.

B-S vols are putting the "wrong number into the wrong equation to get the
right price" as Rebonnato famously said.

------
JASchilz
The central limit theorem shows us that unimodal data with lots of independent
sources of error tends towards a normal distribution. That description is a
good first-pass, descriptive model for lots and lots of contexts, and standard
deviation speaks well to normally distributed data.

Squaring error isn't just a convenient way to remove sign, it's driven by a
lot of data-sets' conformance to the central limit theorem.

~~~
dj-wonk
Thank you. I don't think it is intellectually honest for Taleb to omit this
fact.

------
ClementM
This article is based on paper Taleb published in 2007. If you want to test
yourself, submit yourself to experiment in page 3:
[http://papers.ssrn.com/sol3/papers.cfm?abstract_id=970480](http://papers.ssrn.com/sol3/papers.cfm?abstract_id=970480)

~~~
CamperBob2

       A stock (or a fund) has an average return of 0%. It moves 
       on average 1% a day in absolute value; the average up move 
       is 1% and the average down move is 1%.
    

How does that yield an average return of 0%?

~~~
baking
The usual way to do this is to take the natural log, so an up 1% day followed
by a down 1% day (or vice versa) will always net out to a 0% change.

------
programminggeek
I think because it's called "standard deviation" that it sounds like the thing
to use or look for. It sounds more correct because of the word standard.

I feel like it is the same kind of failing due to human perception of language
that programmers have with the idea of exceptions and errors, especially the
phrase "exceptions should only be used for exceptional behaviors". That's a
cool phrase, but people latch on to it because of the word exception sounding
like something extremely rare and out of the ordinary whereas we see errors as
common, but they are in fact the same thing. Broke is broke, it doesn't matter
what you call it, but thousands of programmers think differently because of
the name we gave it.

We are human and language absolutely plays a role in our perception of things.

~~~
klodolph
> I think because it's called "standard deviation" that it sounds like the
> thing to use or look for.

Yes! Because it's an awesome trick and lets you do good estimates on napkins.

The other day I was buying lunch at a food cart and thought about how much
change the food carts had to carry, as a function of how many customers they
have, under the assumption that they want to be able to provide correct change
to 99% of their customers.

Let's say that the average amount of change a customer needs is $5, and a
99-th percentile customer needs $15 in change. If we _pretend_ that the
distribution is approximately Gaussian we can calculate that 1,000 food carts
with 1 customer each would need $15,000 in change, but 1 food cart with 1,000
customers would need $5 x 1,000 + ($15 - $5) * sqrt(1,000) ≈ $5,320. That's
math you can do in your head without a calculator (being a programmer, 1,000 ≈
2^10 so sqrt(1,000) ≈ 2^5).

The standard deviation and assumptions of normality are so useful because of
the central limit theorem. That is, if you have many iid variables which have
finite standard deviation the sum will converge to a Gaussian distribution as
the number of variables increases.

Then you say "Well, the standard deviation weighs the tail too heavily" and
the response is "well use higher order moments then, that's what they're
they're for".

~~~
skybrian
It's a neat math trick, but it seems more accurate to say this lets you
calculate _bad_ estimates on the back of a napkin. Unless you really think
food carts carry $5000 in change.

The quantitative work I do has to do with measuring latency, where the
minimum, median, 90%, and 99% values are more meaningful than the mean or
standard deviation. Programs typically have a best-case scenario (everything
cached) and a long one-sided tail.

~~~
klodolph
Saying that it's silly to think that food carts have $5,000 in change is
unproductive because I was illustrating how the calculation works, not how the
economics of food carts works, and the numbers I used for illustrative
purposes were not intended to reflect reality. (1,000 customers in a day? Not
likely, my guess is 200 for the busiest food carts.)

But it's good to have bad estimates, at least, it's better to have bad
estimates than it is to have no estimates at all. I'm not saying that standard
deviation is a substitute for more thorough analysis, just that standard
deviation is an improvement over just talking about the mean.

Another example: We'd like to hire you, the mean number of hours per week
you'd work is 40.

Versus:

We'd like to hire you, the mean number of hours per week you'd work is 40, and
the standard deviation is 15. So your _bad_ estimate is that you'd have two
70-hour weeks each year. But it's better than no estimate.

~~~
skybrian
Sure, two points is better than one but what's special about two? I'd rather
have a graph. We have computers so there's rarely a reason to compress the
data so much.

~~~
klodolph
We often have to compress the data down to a single decision or statistic:
yes/no should I accept the job offer, how much money should I save before
buying a house, or what's the probability that I'll die in the next 10 years.

I hate to quote XKCD, but it's like saying your favorite map projection is a
globe ([http://xkcd.com/977/](http://xkcd.com/977/)). Yes, you've preserved
all the data, but even _with_ computers, your beloved graph will not make it
all the way to the end.

~~~
skybrian
Preserving all the data is the logical endpoint but that's not what I was
suggesting. I'm just saying there's nothing special about keeping two points.

I'd rather not feed two points to my decision algorithm, whether it's machine
learning or a human looking at the data. It makes more sense to make some
attempt to preserve the shape of the graph unless you have strong reason to
believe it's Gaussian, and even then the assumption should be checked.

------
cheald
I really tried to get through "The Black Swan" and Taleb's writing struck me
as so pretentious and self-involved that it made it impossible for me to
finish.

He strikes me as someone who is so desperate to be important and recognized
that an assertion like this doesn't really surprise me.

~~~
calroc
I started reading that book and I thought he was a genius, but then, right
abut the point where he starts bagging on the Uncertainty Principle, I
realized that he's actually kind of an idiot.

The book does make a good point though.

~~~
cheald
Maybe I'm the idiot, but I found the book's point to be pretty trivial.
"Sometimes unexpected things happen, and people are bad at expecting the
unexpected" seems pretty damn trivial to me.

Maybe it's because I think like a programmer rather than a finance guy, but a
large portion of what I do on a daily basis is about mitigating risk from the
unknown. A programmer who only guards against known risks is going to get his
ass handed to him sooner rather than later (these are the sorts of people who
use blacklists and regexes to sanitize SQL). A finance guy who only guards
against known risks is just playing the averages. My experience with traders
is that they tend to be pretty abstracted from reality and like to impose
rules and patterns where there are none. I can see why those sorts of people
would find Taleb's work groundbreaking.

Taleb's undoubtedly intelligent, but I feel like he woke up one day, decided
that he wanted to be a philosopher who brings wisdom to the masses, and built
his temple on a mind-searingly obvious principle, which he now proclaims to be
his great gift to humanity, for which he should be praised, hallowed be his
name. The impression I got off of him was that he considers himself a prophet,
imparting a word to the masses that the rest of us are just too stupid to
recognize. Gag me.

~~~
calroc
I agree with you except that you left out the only really important (in my
opinion) part of his pretty trivial message: "Sometimes unexpected things
happen, and people are bad at expecting the unexpected, and then they FAIL TO
NOTICE that they are bad at expecting the unexpected"

It can be difficult for folks (myself included!) to separate their message
from their irascible personalities.. ;-)

~~~
cheald
But if we were good at noticing that we're bad at expecting the unexpected,
then we would _expect_ the unexpected, making it no longer really all that
unexpected, no? His whole argument just felt tautological to me.

~~~
jessaustin
If you _really try_ , for a _long time_ , you can be that rational. Please
note, however, that 99% of people are not that rational, and if you haven't
realized it, you're also not that rational. (Neither am I! But at least I
realize I have a problem... b^)

If none of this makes sense, take a look at the LessWrong site.

------
scythe
While the mean deviation as presented is slightly nicer than sigma for
intuitive purposes, it isn't as appropriate (iirc) for statistical tests on
normal distributions and t-distributions.

More importantly, it doesn't fix the _real_ problem, which is that the mean
and standard deviation don't tell you everything you need to know about a data
set, but often people like to pretend they do. It's not rare to read a paper
in the soft sciences which might have been improved if the authors had
reported the skewness, kurtosis, or similar data which could shed light on the
phenomenon they're investigating. These latter statistics can reveal, for
instance, a bimodal distribution, which could indicate a heterogeneous
population of responders and non-responders to a drug, and that's just one
example.

I'm not a statistician, so some of this might be a bit off.

~~~
drblast
Whenever you try to describe a large data set with a single number, you lose a
lot of information, like you said. Having more measurements helps, but I think
the larger point is that we don't have to do this anymore, we could publish
the entire data set instead.

Without computers, this would be a waste of paper, but transmitting the data
electronically is cheap.

So why argue over the measurements? Publish the data and my software can give
me any measurement I'm interested in.

------
bluecalm
So first about the article:

>>The notion of standard deviation has confused hordes of scientists

What an assertion! It also proved to be very useful for hordes of
scientists... what about some examples of confused scientists ?

>>There is no scientific reason to use it in statistical investigations in the
age of the computer

As someone who uses it daily I am eagerly awaiting his argument.

>>Say someone just asked you to measure the "average daily variations" for the
temperature of your town (or for the stock price of a company, or the blood
pressure of your uncle) over the past five days. The five changes are: (-23,
7, -3, 20, -1). How do you do it?

Ok... if I am to calculate the average I am calculating the average if I need
to know standard deviation I calculate standard deviation...

>> It corresponds to "real life" much better than the first—and to reality.

What the flying fuck. What "real life" ? Standard deviation tells you how
volatile measurements are not what mean deviation is. Those are both very real
life things just not the same thing.

>>It is all due to a historical accident: in 1893, the great Karl Pearson
introduced the term "standard deviation" for what had been known as "root mean
square error". The confusion started then: people thought it meant mean
deviation.

I don't know how one can read it and not think: "is this guy high or just
stupid?".

>>. The confusion started then: people thought it meant mean deviation.

I am yet to see anybody who thinks that standard deviation is mean deviation.
It's Taleb though. Baseless assertions insulting groups of people are his
craft.

>>What is worse, Goldstein and I found that a high number of data scientists
(many with PhDs) also get confused in real life.

One example please ? I can give hundreds when std dev is useful and mean
deviation isn't. Anything when you decide what % of yoru bankroll to bet on
perceived edge for example.

Ok so he asserted that people should just use mean deviation instead of mean
of squares. Guess what though, taking the squares have a purpose: it penalizes
big deviations so two situations which have the same mean deviation but one is
more stable have different standard deviations. THis information is useful for
many things: risk estimation or calculating sample size needed for required
confidence (if you need more experiments, how careful should you be with
conclusions and predictions etc). He didn't mention how are we going to
achieve those with his proposal. Meanwhile he managed to throw insults towards
various groups without giving one single example of misuse he describes.

This is not the first time he writes something this way. His whole recent book
is like that. It's anti-intellectual bullshit with many words and zero points.
He doesn't give any arguments, he throws a lot of insults, he misues words and
makes up redundant terms which he then struggles to define. The guy is a vile
idiot of the worst kind: ignorant and aggressive. Him gaining so much
following by spewing nonsense like this article is for sure fascinating but
there is no place for him in any serious debate.

~~~
azakai
>> The notion of standard deviation has confused hordes of scientists

> What an assertion! It also proved to be very useful for hordes of
> scientists... what about some examples of confused scientists ?

He is exaggerating, for sure. But the point is valid: the mean average
deviation (MAD) is often very different than the standard deviation (STD), and
the MAD is more intuitive, it has a natural geometrical interpretation - STD's
usage of squaring the distance makes it more complex.

And yes, this confuses people in some cases, including scientists. Many
scientists are not statistical experts, they use tools as they were taught,
and they often assume MAD is approximately the STD, because it usually is,
except in rare cases when it is not. I've seen examples of those people in
grad school, he is not making this up.

The STD is far more easy to analyze in a mathematical way. That is the huge
value it brings - squaring is an operation you can take the derivative of, but
absolute value you cannot. STD gives us nice properties like easily provable
sum of variances is the variance of the sum, for independent variables.

MAD, however is nicer for reporting data since it is more intuitive. I think
he makes a valid point that STD is used more frequently than it should be.

> Ok so he asserted that people should just use mean deviation instead of mean
> of squares. Guess what though, taking the squares have a purpose: it
> penalizes big deviations so two situations which have the same mean
> deviation but one is more stable have different standard deviations.

His point is that many people are not aware of that property and do not want
it.

~~~
elipsey
I'm relieved to hear from others who didn't love his book, now I don't feel so
left out. I checked out Black Swan from the library a while back. It seemed
like Taleb's smug prose and mudslinging were writing a check his evidence
couldn't cash, but I couldn't stand more then the first 50 pages, so I never
got to find out. My girlfriend read it all of it and sort of gave me the cliff
notes; she thought I wasn't missing much.

If I ever met anyone in real life who was as impressed by his book as as much
as the inexplicably fawning blogs and reviews I have seen, I would give it
another try, but I sorta feel like I've been had. At least it was a library
book, so I didn't give him 12 bucks.

Anybody here love his stuff, wanna convince me to try again?

~~~
chubot
What is hilarious is that I have been scratching my head over the exact
opposite phenomenon!

I actually love Taleb's books (Black Swan, Antifragile). And I work with "data
scientists", most with Ph.D.s, many in statistics.

I always talk about his book, but I cannot for the life of me find a single
person who's even read it. I think: these are some the ONLY book about stats
on the NY Times best seller lists (up until Nate Silver). And yet all these
professionals have not only not read it, but barely even heard of it?

My hypothesis was that it is more popular on the East Coast than the West
Coast (of the US). Are you from either of those places? I am originally from
the east coast but work on the west coast. To hand wave a bit, I feel like
west coast people are less into "ideas" and more into actions and experiences.
Taleb's ideas do have somewhat of a nytimes-ish new yorker-ish east coast
culture flavor. And a lot of people working in Silicon Valley are not really
that interested in philosophy.

On the subject of his writing, I can totally see that people can be turned off
by his writing. He can be arrogant and insulting. I find it kind of funny, but
that's a matter of taste.

A few things I remember from his books that I really liked:

\- The story of Nobel prize winner Myron Scholes, and namesake of the Black-
Scholes equation, which I learned about in computational finance in college.
He started a company "Long Term Capital Management", to monetize this ideas,
and promptly lost billions of dollars.
[http://en.wikipedia.org/wiki/Myron_Scholes](http://en.wikipedia.org/wiki/Myron_Scholes)

That's not interesting? I think the difference between theory and practice is
intensely interesting, and Taleb has a lot to say about it.

\- I largely agree with his philosophy that people who claim to know things
cause more harm than good. The downfall of Alan Greenspan and Bernanke is
their arrogance. They think they can control the economy. But they can't and
caused millions of people real harm.

\- Respect for the old. For all its virtues, Silicon Valley does have a severe
case of "neomania". Taleb's ideas about things that last apply to software
too. Unix and C are going to be around a lot longer than say Hadoop or
Puppet/Chef.

\- The philosophy of fragility also applies to software in a straightforward
way. Most people know this now, but you should continually expose your
software to users and the market, not build up grand ideas in your head.

\- actions over knowledge, i.e. people who know how to do things but not
explain them

I could list a half a dozen more important ideas but I'll stop there.

I did write in my comments about this book that he overreached on his
"trilogy" idea for the Antifragile. But I do like how he draws together a lot
of seemingly disparate ideas that are philosophically related.

~~~
elipsey
Ok, you convinced me to try again, thanks.

FWIW, I'm from the west coast, but have spent the last several years in New
York working with natural sciences students and post-docs. I might be what you
would call an "ideas" person. Abstraction appeals to me. People in our
department seem happier when their work is closer to physical observerations.
Measuring stuff with yard sticks and radar: good. Getting big piles of data
from other people, and doing stats: OK. Fitting a parematerized model to
someone else's data: healthy skepticism. Of course healthy skepticism is
generally cultivated.

It wouldn't be surprising if our friends reading lists (and reactions) depend
on what they do for living. Did you meet a lot of people in finance or
economics on the east coast? Maybe people's professions are spatially
correlated.

My peers seem to treat models cautiously (including their own), and have
tended to respond to abstract economic ideas with measured skepticism. I can't
speak for them, but I sometimes perceive that economic arguments are suspected
of being insufficiently empirical and subject to ulterior motives. Which, it
seems, is at least a part of what Taleb is complaining about. Obviously these
things can be true of any kind of argument, I'm just reporting my impression.
Anyhow, it would be easier to listen Taleb if his tone was more restrained.

Most of the economists I have paid attention to (which are not near as many as
I would like) seemed inclined to provocation. Sowell in "Basic Economics", and
Friedman in his speeches (I haven't read his papers) tend to poke fun at their
fellow citizens, for example. I think they sometimes alienate those outside
their discipline because of this. It makes reading fun though. I find Friedman
very amusing, and I think so does he.

~~~
chubot
I am a programmer in Silicon Valley, but I was more interested in
philosophy/mathematics when I was young (I was raised and educated on the east
coast). I do feel there is a cultural difference -- not sure precisely what it
is though.

I guess I share Taleb's problems with models. Even before I read Taleb I would
say to myself "the map is not the territory", particularly with regard to
software abstractions. I think the space between the model and reality is
where you find a lot of interesting things (including the ability to make a
lot of money).

I also share Taleb's skepticism with economics. The core problem is that it's
not really a predictive science. It's a lot of people talking about stuff. Did
those ideas help anyone? You can make a good case that they hurt a lot of
people. If they are so smart, why aren't they rich? The Scholes case is a
great example of that.

I'm currently reading "The Signal and the Noise" by Nate Silver, which is
actually a fantastic complement to Taleb's books. They say very much the same
things, in very different ways. The good part is that you will not be turned
off by Silver's prose -- he's humble and very readable. I didn't follow 538 at
all, and didn't pay all that much attention to the 2012 election, but I can
tell that his writing skill was a big reason he became so popular.

To give an example, Taleb talks over and over about "negative knowledge" \--
what not to do, what things don't work, etc. And Nate Silver says the same
thing. To make accurate predictions and models, you have to be aware of known
classes of mistakes, cognitive biases, etc. and not fall into those traps.
People often think that they need to improve themselves by learning more. But
for a reasonably smart people, the bottleneck to your effectiveness is
actually thinking that you know something you don't.

I am also an "ideas" person but I share the utilitarianism and empiricism of
Taleb. There has to be "skin in the game", as he says. There are so many ideas
out there, and generally most philosophical arguments (and journalism,
advocacy, etc.) boil down to confused semantics. So the way to find truth is
through actions and experiments. Economics fails these tests for truth.

EDIT: Nate Silver talks about this paper:
[http://www.plosmedicine.org/article/info:doi/10.1371/journal...](http://www.plosmedicine.org/article/info:doi/10.1371/journal.pmed.0020124)
This would resonate with Taleb quite a bit. Most ideas are false, including
published ones. It would actually violate economic theory if that weren't the
case -- if most science was true -- because scientists have bad incentives
(something I know from direct experience).

------
Glyptodon
All I know is this reminds me a lot of high school where we had to always
compute std dev in problems, homework, and sometimes labs, but nobody really
ever explained how to interpret it. It was always like "This is std dev. This
is how you compute it. Make sure you put it your tables and report."

Eventually someone (or something) did explain it, but once I understood it, it
became clear that it wasn't always a sensible thing to be asked to calculate
but was instead just an instinctive requirement.

------
spikels
You gotta love the acronyms: STD versus MAD!

Taleb is definitely mad but his use of the MAD acronym (mean absolute
deviation) is actually correct. However the STD acronym (all caps) refers to
"sexually transmitted disease" and not generally used for "standard
deviation". Most people use SD, Stdev, StDev or sigma.

Once again his ability to coin new terminology outstrips his ability to form
coherent ideas that are anything more than trivial (eg. we have known about
fat tails in stock returns for 50+ years). Like George Soros[1], Taleb's
success says more about the state of the world of finance than their
contributions to our knowledge.

[1]-See his book "The Alchemy of Finance"

~~~
Fomite
STD is also an obsolete term. Epidemiology, Medicine etc. most often now refer
to them as STIs, or Sexually Transmitted Infections.

The reasoning for this is that many sexually transmitted infections can be
acquired, passed on to others, etc. without causing any clinical symptoms.
See: HPV, among others.

------
justin66
Taleb has a textbook draft up which is more technical than his popular
writings:

[http://www.fooledbyrandomness.com/FatTails.html](http://www.fooledbyrandomness.com/FatTails.html)

There might be something there for the more rabid critics. At least it will
keep them off the internet for a few days...

------
zeidrich
It's not that we should retire the notion of standard deviation. It's more
that we should understand the tools that we are using and use the appropriate
tool for the job.

------
puranjay
NNT is my intellectual superhero but the amount of hate he gets is tremendous.

Please understand that NNT's biggest issues are not so much with the way
statistical models are applied to economics and finance, but how social
scientists sometimes feel compelled to apply them to social fields as well,
which is plain unscientific, dumb, and mostly disastrous.

So when you bear down on his arguments, please keep this context in mind.

~~~
Fomite
Among other things, I've seen way more appalling applications of statistics in
finance and economics than I have in social science.

Also, the assertion in your post that the misapplication of statistical models
in _social science_ is "disastrous" but somehow giving finance a pass? You've
got to be kidding me.

~~~
jessaustin
NNT, at least, doesn't give finance a pass. He excoriates them at every
opportunity.

~~~
Fomite
That's why I was so puzzled - NNT rather eagerly takes finance to task.

------
dxbydt
The notion of area has confused hordes of scientists; it is time to retire it
from common use and replace it with the more effective one of circumference.
Area should be left to mathematicians, topologists and developers selling real
estate. There is no scientific reason to use it in statistical investigations
in the age of the computer, as it does more harm than good.

Say someone just asked you to measure the area of a circle with radius pi. The
area is exactly 31. But how do you do it?

scala> math.round(math.Pi * math.Pi * math.Pi).toInt

res1: Int = 31

Do you pack the circle with n people, count them up and verify n == 31 ? Or do
you pour a red liquid into the circle and fill it up, then drain it and
measure the amount of red ? For there are serious differences between the two
methods.

If instead, you were asked to measure the circumference of a circle with
radius pi.

scala> math.round(2 * math.Pi * math.Pi).toInt

res2: Int = 20

You just ask an able-bodied man, perhaps an unemployed migrant, to walk around
this circle while another man, an upstanding Stanford sophomore, starts
walking from Stanford to meet his maker, I mean VC, well its the same thing...

So by the time the migrant finishes walking around the circle, our upstanding
Stanford entrepreneur is greeting the VC on the tarmac of the San Francisco
International Airport. This leads one to rightfully believe that the
circumference of the circle of radius pi is exactly the distance from Stanford
to the SF Airport ie. 20 miles. It corresponds to "real life" much better than
the first—and to reality. In fact, whenever people make decisions after being
supplied with the area, they act as if it were the distance from their
university to the airport.

It is all due to a historical accident: in 250BC, the Greek mathematician
Archimedes introduced Prop 2, the Prevention of Farm Cruelty Act (
[http://en.wikipedia.org/wiki/California_Proposition_2_(2008)](http://en.wikipedia.org/wiki/California_Proposition_2_\(2008\))
). No I believe this was a different Prop 2. This Prop 2 states that the area
of a circle is to the square on its diameter as 11 to 14
([http://en.wikipedia.org/wiki/Measurement_of_a_Circle](http://en.wikipedia.org/wiki/Measurement_of_a_Circle)
) .The confusion started then: people thought it meant areas had to do with
being cruel to farm animals. But it is not just journalists who fall for the
mistake: I recall seeing official documents from the department of data
scientists, which found that a high number of data scientists (many with PhDs)
also get confused in real life.

It all comes from bad terminology for something non-intuitive. Despite this
confusion, Archimedes persisted in the folly by drawing circles in the sand,
an infantile persuasion, surely. When the Romans waged war, Archimedes was
still computing the area of the circle. The Roman soldier asked him to step
outside, but Archimedes exclaimed "Do not disturb my circles!"
([http://en.wikipedia.org/wiki/Noli_turbare_circulos_meos](http://en.wikipedia.org/wiki/Noli_turbare_circulos_meos))

He was rightfully executed by the soldier for this grievous offense. It is sad
that such a minor mathematician can lead to so much confusion: our scientific
tools are way too far ahead of our casual intuitions, which starts to be a
problem with a mad Greek. So I close with a statement by famed rapper Sir Joey
Bada$$, extolling the virtues of the circumference: "So I keep my
circumference of deep fried friends like dumplings, But fuck that nigga we
munching, we hungry." ([http://rapgenius.com/1931938/Joey-bada-hilary-
swank/So-i-kee...](http://rapgenius.com/1931938/Joey-bada-hilary-swank/So-i-
keep-my-circumference-of-deep-fried-friends-like-dumplings))

~~~
mekael
Possibly the best thing I've read all week. Thank you sir.

~~~
eruditely
Except the difference here is that generally Taleb is right and has a point.
You know the b difference between the probabilistic hacker news titles
generator and the actual titles found here are that even though they sound the
same and have like weight same style. .. one is actually real.

------
lambdasquirrel
I think we'd be better off if we recognized that there are statistical
distributions in the world besides the plain old Gaussian. For example, wealth
does not follow a Gaussian, so why the heck do we throw around ideas like
"above average wealth"?

Is MAD any better? Definitely. But I'd like to see a visual demonstration of
how well it models exponential-based distributions. How well does it describe
their "shape", the skew of the tail?

~~~
chilldream
In that specific case, I'd submit that the word "average" doesn't belong in a
conversation attempting any level of rigor. It only encourages the confusion
between median and mean.

~~~
vorg
Medians and means are interrelated. Averages are best thought of as being on a
scale from meanlike to medianlike: a median requires a mean of the middle 2
values when there's an even number of them, and a mean can have outliers on
each side to be eliminated before calculation. The word _average_ abstracts
away this detail. (I guess there'd be another scale from meanlike to
medianlike absolute deviation, but that's another story.)

------
cwyers
"In fact, whenever people make decisions after being supplied with the
standard deviation number, they act as if it were the expected mean
deviation."

Boy, is that statement useless without any kind of context, example or
citation.

~~~
icebraining
The next two paragraphs have examples with some context.

~~~
cwyers
Like this?

"But it is not just journalists who fall for the mistake: I recall seeing
official documents from the department of commerce and the Federal Reserve
partaking of the conflation, even regulators in statements on market
volatility. What is worse, Goldstein and I found that a high number of data
scientists (many with PhDs) also get confused in real life."

It doesn't tell us what happened, it just asserts that it did in certain
contexts. It doesn't cite the paper he presumably wrote with Goldstein (which
Goldstein?) about it. I feel like I'm getting a summary of an abstract with
all the citations missing.

~~~
lstamour
> which Goldstein?

From ClementM above,
[http://papers.ssrn.com/sol3/papers.cfm?abstract_id=970480](http://papers.ssrn.com/sol3/papers.cfm?abstract_id=970480)

~~~
cwyers
Thank you.

From that paper:

"We first posed this question to 97 portfolio managers, assistant portfolio
managers, and analysts employed by investment management companies who were
taking part in a professional seminar. The second group of participants
comprised 13 Ivy League graduate students preparing for a career in financial
engineering. The third group consisted of 16 investment professionals working
for a major bank."

From the article:

"What is worse, Goldstein and I found that a high number of data scientists
(many with PhDs) also get confused in real life."

I get that "data scientist" is a really broad term at this point, but I don't
think it's a very good description of the people quizzed in this paper, if
this is the paper he was indeed referring to.

~~~
cwyers
And all his examples come from the financial sector, it would seem, but in the
article he refers to "hordes of scientists," "people in social science," and
even "problems with social and biological science." So even the hard sciences
get pulled into his indictment of standard deviation, even though he never
once gives an example from them.

------
ChristianMarks
Climate scientists--among others--have made similar recommendations to use the
absolute mean error in place of the standard deviation, depending on the
application. Taleb might have cited the extensive methodological literature--
for example:

Cort J. Willmott, Kenji Matsuuraa, Scott M. Robeson. _Ambiguities inherent in
sums-of-squares-based error statistics._ Atmospheric Environment 43 (2009)
749–752.

URL:
[http://climate.geog.udel.edu/~climate/publication_html/Pdf/W...](http://climate.geog.udel.edu/~climate/publication_html/Pdf/WMR_Atmos_Env_09.pdf)

------
bayesianhorse
Nassim Taleb somehow likes to beat up on normals...

We Bayesians have similar notions, but we usually try not to overly bully
frequentist methods, the poor things. Also, being familiar with Bayesian
methods, a lot of what Taleb is saying sounds vaguely familiar...

~~~
kylebrown
Well, frequentist methods are taking the blame for the most recent financial
crisis, such as assuming a normal distribution when the empirical one is fat-
tailed.

Perhaps the Bayesian methods will take the blame in the next financial crisis.
Such as the error in estimating a non-stationary distribution and quantifying
the uncertainty.

~~~
bayesianhorse
Statistical economists or analysts always said that the normal distribution is
a simplification and that this simplification has its own problems.

It's just that traders and bankers would fire statisticians who were too vocal
about it as wasting their time with unnecessary explanations...

Bayesian methods in general can only take the blame if you can prove some
other method being more reliable.

Also they have another advantage: Bayesian methods are so mind blowing and
beautiful, that it is hard to blame them for anything!

------
randomsample2
Standard deviation and mean absolute deviation are both useful, but I think
it's silly to suggest that we all adopt exactly one measure of variability to
summarize data sets. When in doubt, make a fucking histogram.

~~~
michaelhoffman
Histograms can be misleading too, especially when the breaks are not set well.

~~~
randomsample2
Anything can be misleading, but in most cases a histogram conveys more
information than a single number ever could.

~~~
michaelhoffman
Absolutely. They just aren't a panacea.

------
thetwiceler
It is sad that Taleb does not see the value in the standard deviation;
standard deviation is far more natural, and more useful, than MAD.

For example, if X has a standard deviation of s, and Y has a standard
deviation of t, then the standard deviation of X + Y is sqrt(s^2 + t^2). There
is a geometry of statistics, and the standard deviation is the fundamental
measure of length.

To retire the standard deviation is to ignore the wonderful geometry inherent
in statistics. Covariance is one of the most important concepts in statistics,
and it is a shame to hide it from those who use statistics.

Additionally, I will mention that we do not need normal distributions to make
special the idea of standard deviations. In fact, it is the geometry of
probability - the fact that independent random variables have standard
deviations which "point" in orthogonal directions - which causes the normal
distribution to be the resulting distribution of the central limit theorem.

------
tn13
There is nothing wrong with STD or MAD. The real problem is a lot of people
apply them without realizing the nature of their data and what kind of
analysis they want to do.

In this case what matters in the end is the kind of impact deviation from mean
has on the real world variable you have. I agree that in most Gaussian
experiments MAD might be more useful than STD.

STD is more useful when the real world impact of the deviation increases
exponentially with the magnitude of deviation and hence it is a good idea of
magnify the (x-n) by squaring it. In many cases the impact is linear where MAD
clearly works better. For example in cricket where n runs are n times better
than 1 run. But in case of shooting. Hitting 9 targets out of 10 might be 100
times better than 1 out of 10 so there MAD will be misleading.

------
TTPrograms
There is some argument that MAD is actually better than RMS for a lot of
applications. Apparently it predated RMS, but one of the reasons it was
switched to was because RMS minimizing linear regression is much, much simpler
to calculate. Also consider comparing the robustness of RMS based regression
with MAD based regression. See:
[http://matlabdatamining.blogspot.com/2007/10/l-1-linear-
regr...](http://matlabdatamining.blogspot.com/2007/10/l-1-linear-
regression.html)

------
yetanotherphd
I had hoped this would be about the revolution occurring in
statistics/econometrics where confidence intervals based on strong parametric
assumptions (e.g. the confidence intervals you would obtain using the standard
deviation) are being replaced by confidence intervals obtained using the
bootstrap (and other non-parametric methods) that don't rely on such strong
assumptions.

But no, it is just advocating using Mean absolute distance instead of the
standard deviation. Which I guess is to be expected from someone whose work
focuses mostly on long-tailed distributions.

Still, I think that non-parametric methods are much more valuable as a
solution to dealing with non-normal data than what Taleb is proposing.

------
valtron
He makes a good point about infinite MAD vs. STD.

------
afterburner
I've found MAD a potentially useful measure for monitoring whether something
gets out of whack; when using STD I needed to modify it to give less weighting
to outliers.

------
aredington
The way I read it he's proposing two things:

1) Refer to the analysis of Root Mean Square Error always by that name. (RMS
is already often used in certain jargon instead of stddev).

2) Stop treating RMS as a default measure of variance. Treat Mean Absolute
Deviation as the default measure of variance, because the figure it provides
is more consistent with people's psychological interpretation.

It's not really retiring RMS, just retiring the idea that it is a good default
statistical analysis.

------
MaysonL
How often do "six sigma" events occur in financial markets? A hell of a lot
more often then the 0.0000001973% that they would in a normally distributed
system.

~~~
tsax
Exactly. I'm astounded that this basic fact - that financial modelers have
been using the wrong distribution for decades IS NOT BEING DENOUNCED. For
example, the Capital Asset Pricing Model (CAPM) which assumes that returns
fall on a normal distribution.

------
Beliavsky
If data is drawn from a Laplace distribution of the form p(x) = exp(-|x|), the
mean absolute deviation is more informative than the standard deviation, but
if its form is close to the normal, p(x) = exp(-x^2), the standard deviation
is more important. So whether to use the mean absolute or standard deviation
depends on the distribution of the data. There is a field called robust
statistics that looks at this question.

------
snake_plissken
I've always thought his writings were more allegorical than scientific; you
can't rely on the standard deviation to never go against you at the worst
possible time. But like anything else, it can and it (probably) will.

Also, yes, his writing style is grating and he takes opportunistic character
swipes at pretty much everyone.

------
beloch
I'm a physicist, so I'm one of the people this guy says standard deviation is
still good for. However, despite some "oddities" (pointed out by others here)
in his article, I'm more than willing to admit a simpler, easier to understand
term would be helpful for explaining many things to the general public. Hell,
it would be helpful for explaining things to _journalists_ , who we then trust
to explain things to the public!

Look at an reputable news site or paper. Odds are they post articles based on
polls several times a day. How many report confidence intervals or anything of
the sort? These are _crucial_ for interpreting polls, but are left out more
often than not. Worse yet, many stories make a big deal about a "huge" shift
in support for some political policy, party or figure, when the previous
month's figure is actually well within the confidence interval of the current
month's poll!

Standard deviation, confidence intervals, etc. are all ways of expressing
uncertainty, and it's become abundantly clear that the average journalist, to
say nothing of the average person, has no clue about what the concept means.
If the goal is to communicate with the public, then we really need to take a
step back and appreciate the stupendously colossal wall of ignorance we're
about to butt our heads against. When we talk about the general public, we
should keep in mind that rather a lot of people know so little about the
scientific method that they interpret the impossibility of proving theories as
justification for giving religious fables equal footing in schools. This kind
of ignorance isn't a nasty undercurrent lurking in the shadows. It's running
the show, as evidenced by many state laws in the U.S.! There is absolutely
_no_ hope of explaining uncertainty to most of these people.

There _is_ hope of explaining basic statistics to journalists, if only because
they are relatively few in number and it's a fundamental part of their job to
understand what they are reporting. Yes, I just said that every journalist who
has reported a poll result, scientific figure, etc. without the associated
uncertainty has _failed_ to adequately perform their job. We need to make
journalists understand _why_ they are failing. If simplifying the way we
report uncertainties will assist with this, then I'm all for it. Bad
journalism is a root cause of a great deal of ignorance, but it's not an
insurmountable task to fix it.

If you are a scientist who speaks to journalists about your work, make sure
they include uncertainties. If you are an editor, slap your peons silly if
they write a sensationalistic poll piece when the uncertainties say it's all a
bunch of hot air. If you are a reader, please mercilessly mock bad articles
and write numerous scornful letters to the editor until those editors pull out
their beat-sticks and get slap-happy. We should not tolerate this kind of crap
from people who are _paid_ to get it right.

~~~
latk
Maybe a solution would be to only communicate the bounds of a value rather
than the value itself. E.g. Instead of “party A has 35% with sigma=2%”
something like “we expect party A to have 33%–37%”. Providing a range is
shorter than explaining the SD, and can be visualized easily, e.g. in a bar
chart.

------
RivieraKid
I was just wondering about a very related problem. I do 5 measurenments of
some random variable (let's say execution time) and average them. How should I
report the variability of that average?

State the sample size and standard deviation?

~~~
perlgeek
If it were 10 or more measurements, I'd use average, standard deviation and
sample size _if_ we can expect that variable to be normally distributed.

I don't know how to make meaningful statistics with fewer data points.

------
al2o3cr
Shorter social scientists: "Gaussian distribution sez wut?"

------
vzhang
I'm seriously questioning some people's reading comprehension - he NEVER said
STD is not useful! He's only saying the name "Standard Deviation" is badly
chosen.

~~~
bluecalm
Read the blog post carefully:

> it is time to retire it from common use and replace it with the more
> effective one of mean deviation

> Standard deviation, STD, should be left to mathematicians, physicists and
> mathematical statisticians deriving limit theorems

>There is no scientific reason to use it in statistical investigations in the
age of the computer, as it does more harm than good

He is saying it's not useful for real world things and people are just
confused. What you wrote is reasonable view. What Taleb writes isn't.

------
etanazir
The minimum uncertainty wave equation is ~ e^(-x^2) ergo the standard measure
is in terms of x^2. QED.

------
tehwalrus
at least he's leaving us physicists alone with it...

------
dschiptsov
Why, it is pretty good in describing probability distributions. What we should
retire are idiots, who assume that it predicts an outcome of the next event.

~~~
regularfry
Everyone is an idiot some of the time.

------
notastartup
I've been a long time fan of Dr. Nassim Taleb. First book I've read was the
one about his time as a day trader and how on the Black Friday market crash,
he made a killing and cleared his desk and never had to work again.

There are those that dislike his ideas because it is threatening to their
existing assumptions about probability and statistics. He argues that experts
and majority of people do not account for the unpredictable but significant
impact a single event can have which often shatters the commonly held belief.
For example, swans were white until the discovery of black swans in Oceania,
too big to fail multi-national corporations going bankrupt like Lehman's
brothers and etc.

He's not anti-academic, but he is against teachings in the common academia
that is based on naive assumptions that is specifically tailored to serve
those that thrives most off the limited quantitative measures, such as market
callers, hedge funds selling complicated quantitative algorithm trades,
academics seeking fame and fortune by writing the most logical and
quantitative paper without questioning any of the tools they are using, it is
this hypocrisy and laziness that is apparent and those that try to deny to the
point of making ad hominem remarks against a man, who simply observes these
things and decides to write it in an entertaining manner (otherwise nobody
would give a shit because the topic would be dry without lay man's linguo).

Keep an open mind, a lot of what he says I do find interesting ideas and it
has influenced my thinking process quite a bit, however it's no way in anyway,
grounds for cracking jokes or ridicule, in fact when I read some of the
comments here, it's a bit shameful. We should be embracing new ideas in order
to explore them, regardless of who the explosive nature of the claim, because
the black swan event is very real and is not captured or understood completely
by our current set of statistical tools and methodology based on questionable
assumptions about how the real world operates. For example, 1/2500 chance is
not what we really think it means in the real world because black swan events
are more common than we think, a percentage probability do not fully reflect
it's frequency and the magnitude of it's event.

Note the fall of crime rates in the United States following a decision to
legalize abortion, economists and experts would come on television and bring
up all sorts of random theories and ideas but little did they realize it was a
chain effect from a court ruling passed decades ago until two economists came
out with a paper that was ridiculed because it suggested that 'killing babies
from poor neighbourhoods = lower crime rate' where most poor neighbourhoods is
occupied by African Americans. Because such idea was earthshakingly
controversial and still denied even to this day. Because Galileo claimed the
earth was round instead of flat, he was executed. This is simply the nature of
our world, almost all part of life, there exists a hierarchy that people
simply do not ask questions either due to blind trust or the fear of reprisal.

~~~
jessaustin
I think I agree with most of this, but now aren't we supposed to say that the
drop in crime is due more to reduced childhood lead exposures than to
abortion? Not just because it's a feel-good, humanistic explanation rather
than a horrific racist eugenics explanation, but also because recent studies
show a pretty strong lagged correlation...

------
roywei
four-day returns of stock x: (-.3, .3, -.3, .3) -> MAD = 0; four-day returns
of stock y: (-.5, .5, -.5, .5) -> MAD = 0.

~~~
jameshart
The A in MAD stands for 'absolute', so no, the MAD for those two stocks is .3
and .5 respectively - versus standard deviations of 0.35 and 0.58.

------
truthteller
he's really lost the plot. :(

