
So You Think You Have a Power Law - saurabh
http://vserver1.cscs.lsa.umich.edu/~crshalizi/weblog/491.html
======
tokenadult
The author is quite an interesting thinker, and I like many of his posts. (I
agree that the default font set on his website is all wrong, much too small,
so I zoomed in with my browser to read his words.) The key claim in his
abstract is important and deserves more attention. (The article title in
submission to HN should have "2007" included, as this posts dates back that
far.)

"Power-law distributions occur in many situations of scientific interest and
have significant consequences for our understanding of natural and man-made
phenomena. Unfortunately, the empirical detection and characterization of
power laws is made difficult by the large fluctuations that occur in the tail
of the distribution. In particular, standard methods such as least-squares
fitting are known to produce systematically biased estimates of parameters for
power-law distributions and should not be used in most circumstances."

The author's take-home points are a list of good practices. He comments that
much of the content of whole journals would disappear if authors and reviewers
followed these practices.

"1\. Lots of distributions give you straight-ish lines on a log-log plot.
True, a Gaussian or a Poisson won't, but lots of other things will.

"2\. Abusing linear regression makes the baby Gauss cry.

"3\. Use maximum likelihood to estimate the scaling exponent. It's fast! The
formula is easy! Best of all, it works!

"4\. Use goodness of fit to estimate where the scaling region begins.

"5\. Use a goodness-of-fit test to check goodness of fit. In particular, if
you're looking at the goodness of fit of a distribution, use a statistic meant
for distributions, not one for regression curves.

"6\. Use Vuong's test to check alternatives, and be prepared for
disappointment. Even if you've estimated the parameters of your parameters
properly, and the fit is decent, you're not done yet.

"7\. Ask yourself whether you really care. Maybe you don't. A lot of the time,
we think, all that's genuine important is that the tail is heavy, and it
doesn't really matter whether it decays linearly in the log of the variable
(power law) or quadratically (log-normal) or something else."

Good stuff. It takes a lot of practice to get statistical analysis right.

------
ChuckMcM
Random note to author if they read this, the default font is unreadably small
on my 24" monitor.

~~~
jotm
Use CTRL and +/\- to zoom in/out.

------
lomendil
I love Clauset, Shalizi, and Newman for keeping up this fight. Even if you
agree with them, you still have to include a linear regression in your paper
to satisfy reviewers.

I was surprised to see this on HN, though. I guess everyone is trying to use
power laws in one way or another.

~~~
pliny
Well, ever since Saddam tried to use power law to justify his invasion of
Kuwait we've all been stepping on eggshells.

~~~
TorKlingberg
What?

------
ronaldx
Visually identifying power laws from a log-log plot is a pervasive anti-
pattern; this is a good treatment of why we should be sceptical and what we
should do instead. Thanks to OP.

------
fiatmoney
This pattern annoys me to no end. People are quick to jump from a claim like
"some users are more valuable than others" to "POWER LAW", and from there to
folksy wisdom about an "80-20 rule".

~~~
greattypo
I agree, both terms feel oddly specific to me when people throw them around
casually, but the layman use is at least directionally correct.

Much more cringe inducing are all of the invented definitions of the Law of
Large Numbers! (e.g. [http://www.fastcompany.com/1825592/9-reasons-choose-
corporat...](http://www.fastcompany.com/1825592/9-reasons-choose-corporate-
job-over-startup)).

~~~
Fomite
Guh. I encountered this so much on an investment site that I wrote a blog post
about the Law of Large Numbers because it was annoying me enough that I wanted
to have a shorthand link. [http://confounding.net/2012/03/12/thats-not-how-
the-law-of-l...](http://confounding.net/2012/03/12/thats-not-how-the-law-of-
large-numbers-works/)

I equally hated "Well, it means something different in this field." No it
doesn't, it's _math_. That's the whole point.

------
ganeumann
About eight months ago I took the data from the Angel Investor Performance
Project(1) and ran it through the powerlaw software(2).

The software said the data on angel funded startups (at least the ones in the
AIPP survey) did not indicate that returns followed a power law. Given how
often we say "power law!" in regards to startup outcomes, I just thought that
was interesting.

(1)
[http://www.angelcapitalassociation.org/data/Documents/Resour...](http://www.angelcapitalassociation.org/data/Documents/Resources/AngelGroupResarch/1d%20-%20Resources%20-%20Research/6%20RSCH_-
_ACEF_-_Returns_to_Angel_Investor_in_Groups.pdf)

(2) [http://code.google.com/p/powerlaw/](http://code.google.com/p/powerlaw/)

~~~
kaa2102
Investment returns may not follow a power law given that they can come from
"normalstan". The distribution of startup values for both funded and unfunded
startups (including IPOs AND failures) would come from "extremistan" due to
the presence of outliers. The data from extremistan will more likely follow a
power law distribution. The terms normal and exstrem-istan come from the
writings of Nassim Taleb, author of Fooled By Randomness and Black Swan:
Impact of the Highly Improbable.

~~~
klodolph
> The data from extremistan will more likely follow a power law distribution.

That's exactly the kind of logical inference that the linked article warns
against, unless you have some additional reasoning behind that assertion.

------
LolWolf
In general, very, very few things in nature follow a power law. And, either
way, like is stated, it is fairly difficult (if not impossible) to even make
sense of what _is_ a power law---notably, in phenomenological cases, power
laws have no interpretable units, which come about only by the use of
constants.

------
MidsizeBlowfish
This post is fantastic and the paper is fascinating. I particularly enjoyed
his writing:

> This is why God, in Her wisdom and mercy, gave us the bootstrap.

