

Razoring Ockham’s Razor - mtviewdave
http://rationallyspeaking.blogspot.com/2011/05/razoring-ockhams-razor.html

======
lotharbot
Ockham's Razor is suggestive, not conclusive.

It is most useful in choosing between two hypotheses where one is a subset of
the other. If hypothesis A completely explains the data, and hypothesis A+
completely explains the data but includes extra axioms, Ockham prefers A over
A+. One simplified argument that can be made is that A+ has a higher
probability of containing wrongness, since it contains any wrongness that is
in A plus any additional wrongness contained in +.

When choosing between two disjoint hypotheses, A and B, both of which
completely explain the data, Ockham's Razor is significantly less useful.
Whichever of A and B is longer has more places it can be wrong, but the
probabilities of wrongness in any particular place are uncorrelated between A
and B. Ockham (or Kolmogorov Complexity or whatever form of the argument you
like) suggests a preference for the simpler of A and B, but it is not a very
strong suggestion.

~~~
olavk
If A and A+ both explain the data and make the same predictions, then they
have exactly the same probability of being wrong: If A is true then A+ is true
and vice versa. The additional complexity in A+ does not correspond to
additional information about the world, and this is why we can safely choose A
over A+.

------
jmathes
Ockham's Razor has two interpretations.

One says that if you have several theories that might explain what you're
observing, the simpler one is probably right. It doesn't matter whether this
is true; it saves time by letting you fail faster if you're wrong. This is the
more popular understanding of the razor, but it's less useful because it can
be wrong, and jackasses often use examples of its wrongness to claim that it
never provides any information, in order to advance crackpot metaphysics or
religion.

The other says that if you have two theories which both explain everything
you're observing, then it doesn't matter which one you adopt, so you might as
well adopt the simpler one. For example, if I hypothesize that E=mc^2 + u,
where u is the number of invisible, otherwise-undetectable pink unicorn
particles within three meters of Mars, this theory makes all the same
predictions as E=mc^2, and is nondisprovable. It's Occam's Razor that says
that we shouldn't bother with this pink unicorn version.

------
bravura
"The obvious question to ask about Ockham’s razor is: why? On what basis are
we justified to think that, as a matter of general practice, the simplest
hypothesis is the most likely one to be true?"

There is an argument from computational learning theory. Specifically, there
are many complicated hypotheses, whereas there are few compact (simple)
hypotheses.

If a simple hypothesis correctly models the observations, it is less likely to
occur by chance than if we find a complicated hypothesis that can model the
data. So, everything else being equal, choosing a simple hypothesis has less
risk.

~~~
olavk
The Razor does not say that "the simplest hypothesis is the most likely one to
be true". Obviously a more complex theory which actually correspond to
observations is truer than a simpler theory which don't match observations as
well.

Occams Razor says that if we have two theories _which both explains all the
fact_ (and have the same predictive power), then we might as well choose the
simplest.

An example: The Copernican and the Tychonian planetary models match and
predict the same observations, but the Tychonian model was more complex.
According to Occams razor, the Copernican system should then be chosen. But if
the models actually differed in some predictions, then Occams razor would not
be relevant.

Occams Razor is not a statement about the world. It is a tool for thinking.

~~~
ced
I agree with the thrust of your post, but not with:

 _Occams Razor is not a statement about the world. It is a tool for thinking._

In a world where everything is best fitted by a fifth-order polynomial,
scientists who encounter a seemingly linear relationship will think "Oh, what
a coincidence. I bet that if we had better observations, those higher
coefficients wouldn't be exactly 0." [1]

We live in a world that has:

\- Very simple fundamental rules

\- A ton of emergent complexity arising from those rules

That's why Occam's razor has a good track-record. Because the world is
reducible to simpler laws.

[1] Similarly, Occam's razor does not hold in worlds where gods intervene,
because every "coincidence" can have epic, Iliad-sized explanations in terms
of Zeus getting mad at breakfast.

~~~
olavk
It might be that simpler theories are more often true due to the nature of our
universe - but that is not what Occams Razor is about. So its really two
different issues.

~~~
ced
Ah, I did not know that the systems of Tycho and Copernic were the same.

I think that when most scientists (and bravura above) invoke Occam's razor,
it's to pick one of two _non-equivalent_ theories. For instance, the Ptolemaic
system made predictions that were, _given the limited accuracy of the
instruments at the time_ , indistinguishable from the Copernician system. But
I'm quite sure that scientists felt justified in "betting" on Copernic because
the equations are simpler, and use fewer arbitrary constants.

------
pjscott
Let's see if we can restate Occam's Razor in a way less susceptible to misuse.
If anybody wants to improve on this, do so with my blessing:

 __Occam's Razor: __Since P(A and B) is always less than or equal to min(P(A),
P(B)), hypotheses with smaller Kolmogorov complexity should be accorded a
higher prior probability.

From there, update on evidence. If a hypothesis (no matter how simple) is
contradicted by the evidence, then obviously your estimate of its probability
should go way down.

The use of Kolmogorov complexity (or something roughly equivalent like Minimum
Description Length) is important here. As explanations of combustion, "A
wizard did it" sounds simpler than an explanation of the chemical reaction
that actually happened, but a full model of each of these explanations
(including the wizard and chemistry) would reveal their true relative
complexity: the wizard is much more complex than a few simple rules about
chemical reactions.

~~~
olavk
That is itself a hypothesis. Here is a simpler one: The probability of a
hypothesis being true has nothing to do with its Kolmogorov complexity.

~~~
TheEzEzz
That's a valid hypothesis, but is empirically false.

~~~
ignifero
Empirically, complexity(einstein's field equations) > complexity(newton's
law), complexity(spherical harmonics) > complexity(electrons spinning around
atom) and so on and so on. Complexity is not a criterion for truth

~~~
Locke1689
Except that those theorems explain things that cannot be explained by simpler
theorems, thus they still satisfy the criterion.

~~~
ignifero
Their validity has nothing to do with complexity though

------
hugh3
I'd go so far as to say that Ockham's Razor doesn't say anything about whether
the simplest explanation is _likely_ to be true, but merely suggests a way of
attacking the problem. Pick the simplest hypothesis, test it exhaustively, and
if you find some data which contradicts the hypothesis then you move right on
up to the simplest hypothesis that also explains the new data.

There's always an infinite number of arbitrarily complex theories that can
explain the data. Maybe that bump in the night was the hot water system. Maybe
it was the hot water system _and_ the boogeyman. Maybe it was the hot water
system and the boogeyman riding a unicorn being chased by a liger... but these
hypotheses aren't worth entertaining until we have something other than a
bump.

An example from my own field of planetary interiors: we've known the density
and outer atmospheric composition of giant planets for quite a few years now.
In the absence of anything else to go on, we tended to assume that the
interior was chemically homogeneous (or maybe homogeneous plus a distinct
core). But as our models and measurements improve, we get evidence that we
really need a non-homogeneous model of the interior to explain everything.
Does this make us stupid to have entertained homogeneous models? No, that was
the right place to start, until we got evidence to the contrary.

------
michaeldhopkins
Aquinas covered this. "If the assumption is made, the sensible appearances are
accounted for. But this is not strict proof since they also could be accounted
for by different assumptions."

------
warmfuzzykitten
The field of nutrition is rife with simple hypotheses that turn out not to
explain the data or have predictive power. Occam's razor is not refuted. We
have no reason to expect biochemistry or the universe to operate in a simple
way, and they manifestly do not. But we have every reason to be parsimonious
in our explanations. Humans are fanciful, superstitious creatures who eagerly
trade in nonsense. We need that razor to trim off our fantasies.

------
6ren
The idea is that the simpler hypothesis is more _likely_ to be true, given
that it explains the data equally well. It doesn't mean it _is_ true. Also, I
believe elliptical motion predicts the observations of planetary motion more
accurately than circles.

A problem with defining "simpler" is that it depends on the language that you
use to express it. If you lack the right concepts for it, it may seem more
complex. And if these concepts are quite difficult to understand (require a
lot of training), it's kind of cheating to say that a theory expressed in them
is "simple". A bit like a 1k demo that uses multi-megabytes of library code.

And now, my conjecture for why Occam's Razor seems to work: it is because we
pre-select simple theories. That is, there are lots of complex theories out
there that are the truth, and some of them are so complex, that we actually
can't understand or even conceive of them (and never will - they are not
susceptible to hierarchical decomposition). Therefore, correct theories turn
out to be simple (they are more correct than the slightly more complex
theories they compete with; but are never compared with the actually true
theories that are bizarrely complex).

Plus, our language (including mathematically) notation is constantly being
refined for how we use it - by adapting it to what we know. So it is
overfitted to our present theory (the first thing a scientist or mathematician
does when confronting something too complex is invent a notation for it that
makes it simpler - the notation absorbs some of the complexity, which it
borrows from our massive stores of analogies and ideas - and sneaks complexity
in via the back door).

 _The Unreasonable Effectiveness of Mathematics in the Natural Sciences_ ,
Wigner, <http://www.dartmouth.edu/~matc/MathDrama/reading/Wigner.html>

_THE UNREASONABLE EFFECTIVENESS OF MATHEMATICS_ , Hamming,
[http://ned.ipac.caltech.edu/level5/March02/Hamming/Hamming.h...](http://ned.ipac.caltech.edu/level5/March02/Hamming/Hamming.html)

 _(these are perennial favourites relevant to the topic - not claiming they
support my conjecture)._

~~~
olavk
It's philosophically problematic to say that one theory can be truer than
another if they both explain the same facts and yield the same predictions.
This would mean that there is a notion of truth which cant be verified or
falsified experimentally, which really opens the door for all kinds of
metaphysics.

~~~
6ren
Let's say I have a theory that can explain the complete works of Shakespeare
so well, that an algorithm based on it (written in C) can generate the
complete works.

Now, it turns out that my "theory" is just a printf with a loong string...
containing the complete works of Shakespeare. You can measure how "simple"
this theory is, by the information content.

What if there was another theory (algorithm) that could also generate the
works exactly, but was shorter - would you say it's a better theory?

~~~
pjscott
What predictions do those "theories" generate?

~~~
6ren
What the next character will be.

------
hadronzoo
MacKay makes an argument for Occam's Razor using logical probability:
<http://www.cs.toronto.edu/~mackay/itprnn/ps/345.357.pdf>

------
ignifero
_On what basis ... the simplest hypothesis is the most likely one to be
true?"_

Occam didn't say that. The true hypothesis is the one confirmed by
experiments. One would use Occam's razor to choose among equally valid
explanations. Now, when you don't have a valid explanation, don't expect
William of Occam to fix that for you.

