
Han Solo and Bayesian Priors - rck
http://www.statslife.org.uk/the-statistics-dictionary/2148-han-solo-and-bayesian-priors
======
p1mrx
You also need to consider the survivorship bias of a story from a long time
ago in a galaxy far, far away.

All the other potential stories where the protagonist predictably wins the
Darwin Award would be too uninteresting to reach us over such an immense time
and distance.

------
akozak
C3PO's estimation would have been a lot more useful if he'd calculated _their_
odds.

Also his calculation suggests there's a strong open data regime in the empire,
which is a nice thought.

~~~
aet
The odds may have been based on simulation rather than "real data"

~~~
akozak
Ah but in that case I find it hard to believe a robot wouldn't take piloting
skills into account.

For example when wind energy developers study wildlife impact they tend to
include avian "avoidance behavior" in modeling risk of collision.

------
ced
I like the concept, it's a cool way of introducing Bayesian probabilities!

The conclusion strikes me as overly confident, though. It implies that if we
had 100 Han Solo's, we are _very_ confident that about 3/4 of them are going
to make it through. This comes about because it uses this:

 _We 're going to say that C3PO has records of two people surviving and 7,440
people ending their trip through the asteroid field in a glorious explosion._

as data for Han's odds of making it. But 100 out-of-shape people dying on the
ascent of Mount Everest does not tell us anything about the odds for someone
who is very fit.

I would rather model that there is no one true probability of making it
through - it's going to be dependent on the pilot. Mediocre pilots might have
odds in the neighborhoud of 1/3720, but there is presumably a lot of variance
depending on skill, and my prior belief is that Han would wind up in the upper
end of the distribution.

~~~
sukilot
How is your model different from the blog post? Yours sounds like a
paraphrase.

Do you disagree that Han is more likely to survive more difficult challenges
than easier ones?

~~~
ced
The blog post has the right idea, but the wrong equations. If we take C3PO's
statement to mean:

 _We 're going to say that C3PO has records of two people surviving and 7,440
people ending their trip through the asteroid field in a glorious explosion._

Then the big question is how skill-dependent the challenge is. Is it like
swimming across the English channel, or is it random like surviving the
Niagara Falls in a barrel? There exists a distribution of survival rates as a
function of skill level, we just don't know what it is. We could assume a
sigmoid form. C3PO's statement gives us _some_ constraint on what the
parameters of the sigmoid are, but not enough. We could make some further
assumptions and get an answer, but the results will depend strongly on the
specific assumptions. In other words: we don't know the answer.
[https://xkcd.com/384/](https://xkcd.com/384/)

~~~
xaetium
I often find myself wondering if the major problem with any given statistical
analysis is whether Bayesian inference ought to have been used at all.

------
Matumio
The 75% result (chance to survive) seems about right, but the confidence in
this result seems completely unnatural. The sharp peak in the posterior plot
suggests that I have nailed down the exact probability to +/\- 2%. No way.

------
brobo
Thought experiment: imagine there is a second robot, C4P1, on one of the TIE
fighters. He independently comes up with an estimate of 10,000:1 for Han's
death. Are we now going to slide our estimate all the way down? Or, what if
C3P0's line was revised to say 1,000,000:1? Would the author now say, 'whelp,
I was pretty confident at 20,000:1, but now that C3PO's number is so much
bigger than mine, I guess Han's gonna die'?

------
EGreg
I think this is the kind of analysis you need instead:

[http://tvtropes.org/pmwiki/pmwiki.php/Main/PlotArmor](http://tvtropes.org/pmwiki/pmwiki.php/Main/PlotArmor)

------
mikexstudios
Original source: [http://www.countbayesie.com/blog/2015/2/18/hans-solo-and-
bay...](http://www.countbayesie.com/blog/2015/2/18/hans-solo-and-bayesian-
priors)

------
elliott34
Does the author have something backwards or do I?

P(RateOfSuccess|Successes) = Beta(α,β)

"In Bayesian terms, C3PO's estimate of the true rate of success given observed
data is referred to as the likelihood."

BUT we know likelihood(rateofsuccess|data) = probability(data|rate of
success).

I am confused.

~~~
jmcohen
"Likelihood" can refer to both p(param|data) and p(data|param). See [1] for an
example of the first, and [2] for the second.

[1]
[http://en.wikipedia.org/wiki/Likelihood_function](http://en.wikipedia.org/wiki/Likelihood_function)

[2]
[http://en.wikipedia.org/wiki/Bayes%27_theorem#Events](http://en.wikipedia.org/wiki/Bayes%27_theorem#Events)

------
UhUhUhUh
One factor seems missing: the deviation of Han's skills from that of the
average pilot. Does C3PO take it into account in his likelihood? Do we take it
into account in our probability of success? Although I do get the point, it
seems there's often something of that sort, a simplification, in that kind of
bayesian reasoning.

------
lordnacho
Someone needs to do this with Game of Thrones characters. Everything I thought
about important characters' survival chances turned upside down as I watched
it.

~~~
davmre
Since this is the Internet, _of course_ this has been done, several times:

[http://www.math.canterbury.ac.nz/~r.vale/got_290814.pdf](http://www.math.canterbury.ac.nz/~r.vale/got_290814.pdf)

[http://allendowney.blogspot.com/2015/03/bayesian-survival-
an...](http://allendowney.blogspot.com/2015/03/bayesian-survival-analysis-for-
game-of.html)

