
How Statisticians Found AirFrance Flight Two Years After It Crashed - linux_devil
http://www.technologyreview.com/view/527506/how-statisticians-found-air-france-flight-447-two-years-after-it-crashed-into-atlantic/?utm_campaign=socialsync&utm_medium=social-post&utm_source=facebook
======
raverbashing
It was not really "statisticians" who found it.

The initial searches based their original search area on backdrift from debris
found, which is _very unreliable_

Instead, if they had focused their search area on the original route and
estimated flight time before crash they would have found it sooner.

THEN if you don't find it you try different things.

Their Bayesian analysis was basically considering the originally scanned area
(looking for ULBs - very unreliable as well) and where they didn't find it.

I'm not trying to rain on their parade, but I think the main mistake was (not
theirs - the search team) relying more on backdrift than other information
(basically figure 14 here
[http://isif.org/fusion/proceedings/Fusion_2011/data/papers/1...](http://isif.org/fusion/proceedings/Fusion_2011/data/papers/140.pdf)
\- the highest probability area is, obviously, near the last known position)

------
mschuster91
The biggest problem with MH370 is that there is not a single "higer quality"
location datapoint available. In AF447 you had dead bodies and iirc debris,
which at least helped to roughly identify the area of the crash site.

With MH370, the possibilities are next to infinite. Nothing is known except
radio data with dozens kilometers of accuracy.

~~~
raverbashing
Correct

With AF447 you had the exact position at approximately 10min before the crash
(and debris).

With MH370 and the Inmarsat data you have something that is _much more
uncertain_

------
ACow_Adonis
This was actually sent round a couple of weeks ago where I work by one of our
executives (I work at a place that has Statistics in its name).

I didn't mention it then, because I didn't want to be a debbie downer and hurt
everyone's "rah rah statistics yay!" feeling. Or send something round that
contradicts an executive :P But the paper seemed dodgy as hell to me.

This wasn't solved with Statistics. Nor was it solved by Bayesian statistics.
It was solved when people did lots of searching, failed to find anything for a
while despite searching where the plane was, and then they eventually found
the plane near its last known location despite having already searched there.

If anything, the paper was just an application of the texas sharpshooter
fallacy. I believe the authors made several models, and then included the one
in the paper that showed the result they wanted.

See paper:
[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.370...](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.370.2913&rep=rep1&type=pdf)

Indeed, see the graphical representation of the probability distribution in
the paper, you'll notice that the plane was found in a section of the
probability distribution that had already been searched earlier: an area that
should now have a lower probability than the opposite half of the probability
distribution if we're using semi-reasonable bayesian techniques. Indeed, this
is what we see in figure seven.

In figure 8, we see their new probability distribution under the assumption
that the beacons did not work at all, where they try to say their method
pinpointed where the plane was. But since this was not known at the time, and
figure 7 is the far more reasonable bayesian model given previous searches in
the area (because figure 8 assumes a 100% probability of both beacons failing,
something practically no bayesian would do), i posit that they either made
that model after the fact, or they indeed had several models fitting several
different scenarios, and after the plane was found, they chose the one that
best fit the data post-facto. Odds are, if the plane was found, it would fit
one of their scenarios, and they would then write a paper saying how their
model was such a success. Figure 7 is the far more reasonable bayesian
distribution, and it actually tells you to now search in the wrong area.

If they followed bayesian methods, there is in fact a bottom half of their
distribution that should of been searched next (where the plane wasn't), and
they in fact found the plane in an area that had already been passively
searched: an area that should have downplayed by bayesian probabilities for
future searches because of the unlikely-ness of this area being searched and
yet still finding nothing.

I actually like Bayes, A LOT, but this is not a good example, except perhaps
of the precept: if you want to find a lost plane, its probably a good idea to
start looking near where it was last located.

~~~
lotsofmangos
More or Less covered this much better.

The company had already been called in once and had completely failed to find
anything because their assumption was that the plane would be unlikely to be
anywhere that had been searched. This was their second go and they changed
their assumptions.

 _" It still was a minor miracle that we found it," says Keller._

[http://www.bbc.co.uk/news/magazine-26680633](http://www.bbc.co.uk/news/magazine-26680633)

------
truncate
For those interested in more technical details here is paper I found -

[http://isif.org/fusion/proceedings/Fusion_2011/data/papers/1...](http://isif.org/fusion/proceedings/Fusion_2011/data/papers/140.pdf)

[https://www.informs.org/ORMS-Today/Public-Articles/August-
Vo...](https://www.informs.org/ORMS-Today/Public-Articles/August-
Volume-38-Number-4/In-Search-of-Air-France-Flight-447)

------
curtis
The thing that I find most interesting about this part of the AF447 story is
this:

> But Stone and co chose to include the possibility that the _acoustic beacons
> may have failed_ , a crucial decision that led directly to the discovery of
> the wreckage. [Emphasis mine]

------
spitfire
The actual paper on the topic.

[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.370...](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.370.2913&rep=rep1&type=pdf)

------
dba7dba
Another similar case of using stats and science to find something (in this
case German U-boats) in the middle of ocean.
[http://www.amazon.co.uk/Blacketts-War-Defeated-U-Boats-
Broug...](http://www.amazon.co.uk/Blacketts-War-Defeated-U-Boats-
Brought/dp/030759596X)

UK/America/other allies were suffering horrendous shipping losses caused by
German U-boats. What really helped turn the tide was scientists using
'science' to predict where to find the German U-boats.

------
ajb
Yeah, bayesian search theory is an interesting topic. A while back I wrote a
program to search for intermittent bugs using it (a sort of bayesian version
of 'git bisect') It hasn't seen real use (Intermittent bugs tend to involve
real hardware, so you can't just try it out on any bug database) . But if
anyone wants to try it out, it is at
[https://github.com/Ealdwulf/bbchop](https://github.com/Ealdwulf/bbchop)

