Hacker News new | comments | show | ask | jobs | submit login
Amanda Knox and bad maths in court (bbc.co.uk)
49 points by mitmads 1517 days ago | hide | past | web | 40 comments | favorite

Does DNA testing work like that? If so, you'd just build more iterations into the test itself. If they refer to taking another sample, again, for such a critical test, why not take multiple samples? The coin analogy seems misleading as you can flip many times for free, and each flip is (usually) completely independent.

It'd seem a retest would only matter if there's a new test or the previous test was suspected to be improperly done. In any case, you'd need a reason to believe a different outcome is possible.

In this case, a quick search finds[1]. It says evidence and sample procedures were not properly followed, and there wasn't even evidence of blood on the knife.

I'm all for being corrected on statistics, but this doesn't seem like a case of bad math, does it?

1: http://www.westseattleherald.com/2011/06/29/news/update-dna-...

More to the point, even if her blood was on the knife would that necessarily be conclusive evidence? I know that I've left a bit of blood on various cooking knives of people I've lived with.

No offense (maybe a giggle) but that seems highly unusual

I cut myself cooking sometimes. Nothing to be proud of, but hardly unusual I'd think.

Anyone that isn't a professional chef is probably going to leave a little blood on a knife by cutting themselves with it.

There's a difference between a knife drenched in blood and a tiny bit of blood from a cut, certainly, but as far as forensics goes that may not matter - they may want to find any evidence of blood, like some blood caught in the space between the handle and the blade, on the assumption that a killer may have tried to clean blood off a weapon.

> on the assumption that a killer may have tried to clean blood off a weapon

Surely cleaning blood off the knife makes just as much sense if it was a cooking accident as if you've killed someone with it?

> Anyone that isn't a professional chef is probably going to leave a little blood on a knife by cutting themselves with it.

That's a strong claim.

I'm not a professional chef but I like to use a chef's knife, and I cut rather fast. Yet I don't remember ever cutting myself to the point of bleeding. I only remember scraping off nails and skin.

> like some blood caught in the space between the handle and the blade

I reckon that if it gets stuck between the handle and the blade then we're talking about more than a little blood.

I've cut my fingers deeply with a kitchen knife more than once, and bled profusely.

I rarely cook, but I work on cars a lot. Cutting my hands and fingers is part of the routine. Oddly, such cuts have never gotten infected, even when the dirt gets embedded below the skin. I think car grease is an effective bacteriacide!

(I find wearing gloves while working on a car uncomfortable, though I'll do it when cracking a rusted nut loose - when it finally gives way you'll always bang your hand against something.)

That same grease make my cuts all red and inflamed. What gets me is that its rarely jobs that you know will be hard that result in cuts. It's the small simple ones. Switch a ram stick on this beige box, change a set of windscreen wipers. Adjust the temp on the hot water cylinder. Do anything maintenance related to a washing machine.

I don't cut fast and have cut myself a few times to the point of bleeding. Maybe the fact that you cut fast indicates that you are reasonably skilled at this and therefore less likely than others to make yourself bleed, rather than speed indicating that mistakes are more likely.

And yet the mathematics in the article seems bogus to me, too. Can anyone figure out what calculation Colmez is doing?

I'm not a statistician, but if you toss a fair coin 20 times, there is about 0.1% chance of getting 17 heads, but to figure out the probability that the coin is fair given this data, it seems you need Bayes' theorem, which requires a prior probability on the coin being fair.

Confusing "the odds of this occurring with a fair coin" with "the odds that the coin is fair" is by far the most frequent statistical error I see, and by far the least often corrected. It's mildly terrifying.

Is there a formula you can use to convert between the two?

Yes, but you need to estimate a probability distribution across various types of coin biases first.


Reading some other comments in this thread, I feel that I really ought to have included some more stuff here. There isn't actually such a thing as "the odds that the coin is fair." Either it's fair or it's not. What we can talk about is what probability we should ascribe to it being fair given what we know. Even a single coin flip will have a single result, uniquely determined by the way in which it is launched into the air and caught. Probabilities only exist in the presence of our ignorance of the actual facts, and some people consider probabilities to be themselves a measure of our ignorance of the world.

Oh I get it

P(H0|observation) = P(observation|H0) * P(H0) / P(observation)

P(observation) = P(observation|H0) * P(H0) + P(observation|!H0) * P(!H0)

A normal significance test tells you P(observation|H0). [Though I'm not sure about P(observation|!H0)]. To apply Bayes' Rule you need P(H0), where P(H0)=???

Bayes theorem. The problem is that it requires a prior, which is usually unknowable.

You are exactly right. Saying "there is only an 8% chance of this happening with a fair coin" is something completely different from saying "there is a 92% chance the coin is biased". The author is utterly clueless when it comes to probability.

This is how frequentist statistics works. You ask the wrong question ("the chance of the data occurring given the assumption") and use clean, rigorous, impeccable math to get an answer. Bayesian statistics is (usually) the opposite - you ask the correct question ("the chance of the assumption being correct") but find that there is no way without making some big assumptions to get to the answer.

Here's some good (possibly more "fair" than I've been above) discussion if anyone wants to read/think about this more:



A concise explanation of the difference between Bayesian and Frequentist techniques in statistics:


You know, I didn't understand that comic when it was posted (despite feeling like I have an understanding of Bayesian vs. frequentist statistics) and I still don't. So, I looked it up and apparently I'm not the only one.

It seems to me, and the commenters on stats.stackexchange [1] that this comic both misinterprets frequentist statistics and misrepresents Bayesian statistics. I realize that XKCD is a nerdy comic meant to be entertaining - I just wanted to leave this discussion here in case anyone else is confused; I think this is an important distinction and one most people interested in statistics should spend some time thinking about.

[1] http://stats.stackexchange.com/questions/43339/whats-wrong-w...

Edit: I can't edit my first comment now, but gweinberg's post (sibling to the grandparent of this) words the problem perfectly.

You are correct. But the reality is even worse.

I don't know the particular DNA test used in this case, but lets assume it gave a certainty of 92% that the DNA isolated was from AK. This means that the particular sequences of DNA identified could have come from another person with a probably of 0.08 (i.e. a one-in-12.5 chance, which is not particularly low in a case like this). It does not mean that the DNA is correctly characterized with a probability of 92%.

For a repeated test to give a different probability, the identity of one or more of the sequences isolated from the sample would have to have been incorrect in one of the assays, i.e. there is a procedural error.

It is not at all like tossing coins. An analogy would be getting someone's eye color as blue the first time and brown the second.

I'm not sure if Bayes theorem is relevant here, (maybe behind the scenes) but you would go probably for http://en.wikipedia.org/wiki/Statistical_hypothesis_testing

Yea, if I flipped a coin 10,000 times and 55% of them were heads then the coin is definitely biased toward heads, not a 55% chance that it's biased.

Article on statistics, IDing, and trials from a former professor of mine: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1432516 (free download).

Also very depressing to read his work on forensics. Long story short, everything in CSI besides DNA evidence is an unscientific sham. And even DNA evidence is dominated by lab error (1-2%). See: lst.law.asu.edu/fs09/pdfs/koehler4_3.pdf.

If you know anything about it, how would you/he propose fixing the problem from an institutional perspective?

Section V of that paper suggests that being more precise in language choice (an important problem in statistics) is his main advice there, but perhaps making the data and conclusions open in some "anonymized" form would support this endeavour?

My understanding of the original scene is that the police first on the scene were not very good, so even if the DNA on the knife is hers, one has to think about contamination by the original (postal) police.


Math is great for winning court cases. I got pulled over 12 years ago, and went into the magistrate hearing with a drawn out model indicating how the cop was either dangerously speeding himself to pull me over or had inaccurately assessed my speed and bumped up the speed he indicated on my ticket. The lawyer took one look at my sheet full of equations and said "Well, just watch your speed next time." He let me off without any fine and dismissed the case. I would always advise high school students who say "I don't need math" to understand that the world is shifting such that people who know math are becoming very powerful. Especially here in America, where no one would ever admit to being "dumb" and will therefore do anything they can to avoid doing math and looking bad. This is a serious weakness, and can easily be exploited if you ever run into a crafty mathematician.

Here is a much better article on the application of probability to the Amanda Knox case: http://lesswrong.com/lw/1j7/the_amanda_knox_test_how_an_hour...

It will be fascinating to see how this discussion out in a mainly American forum. In very broad terms, the attitude that the Italians seem to have to Knox is similar to the attitude that the Americans had to Woodward, and very much vice versa.

For anyone who doesn't know what "Woodward" refers to: http://en.wikipedia.org/wiki/Louise_Woodward_case

Stuff like this terrifies me:

"In the days following the verdict it emerged that the jury had been split about the murder charge, but those who had favoured an acquittal were persuaded to accept a conviction."

I just don't understand it. How can a person serving on a jury, with another person's life in their hands, be "persuaded to accept a conviction" when they don't actually believe the defendant is guilty?

Have you ever sat on a jury? Unless all 12 people start in unanimous agreement, then some people are probably going to have to be persuaded to change their mind. This is intentionally part of the process; it's how the system is supposed to work.

Maybe I'm reading it wrong, but the way it's worded makes me think that they never changed their mind about the innocence of the defendant, but were simply persuaded to vote against what they believed to be true.

I'm perfectly fine with someone starting out thinking the defendant is innocent and then later changing their minds to thinking the defendant is guilty. I'm very much against anyone thinking the defendant is innocent but voting to convict due to peer pressure or whatever.

Watch 12 Angry Men backwards.

Funny you bring it up, because from a Bayesian perspective, the main character makes some pretty specious arguments and the final choice to not convict is probably wrong; see http://rationallyspeaking.blogspot.com/2012/11/odds-again-ba... and the excerpts from http://scholarship.law.duke.edu/cgi/viewcontent.cgi?article=... ("Was He Guilty as Charged? An alternative narrative based on the circumstantial evidence from '12 Angry Men'", Vidmar et al) in http://studiolo.cortediurbino.org/how-useful-is-bayesianism/

That's an awesome comment. What a disturbing movie that would be.

A DNA Test doesn't work like that.

Probability tests do, but not things we label "probability" that aren't. "there is a 25% chance of rain tomorrow" isn't really a 25%. It is a confidence score.

Something people can grasp better than DNA: Let's say you have a partial thumb print in an imaginary murder. You could eliminate suspects who don't have that portion of the thumb print, but you couldn't confirm that the person or people who match did it. Only that the thumbprint is a "Pocked Loop", "Whorl", or "mixed" and so anyone with a "tent Arch" is not the killer.

You can be 100% confident that the print excludes the person with the "Tent Arch" and if you knew there were only 4 people in the room when the victim in our imaginary scenario died you could even go so far as that since only 2 of them have a potential match that you have a 50% confidence in the match. But Running the test 800 times will not get you to a number better than that.

Exactly, that's why it helps to have multiple tests with different sensitivities and specificities for ruling people out and in, respectively. Whorls and a grip circumference of 14cm, say

I don't like that RFLP analysis is still so common. Practitioners of forensics should genotype a few hundred thousand markers and be done with it. I see that Illumina offers tools for this purpose, in fact: http://www.illumina.com/applications/forensics.ilmn?sciid=20...

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact