
In Search for Killer, DNA Sweep Exposes Intimate Family Secrets in Italy - eruditely
http://www.nytimes.com/2014/07/27/world/europe/in-search-for-killer-dna-sweep-exposes-intimate-family-secrets-in-italy.html?partner=rss&emc=rss&smid=tw-nytimes&_r=0
======
gph
It makes me wonder if in the future there might be false convictions based on
DNA evidence, and not from chimeras or inaccurate testing. DNA found at a
crime scene has basically come to mean that person is guilty in the public
mind (thanks CSI). Even though it's unlikely, it's possible to get some DNA on
someone you're near in public who then gets murdered. So you've got DNA on the
person, your cell phone is in the area, and you don't have a particular alibi.
In most cases that probably means that person is guilty, but I can imagine a
number of cases where the person could just be in the wrong place at the wrong
time.

From the article it sounds like from where the DNA was located and other
corraborating evidence that this guy is guilty. But I'm glad the article ended
by saying some of the geneticists weren't condemning the accused based on the
DNA evidence alone. I hope that type of thinking expands into the wider
public.

~~~
smtddr
I wonder about this all the time.

Sometimes I comb my hair, or bite my fingernails. What if part of it fell on
the floor and some random person steps on it. Stuck on the bottom of their
shoe. A bunch of coincidental stuff comes up and my DNA is tested against a
fingernail/hair under victim's shoe... match. I'm doomed.

Indeed, I fear the mentality that DNA-match = Guilty.

How many people have been sent to jail/put to death based on false DNA results
or even accurate but not via the crime they're accused of.

~~~
rl3
> _Sometimes I comb my hair, or bite my fingernails._

You don't even need to comb your hair to leave DNA all over the place. A
normal person will lose 50-100 hairs every day.[0]

Also, everyone sheds dead skill cells, and certain dermatological conditions
can exacerbate matters.

[0] [http://www.aad.org/dermatology-a-to-z/diseases-and-
treatment...](http://www.aad.org/dermatology-a-to-z/diseases-and-treatments/e
---h/hair-loss)

~~~
jessaustin
_...everyone sheds dead skill cells..._

True. If you're in an office you haven't visited before and would like to be
disgusted, examine the bottoms of a few mice.

------
bcoates

      “The important fact is that this man’s DNA is the same as
      the DNA found on Yara,” he said, noting that the odds of
      anyone else sharing the same genetic profile was “one
      person out of two billion of billions of billions” or
      practically nonexistent.
    

This fundamental misunderstanding of statistics really doesn't increase
confidence in this investigation. When you go on a fishing expedition across
22,000 suspects, the odds of getting a false positive from one cause or
another is nearly one.

~~~
themartorana
Is this true? It _feels_ wrong - that a false positive DNA test is not like
picking the right number on a roulette table. Considering the number of bits
of information that have to match to be a positive hit in DNA testing, even
22k tests shouldn't hit a false positive.

Statistics being what they are, what is the science behind false positives in
DNA testing across a fairly large sample set?

~~~
e12e
Bear in mind that DNA test != full sequencing.

I'm a bit rusty on my combinatorics, but borrowing from wikipedia[1] "if
n(p;d) denotes the number of random integers drawn from [1,d] to obtain a
probability p that at least two numbers are the same, then:"(...)

We have stated above that n=2, p=1/2 __(10^(3 __9))=1 /(2e27), so we have:

    
    
          p ~ 1 - ( (d-1)/d )^(n(n-1)/2))
            = 1 - ( (d-1)/d )^(2(2-1)/2)) # -> 2/2 -> 1
            = 1 - ( (d-1)/d )
         p-1=-(d-1)/d
        (p-1)d=-d+1
         pd-d=1-d  
           pd=1
            d=1/p 
            d~2e27 #Yeah, I guess this should have been obvious, in
                        retrospect
    

For n=20k=2e4:

    
    
         p(n=2e4,d=2e27) ~ 1-((d-1)/d)^(n(n-1)/2)
    

Which is so small that in Python I had to use the Decimal module to compute
p~1e-19 or so.

However, this assumes an _even distribution_ across the possibilities -- and
we've approximated the number of combinations based on the statement of
collision probability -- not based on actual number of base pairs, or based on
some model on how these are likely to be distributed... (eg: if we test for
sex, but only sample males -- does that mean we loose a certain number of
possibilities? How about caucasian vs other races? It depends on what markers
we are testing. Further we _know_ that some markers are likely to match,
because there's bound to be some common ansecstors among some members in the
sample etc.)

Note that the math above might be entirely bogus -- I'm off to bed, so I'd
assume I've made some silly mistakes...

[edit: On actually reading the article, it could appear that the quote on "two
billion billions of billions" are indeed meant to indicate he likelihood of
someone sharing the same DNA, rather than the likelihood of two samples
testing as similar.]

[1]
[https://en.wikipedia.org/wiki/Birthday_problem#The_generaliz...](https://en.wikipedia.org/wiki/Birthday_problem#The_generalized_birthday_problem)

~~~
e12e
For a more _relevant_ discussion, see eg:

[http://www.councilforresponsiblegenetics.org/GeneWatch/GeneW...](http://www.councilforresponsiblegenetics.org/GeneWatch/GeneWatchPage.aspx?pageId=57)

Keep in mind that doing 20k tests, there are likely to be mistakes made. In
the case of a false positive (or a disputed positive), that shouldn't be too
much of a problem -- just run the test again, perhaps using multiple labs to
check the results.

For a false negative match -- you've just missed/"cleared" the probable
perpetrator. Such an error is unlikely to be caught (I'd imagine) -- on the
other hand if the error rate is low (say 0.5%?) -- the chance that an error is
made when testing that (presumably) _one_ swab that is from the suspect, is
pretty low, even with 20k tests.

------
teddyh
Those referencing _Gattaca_ (1997) here should read _The Unreconstructed M_
(1957) by Philip K. Dick:

[http://www.sffaudio.com/podcasts/TheUnreconstructedMByPhilip...](http://www.sffaudio.com/podcasts/TheUnreconstructedMByPhilipK.Dick.pdf)

------
eruditely
The need for prosecutors to be literate in technical fields such as
statistics, probability, and perhaps basic forensics is now necessary. Over-
zealous prosecutors are some of the most dangerous individuals that seem to go
unchecked.

~~~
venomsnake
Overzealous means - we got a guy, now we must find the crime that fits him
best.

------
chatmasta
Why stop at a dragnet of 22,000 samples? Why not take DNA samples of every
newborn?

~~~
msandford
[http://www.imdb.com/title/tt0119177/](http://www.imdb.com/title/tt0119177/)

------
nicolasp
There is a similar case in France (albeit at a much smaller scale), where DNA
tests were run on all male pupils and staff from a school in an effort to find
the perpetrator of a rape. So far they haven't found a suspect.

[http://www.theguardian.com/world/2014/apr/13/french-
school-d...](http://www.theguardian.com/world/2014/apr/13/french-school-dna-
testing-rapist-la-rochelle)

[http://www.theguardian.com/world/2014/may/22/french-
school-r...](http://www.theguardian.com/world/2014/may/22/french-school-rape-
dna-testing-la-rochelle)

------
coldcode
Would not likely happen in the US.

~~~
MrBra
Sarcasm?

