Hacker News new | past | comments | ask | show | jobs | submit login

An excellent post by Schneier.

> The problem isn't just that such a system is wrong, it's that the mathematics of testing makes this sort of thing pretty ineffective in practice. It's called the "base rate fallacy." Suppose you have a test that's 90% accurate in identifying both sociopaths and non-sociopaths. If you assume that 4% of people are sociopaths, then the chance of someone who tests positive actually being a sociopath is 26%. (For every thousand people tested, 90% of the 40 sociopaths will test positive, but so will 10% of the 960 non-sociopaths.) You have postulate a test with an amazing 99% accuracy -- only a 1% false positive rate -- even to have an 80% chance of someone testing positive actually being a sociopath.

Interestingly here he uses percentages to describe base rates and risk. Gerd Gigerenzer has a nice book, Reckoning with Risk, where he explains with many examples the problems of this approach. Gerd asks people to use real numbers instead, which are much easier to understand for most people.

Thus, Schneier's example becomes:

> Out of 1,000 people about 40 of will be sociopaths. You have a test that will tell you if someone is, or is not, a sociopath. The test will be correct 9 times out of 10. Bob has taken the test, and has been identified as a possible sociopath. The chance that Bob is actually a sociopath are actually about 1 in 4. This is because the test will tell you that 36 of the 40 sociopaths are sociopaths, but it will also incorrectly tell you that 96 non-sociopaths are sociopaths.

My writing is lousy, and other people will be able to clean this up, but even with my poor writing style it's easier for most people to follow and understand than the percentages.

This is alarmingly important when you're making a health decision - "Should I remove my breasts to reduce my risk of breast cancer?" for example.

(http://www.amazon.com/Reckoning-Risk-Learning-Live-Uncertain...)

EDIT: I use "sociopath" because it's in the source article. I agree with NNQ that it's very troubling to bandy around diagnostic labels like this, and deem people to be dangerous, just because of a tentative probabilistic diagnosis.




This is a pretty bad example to illustrate the fallacy because a 25% confidence is actually extremely good. I don't know what good a test for sociopathy is, but if we had a test this good at identifying terrorists, it would be incredibly useful. If signals intelligence could produce a list of people and guarantee that a quarter of the people on that list are terrorists, it would absolutely revolutionize law enforcement.


This issue has nothing to do with the efficacy of the tests. In a perfect world having a list where 25% of the names are potential trouble-makers (for various definitions of trouble-maker) would be an enormous benefit and allow resources to be targeted more efficiently. The real problem is what happens in the imperfect world where law enforcement, government, self-imposed officers of authority get lazy or down-right malevolent. Do you really want to live in a world where your family members might be carted off, never to return, because their name came up on a list generated by a computer. Try asking the folks in North Korea, for example, whether this type of test is a good idea or not.


The best outcome would be for these kind of traits to be passively detected and for the community to provide help and support as an emergent property of that community. In a way this already happens on sites like Reddit where suicide prevention emerged organically based on the dynamics of the community. This is in stark contrast to the "real world".

The internet is well known as a negative influence on certain people, but couldn't it be having a positive effect that is harder to measure and more an unintended side effect.


Yes, but what happens when your community find out that you no longer believe in God or that you think the Earth actually revolves around the Sun. It's not when things go right but when things go wrong that matter. The points of failure for this type of system are innumerable.

The real goal is building social/political systems that are robust and have checks and balances so that they cannot be perverted by special interests and are accessible to those who need them (child abuse support lines are a good example). Anything where a group intervenes on behalf of an individual is prone to disaster.


The system you describe is exactly what we have in the world now. The financial system has numerous checks and balances and is notoriously prone to non-virtuous behavior. Aren't hacker news or stackexchange examples of creating virtuous behavior using an algorithm, and a strong community? It is self regulating because if you suddenly have a deep opposition to the ethos of a community you can just leave (slashdot -> digg -> reddit).


You can 'just leave' pseudonymous communities because their aggregated judgment doesn't follow you to the next one. But if one's real identity is flagged as "(likely) sociopath" on a Real Name service, how does one 'just leave' that determination behind?

Are search engines and archives going to all willingly 'forget' that data when you 'just leave' Facebook? Are they going to not aggregate and correlate it to any new service you join?

This is one of the huge points of criticism of Real-Name-required services: a person can never escape an unjust judgment of such communities, due the long memory of the internet.


"We're here to help you. And watch you succeed, friend!"


> In a perfect world having a list where 25% of the names are potential trouble-makers [...]

... would be pointless, as a perfect world would have no concept of "trouble".


"in a perfect world" is an English phrase. It means, in this context, "if this worked perfectly".

It is not a statement about the world in general.


You say idiom, I say chronic lack of imagination.


> and guarantee that a quarter of the people on that list are terrorists,

Honest question, what would happen to the other three quarters?


They'd end up on a no fly list and be deprived of basic liberties to travel with no right to appeal secret courts and determinations. Or they'd get invasive checks even though they are a kid, or in a collapsible wheelchair and obviously not a terrorist.

All because people can't understand the example in question, which appears in the first few chapters of most introduction to statistics books. And while all that money is being spent on useless checks the 9/11 terrorists, who the agencies were warned about, and the Boston bombers, who the agencies were ALSO warned about, are not followed up on because human and other resources are being spent on mass surveillance.


They'd be subject to background checks and surveillance that they ideally would not even notice - yes, I'm aware that we don't live in an ideal world, but the point is that traditional "leg work" policing is pretty good at determining whether a suspect is actually engaged in nefarious activities - but it's expensive and requires a reasonably narrow list of suspects to begin with.


Most likely nothing. It depends on what sort of test it is; if it's something non-intuitive like (say) a habit of writing sentences that always have a prime number of words in them, you'll get your false positives but most those people won't pass any other tests, whereas the actual terrorists will.

What Schneier is missing is that while you can't ID people that well from a single test, you can apply a bunch of them. In his example, one test improves the probability of correctly ID a sociopath from 4% to 24%. Apply another, different test of similar efficacy to that result set and you'll have a population of 21 true positives, and 8 or 9 false positives, increasing the probabiliy of a successful ID from 25% to ~70%. Sure, there's no single test that will give you reliable answers, but so what? It's OK to use a multi-pronged solution.


Applying multiple tests only works if the tests are independent. When you're searching for the proverbial needle in the haystack, you probably don't have enough needles to let you reliably calibrate several independent tests in the first place.


Waiting until someone actually commits a crime provides a list that is 100% accurate.


I can just see the Daily Mail headline now:

"Facebook records reveal convicted killer wrote 13-word post 5 years ago - red flag was raised - why was nothing done?"


No, it doesn't. People commit crimes and get away with them al the time, in fact. I'm not proposing that we put people in jail for having criminal potential.


Nothing that is being proposed will stop people getting away with crimes.

Waiting until someone actually commits a crime will stop people being persecuted for a coincidental similarity of their behavior to that of a terrorist, sociopath or mime artist.


I think the fact that in this hypothetical 4% of all people are actually terrorists, and the potential terrorist list would be 10% of the total population, would have a greater impact on law enforcement than the accuracy of the test.

Even 0.4% of the population that get tested and are incorrectly "proved" innocent of being a terrorist amounts to more than a million undetectable terrorists in the US alone.


Do I understand you correctly ?

You're saying that you would be happy to join 74 other non-terrorists (i.e. law abiding citizens) plus 25 actual terrorists and be taken off to Guantanamo Bay indefinitely ?

You're really sure about that being a Good Thing for law enforcement ?


He got 25% starting from a base rate of 4%. The base rate for terrorists is a little lower than that. I agree that the essay probably ought to emphasize this point so skimmers won't take away the wrong idea.


I don’t pretend to know what the value of N ought to be[1], but 1/3 is ridiculously low.

[1] http://www2.law.ucla.edu/volokh/guilty.htm


I found his version clearer because "%" distinguishes proportions from quantities.


Yes, but imagine you're taking the test.

Suppose you have a test that's 90% accurate in identifying both people with X and people without X. If you assume that 4% of people are people with X and you're told that you test positive for someone who has X.

Do you really find it easy to arrive at your actual chance (26%) of having X? Let's not forget that most people on HN are at the smarter end of the bell curve. It'd be interesting to see the results of a large scale study about answers to questions like this.


I think you're misreading my comment. When I said "%" distinguishes proportions from quantities, I was implying there'd be both proportions and quantities (as his version has). Otherwise, they needn't be distinguished.

When I said I found his version clearer, I meant between the two versions originally given. The one you've just added is of course less clear because unlike the other two, it doesn't point out the issue.

BTW: But maybe it is clearer, for calculation rather than understanding, because I get 27.(27)%, not 26%... https://www.google.com/search?q=%28.9*.04%29/%28.9*.04%2B.1*...


not sure how to calculate this with percents

true positives .90 * .04 = 0.036 false positives .10 * .96 = 0.096

total positives 0.132

positives that are true positives 0.036 / 0.132 = 0.2727...

i had to think about the calculation as i was doing it wasnt automatic even though it was just multiplication, but I think the difficulty is more to do with the fact that you have to use some relative of bayesian probability not really the fact that you had to deal with percentages


I vote for just using real numbers instead of percentages, like "0.4 chance" instead of "40% chance". The reason is that a lot of people get the math of percentages wrong simply because it involves a lot of back and forth mental conversion between percentages and fractional representation. I always found it easier to just use the latter.


To be honest, with these numbers the percentages actually do a better job of giving me the impression that this test is worthless. 25% sounds much worse than 1 in 4.


Once you get to 25% that's true.

But today try to ask a few people around you, and see what they say.

> Suppose you have a test that's 90% accurate in identifying people who have a disease, and 90% accurate in identifying people who do not have the disease. Assume that 4% of people have this disease. Hypothetical_Bob is tested, and the test says that he has the disease. What are the chances that Bob actually does have the disease?

Lots of people - smart people too! - struggle with this. Even if you give them pencil and paper and let them doodle around they will often give you an incorrect number. And most of them will be surprised if you tell them it's as low as 26%.


Ah yes, I see what you're saying. Real numbers are easier to reason with and get correct results than percentages.

I think my point is slightly orthogonal since I misunderstood you; if you tell someone that something is "10%" they will think "that is pretty bad" whereas "1 in 10" is more likely to get a "hey, that's not too shabby" response. Percentages sound "worse" than numbers, even when they are the same (at least to me). Perhaps because they are harder to reason with?




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: