
How to lie without statistics edition - solutionyogi
https://www.chrisstucchio.com/blog/2016/propublica_is_lying.html
======
stdbrouw
> The predictor is probably not biased against any particular race - the
> race_factorAfrican-American:score_factorHigh term is not statistically
> significant. Or, as ProPublica puts it, it's "almost statistically
> significant".

This is not how you interpret p-values. Not attaining statistical significance
(insofar as you think that that's a good measurement to start with) means
there isn't sufficient evidence to confirm that a factor truly makes a
difference. It is not proof that no racial bias is present -- to prove a
negative with the same 95% confidence we expect elsewhere, the analysis would
need to have 95% power to detect the bias if it did exist... and unfortunately
it just so happens that interaction effects always have lower power for the
same sample size than first-order effects.

The author also accuses ProPublica of cherry-picking numbers but then picks
the one number from the entire article that is least convincing (one of the
numbers on predictive accuracy of the algorithm for black vs. white people)
and ignores the other statistics about predictive accuracy that are mentioned
in the article.

Do I necessarily want to defend ProPublica here? I dunno, I haven't gone
through the analysis they did for the article in depth. I also like the
author's note that statistical models are often wonderful tools for reducing
human biases and (in this case) reducing racism. But ultimately I do feel the
author is being very hostile when ultimately his rebuttal is pretty feeble.

~~~
yummyfajitas
You're right that I didn't check the power of their test, and I'm also not a
huge fan of p-value tests. If you want to say that their methodology sucked,
go ahead. Skimming their R script, they do appear to have skipped the power
analysis.

But that's not what I'm criticizing them for. Statistics is hard and bad stats
are forgiveable. I'm criticizing them for ignoring their own numbers when they
came out insufficiently clickbaity - that's deliberate dishonesty.

------
wodenokoto
I'm still confused if race is a feature in the original model. More over, the
offense and re-offense rate of black people is also tied to racial bias by the
police.

If the police let more speeding white guys off with a warning than black (as a
rate) then the bias is applied at the data level.

------
areyousure
yummy: I'm commenting here because I can't figure out how to comment on your
blog. :(

As a potentially useful but minor correction: Sorelle, who you mention in your
previous post, is a "she".

