
Exit polls aren't what you think they are - mcbrown
https://nuttersandnuttier.com/exit-polls-arent-what-you-think-they-are-e93d031726fb#.z8cow174n
======
jobigoud
He didn't choose Springfield at random for his example, this particular bit of
counter-intuitive statistic is called the Simpson's Paradox.

[https://en.wikipedia.org/wiki/Simpson%27s_paradox](https://en.wikipedia.org/wiki/Simpson%27s_paradox)

~~~
bo1024
I don't think this is really Simpson's paradox (which is named after Edward
Simpson, from 1951). In Simpson's paradox, you have a statistic X that is
lower than Y in both of cases, but overall, Y is higher than X.

It _would_ sound like Simpson's paradox if one candidate won a higher
percentage of the vote in both East and West springfield, yet lost the
election. But this is of course impossible.

Simpson's paradox arises when you compare two different percentages, say
belonging to Candidate A and B, across two different treatments, say East and
West Springfield, but you don't compare the sample sizes. It doesn't apply
here because everyone who votes is assumed to vote for either A or B.

An example of Simpson's paradox would be like this. We look at the percentage
of _their own party_ that a candidate wins. Then it could be that Candidate A
wins 90% of the Democrats in East Springfield while B wins only 80% of the
Republicans; and in West Springfield, A wins 60% of the Democrats while B wins
50% of the Republicans. Yet, due to differences in population between East and
West, A overall only wins 65% of the Democrats while B wins 75% of the
Republicans.

~~~
Sacho
The problem described by the article seems like a case of violating
dimensional analysis, rather than Simpson's paradox. It might be more obvious
if we use clear units: 50 scores voted for A in West, and 80 dozens voted in
East.

The article talks about "weighting" the results, which is exactly figuring out
the conversion from "democrats in West" to a common unit, "single person", to
allow proper arithmetic operations on them.

------
fotbr
One thing I rarely see brought up is that people lie to exit poll takers.
Maybe I'm just stupid, or politics are too far above my understanding, but I
don't understand why exit polling is taken as gospel (see 2000 shrub vs bore,
or brexit if you prefer an international flavor), given there's absolutely
zero requirement that responses be truthful.

I know I wasn't good with my statistics classes (I managed low to mid "A"s,
but I never really understood the steps I was reproducing, or the why behind
the process), but how do you correct for that type of uncertainty?

Is there a good, basic statistics reference that HN would recommend? We used
Devore's "Probability and Statistics for Engineering and the Sciences", and it
didn't "click" with me. I'd love to find a good textbook on the subject.

------
wyldfire
> The reason for the initial error in the 2016 primary is obvious: the
> rural/urban split caught exit pollsters — who probably assumed things would
> look a lot like 2012 — completely by surprise.

Wouldn't it make a lot more sense to use the 2012 dem primary as a basis
instead?

> And if you hear anyone say the exit polls are a sign of a rigged election,
> please do tell them that I told you to tell them that I said to say that
> they’re not very knowledgeable about the subject.

Yeah, it's too bad. Now that I know this info about exit poll results it would
be nice if they could qualify the numbers a little when reporting them.

~~~
maxerickson
There wasn't a 2012 Democratic primary. Not a meaningfully contested one
anyway.

In New York State in 2008, Clinton beat Obama with support coming from both
urban and rural areas (with Clinton perhaps doing better in rural areas than
Obama):

[http://uselectionatlas.org/RESULTS/state.php?year=2008&fips=...](http://uselectionatlas.org/RESULTS/state.php?year=2008&fips=36&f=0&off=0&elect=1)

It's also not necessarily the case that the turnout for past elections will be
a sound guide to the turnout for the next election. A lot of people vote as a
result of affinity for a particular candidate or because of a motivating
issue.

------
DennisP
So obviously, early results from exit polls aren't reliable because we don't
know turnout numbers.

But after the election, we know exactly who voted, so it seems we _could_ use
exit polls at that point to sanity-check the results. Given a paper trail, a
significant discrepancy could trigger a recount.

------
snksnk
> Unfortunately, everyone (you, your family, that egg on Twitter, most
> pundits, and at least one organization purporting to be doing exit polling)
> has no fucking idea how exit polls are conducted and why those initial
> figures are a steaming pile of crap until real figures on turnout (i.e. the
> votes themselves) have been tabulated.

We do, don't be so condescending and expletive. The popular press just wants
some cheap sound bites to get more views / clicks. They do the same with
academic research, in which they mistake economic significance for statistical
significance, ignore any shortcomings, ignore non-rejected hypotheses, and
project the findings outside of the (often very limited) scope.

~~~
vxNsr
This whole article uses condescension as a comedic device, it works for some
and not others. But he wasn't actually condescending, he was just trying to
entertain.

