Wouldn't this greatly throw off the results in cases of rare behavior? For examp...

gus_massa · on July 31, 2011

Easy version:

If the sampling is large enought, 1/6 of the answers are forced "yes", 1/6 are forces "no", and only 4/6 of the answers are reals.

So to get the actual frecuency of "yes", you should remove the 1/6 of the forced "yes", but the total size of the sampling is only 4/6 of the original, so the formula is

  Actual_YES= (Measured_YES - (1/6)) * (6/4)

  Actual_NO= (Measured_NO - (1/6)) * (6/4)

(If Measured_YES+Measured_NO=1=100%, then Actual_YES+Actual_NO=1=100%)

More technical details:

You have to be carefull, because the number of forced "yes" and "no" will not be exactly 1/6 of the total sampling. For example, supouse that you ask 60 persons if they are "aliens" using this method.

If you get 11 "yes" answers, it doen'n mean that (11/60-1/60) * 6/4 = 2.5% of the people are "aliens".

If you get 9 "yes" answers, it doen'n mean that (9/60-1/60) * 6/4 = -2.5% of the people are "aliens"!

And each of this sampling results happens ~13% of the times that you make the pool.

jonkelly · on July 31, 2011

No, it wouldn't throw off the result, but no matter the technique you would need a very large sample to accurately measure a behavior that rare.

lliiffee · on July 31, 2011

You are right. The "signal to noise" will be terrible if the rare behavior is much less likely than the random event. The only "statistical" solution would be to calibrate things by making the random event similarly unlikely. (e.g. one could use a 100 sided die.) If this statistical solution destroys the psychological advantage of the whole thing, I don't know!

hugh3 · on July 31, 2011

I'm concerned that some of the respondents wouldn't quite have understood the rules of the game, and that this might throw the results off significantly.

Natsu · on July 31, 2011

You could always ask them other questions you already know the answers to in order to measure that.

bugsy · on July 31, 2011

That would only, even theoretically, be a possible approach to explore if you had questions that you knew this group of people were exactly as likely to lie about as killing protected leopards. How could one know which questions these would be. They wouldn't. Despite this problem, I would still like to see such calibration questions added to each round such as "Does astrology work for you?" and "Have you even been taken by aliens?" The results to these questions would give some useful data for comparison.

Natsu · on Aug 1, 2011

You're assuming they lie. I'm addressing the worry that they might not understand the rules.

bugsy · on Aug 1, 2011

All right. That is a very good question to bring up. Rules like those described are not as straightforward to understand to ordinary people as they may seem to the experimenter. I had to read it a couple times to understand that the idea was to tell the truth if 2-5, and give a fixed response if 1 or 6. Surely there were participants that didn't understand and the number among rural farmers is going to be different than the number among western college psychology students upon whom most of these sorts of tests are developed and calibrated. 6 is an unlucky number, the number of the devil in some cultures. Other cultures have feelings about 1 (unity), 2 (dualism), 3 (trinity), 4 (chinese good luck, indian sacred number) and 5 (witchcraft). When talking of small effects, there may be emotions experienced that vary from person to person depending on their education, IQ, cultural background and environment. This can skew results in a way that is dependent on the particular population tested. A person from a culture that believes that 6 is an evil number and a bad omen may be very slightly more likely to change their response on a 6, seeing it as a warning. One can't take a test like this that is quite subtle and looks at tiny effects within noise signals that come not from rocket engines but from subjective human reactions and dump it on any population and assume results calibrated on a different population are a valid interpretation. That would have to be shown first, in cross-cultural comparison testing.