One must ask if the investigators of this study selected the correct faculty for examination. (I kid, this is embarrassing)
As a mathematician, I was primed with the knowledge that a large fraction of a mathematics department failed this test. I looked it up on Wikipedia, didn't spoil the answer, and thought damn hard before unfolding the "solution" section. I was relieved to see my answer therein. I do wonder if the students and staff were primed to think about this as a logic puzzle, or if they simply went with a gut answer. Because in my experience, that makes loads of difference in how people of all stripes, mathematicians included, respond to challenging questions.
My gut response was to flip an extra card, for what that's worth. Secondary consideration took a couple of seconds, and I spent another thirty convincing myself that I was correct.
I think we're looking at this wrong. I feel like this test is designed to investigate social biases not test for logical skills and if these people are failing it, it's not so much of a failure in their understanding of logic but rather a procedural impact of the way the question is framed, which is probably precisely why "reframing it in a social context" changes their result populations. I think this test is extremely sensitive to how you pose the question.
Are we trying to test if the candidate can solve the logic problem, or are we trying to test how they handle an /intentionally-confusing/ situation and what (psychological) biases they jump to with their solution?
If it's a test of their logic capabilities, then it seems like the numbers are artificially low, so maybe not so embarrassing as you say... Reason being, I think there are several confounding variables included in the results they'd need to control for if that was the point.
An obvious one, if we were testing logic directly, then I wonder if they allowed the participants submission to "show their work" rather than just which final cards they chose. Doing so would eliminate the "carelessness" confounder in the result where they didn't thoroughly think through all of the logical cases of the cards or where they accidentally included an incorrect card but understood the nature of the required solution, ie. if they knew they needed to disprove rather than confirm but accidentally included a useless card for disproving, they still understood how to solve the problem and thus the logic. What percentage of their results fall into that bucket?
There's also other confounding factors that are set up to "confuse" the participant here that could be removed if we wanted to truly test their /logical skills/ and /not/ some psychological/sociological property. For example, the question merely says: "test that if a card shows an even number". In English, "if" can mean both the inclusive or exclusive OR depending on context - it's needlessly vague, and additionally, I posit that in English, given the common usage of the phrase "test ... if", the phrase is /leading/ the participant to look for /positive confirmation of the rule/ rather than the negative. You can of course derive that the negative test is needed by studying the cards but why try to mislead them outright? Why not say "choose the set of cards that you'd need to flip to prove the rule is false"? This clearly demonstrates the task and doesn't send them on a goose chase.
There's other things too. It doesn't mention if these cards are from a global set of cards or the rule is only meant to be proven on the 4 cards presented. It implies the latter but if you start thinking about "confirming if the rule is true for all cards", it sends you down another useless logical rabbithole, yet, /cards normally come from a deck in real life/ and it is natural to expect there are more cards. Maybe if they wanted to be exact we shouldn't be using cards at all but rather wooden blocks or something.
And I'm sure there are more "biases" that I'm not catching here. If your goal is to test people's likelihood of affected by certain biases psychologically, then all's well and good with the test, go right ahead. But if you're going to present the poor results as some sort of indicator of an population's skill at logic, maybe not the best test without some better testing procedures, imo.
As a mathematician, I was primed with the knowledge that a large fraction of a mathematics department failed this test. I looked it up on Wikipedia, didn't spoil the answer, and thought damn hard before unfolding the "solution" section. I was relieved to see my answer therein. I do wonder if the students and staff were primed to think about this as a logic puzzle, or if they simply went with a gut answer. Because in my experience, that makes loads of difference in how people of all stripes, mathematicians included, respond to challenging questions.
My gut response was to flip an extra card, for what that's worth. Secondary consideration took a couple of seconds, and I spent another thirty convincing myself that I was correct.