A - "Philadelphia is the capital, and others will agree."
B - "Philadelphia is the capital, but most others won't know that".
C - "Harrisburg is the capital, and others will agree."
D - "Harrisburg is the capital, but most others won't know that."
This technique eliminates groups A and C from consideration, and measures the difference in size between groups B and D.
Both groups B and D think they know something other people don't, but B is wrong and D is right.
In cases where people feel like they have "inside" knowledge, generally speaking it's because they are correct and knowledgeable (group D), not because they are misled (group B).
You often arrive in group D by virtue of having been in group A and then learning the actual truth of the matter.
You can arrive in group B by falling prey to a conspiracy theory, which makes this technique perhaps invalid in such cases? I wouldn't be surprised if the question "Do vaccines cause autism?" had a surprisingly popular answer of "Yes".
I can't say what such a poll would say, but I feel fairly confident that using any sort of "surprising popularity" measure is no guarantee of Truth, but only an excuse after the fact to explain a result.
As you say, for some question the "surprisingly popular" answer could just as easily be dead wrong.
Sometimes experts could be right, but other times wrong. Vaccines/climate change/current politics could all have a strong effect here and still be wrong.
>is a wisdom of the crowd technique that taps into the expert minority opinion within a crowd.
Yes you did. :)
If you give an answer but its a guess, you'd be less likely to say people agree with you.
"My best guess is X but I believe most people would contradict me" suggests that you consider yourself an expert who knows better than most people.
We live in cultural bubbles, tending to cluster close to people like we (People that speak the same language, with similar points of view, economic levels, cultural answers engraved for solving the same problem, eating the same food, cooked in the same way, exposed to the same publicity, popular songs, sport teams and ideological propaganda). People make most of their friends in the same school whereas being teached the same things by the same teacher.
Therefore, we are notoriously bad guessing what other people think... out of our cultural bubble.
If you ask people questions like "Will your neighbors enjoy doing pig meat barbeques?", "Are this mushrooms poisonous" or "Is ok to put soap directly in your bath water?" they can provide a reliable answer, but the results can't be extrapolated out of the cluster. Are useles to find the truth. All cultures choose to ignore big chunks of human knowledge or do some things plain wrong. A poisonous mushroom can be made edible after cooking by an obscure technique culturally spread. Many jews will dislike pig meat. Many japanese will not find ok to put soap in the bath. The majority of people in the planet don't know and will not care about what is the capital of Guangdong, Entre Ríos, or Philadelphia.
If your starting variables are unconstrained (the people answering the question is anonymous and from an unknown pool) the technique is a sociological dice that will return different results each time (AKA pseudoscience)
Wouldn't you get the same result simply by saying "let's just consider the answers from people who said others would respond differently because they're more likely to be right"? That's more intuitive to me and I believe it's effectively the same operation.
First order: "How many jelly beans are in this jar?"
Second order: "What percentage of people will get the wrong answer to this question about Pennsylvania's capital?"
There's also the interplay with the Keynesian beauty contest:
In some sense the stock market is an infinite regress of these "what will other people think about value..." questions.
When Buffet was starting out in 1950-1960s, shareholders had stronger rights and many shares did pay dividends.
The worst case scenario would be gaining a controlling share and forcing some sort of payback.
Today there are many classes of shares which are disturbingly close to buying random cryptocoin.
Think about buying non-voting shares of Google,Facebook,Zynga, etc. Or how about buying whatever that travesty that ADR of Alibaba is ? What are you getting when you buy Alibaba in US?
What are you getting really when you buy those kind of shares?
You have very little of hope of getting dividends.
You have no hope of getting controlling shareholders to give up their controlling "some animals are more equal" shares.
So there is very little intrinsic value in those.
Thus the only value is the "castles in the air" valuation.
However, for those with fear of that situation, there are saner equities on the market.
You don't need lots of potential buyers for liquidity, only one for what you're trying to sell. It's why very low volume secondary markets can function and provide liquidity.
A lot of people will also obviously invest into things even when they do not consider them safe investments. If the potential return is thought to be high enough, large numbers of people will routinely invest into unsafe investments with full awareness that it's at least somewhat dangerous (whether stocks, crypto, real-estate, schemes, etc).
(In particular, I'm referring to the assertion at the end of the article: "Because of the relatively high margin of 10%, there can be high confidence that the correct answer is No.")
The MIT summary  notes "The researchers first derived their result mathematically, then assessed how it works in practice, through surveys spanning a range of subjects, including U.S. state capitols, general knowledge, medical diagnoses by dermatologists, and art auction estimates." Across all those areas, this technique had error rates about 20% lower than other competing techniques. Those techniques included simple majority vote to two different kinds of confidence-weighted scoring.
The paper earned a prestigious publication in Nature.
Prepare two equally sized ensembles of classifiers, let's call them A and B.
1. Train each classifier in ensemble A on labelled data to predict does a picture contains a cat.
2. Take some other unlabelled dataset and collect answers from classifiers from A for each picture from this dataset.
3. Train each classifier in ensemble B to predict average answer of classifiers from A for each picture from the unlabelled dataset.
Then for a picture from the test dataset it would be possible to get answers from ensemble A and from ensemble B and calculate what would be the surprisingly popular answer.
Q: Is the earth flat?
Yes: 10% - 3% = 7%
No: 90% - 97% = -7%
The earth is flat, I guess ;)
Q. Is the earth flat?
Conclusion: the earth is not flat.
Not sure where you're getting the "10% believe the earth is flat" from, but if that's just your guess then it's pretty reasonable for people to say they think 10% of people will respond yes to the question.
It's not about what exactly the percentage flat-Earther would be (be it 1% or 0.1%), it's about which question has more "no" percentage.
And it should be Q2 because
Normal people will answer "No" to both questions.
Flat-Earther will answer "Yes" to the first question. Most of them would, however, answer "No" for the second, because they're well aware they are the minority on this topic (think "wake up sheeple!").
Or did you interpret the question as suggested by bena in his sibling answer, ie as an estimate for the answer distribution (that would then have to be averaged) instead of a choice of the most popular answer (as was done in the wiki article)?
That does seem realistic considering how often the topic pf flat-earthers comes up in discussions on social media, despite it being an extremely niche view.
Doesn't seem realistic to me.
I'm not sure, that "yes" imply that the asked person believes that Earth is flat. I probably would be confused if someone asked me such a dumb question (why?), and I answer "yes" to make situation to be obscure not just for me.
I think OKCupid got numbers like this for the question, or something like "is the Earth bigger than the Sun". To which I attribute to the smartass constant -- or rather that dumb questions elicit dumb responses.
But, interestingly, there may be a symmetry between people who overestimate the amount of creationists and creationists who underestimate their numbers.
So maybe that technique is actually valid. It spots an asymmetry in people who think the majority is wrong on something.
It could just as easily be crazy answers from crazy people.
* each side believes that those who disagree are wrong, not just mistaken in their remembrance of a fact.
* each side is well-informed about which view is more popular
The fact that everyone is willing to stick to their answer despite everyone knowing what "most people" think means that an answer's being surprisingly popular doesn't bear either way on its being right. The test is ideal for simple questions of fact which are not disputed.
As you said, knowingly going against the consensus does not an expert make.
The correct answer to the second question is "No". That is what most people would say. So technically, if we saw Yes:0, No:100, that would just mean everyone knows what they popular choice is.
What they want to ask is "How many people would say the answer is "No"?" Then if the answers are far afield, we can say whether or not an answer is surprisingly popular.
So in the article and with your example, the second question isn't actually a reflection of the proportions of the first. They're kind of disconnected.
Still, the article says this:
Because of the relatively high margin of 10%, there can
be high confidence that the [surprisingly popular answer
One, I'm sure that nowhere near 10% of an educated population believes in flat-earthism.
Two, I think this technique is certain to yield the correct answer quite a bit more often than it will yield an incorrect one. Since it claims to achieve "high confidence" and not "absolute confidence" I think it would still be a pretty valuable metric in many instances.
Guess this technique may also detect facts that are involved in common conspiracy.
Perhaps the number of bets is analogous to what people believe, while total money bet is analogous to what people think others believe. If so, the surprisingly popular answer is the one with the smallest average bet.
The obvious analog in the stock market is average order size, but this isn't a generally-available statistic, and it's not clear to me whether it would be meaningful.
Volume (shares traded per time unit) is closely-watched, however, and several technical analysis "indicators" compare price changes to volume.
High liquidity is a state of relatively small price changes per unit of trading volume. It generally means traders are not paying large premiums/discounts to previous trades. In a way, this corresponds to a smaller bets on future returns, and thus, to surprisingly popular returns...
 There are similar concepts in modern portfolio theorgy, such as Amihud illiquidity.
So it is assumed that people using this technique could be inferring the correct answer by finding the surprisingly popular answer. Not just a observation about what the population thinks.
But it would probably help in cases where popular opinion is entirely misinformed about the subjective question, not having any basis other than (already misinformed) hearsay on which to form their own subjective opinion.
So, for example, if there was a musician who had an absolutely terrible song that somehow became the song they were best known for (being a "one-hit wonder" whose song wasn't really a "hit"), the public might believe that that song is their best song, since it's the only song of theirs the public has ever heard of. Experts (i.e. people who have heard more than the one song of theirs), on the other hand, would tend to agree that it's certainly not their best song.
(Given that example, I'm inclined to suggest that you could use this algorithm to determine when people are being judged overly-harshly for things, e.g. whether to ban someone from a website just because they've received a lot of reports about that person's behavior.)
1. Who will you vote for?
2. Who do you think will win?
I don't know if we can learn anything from the surprisingly popular answer here, but now I wonder if question 2 would be a better predictor of the election result. But it will be tainted by previous polls and news.
It doesn't really matter here that people answered incorrectly.
There is this core part of the unpopular-but-correct answerers that knows that the incorrect answer is popular.
The larger this core part is, the higher the "score" of the "surprisingly popular" answer is. It's about finding something that is "mistakenly popular".
Saying that you believe the popular option is different to yours is a display of high confidence in your own answer, and low confidence in other people's knowledge.
Dunning kruger only deals with highly confident wrong people, which isn't the interesting group in our surprisingly popular measurements.
Bringing dunning kruger in to it would make more sense if there was an "im not 100% confident" option, but instead it was an absolute dichotomy.
It doesn't really mean that all of the people answering incorrectly are completely confident and therefore are dunning krugering. Someone who is not overconfident at 60% confidence adds the same "yes" data point that someone who is 100% confident and wrong does.
* Do you think Trump will win the US Presidential election?
* Do most voters think Trump will win?
The result of moving the capital to Philadelphia would surely strain the legitimacy of the state government even more.
Moving it to Pittsburgh would accomplish the same exact thing except it would be arguably even less just.
There are any number of reasons why keeping it in Harrisburg is imperfect but it's probably the least bad solution.
Yes, the distance between respective city centers is farther than those others, but that makes it more inclusive.