
Knowledge from small number of debates outperforms wisdom of large crowds (2017) - Dowwie
https://arxiv.org/abs/1703.00045##
======
hobofan
I would say that this has been known for quite some time in philosophy, but I
guess it's good to have some real-life verification for it. This article on
Belief Merging and Judgement Aggregation[0] is a good entry point for the
field, for anybody that is interested.

[0]: [https://plato.stanford.edu/entries/belief-
merging/](https://plato.stanford.edu/entries/belief-merging/)

~~~
watersb
Off-topic, but I love the logo at the upper-left corner of the
plato.stanford.edu web pages.

Stanford has an amazing collection of sculptures by Auguste Rodin, including
his iconic "The Thinker". If you have business or just visiting the Silicon
Valley area, it's worth exploring the Stanford campus and its Rodin museum.

[https://en.m.wikipedia.org/wiki/Auguste_Rodin](https://en.m.wikipedia.org/wiki/Auguste_Rodin)

------
ajuc
Am I correct that they asked the same people the same questions 3 times?

It's no surprise that the 3rd time the answers were the best. I'd expect this
to happen even if there was no debate (if not to the same degree).

> Each participant was provided with pen and an answer sheet linked to their
> seat number. The event’s speaker (author M.S.) conducted the crowd from the
> stage (Fig. 1A). In the first stage of the experiment, the speaker asked
> eight questions (Supplementary Table 1) and gave participants 20 seconds to
> respond to each of them (stage i1, left panel in Fig. 1A). Then,
> participants were instructed to organize in to groups of five based on a
> numerical code in their answer sheet (see Methods). The speaker repeated
> four of the eight questions and gave each group one minute to reach a
> consensus (stage c, middle panel in Fig. 1A ). Finally, the eight questions
> were presented again from stage and participants had 20 seconds to write
> down their individual estimate, which gave them a chance to revise their
> opinions and change their minds (stage i2, right panel in Fig. 1A).
> Participants also reported their confidence in their individual responses in
> a scale from 0 to 10.

~~~
thenaturalist
I can’t follow your reasoning. How does asking you the exact same question
twice improve your accuracy? If you don’t know how high the Eiffel Tower is,
why would you know it the second time? Conversely, you likely created a mental
anchor when giving an estimate the first time and would be hard pressed to
provide an answer contradicting your first estimate - even if it might be more
accurate.

There were no answers provided between asking the questions as far as I
understand this excerpt.

In the group only 4/8 questions were asked and here a difference in accuracy
can be made. Maybe you’re mixed with travelers who recently visited Paris,
historians or people who due to other circumstances or pure luck provide more
accurate estimates, effectively influencing the anchor of your first estimate.

~~~
ajuc
Given more time I can remember something, or I can notice that the last
question gives some hints or even unrelated associations that help to answer
the first question.

I don't think "if you don't know the answer in 20 seconds you won't ever know
it" is true.

~~~
thenaturalist
As far as I understand it, these were not simple right/ wrong questions, but
questions about estimating a continuous measure (height, age, percentages). I
have no reason to believe that questions were related or would give hints.
Such a correlation would destroy the power of the experiment.

As far as I understand it, this paper is not about "Do you know what is true
or false in 20 seconds" but "what is a value you confidently estimate within
20 seconds". This is a field much studied in psychology and when you look into
Kahnemann and associated research I would be surprised to find any scientific
evidence that time improves your estimate. I'm not saying it's impossible, I
am confident that - on average - it simply does not happen.

Kahnemann showed we're full of biases and this research shows that calibrating
ourselves with others is a much higher predictor of improving accuracy of
estimations than time.

~~~
kortilla
Wow, that’s a terrible way of conducting an experiment. Having more than 20
seconds to reason through an estimation will produce a much better result if
you are any kind of systematic thinker.

In 20 seconds for the Eiffel Tower I’ll just pull a number out of my ass. In 5
minutes I will think through the comparison charts it shows up next to on
other high rises. I’ll remember the half scale one in Las Vegas and its
relative height to the Bellagio across the strip (about the same) and that the
Bellagio was about 40 stories. Given 40 stories at 13 feet per floor, you get
520 ft * 2 = 1040 ft for the Eiffel Tower.

------
amelius
Perhaps we can save democracy by replacing the voting mechanism by placing
people in groups of 10, and letting them reach consensus before making a vote.

~~~
AnthonyMouse
That was how the US Senate originally operated. You would elect your state
legislature (where your vote is _much_ less diluted than at the federal level)
and they would elect your Senator.

Early 20th century populism took that away. It also meant that the states now
have no representation in the federal legislature, which led to an almost
immediate federal takeover of basically everything.

~~~
axiak
That amendment really screwed democracy in the US. Not only does it remove
room for debate, but it has IMHO been a major factoring in lessoning voter
interest in the state legislature. This has far reaching consequences, such as
making it harder to make new amendments and checks against gerrymandering.

------
ivanmaeder
I've only read the summary but Philip Tetlock comes to this conclusion based
on his work with "The Good Judgment Project" (described in his book
"Superforecasting").

The GJP is a kind of experiment he's been running for a few years in an
attempt to learn how to improve predictions.

[https://en.wikipedia.org/wiki/The_Good_Judgment_Project](https://en.wikipedia.org/wiki/The_Good_Judgment_Project)

From what I remember: overall, teams did better than individuals and wisdom of
the crowds because they were able to feed off and combine each other's points
of view and separate knowledge.

However! It's important for teams to not let groupthink dominate—individuals
within teams needed to challenge each other.

------
hkt
Deliberative democracy is great. Put people in enriched decisionmaking
environments for better outcomes, _and then_ get greater democratic legitimacy
from doing so than even some representatives? Win win.

This is the opposite of populism too: when we voted Brexit in the UK, it was
an uncontrolled, information-scarce (and falsehood-rich) environment. A room
where people were organised and where they asked for their own experts and
talked to one another would never have delivered that result.

~~~
sbhn
People in the uk voted brexit, because it was just after the uk us and france
were dropping bombs in lybia and syria, arab spring they called it, the
destruction of infrastructure created millions of refugees, the bbc amplyied
the importance and strength of isis, and flashed images of thousands of the
refugees in calais tripping over each other trying to illegally cross the
channel into the uk. The good souls of england thought they were under attack,
bless em.

------
sytelus
You can make a case that democracy is a grand application of “wisdom of large
crowd”. But if that can easily and consistently outperformed then do we have
better political system than democracy? What are the consequences of this?

~~~
mtgx
That would be _direct democracy_. No "democracy" on Earth is actually a direct
democracy, but a representative democracy, which seems more in line to what
this study proposes, no? A small group of elected officials to "debate"
issues.

~~~
Askolein
Switzerland is close to a direct democracy with a model based on:
representative democracy by default but direct as soon and anytime the people
feels the subject should be handled so. That system has proven very stable.

~~~
hobofan
> That system has proven very stable.

Caveat: with a small, well-educated population. Try and apply the same system
elsewhere and your mileage may vary.

~~~
GarvielLoken
Then the obvious solution is to divide nation states down to small, well-
educated populations no?

~~~
arethuza
Say about 6 million people? On average?

[I'm a Scot - though that's not where I got the number from]

------
denzil_correa
We intuitively do know this and mankind has used different variations of this
concept throughout human civilization. Sortition has been used for more than
2000 years to come up with fair representation of the population in governance
[0]. A more contemporary example would be the jury system in the US. But, nice
to have experimental results on this concept.

[0]
[https://en.wikipedia.org/wiki/Sortition](https://en.wikipedia.org/wiki/Sortition)

------
notahacker
There's a big unanswered question about whether "debate" is quite so useful at
improving accuracy of answers when it consists of an audience with strong
priors listening to motivated reasoning rather than people deferring to the
people who are most confident they have a decent grasp of the subject of a
neutral trivia question. I think it's conceivable the opposite effect might
occur if the experiment were be to repeated using polarising political issues,
even if the questions themselves were fact-based (economic growth rates, crime
rates, immigration figures, global temperature changes etc).

~~~
patcon
In the same experiment (but still unpublished) they actually find evidence
that a similar CONSENSUS effect is observed in polarized issues with no clear
"best" choice.

[https://www.ted.com/talks/mariano_sigman_and_dan_ariely_how_...](https://www.ted.com/talks/mariano_sigman_and_dan_ariely_how_can_groups_make_good_decisions/transcript?language=en)

The most interesting bit to me is the fact that a rare middle position
stakeholder, a "high confident gray", allows groups to reach consensus more
often. This potentially has huge applications in system design of the
processes of democracy imho

------
baud147258
It reminds me of that assertion that the IQ of a crowd is the lowest IQ
divided by the number of people in the crowd. Seems the study validate this
with that smaller groups performs better

~~~
prewett
Except that the previous study refuted that idea: the average answer was
better than individual answers, which is definitely not IQ_crowd = IQ_min / n.

------
patcon
This TED video tuned me into this research earlier in 2018, and while not
rigorously tested, there are some even more interesting hypotheses that are
being sussed out of the experiment:

[https://www.ted.com/talks/mariano_sigman_and_dan_ariely_how_...](https://www.ted.com/talks/mariano_sigman_and_dan_ariely_how_can_groups_make_good_decisions/transcript?language=en)

------
sova
The line to know: "Remarkably, combining as few as four consensus choices
outperformed the wisdom of thousands of individuals." Confer away!

------
sharemywin
The aggregation of many independent estimates can outperform the most accurate
individual judgment. This centenarian finding, popularly known as the wisdom
of crowds, has been applied to problems ranging from the diagnosis of cancer
to financial forecasting. It is widely believed that social influence
undermines collective wisdom by reducing the diversity of opinions within the
crowd. Here, we show that if a large crowd is structured in small independent
groups, deliberation and social influence within groups improve the crowd's
collective accuracy. We asked a live crowd (N=5180) to respond to general-
knowledge questions (e.g., what is the height of the Eiffel Tower?).
Participants first answered individually, then deliberated and made consensus
decisions in groups of five, and finally provided revised individual
estimates. We found that averaging consensus decisions was substantially more
accurate than aggregating the initial independent opinions. Remarkably,
combining as few as four consensus choices outperformed the wisdom of
thousands of individuals.

~~~
sharemywin
wonder how that would compare to people providing a confidence to their guess
and using a weighted average.

------
ebetica0
I wonder how much statistical independence has to play in this. e.g. the large
crowd can be biased when they influence each other in a way that the small
debates cannot. What happens when you have small independent debates versus
large independent crowds?

------
PaulHoule
We had poker planning introduced to my team and within two months or so we got
to the point where most of the time we all get the same number or get
something like 8-13-21.

------
theontheone
I think attributing the accuracy to debates is a rather hasty conclusion. I
would love to see an experiment where the individual participants rated their
confidence from 1 to 10, and only the highest confidence answer in each group
was taken. My hunch is this would perform just as well (if not better) than
post-discussion.

------
reggieband
Whenever I come across such ideas it makes me want to more deeply investigate
the Communicative Rationality [1] of Jurgen Habermas. Despite modern trends
towards a conservative stoicism (e.g. Jordan Peterson) I actually think Social
Critical Theory feels to be moving in the more-correct direction. However, the
current negative association with moral-relativism and postmodernism is a new
McCarthyism. The bridge should be pragmatism but even that is too closely
associated with socialism.

Too many isms but the ones that are scaring me most now are authoritarianism,
fascism and totalitarianism. Ideas like communicative rationality feel to me
like the most reasonable solutions.

[1]
[https://en.wikipedia.org/wiki/Communicative_rationality](https://en.wikipedia.org/wiki/Communicative_rationality)
[2]
[https://en.wikipedia.org/wiki/Critical_theory](https://en.wikipedia.org/wiki/Critical_theory)

