
Show HN: Guess which government programs work - robertwiblin
https://80000hours.org/articles/can-you-guess/
======
robertwiblin
I'm one of the creators of the quiz!

Note that the projects were _not_ chosen based on their results, and certainly
not because the results are counterintuitive. Rather they are the ten best
understood social programs we could find.

You can read about the findings of our research into whether people can guess
what works and what doesn't ahead of time:
[http://www.vox.com/2015/8/13/9148123/quiz-which-programs-
wor...](http://www.vox.com/2015/8/13/9148123/quiz-which-programs-work)

~~~
NateLawson
Hey, this is really neat. It reminds me of the lesswrong.com community, where
the goal is to figure out ways of offsetting cognitive biases.

One thing I've always wanted to see is a mandatory A/B test built into any new
legislation. That is, funding for a study after X years of the new law or
program being in place would have to be part of the law. It could even
including a threshold of the desired controlled effect of the law or else it
is immediately revoked at that point.

This way we'd force people to argue not about 2nd order issues ("I think
mandatory sentencing is good" vs. "No, I think it's bad") when they could
agree on first order issues ("We all want to see a 10% decrease in robberies
within 3 years.") This way a bad program would automatically expire if it
didn't meet its goals, and the goals would have to be open enough that people
could throw out bad laws ("Goal is to incarcerate 20% of the population"
wouldn't pass, no matter what your political leaning).

~~~
asgard1024
> One thing I've always wanted to see is a mandatory A/B test built into any
> new legislation

This is what I find great about US and EU (I am from Czech Republic), that in
a _federal_ system, you can test the legislation in one state first, and then
see the results and this lets the other states to perhaps adopt or avoid that
approach.

Unfortunately, strong moneyed interests break that model by lobbying strongly
in Washington and Brussels from the start, which I think is a bad trend, but I
have no idea how to prevent it. The effect of having more cultural diversity
is very hard to quantify economically. (There is also other side of the coin
though, which is too much state competition can lead to polarization and war.)

Although sometimes there are things where you have to be careful about this
approach - for example, if one state engages in lower taxes or pro-export
policies, it may gain, but the same policy enacted by everybody will make
everyone lose.

~~~
Retric
There are a lot of things which work better as federal programs vs state /
local programs. Homeless populations spring to mind, if some city's treat
there homeless population poorly they can 'export' the problem to another city
/ state. Where a state with overly generous benefits may see migration.

Education is another with states that export there most educated population
having little incentive to subsidize their higher education system. Which
likely has lead to the dismantling of state funded schools as people became
more mobile.

~~~
asgard1024
> Which likely has lead to the dismantling of state funded schools

I disagree this is a necessary outcome. It may as well happen that backward
states will take notice and improve their higher education system in various
ways.

But I think what you are saying is a fair point and I actually mentioned it.
What I would add to my comment is the common solution to this is called
"principle of subsidiarity" and it's used EU and Switzerland (which is another
example of a federation that often tests different law variations in smaller
scale first).

------
alexandercrohde
I was thinking about why this website triggered my skeptic alarms and came up
with a few things.

1\. The web interface makes it hard to go back. The first thing I want to do
when I see a surprising answer is reread the question. I understand this is
likely just bad engineering, but it makes the whole site _feel_ less
trustworthy.

2\. Some questions get summarized poorly. For example, the mindfulness
question asks "What effect does mindfulness based stress reduction have on
self-reported mental health (rates of anxiety, stress, and depression)?" but
then summarizes the choices to "reduction rates of mental health issues." You
can't drop a word like self-reported from a question, after all physically
disabled people self-report being happier after their disability (e.g.
[http://www.ncbi.nlm.nih.gov/pubmed/21870935](http://www.ncbi.nlm.nih.gov/pubmed/21870935)).

Also in the "Drug Substitution Programs" question the text indicates that the
research is based off of cases where "Addicts were given heroin or substitutes
such as methadone or buprenorphine, based on their needs," however the choices
are formed "Positive effect - Prescribing heroin to addicts reduces crime
rates," [note that it dropped the _or substitutes_ ]. This feels like going
for shock value.

3\. At the end the website is selling very hard about some newsletter.
Apparently the website seems to be focused around a career guide? If you truly
have no axe to grind then present high-quality information and I'll naturally
explore the site more.

4\. If your objective is really to help raise awareness about how often media
publicity for social interventions doesn't reflect efficacy as measured in
journals then I would think at the end you would propose a plan of action ,
such as "When hearing about a social program, you can use google scholar to
tap into research findings..."

~~~
yuncun
There is also an unnecessary "are you sure?" notification that pops when you
leave midway through the quiz.

------
asgard1024
This is great! One of the interesting things about social sciences (and
engineering, although just like computer hacking, social engineering has
unfortunately strong negative connotations) is that the real world is often
counter-intuitive. In particular, people tend to believe in punishment a lot,
while it commonly escalates the problem (I like the quip "emphasis on
punishment is the sign of an obedience frame").

I am dismayed that I didn't do better than random chance, even though I like
to read about social issues. I think this really shows the importance of
empiricism in social sciences and engineering.

~~~
davidgrenier
My cue was to vote against strategies that seemed built on a rewards and
punishments foundation.

~~~
toyg
The problem with reward/punishment strategies is that rewards are inevitably
deferred; and people who are in problematic situations likely have much lower
ability to deal with deferred satisfaction. That's just how it is.

Unfortunately, because bureaucracy is mostly run by people with great skills
in the area, this fact is really hard to get across, or is even rejected
outright. After all, if I-Court-Judge or I-Politician could keep it together
enough to get where I am, why can't anybody else manage just a little bit? And
then the morality discourse kicks in and there is no way out.

------
noir_lord
The Elderly care question said that there was no effect on 3 year mortality,
which is fine and I'm sure that is accurate but it doesn't answer the question
of "Did it have a net positive/negative effect on quality of life for those 3
years".

Not been a negative nelly just an observation that something as complex as
social interaction problems can't always be summed up with a clear net
gain/loss.

Loved the idea and implementation though.

------
dang
Having conclusive evidence that a program doesn't work (or even is harmful) is
significant. How effective is it for actually getting such programs stopped?
Have any been?

~~~
BenjaminTodd
Some have, but many persist for years (e.g. Scared Straight).

~~~
dclowd9901
Scared straight is highly effective PR. Hood looking young people getting
yelled at by scary men? Possibly crying? It's like schadenfreude for
"civilized" society.

I don't understand how policy leaders can look at hard numbers and say, "fuck
it, let's keep funding it."

~~~
vacri
_Tucker: My expert would totally disprove that.

Abbott: Who is your expert?

Tucker: I don’t know, but I can get one by this afternoon. The thing is,
you’ve been listening to the wrong expert. You need to listen to the right
expert. And you need to know what an expert is going to advise you before he
advises you._

\- The Thick Of It

------
superuser2
Home visits for older adults is a little disingenuous. The purpose is not
really to reduce death risk, but to increase quality of life. I don't think
that counts as "no effect" just "no effect on mortality" which is
unsurprising.

~~~
Thriptic
> The purpose is not really to reduce death risk, but to increase quality of
> life.

Not necessarily. There are many at home services geared around attempting to
improve medication adherence in elderly patients for example, as they
frequently are on several medications concurrently and have trouble keeping
track of dosing requirements due to cognitive dysfunction. Non-adherence can
have a significant effect on longevity.

------
ridgeguy
I got way fewer of these right than I would have hoped. It's clear that I need
to update my understanding of the questions presented. I owe a debt to those
who put this together. Thanks very much.

~~~
Asbostos
If it helps, any time you choose "No effect", you're making a guess about how
thorough the experiment was, not about the program itself. Every program will
have some effect, but may be too small to have been detected. Similarly if the
correct answer is "no effect" that just means the studies weren't accurate
enough.

------
jrnvs
It would be nice to see a summary of the projects and their effects after
finishing the quiz. By that time, I had already forgotten about some and
wondered what the answers were.

~~~
robertwiblin
Thanks for the feedback, I'll see if we can change that!

------
pluma
I find it a bit unintuitive that "no change" errors are valued the same as
"opposite" errors (e.g. answering "positive" when the change was actually
negative). I find it far less interesting when the discrepancy is whether
there is any change or not than when the effect is actually opposite to what
one would apparently expect.

Additionally as others have pointed out some of the questions and answers
aren't very clear (though maybe it's just my reading comprehension that's
failing). I too am unhappy about the way the elderly question is posed -- it's
not clear whether all of the programs were actually focussing on reducing
mortality. In fact the introduction mentions that it's merely one of several
goals and from anecdotal evidence I would expect that the mortality metric is
thrown off by patients stabilizing once their health has deteriorated enough
to require them to be placed under permanent care -- that you're less (or
equally likely) to die doesn't mean you're more (or equally) healthy.

~~~
BenjaminTodd
We have to just pick one outcome variable for each experiment though, or it's
going to get too complicated.

------
iopq
It's really hard to tell "negative effect" and "no effect" apart, I got two of
them switched.

Besides, I would consider both of those failed programs, not even sure if
there's a point in distinguishing.

I got 7/10 right, only one successful program I got wrong.

------
blakeweb
This may be specific to me (though I'm using the latest chrome, latest mac os,
so I doubt it), but the "share on facebook" link didn't work at all for me,
and the facebook "like" link took me to a page to share the post on my
timeline, not just "like" it. Didn't see any easy route to report such things,
so here it is!

------
frabcus
Ben Goldacre (of Bad Science) wrote a paper with the UK Government on A/B
tests of policies.

[http://www.badscience.net/2012/06/heres-a-cabinet-office-
pap...](http://www.badscience.net/2012/06/heres-a-cabinet-office-paper-i-co-
authored-about-randomised-trials-of-government-policies/)

------
Freeboots
Interesting that the average user is less than 50%. I got two wrong and still
felt like an idiot.

~~~
Asbostos
Since there are 3 choices for each question, a random user should get 33%. So
the average user is slightly better than random (40%).

------
eli_gottlieb
Ha! I'm as good as a coin flip! Take that, the average user, who is actually
nonrandomly bad!

------
mattmanser
While the content is great, the tech seems bloody awful. Why is it so slow to
load? It takes 5-10 seconds to load on both my laptop and my desktop (both < 1
year old) and I have nice fast fibre. Why would you start with a loading
screen in this day and age? And worse a loading screen that doesn't tell you
what it's loading. Flash is dead.

It severely impacts the usability, it looks like nothing is happening and I
can imagine a lot of people just close it before it loads.

And it's (virtually) static content! Why doesn't it load with the opening page
at least?

~~~
BenjaminTodd
Sorry about that - we got a bit overloaded. It's also a beta platform, hence
the loading page issue.

------
crdoconnor
There are a couple of those where I suspect that they might have the causality
reversed.

