
How I Used Amazon’s Mechanical Turk to Validate my Startup Idea - toumhi
http://harperlindsey.wordpress.com/2010/09/01/how-i-used-amazons-mechanical-turk-to-validate-my-startup-idea/
======
goodwinb
I'd caution making your survey "Hey would you use Web app A?" The Mechanical
Turk users want to blaze through the survey as fast as possible and get paid.
Clicking "Yes" gets them through the fastest and also lets them please the
surveyor (the same bias as your friends or family).

Force them to make a choice. Present Web app A (real), Web app B (dummy), and
Web app C (dummy). Make them rank which web app is best.

Set up the survey three ways: A, B, C. B, C, A. and C, B, A. Have a third of
the sample take each survey. You would be surprised at the first choice bias
with Mechanical Turk. Actually you wouldn't be surprised when you remember
that these people just want to get the thing done.

Finally a good secondary survey is to make them rank order features for their
worth. This helps find your MVP. (Read more about "conjoint analysis" if this
interests you.)

------
michael_dorfman
It's not clear from the article what kind of business model the Startup in
question is planning to use. This is relevant, as the fact the 73% of random
strangers said they would "use the service as described" doesn't give any
indication that they'd be willing to pay for it.

Clearly, the author got his $27.50 worth; how much that's really worth,
though, in the long term, remains to be seen.

~~~
wdewind
Yeah, I've done this before. Of the 200 people I surveyed a much higher %
(around 95%) said they liked and would use the service...and then did not. My
experience with this stuff makes me think that people who do it generally have
a perception that you are their "boss" and that they need to sugar coat
everything they tell you.

TLDR: Be HIGHLY suspect of any results you get from mechanical turk (same goes
for feedback army and others).

~~~
subbu
It applies to everyone. Not just mechanical turk folks. When someone says they
'will use' your product they are non-committal. (I have done this so many
times for the fear of offending the person who is asking me). When its time to
take out their credit cards then they backtrack. They are 2 different decision
points. The later is what matters to startups.

~~~
frossie
_When its time to take out their credit cards then they backtrack_

And to be fair, even if I am trying to be super-honest, I find it very hard to
decide whether _I_ would be happy to whip out a credit card from a concept
pitch. It's not just a question of "would I pay for A", it is a question of
"would I pay for A it it really rang my bell", and I can't tell whether you
are going to ring my bell or not until I have tried your product _and_ your
customer support philosophy.

When I look at my credit card statement, I see a bunch of companies not just
whose product I like and use, but also a bunch of folks for whom I can say "I
like the way they do business".

I didn't pay to cloud-store photos for over a decade, until I found smugmug.
So what is the right answer to the question "would you pay to store photos"?

This isn't the kind of nuance you would get from an MT survey I don't think.

------
ScottWhigham
Strange use of Mechanical Turk if you ask me. You could've spent about the
same in Google Adwords asking people to take a survey. At least then you'd
have a survey of "people who are searching for important keywords related to
my topic of interest" rather "people who are so technically capable that they
are using MT".

"I also found out to my surprise that more men than women said they’d use the
service. I fully expected it to be the other way around." 200 people is not a
large sample size and I would caution you against making sweeping decisions
about the future of a company based on this small size. It could be that the
MT program attracts more men than women? I don't know how you set it up but
you said earlier that you didn't really do any segmentation.

~~~
cinimod
It is still waaaaaaaaaaay better than asking your family and friends.

~~~
zackattack
Dude. No.

They're both 100% useless.

Mechanical Turkers are incentivized to give positive feedback because there's
a chance (lurking in the back of their minds, surely) that people are
emotionally sensitive and if they get criticized, they will be emotionally
wounded and won't approve their HIT.

Do you know how I know this? I asked Turkers to give me feedback on one of my
sites, and it was nothing but sunshine and razzmatazz smoothies.

You know you're building something people want when they vote with their
wallets or attention (give you their email address, phone number...)

New Theory: hell, an even better test might be are they willing to "invite
their friends" to tell them about it? People share things with friends because
they like their friends - is it of sufficiently high value that they will get
a "social credit" for sharing with their friends?.

~~~
nandemo
I guess you could ignore the 73% that said they would use it. But note that
27% said they wouldn't use it and also stated "WHY they wouldn’t use" it. This
should be at least 1% useful.

------
akshayubhat
People who use AMT for earning money may not necessarily be the right audience
you are looking for.

~~~
yellowbkpk
I don't think that people use Mechanical Turk as their sole source for money.
If I remember correctly, someone figured out that mechanical turk pays roughly
$5/hour if you spend your whole day doing it. That's not very much.

Mechanical Turk seems to be filled with people itching for some community to
participate in -- why not get paid for it at the same time?

~~~
snprbob86
I've run an MTurk survey as well and asked standard demographic information.
My results are surprisingly in line with national averages. I ran the survey
at the suggestion of someone with much more experience running MTurk surveys,
who claims he has verified that the results are _better_ than typical phone
and mall campaigns.

~~~
pbhjpbhj
>I've run an MTurk survey as well and asked standard demographic information.
My results are surprisingly in line with national averages.

Do MTurkers tell the truth though? Presumably one person can run a dozen
fronts so that they can answer questions that require a specific demographic.
Or they masquerade as elderly females to appear more trustworthy and shame you
out of objecting if they've not done the job properly or whatever.

~~~
Revisor
Do you think they would lie so as to mirror the national averages?

------
lee
I would take the "would you use this service" feedback with a grain of salt.

It costs you nothing to say "yes", but everything changes as soon as you start
to charge for it.

Going from Free to $0.01 for any online product or service significantly
changes your conversion numbers.

You might get better feedback standing on a corner downtown, asking if people
would use your service... and charge them $1 for a voucher that would allow
them to use it when it is ready.

PS: Please get out of stealth mode!

------
rkalla
For anyone else interested in this "quick, cheap, feedback" cycle, PickFu is
another service that does this: <http://pickfu.com/>

I spent about $20 to get 200 responses for my question, but only ended up
collecting 103 answers before the question expired.

None of this stuff is perfect, but employing a few of these approaches all at
the same time will generate results that can be handy to correlate.

~~~
justinchen
Actually, it didn't expire, it finished with 104 of 200 answers picking "B -
Paypal" <http://pickfu.com/00PULC>

Also, just to clarify, it's $17 for 200 responses.

Thanks for trying out PickFu!

~~~
rkalla
Justin, my apologies on that, I misread the screen and thought I only got 104
answers.

It was a great experience overall, and for micro-testing (where something like
Usability testing is too much) PickFu is on my short list for future projects.

~~~
justinchen
No worries. Other people have misread it too so we probably need to tweak the
wording to make it more clear.

Glad to hear it!

------
harperlindsey
This is Lindsey - who wrote the post. I wanted to clarify a couple of things,
based on the feedback I’m seeing here. (btw. This was my first blog post :)).
I am still in stealth mode (which is a completely different conversation that
I’ll be blogging about soon). www.Swayable.com will be a consumer facing free
service. With that said, the Turk system will not work for a lot of startups
that need niche specific data. I was looking for general consumers that are
web/computer savvy and the Turk system fit the bill for what I needed. The
most valuable feedback I got was actually from the text responses as opposed
to the yes/no response. I will definitely follow up on this post as soon as I
am out of stealth mode, and provide more specific data. Thanks for all the
great feedback!

~~~
mikeklaas
You are in stealth mode, yet willing to pitch the idea to hundreds of random
strangers?

------
mitko
cool. My girlfriend had an idea about startup and we started discussing it. At
some point I told her 'lets stop hand-waving and look at some data' . For ~10
bucks we got 110 good responses to 11-question survey on Mechanical Turk. It
really gave us new perspective on some of our problems. That was much easier
than asking our friends for feedback (which we did too).

------
olalonde
The post would be more interesting if we knew what questions were asked in the
survey.

Edit: I meant how exactly were the survey looked like, including the service
description. If it's a paid service, I highly doubt the 73% figure.

~~~
timinman
Me, too. I checked my stats after using MT for a site-survey. Most of the
users were from the same geographical region, they gave very positive
feedback, and they spent very little time on the site. Pair that with an
understanding that MT only pays well if you do lots of little tasks very fast,
and you start to get the idea that the feedback might not be the most useful.

~~~
timinman
You need to put in a text box: "Please describe, in your own words, the
purpose of my site:" Then throw out all the responses that are incomplete or
way off. The problem is that when I did that, there were too few useable
results.

~~~
nmcfarl
This is incredibly good advice. All MTurk tasks of a small size or that will
be approved by a single person, should ask an open ended question, that
requires a response.

This greatly reduces fraud, and shoddy work. And it has a tendency to
highlight the really good workers (some of which are amazing!) - maximizing
your benefit as a requester.

------
kitchen
Interesting article, but probably poorly timed submissing to HN. You could
have used this article to promote your product, but you're in stealth mode so
we get a "sign up to hear when my product is available" page, but still have
no idea what your product is. Good article, bad marketing. Should have
released this about a week after you launched the site :)

------
huhtenberg
I wonder how many of these 200 people were from India, which appears to be
supplying a good chunk of Mech Turk userbase.

I also ran a couple of polls with similar intent on MT and then stopped doing
that because results were skewed dramatically. For one there's no filtering by
a geographical region and you bet your 10c-per-answer that you will get a
_ton_ of replies from the poorest corners of the world. Secondly, responders
_are_ afraid of you rejecting their answer and thus affecting their MT rating,
so they tend to tell you want you want to hear and not what they actually
might be thinking. Hence your 73% approval rate.

This sampling is not random, very far from it. You are effectively sampling MT
community with its own dynamics, the community that is likely NOT to be
representative of your own target user base.

~~~
jlees
I've been using CrowdFlower as an interface to MTurk, and it let me exclude
specific countries. Your point about the secondary motivation is unfortunately
valid, though.

------
zumda
Might be a bit off topic, but does anyone know a good alternative to MTurk for
people outside the US?

~~~
chrisconley
I'm building an API on top of MTurk, <http://houdinihq.com>, that you could
use. It's currently in alpha, but feel free to email me at
presto@houdinihq.com if you're interested in an api key.

~~~
rarestblog
Simple API looks really simple, something that I definitely like!

How do you plan to accept payments? How much would Houdini itself cost (per
task, per month?)? Do you plan to accept non-US users?

~~~
chrisconley
Thanks! We also have some more advanced apis in the works, like asking
multiple workers to do the same work and automatically determining the 'true'
answer. From your point of view, they'll be just as easy to use though.

Pricing is still TBD. MTurk charges will be passed through directly to you
with Houdini-specific charges either per task or per month.

Yeah, I'm planning on accepting non-US users.

------
Keyframe
I've seen weirder use for mechanical turk:
<http://www.youtube.com/watch?v=D_CC5r5Wfm0> So, I guess why not?

------
bmcnamara82
What if you asked for a follow up email address / phone number? The survey
would then act as a screening device that would identify those who are really
interested.

------
terrapinbear
In case anyone doesn't know what a Mechanical Turk is:
<http://en.wikipedia.org/wiki/The_Turk>

------
EGreg
I think Amazon Turk could be better used for beta testing and early testing of
word-of-mouth. First, ask some people to register on your site and then track
what they do. Then, have them fill out a feedback form and tell you what they
would have you improve. One iteration would cost you around $20. And if you do
it once every 2 weeks or so, you can keep in touch with what your users
actually want.

------
ck2
Does Amazon say anywhere how diverse Turk users are? Maybe you ended up only
surveying mostly one gender in mostly one part of the world?

~~~
mrtron
When we were creating <http://markiter.com>

We did a blog post referencing a study:

<http://blog.markiter.com/#turk>

[http://behind-the-enemy-
lines.blogspot.com/2008/03/mechanica...](http://behind-the-enemy-
lines.blogspot.com/2008/03/mechanical-turk-demographics.html)

Which has been updated recently. Some high quality data presented in a clear
way.

[http://behind-the-enemy-lines.blogspot.com/2010/03/new-
demog...](http://behind-the-enemy-lines.blogspot.com/2010/03/new-demographics-
of-mechanical-turk.html)

~~~
ck2
Thanks for that - this set is from Feb 2010

[http://behind-the-enemy-lines.blogspot.com/2010/03/new-
demog...](http://behind-the-enemy-lines.blogspot.com/2010/03/new-demographics-
of-mechanical-turk.html)

    
    
        * United States: 46.80%    (65% female)
        * India:         34.00%    (70% male)
        * Miscellaneous: 19.20%

------
zaidf
May be the survey data us worth useful if your startup's target audience
consists of folks who earn pennies on the hour.

------
famousactress
Strange? Sure. Useful? Probably.. But I really like the novelty of looking at
AMT as a group real people and not just commoditized, bite-sized, labor. Opens
my mind up to problems that might be solved with AMT.

------
ANH
Some good advice I've heard: When doing this kind of market research, don't
ask people if they would buy, ask them to buy.

Of course, that's easier said than done in certain circumstances.

------
lsc
Come back after you launch and tell us if the feedback was accurate- until
then, you are half way through executing a (rather interesting, imo)
experiment.

