
Regarding the Most Recent Amazon Mechanical Turk Data Contamination - dsr12
https://www.maxhuibai.com/blog/evidence-that-responses-from-repeating-gps-are-random
======
psb31
I’m the CTO at an MTurk competitor (Prolific.ac). We’ve written up some
thoughts on this below. It’s unlikely that these accounts are ‘true’ bots but
rather low effort workers hiding location with a VPN and providing pseudo
random responses.

[https://blog.prolific.ac/bots-and-data-quality-on-
crowdsourc...](https://blog.prolific.ac/bots-and-data-quality-on-
crowdsourcing-platforms/) [https://blog.prolific.ac/how-to-improve-your-data-
quality/](https://blog.prolific.ac/how-to-improve-your-data-quality/)

We’ve done a lot of work on improving data quality on Prolific and filtering
out these kind of accounts and heavily invest in making sure our pool are
honest, attentive and engaged.

If anyone’s concerned about data quality in online samples please get feel
free to get in touch.

~~~
NullPrefix
Your platform requires quite a lot of data. Is entering all that necessary to
be able to participate in studies?

~~~
psb31
No. All questions are optional and typically studies only filter on 1-3. We
realise there’s a too many there by default and we’re working on staggering
the demographics more.

~~~
edoceo
This is a pattern I use, gotta trickle those Qs, not overwhelmed and keeps
them engaged after.

------
ris
Using MTurk for personal surveys and expecting to get good and meaningful
results from it is just all kinds of wrong. I do hope there isn't "serious"
research that's being done this way.

~~~
your-nanny
what evidence do you have to support this attitude?

~~~
flingo
If I had to guess as to his reasoning, it's because Mech Turk is artificial
AI. You wouldn't survey an AI and expect to get anything meaningful.

------
aantix
Data contamination has always been an issue with MTurk. Bots that submit junk
data in hopes that you just auto approve the entire data set. You definitely
need a secondary validating question to validate the turkers authenticity.

------
vanpelt
What does he mean by GPS? Is he just getting location info from the IP or
using the browser geolocation API? I’ve run my fair share of surveys on mTurk
and FigureEight. The best way to prevent scams are with honeypots. Always have
a question or two deeper in the survey that seem quick and simple but require
engagement. Be sure to change these questions often.

~~~
shmageggy
Yeah, it's standard practice for mturk experiments to have a catch trial or
two and to exclude responses that answer incorrectly. All of the experiments
I've seen in my field do this.

------
akuz
As it stands, nearly all problems in collecting data on MTurk can be solved by
1) using proper qualifications 2) paying enough.

If anyone is having trouble, feel free to reach out.

Source: Working with MTurk since 2013, including a 3 month internship in the
belly of the beast. Starting a related PhD this fall.

~~~
glup
Seconded. Have run thousands of participants over eight years. Ya there's
crap, but that's true whenever you ask any random sample of people about
anything.

~~~
JackFr
> but that's true whenever you ask any random sample of people about anything.

Are MTurkers a random sample?

------
JamesCoyne
Can anyone comment on how common it is for social psychology research to rely
on surveys completed by Turkers?

~~~
mrtksn
According to Duncan Watts in his book "Everything is Obvious, once you know
it" it's quite widespread to use the Turkers for that kind of research.

It's described as a very cost-effective method compared to reaching out the
subject directly and trying to convince them to gather somewhere for the
study.

Also, it's described as game-changing when it comes to studies that must have
a large base and made previously impossible(due to practical reasons) studies
possible.

~~~
JamesCoyne
Thanks.

------
techbio
The answer to so many similar classification questions is to apply the
simplest of case-built classifiers to detect red flags and then write a regex
to exclude new classes of bot data.

[https://en.wikipedia.org/wiki/K-means_clustering](https://en.wikipedia.org/wiki/K-means_clustering)

[https://en.wikipedia.org/wiki/Naive_Bayes_classifier](https://en.wikipedia.org/wiki/Naive_Bayes_classifier)

------
imhoguy
Now this is quite a challenge - make an AI which mimics human intelligence on
MTurk. Profit!

------
raverbashing
I wonder if this could be someone testing an AI "quiz-taker"

------
predictionable
I don't get it. Why are mechanical turk outcomes being founded upon political
ideology as a concrete, objective data point?

There's something in this process of deriving authenticity that seems...
flimsy?

I mean, substitute "Skeletor" and "Hordak" as stand-ins for "KKK" and "Nazi
Party", and "Eternia" for "America".

When you make a video game out of clicking questionnaire responses, and seal
it in an inconsequential vacuum, why would anyone expect conclusive results
that "Skeletor" and "Hordak" should be regarded objectively as detestable?
There's no real consequence to saying "blah", "yay" or "boo" and I don't think
I'd expect success if I were to try and ground my hypothesis on simple, obtuse
good/bad/indifferent political opinions, to prove whether response data is
"authentic" or not.

This feels like a weaker conclusion than gauging lie detector readings.

You can safely ground response data on facts like:

    
    
      The sky is blue.
    
      Wood comes from plants.
    
      Stars exist in outer space.
    

But, you can't really demand predictable results to bind on opinions.

    
    
      Is Mr. Rogers nice?
    
      Are leather jackets good?
    
      Do old cars go fast?
    

Whether evil people started answering, or ambivalent people started shrugging
at the questions doesn't provide a conclusive assessment of whether a system
is producing useful data. I feel like this is a garbage-in-garbage-out
scenario, no?

