

Mechanical Turk vs oDesk - Panos
http://www.behind-the-enemy-lines.com/2012/02/mturk-vs-odesk-my-experiences.html

======
tansey
This article is an excellent summary of all the frustrations I've encountered
using both oDesk and Mechanical Turk.

I'll add one more issue with MTurk: binary decisions. Some of my tasks can't
easily be broken down into accept/reject, or sometimes a worker gets something
_close_ but not all the way there. I want to be able to say "hey, you did
pretty well, but please make these small changes and we'll accept it."

But that's not possible right now. You get the HIT and it's accepted or
rejected. The result is that we have to sometimes reject workers who are
making an honest effort but didn't sufficiently complete the task or we have
to accept sub-par answers so as to not have to deal with worker blowback.

~~~
Panos
Let me suggest one solution for such cases: Use iterative tasks, in the spirit
introduced by TurkIt.

For example, you want to create a caption for an image. You let a user create
a caption. Then you take this caption and give it to another user, asking the
user to improve it. Take the two versions and ask other workers, "which of the
two versions is better?". Iterate until no improvement is possible.

Not a trivial setup, but gets around the binary accept/reject decisions
problem and generates results of significantly superior quality.

~~~
83457
Good idea. I've also thought about this for data entry/analysis type tasks. It
is fully expected that a worker on any level won't be perfect, especially on
these types of services. An approach I've considered is to have have 3 workers
do the same task then compare results and take the majority answer which
should be close to if not 100% accurate. Overtime you could even rate workers
for accuracy and recruit/pay more if possible.

~~~
Panos
On a very selfish note, take a look at these:

<http://qmturk.appspot.com/> <http://code.google.com/p/get-another-label/>

You may find the code useful for what you are trying to do.

------
msears
To get any sort of work done you need to give good instructions to good
workers, all within a good process. If any of these aren't in place it is hard
to get quality results.

(Disclaimer: I am the founder of CloudFactory, a next generation mTurk working
to get the right mix of solutions, process and workers.)

Panos has been using mTurk for years and knows how to break work down and give
clear instructions to workers (plus validate to catch spammers and lazy
workers). Most people need to fail a bunch of times in learning how to create
tasks well before becoming good "factor owners" as Panos described it. We have
a team of crowd solution developers that design workflows, create killer task
forms, analyze data and improve everything so others don't have to go through
the painful ramp up.

We have sent many thousands of tasks through mTurk while we build up our own
workforce from scratch. Anonymous workers that sign up online in 2 minutes
with fake info have no accountability or ownership in their work. This is just
never going to work. You end up with anarchy in the marketplace, both workers
and requesters get ripped off. CloudFactory is taking a totally different
approach to building up our workforce to ensure that workers have
accountability, are matched to the right tasks and in general are motivated
and enjoy their work. Happy workers are good workers and we don't see any
technology making up for this.

That said, a process is required and this is where technology is important.
Quality control techniques are essential to catch mistakes even when the best
workers are given clear instructions. We take a factory, mass production
approach to this type of work and our platform offers tools to give real-time
control and transparency to your work done in the cloud.

So when the screwdriver (mturk) isn't working properly, don't get a hammer
(odesk) ... just get a better screwdriver!

------
lbarrow
If you're having trouble getting your HITs done or ensuring quality work on
Turk, MobileWorks (YC S'11) is essentially a better version of Turk that takes
care of these things for you. (Disclaimer: I'm an engineer at MobileWorks.)
Your experiment with oDesk is interesting, but I can't help but feel that this
sort of microtask work is not was oDesk was built for.

We're crowd researchers who spend our time working on methods for routing
tasks to qualified workers and QAing work. We aren't a wrapper on top of Turk
-- we're our own platform and we employ our own workers. You just push HITs to
our API and we take care of quality control and worker management for you.

If you're interested, email me at lionel @ (my company) .com and I can help
you get started.

~~~
Panos
Thanks for the offer. I have a decently long experience with MTurk to know how
to get things done.

My point is that microtask work is not necessarily the optimal setting for
tasks that are expected to last for longer periods of time. It is often
beneficial to train and give people meaningful pieces of work instead of
converting real work into micro-work and assume that workers are not
intelligent enough to get things done properly.

------
smackay
How soon will it be possible to pick-off Mechanical Turk tasks using computers
- assuming it is not being done already? Is this a real opportunity or is the
big money going to be made in helping businesses design their work so it can
be automated this way (assuming the problems don't degenerate into just
another parallel programming task).

~~~
aoe
What does "pick-off Mechanical Turk tasks using computers" mean?

~~~
wisty
I think he means "complete", as in automate the tasks. "Pick off" is usually
used as in: "The sniper was in the bell-tower, picking the enemy soldiers off
one by one" (if you're a grammar geek, it's an optionally separable phrasal
verb). It can also be used to refer to completing tasks.

------
almost
I think Crowdflower (<http://crowdflower.com/> ) would be useful in this
situation. I've used them before because I'm UK based and MTurk wouldn't let
me create HITS (not sure if that's still true?) but they also add some stuff
to MTurk like a HIT designer (a form designer) and, more interestingly here,
an interface to have people complete your HITS outside of MTurk.

------
4clicknet
What's a typical percentage rejection rate for your tasks?

I find that for some classification tasks my rejection rate is around a third
(this includes careless/sloppy answers, so the rejection rate of honest
answers is lower).

Even though there is always a right answer, some answers are almost correct,
but they get rejected and the worker's stats are affected.

