

Clockwork Raven - MTurk framework from Twitter - polyfractal
http://twitter.github.com/clockworkraven/

======
anandkulkarni
This adds to a long list of projects like Turkit
(<http://groups.csail.mit.edu/uid/turkit/>) that try to patch over the ease of
use / spam / quality holes in Mechanical Turk.

Better platforms than Turk exist today that handle this functionality out of
the box. Building more solutions like this isn't a great use of engineering
time!

~~~
aantix
What are the platforms that you're referring to?

~~~
bravura
Turkit, Javascript code for achieving consensus among AMT workers, especially
with multiple stages. See the Find-Fix-Verify pattern: [http://www.behind-the-
enemy-lines.com/2011/04/want-to-improv...](http://www.behind-the-enemy-
lines.com/2011/04/want-to-improve-sales-fix-grammar-and.html) By doing
multiple stages of crowdsourcing, you can do complicated tasks with high
quality, like: Improve the grammar of these reviews. Zappos did this and saw a
lift in conversion.

Crowdflower, which is good for achieving high-quality one-shot annotation, but
not so much for pipelines of annotations.

MobileWorks, which is good for achieving high-quality one-shot annotation, and
says in the docs that it has pipelines. I haven't figured this feature yet
out.

CrowdControl, which supposedly solves every problem, but is priced as an
enterprise solution.

If you want to build something cool, implement pipelines of work. i.e. build a
crowd-programming layer that has subroutines. Look at what CrowdControl says
they are doing.

[edit: If you also want to build something cool, implement a reputation
system. Don't just assign workers a single number. Figure out what _kind_ of
tasks the workers are good as, and do a per-task reputation system. For bonus
points, solve this correctly, by dynamically gauging the skillset and
difficulty for each task, rather than simply grouping tasks into N clusters,
where N is low.]

Email me if you want to discuss. I've been thinking about this for a while.

~~~
joshu
We also assume pipelining in <http://human.io> because the UI at each stage is
pretty minimal.

------
polyfractal
Whoops, an announcement that went with the project (probably should have
linked to that instead):

[http://engineering.twitter.com/2012/08/crowdsourced-data-
ana...](http://engineering.twitter.com/2012/08/crowdsourced-data-analysis-
with.html)

------
aantix
Are there any built in controls for spam/bot responses? This is something I
have been contemplating adding to Turkee ( <https://github.com/aantix/turkee>
).

