
Crowdsourcing Moderation Without Sacrificing Quality - paulchristiano
https://sideways-view.com/2016/12/02/crowdsourcing-moderation-without-sacrificing-quality/
======
MarkPNeyer
Why Must there be one "correct" view of the comments on a document? This
insistence seems to be what hobbles us.

People will disagree on what is or isn't worth reading. That's fine. We should
see the problem as sorting comments per viewer, not for all viewers.

If 10 of my close friends think a comment is worth reading, then I personally
would like to read it. It doesn't matter what the moderator or a quoroum of
random internet people think at that point.

We should be using the social graph to filter out nonsense, and to surface
constant that we are likely to enjoy.

If anyone is interested, I've written code to do this here:

[https://github.com/neyer/respect](https://github.com/neyer/respect)

~~~
paulchristiano
I think that the correct behavior in the long run is to give personalized
recommendations, analogous to this proposal: [https://sideways-
view.com/2016/12/01/optimizing-the-news-fee...](https://sideways-
view.com/2016/12/01/optimizing-the-news-feed/)

The main motivation for having a unified view is to decide what to show people
who visit but don't rate things / log in / etc.

I'm currently finishing my dissertation on collaborative learning algorithms,
and I agree that it is not generally possible to have a single global answer.
This paper does something similar to your respect matrix:
[https://arxiv.org/abs/1411.1127](https://arxiv.org/abs/1411.1127)

(The same technique can be modified to incorporate information from an
external social network.)

------
lightedman
So... basically, trying to automate what Slashdot already does. The machine-
learning program would need to be an expert in every subject to make this even
remotely workable.

~~~
paulchristiano
The hope is to use features like "person X upvoted this content," rather than
to try to use domain-specific features or to do natural language processing or
so on. Over the long run, algorithms may play a larger role in that process,
but at first it would just be people.

~~~
lightedman
"The hope is to use features like "person X upvoted this content,""

Bad idea. They should focus on WHY such a post got upvoted, not that it was
upvoted.

~~~
paulchristiano
You seem to be imagining very impressive machine learning. I agree that's the
right thing in the very long run.

I don't understand why you say the simple version is "unworkable" though.
Currently many forums just use a heuristic based on date and "total number of
upvotes" to decide what to display. Would you describe that as unworkable?

~~~
ohyeshedid
> Currently many forums just use a heuristic based on date and "total number
> of upvotes" to decide what to display.

That's why these systems often fall victim to brigading by organized trolls.

~~~
paulchristiano
The proposed mechanism makes brigading harder (e.g. a voter is only weighted
highly if they are adding useful information, such that increasing their
weight improves predictive accuracy), but doesn't make it impossible.

In any case, it seems extreme to call existing systems "unworkable."

~~~
lightedman
"a voter is only weighted highly if they are adding useful information"

Just how do you propose one determines if the information is useful? Again,
back to my original point, you'd need a system trained in EVERY FIELD.

In other words, such systems are pretty much unworkable unless you're giving
it full access to the entire ACCURATE knowledge of mankind. You happen to have
a nice accurate database of everything man's ever accomplished and discovered
for such a machine to use? Even Watson isn't even close to that level.

~~~
paulchristiano
By "useful" I mean "improves our ability to predict what the moderator would
say." The mechanism for determination is just the usual approach in machine
learning, set the weights to minimize prediction error. (Plus some
regularization, which may incorporate a prior against new users offering
useful feedback.)

