

Wikipedia to Color Code Untrustworthy Text - Alex3917
http://www.wired.com/wiredscience/2009/08/wikitrust/

======
mjgoins
Our brains already do this for us. It's called the bullshit detector
(technical term).

------
Alex3917
Meh I love how I got ignored for proposing basically the same thing five years
ago:

<http://bit.ly/QJFb0>

~~~
btn
How is your idea "basically the same thing"?

Your idea seems to boil down to the calculation of a edits/views metric that
supposedly implies something about the quality of the entire article.
WikiTrust tracks the edits relating to each word, and the edits of the authors
making those edits to determine the trust of that text.

~~~
Alex3917
If my proposal had been discussed in 2004 then the consensus might have been
that the specific implementation that works best is the algorithm they are
putting in place today. That's fine.

The point was that I proposed that we had the same problem (and opportunity)
in 2004 and came up with the same general approach, drawing on the same set of
resources, and no one took me up on my offer to discuss it.

~~~
thumper
I don't think that there was disagreement that there would one day be a
problem, even in 2004. But "drawing on the same set of resources" doesn't seem
like a fair way to think of it, especially to conclude that your idea would
have led to the same thing.

I am one of the researchers on this project, and there are no "resources"
which are being made available to us -- it's been something of an uphill
battle to do more than just talk. We came up with our idea in January 2006,
and spent the year implementing it. Even now, with this attention and with it
being open source, there are no volunteers stepping forward to help make it a
reality on the Wikipedia itself. We receive very little funding (from CITRUS
and LANL, not Wikipedia), and I pay my own tuition from my side-job. Our
research group has been really focused this last year to make the code
"production ready" instead of just research code, but don't forget that we are
academics -- it's difficult to justify our spending time this way, except that
we truly believe in the project.

In terms of "credit for the idea", the earliest published work that I have
seen is the "Puppy Smoothies" article in First Monday (
[http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/ar...](http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/1400/1318)
) We cited that and every other blog posting (unusual for an academic paper!)
we could find in our first paper. You'll see that there are actually several
that existed at the time, but the real issue is in finding the man power to
implement the ideas.

~~~
Alex3917
Gotcha. By resources I actually meant the data available (timestamp, page
views, contributor history, etc.), and not the actual financial support.

Anyway I wasn't trying to take credit for your work, I was just annoyed that
no one wanted to discuss it at the time. Great job though, this is a really
cool (and important) project!

~~~
thumper
Thanks. Okay, I see what you're saying. Actually, they still won't make that
information available, though we have been asking a while. That was a big
challenge when we started, but I think it was a good constraint because it
pushed us in this direction to think about how the text itself evolves.
Upcoming work will be looking at new signals to inform the page quality, such
as activity on the Talk pages -- so there's no shortage of ideas, only of time
to work on it.

On the page views signal - there was a great paper in the last few years that
came up with a way to estimate it from multiple other data sources (eg,
Alexa). I don't remember the title, but it was an impressive bit of stitching
together. If I had the time/money/resources, I would love to get that as a
signal in our work and see if it helps. I'm not 100% convinced it would,
because of vandalism I've seen which lasted for years on a somewhat popular
page -- so even many eyes does not help if no one will take action to fix
incorrect data.

