
AI service gives Wikipedians ‘X-ray specs’ to see through bad edits - jasoncartwright
https://blog.wikimedia.org/2015/11/30/artificial-intelligence-x-ray-specs/
======
garblegarble
I can't help but wonder if this will have the side-effect that vandalism will
become a more subtle subversion of the pages (or maybe there's already a
number of people who do just that and this will thin out the field to make
them more obvious)

~~~
data_scientist
Relevant XKCD [https://xkcd.com/810/](https://xkcd.com/810/)

------
jacquesm
Is there anything about the false positive rate of this system?

Any binary 'good or bad' classifier should have at least a couple of stats
attached to it to give an indication of how reliable the classifier is.

To complete the circle, it would be nice to know how it performs according to
this:

[https://en.wikipedia.org/wiki/Precision_and_recall](https://en.wikipedia.org/wiki/Precision_and_recall)

~~~
kenrick95
Not really, but one can report a false positives and false negatives here:
[https://meta.wikimedia.org/wiki/Research_talk:Revision_scori...](https://meta.wikimedia.org/wiki/Research_talk:Revision_scoring_as_a_service#Misclassifications)

Since it is activated on Indonesian Wikipedia (I helped them to extend the
tool to Indonesian language), I noticed that this tool can hardly capture an
obvious vandalism yet. Some other edits are hard to verify whether it is
vandalism or not, even by human. I believe this is a work in progress and hope
that it will improve over time. But hey, it is the first time a vandalism
detection tool is deployed in Indonesian Wikipedia. :)

~~~
jacquesm
> Not really

That's worrisome then. You'd expect a bit more rigor before putting a bot like
that into production.

~~~
tptacek
It's not actually doing reversions, right? It's just scoring edits. Wikipedia
editors keep hot-lists of articles they watch edits for, or patrol lists of
edits or page creations; ostensibly, all this needs to do is sort those lists
of edits.

~~~
jacquesm
It's augmentation, not making any decisions on its own (and that's logical
given the false positive rate). So it's just an aid at this point but it can
probably help quite a bit with triage.

------
zellyn
Welcome to the spammer vs machine learning arms race, Wikipedia. :-)

(Former YouTube Abuse team member)

~~~
kuschku
Well, Wikipedia at least has humans working on it, who can catch issues and
improve it.

It can’t be worse than YouTube/Google’s systems anyway.

~~~
titanomachy
I don't see that many bot-generated comments anymore on YouTube. The genuine
user-generated comments are among the lowest-quality on any site that I
frequent, but I suspect there isn't much that Google can do about that.

~~~
kuschku
I’ve seen tons of spam, "buy viagra here", and I’ve seen hundreds of videos
that have just an empty image, no sound, and a full-screen annotation (and a
description) that link to another video site.

------
cowardlydragon
As referenced in the article, what is "good" and "bad" is dependent on
perspective.

Wikipedia has several perspectives just within the colloquial / perceived
mission. Is it for academic integrity? For common education? For the rich and
powerful? For the everyman? To record as much as possible? To cull the best
knowledge from the stream?

Is it for progressive ideals, which may be opposed by the common majority? Or
does it reflect a conservative viewpoint, adopting only what has proven to
coalesce in society?

For who's society? What about two sides of a war? Or two equally opposing
economic interests (entrenched oil vs alternative energy)?

"good" or "bad" isn't a number...

~~~
tptacek
"Good" simply means "unlikely to be reverted". The objective is to save time
for editors.

~~~
HarryHirsch
Maybe the better metric is usefulness to the reader. A suitable algorithm can
recognize things like "N.N. smells", but so can any user. The current
algorithms are very effective at dealing with "XY is gay" and "Buy herbal
viagra"; you don't see much of that.

The really insidious stuff is spin doctors for companies and nations. So far,
Wikipedia does a very poor job with geopolitical conflicts, and competing
financial interests are not taken seriously enough either.

~~~
iraphael
But the people who are supposed to define what's good for the reader are the
editors. They show the system what they think is good by either reverting
edits or not, so by saying "this edit is likely to be reverted", the AI is
saying "based on input from all editors, this edit may not be good for the
users".

Effectively, those are the same thing, unless you want to create a hard
definition of "good for the users". But that's going around the editors, and
taking their opinion of "good" out of the equation, which IMO is a very bad
idea.

------
tptacek
I'm surprised it took this long. Wikipedia has what would seem to be an
excellent and straightforwardly encoded set of training data: a constant
stream of edits and reversions.

~~~
logicallee
That's not training a model to predict the quality of edits - that's training
a model to predict the reactions of admins/editors :) Perhaps a better title
is "AI service models Wikipedia editors"?

~~~
tptacek
That's all the model needs to do to be valuable.

~~~
logicallee
Well, suppose something told me what I would decide if I gave any matter 15
seconds of consideration. It would be an invaluable tool I might use often,
for example for sorting and classifying, throwing out trash and so forth. For
example, it would be a fantastic spam classifier, since even after 15 seconds
there is no doubt. I could use it to get through hundreds of spam messages
easily, or delegate things to appropriate departments and so forth.

But it would not give me "x-ray vision", since at the end the day it is the
same thing I myself conclude if I give the matter 15 seconds of consideration.
Something that I miss - a genuine mail written the way a spam might be ("hey!
it was nice to meet you!" in the subject, etc), by a recent acquaintance I
forgot I made - would be misclassified as spam. There is no x-ray vision
involved here - it's not training on the data - it's training on _me_!

I feel the distinction is an important one.

~~~
tptacek
It's 15 seconds of consideration you have to give to huge volumes of edits, so
singling out the ones where that 15 seconds is likely to be profitable for an
editor is extremely helpful.

(I have no idea how well the scoring works in practice).

Remember, the editorial tasks we're talking about here are enervating.

~~~
logicallee
fair enough, I guess I don't like the "x-ray specs" analogy. Their example is
a disruptive edit[1] where a URL is replaced by LLAMAS GROW ON TREES

perhaps rather than say it gives editors "x-ray specs", it would be fair to
say it gives editors caffeine :)

[1] -
[https://wikimediablog.files.wordpress.com/2015/11/diff_64221...](https://wikimediablog.files.wordpress.com/2015/11/diff_642215410.png)

------
Estragon
In case anyone else is interested in the source code, it's here:
[https://github.com/wiki-ai/ores](https://github.com/wiki-ai/ores)

------
exo762
This is very peculiar. New tool is introduced, which joins a set of existing
tools (bots) which address the same problem. Some comparison of quality and
robustness is expected.

Instead author gives us some insight into his/hers source of inspiration:

> A feminist inspiration

> “Please exercise extreme caution to avoid encoding racism or other biases
> into an AI scheme.” > Wnt (from The Signpost)

And few lines below we have this gem:

> While artificial intelligence may prove essential for solving problems at
> Wikipedia’s scale, algorithms that replicate subjective judgements can also
> dehumanize and subjugate people and obfuscate inherent biases.

Essence of not being sexist or racist is not to evaluate people basing on sex
or race! Author instead accents that we should totally pay attention to
subjective factors (such as sex or race) instead of objective factors (such as
quality of edit).

And let me guess whos edits will be targeted with extra scrutiny.. Edits of
people who don't buy feminist agenda!

(Disclosure, I'm egalitarian and MRA; While I fully support women rights, I
just believe that modern feminism does not have anything to do with equal
rights for anyone. This post nicely fall into this pattern.)

~~~
benten10
Alright buddy, I don't want to get in a flame war here, but I'll summarize my
arguments in short.

A hits B. Hard. When B tries fighting back, A suggests the world should be a
fair place, and really, he doesn't support violence at all, and there should
be global peace.

C turns D's family into indentured servants. When D becomes a part of a
revolution, and tries to make sure no one from C's (or D's) side can do it
again, C says "I'm all for equality yo, lets not be violent at each other".

F's people are particularly unfairly attacked by different members of G
groups, which are in power in their country. A lot of F's die unfairly and
prematurely, and are mistreated for no reason. When F's people say "our lives
matter, be kinder to us", some people who are in G's side say: "hey F, ALL
lives matter."

People called G's are unfairly discriminated against, overtly and covertly,
but often very openly. They are called names, and their presence in public is
often made very difficult, by people called H. So when G's get together and
try to get this G-rights going, some H's say: "Hey, there should be equality.
You shouldn't get special priviledges. If you are systematically prejudiced
against, your bad, there should be equality okay". Lets call them HRA's. You
are HRA, exo762.

~~~
exo762
Your argument is in fact two arguments. First one is: "women are discriminated
against". Second one is: "women are fighting back, men brought it onto
themselves".

So lets go through part one, "women are oppressed", point by point.

> A hits B. Hard. ...

Violence. 70% of non-reciprocal domestic violence is committed by women
against men. Article [0]. Men are primary target of every police intervention,
even if it was man who reported violence committed against him by his woman
partner. Lookup Duluth model. And battered men has nowhere to go - there are
no shelters for male victims of DV.

> C turns D's family into indentured servants...

Slavery? What?

> ... members of G groups, which are in power in their country. ...

"Men in power". Yeah, those who are in power are mostly men. But what good
comes from it for men who are not in power? Majority of homeless are men;
majority of prisoners are men; majority of unemployed are men; majority of
victims of DV are men. Men get longer sentences for same crimes, women get
suspended sentences for things men serve time.

As for clear legal discrimination, there are multiple examples of it. Fathers'
rights are a joke, draft is still for males only, selective service is for
males only; men can't opt-out from parenthood, while options are available for
women. As for legal discrimination of women - there is one clear-cut example
and it is access to abortion.

As for your second argument. "Men had it coming, now repent for your sins".
This argument is dumb beyond recognition. Even Joseph Stalin said that "The
son is not responsible for the father's deeds". You seems to be not agreeing
with him on only thing everyone agrees with him. And you basically place blame
ON HALF OF HUMANITY, collectively. Good job!

[0]
[http://www.nationalpost.com/opinion/columnists/story.html?id...](http://www.nationalpost.com/opinion/columnists/story.html?id=a41532d6-d4df-46a2-a784-f6499938f3b0)

If you want more sources, go here:
[https://www.reddit.com/r/MensRights/wiki/faq](https://www.reddit.com/r/MensRights/wiki/faq).
FAQs don't bite, and I've spent enough time typing out this answer.

EDIT: word of warning, /r/MensRights is a bad place, too much vitriol. Go to
/r/FeMRADebates instead.

