
Who Writes Wikipedia? - hiteshiitk
http://www.aaronsw.com/weblog/whowriteswikipedia
======
mseebach
So, in the article it's claimed that Wales wanted to do a study using this
method, and (implied) if appropriate, revise his conclusion.

This is four years ago, anyone knows if that happened?

Edited to add: One thing that has been annoying me a little, is when articles,
regardless of their quality, are deleted because they lack notoriety. If these
deletes adhere to the conclusions in this article (written by an outsider,
then deleted by an insider, rather than both written and deleted by insiders),
this strikes me as an example of a policy that should be changed in face of
this evidence, since capturing the knowledge of these drive-by contributors
seems more important than "saving space".

~~~
aaronsw
Wales admitted he was wrong to me through a mutual friend. I don't think he
ever publicly recanted, though.

~~~
mseebach
That is rather disturbing. Well, as PG says, there's still room to do to
Wikipedia what Wikipedia did to Encyclopedia Britannica.

~~~
barrkel
I don't find it disturbing. I think it's very much in Wales' character;
starting out in soft porn, co-founding Wikipedia, and then passing himself off
as founder, rather than co-founder. Every time I see his face at the top of a
Wikipedia page, it sickens me, and similarly every time I see him described as
"founder". Thus, the Wikipedia banner sickens me twice.

AdBlockPlus element hiding rule: wikipedia.org###siteNotice

~~~
sielskr
Thanks for the AdblockPlus rule!

------
xentronium
As a moderator of a wiki-project, I can confirm Aaron Swartz's results. After
we had gained enough popularity, most of the articles were written by
outsiders and core community was there only to maintain -- categorize, wikify,
create interlinks here and there, delete garbage and, last but not least, have
flamewars in the discussion pages.

~~~
sliverstorm
It's only logical- the people best suited to write articles are people who are
well-versed in the subject, and the core community can't possibly contain
experts on every subject.

~~~
ErrantX
I'm not sure that is always accurate; WP simply summarizes content available
in reliable secondary sources.

For most topics it is possible to do so without expert knowledge. Of course;
some areas have always needed experts to write. I'd never touch a medical
article, or macro-biology, for example as it is all double dutch to me :) But,
say, history is often easy to write if you have the source material.

------
erikpukinskis
A key takeaway for me is that you can't just take crude metrics, make some
inferences and run with them. Jimmy Wales has been running his organization
under a wide reaching and _completely wrong_ understanding. That's because he
equates number of edits with importance of contribution.

It's a basic psychology 101 concept, but one thats easily missed: don't equate
your operationalized variables with the phenomena you are trying to measure.

~~~
billswift
That's a danger with any management - it is all too easy to overvalue what is
easy to measure and more or less ignore what is hard. I have read about
problems caused by that in everything from the military through software
engineering in the present. It is also referred to as "you get more of
whatever you measure".

------
vrruiz
Felipe Ortega wrote a Ph.D. thesis devoted to this topic. "Wikipedia: A
Quantiative Analysis", available at <http://libresoft.es/Members/jfelipe/phd-
thesis> His conclusions were featured on a Washington Post article,
"Volunteers Log Off as Wikipedia Ages", discussed here
<http://news.cnet.com/8301-1023_3-10403467-93.html> and here
[http://news.slashdot.org/story/09/11/25/160236/Contributors-...](http://news.slashdot.org/story/09/11/25/160236/Contributors-
Leaving-Wikipedia-In-Record-Numbers)

~~~
Ras_
Second opinion (with stats) by Wikimedia Foundation Data Analyst Erik Zachte:

[http://infodisiac.com/blog/2009/12/new-editors-are-
joining-e...](http://infodisiac.com/blog/2009/12/new-editors-are-joining-
english-wikipedia-in-droves/) (addendum:
[http://infodisiac.com/blog/2009/12/why-i-changed-the-
title-o...](http://infodisiac.com/blog/2009/12/why-i-changed-the-title-of-my-
previous-blog/))

------
quant18
The article's dismissal of an editor by complaining that "[his] edits were all
deleting things and moving things around" is a perfect illustration of why
"bytes added" is just as bad a contribution metric as "edit count". We're not
in 2001 anymore. The Internet is not short of bytes about Alan Alda or
Anacondas, and nor is Wikipedia short of people who add bytes about them.

Newly-added bytes may be true or false. They may be useful or not useful even
if true --- readers do not want every byte about a given topic, they want a
few tens of thousands of the most useful ones. (This is entirely orthogonal to
the inclusionist-deletionist debate about what _topics_ should be included.
Some bytes may be useless in the context of a main topical article about Alan
Alda himself, but they would be very relevant to a subtopical article about
Alan Alda's dental health).

For a popular topic, you'll have dozens of people adding bytes of varying
quality. Insiders subtract the false or useless bytes (an action easily
captured in statistics and then maligned on the internet by pundits), but also
look at the true and useful bytes, fact-check them, and then _leave them in
place_. This contribution --- curation --- is not captured in any statistics,
but it is an important part of the mechanism by which you can have 1 expert
and two enthusiastic amateurs stop by every few weeks on their lunch break to
expand an article with no centralised notice or approval, without having have
the place turned into a mess by the 97 vandals and well-meaning incompetents
who came by in the meantime.

The real problem Wikipedia faces is in the long tail of topics, where there is
only one person adding the bytes, and that person is either grinding an axe,
self-promoting, or afflicted with incurable "nerdview". The well-meaning,
harried, underinformed Wikipedia insiders inevitably screw up when they try to
distinguish useless vs. useful bytes on these topics, but I wouldn't call the
outsiders who added the bytes in the first place "experts". Unfortunately both
sides' conduct may be scaring away the people who _are_ actual experts on
those long-tail topics ...

~~~
tokenadult
_Insiders subtract the false or useless bytes (an action easily captured in
statistics and then maligned on the internet by pundits), but also look at the
true and useful bytes, fact-check them, and then leave them in place. This
contribution --- curation --- is not captured in any statistics. . . .
Unfortunately both sides' conduct may be scaring away the people who are
actual experts on those long-tail topics._

Correct. But not just long-tail topics, but any topic that can only be
properly treated by deep immersion in the subject, and especially any topic
that is controversial. A lot of point-of-view pushers on Wikipedia do a lot of
their POV-pushing by "framing," just making sure that some wikilinks to
articles they like persist, while others are deleted, or that navigation
templates or article categories play up the articles that best represent their
point of view. And that's before we even get to the issue of deleting reliable
sources.

------
brianmckenzie
Out of curiosity, how many HNers write on Wikipedia? I do it every so once in
awhile, but only only on a few topics I'm intimately familiar with, or to fix
obvious vandalism.

~~~
flomo
I started doing so, until I realized the only way you could really contribute
an article was to enlist for a lifetime of janitorial duty. Otherwise,
whatever work you did quickly degenerates into run-on sentences and trivia.

The article's point resonated with me, because it seems that the whole process
is based on the assumption that you're a no-lifer who spends all day
patrolling Wikipedia.

They advertised themselves as the "open source encyclopedia", but they have
not adopted any other concepts from the software development world, such as
"quality assurance" and "stable releases".

~~~
binbasti
_They advertised themselves as the "open source encyclopedia", but they have
not adopted any other concepts from the software development world, such as
"quality assurance" and "stable releases"._

Adopting the concept of forking, distributing the whole thing, and making it
easy to push and pull in changes from any source would be a great goal.

Someone already started hacking on a tool for converting the dumps to Git
repos, which may or may not be a suitable base to build upon:
<https://github.com/scy/levitation>

~~~
riffraff
how would they distribute the thing, if not through the dumps? I mean, using
an scm to distribute it (I asked the freebase guys the same thing) would save
time/bandwidth but it's not a conceptual improvement.

Also, how would forking be useful in any way? It's not like someone can work
on a feature that later gets merged into the mainline, right? Could you
explain the advantages ?

~~~
binbasti
Forking would be the key to eliminate the gatekeeper problem, so that for
certain topics people could publish their own articles or make changes to
existing ones without needing the approval of some opinionated moderator or
admin. On the German Wikipedia this problem is so bad that earlier this year
we had a really big discussion that even extended from the blogosphere to the
major news outlets.

What I mean is that Wikipedia could learn from GitHub how a lot of forks plus
easy pull requests can exponentially increase participation and generation of
new content while still being able to retain quality.

 _It's not like someone can work on a feature that later gets merged into the
mainline, right?_

Why not? You could extend an article or contribute some new articles for a
topic, publish the content yourself, and try to get it into the main project.
Then on the main site there could be a list of forks, and even a network graph
exactly like on GitHub.

~~~
riffraff
sorry I do not understand: who is in charge of importing the data anyway, if
not the gatekeeper?

Also, i mean that the difference with forking source code is that you can work
on something that will not get accepted into the mailine, but is useful to
you, or your organization.

It does not seem to me that you can have a private wikipedia to which you
refer people, it would be useless. Same with forks, if an officially blessed
fork does not exist, you get anarchy and all of them become unreliable, cfr
the rails foreign key plugins for reference: when strong active and clear
leadership miss, projects fade into failure.

OTOH if you just want "easy merge" that does not seem to replace the
gatekeeper, you just made him more visibile.

But if I am missing soemthing, I'd be happy to understand :)

------
jdp23
Fascinating analysis, using a different metric and coming to a completely
different conclusion than Wales did. If he's right, it'll be interesting to go
back and reread what various people have written about Wikipedia. Four years
later, has anybody run the larger-scale analysis he suggests here?

------
alex_c
What are some examples of policies that would be detrimental to casual
contributors?

~~~
redthrowaway
There's a huge problem with the current Wikipedia community being almost
outright hostile to new editors. Edits that are either poorly formatted or
poorly written are often reverted rather than improved, and the edits of
unregistered editors are very likely to be reverted. Mostly, this is due to
the desire to fight vandalism, and fairly strong xenophobia.

With tools like huggle, it becomes trivially easy to scan hundreds of edits an
hour, and reverting is as easy as pressing 'r'. The problem is, vandal
patrolling with huggle is _really boring_. You're on the lookout for things to
revert, and can quite easily misinterpret someone changing a number ("512" to
"568") as subtle vandalism. So, someone comes along, fixes a figure that was
wrong, and within 1 minute, their change is reverted. It presents a really
uninviting face to the uninitiated.

As for xenophobia, the wikipedia community is almost paranoid when it comes to
outside influence. This distrust isn't unwarranted, as there have been many
instances of various groups plotting to have wikipedia reflect their reality.
What this means, however, is that external calls to edit a particular article
will likely wind up at AN/I (Administrator's Noticeboard / Incidents,
basically where you go to tell on people), and the first thing any new editor
would be greeted with would be a notice saying a diplomatic version of "we're
on to you".

Policy itself isn't the problem; it's the community. Policy both reflects and
shapes community dynamics. However, Wikipedia is a "Jimbotocracy". Jimbo can,
has, and will continue to, overrule the community and do whatever he wants,
usually in the form of banning / unbanning / stripping or granting of
privileges. In addition, ArbCom, the "Supreme Court", if you will, defers to
Jimbo. While Jimbo doesn't set policy directly (although he reserves the right
to), he does heavily influence it. His views on what policy should be are
often taken as passed down from the mountaintop on stone tablet, so it matters
what he thinks and why he thinks it.

------
Uhhrrr
I found the followup at the bottom of the page equally interesting:
<http://www.aaronsw.com/weblog/writefp>

------
jrockway
There is that girl from Indonesia, and Jimmy Wales. And some other people.
According to the ads.

------
ErrantX
The problem is; neither method alone is a good measure of value.

Plenty of articles I come across have "drive by additions" of some length that
are unreadable, repetitive and disturbingly worded. There are contributors
whose sole purpose is to go round and copyedit that content, which they may do
in a number of edits... but change (by comparison) very few words. It can
still take hours of effort.

A Wiki article is the sum of all those edits.

------
jaysonelliot
One of the primary reasons that Wikipedia is written by such a tiny percentage
of its users is that the process of writing and editing is arcane and shrouded
in one of the most insular online cultures in existence.

I don't know if greater access to its priestly class would mean a better or
worse site, honestly. I'm sure there are thousands of people who are experts
in their fields and could improve the content of the site greatly - but I can
imagine that opening the floodgates and working to democratize authorship and
editorship could also drag it further into the gutter.

~~~
carbocation
The article that we are discussing is actually saying quite the opposite: that
the bulk of Wikipedia is written by many, many users. The bulk of the _edits_
on the other hand, are written by the few.

------
woodall
Here is a dump I made sometime ago but just posted. I am torn on what to
think. These companies know more about their products than the general public,
and thus are more qualified to talk about them, but there's that little cynic
in me saying they shouldn't. Take it with a grain of salt because some of them
are legit.

[http://www.christopherwoodall.com/blog/?x=entry:entry101221-...](http://www.christopherwoodall.com/blog/?x=entry:entry101221-024331;comments:1)

------
zavulon
Even though I disagree with his views on just about every political issue,
Aaron is an inspiration for me. I would love to be in a position where I don't
have to worry about money and invest significant amount of time in things that
interest me, without worrying about money.

It seems like ever since I started working for myself, I don't have time for
hobbies at all - everything I do has to go through a prism of _how will this
affect my bottom line_...

------
guelo
2006

------
known
I came across many self appointed _editors_ in Wikipedia.

