
In Defense of Inclusionism (2009-18) - luu
https://www.gwern.net/In-Defense-Of-Inclusionism
======
Permit
As a purely end-user of Wikipedia I just want to chip in only to say that I
have not seen any observable decline in Wikipedia over the last 15 years. I am
generally very happy with the results I get when I use it.

> The fundamental cause of the decline is the English Wikipedia’s increasingly
> narrow attitude as to what are acceptable topics and to what depth those
> topics can be explored, combined with a narrowed attitude as to what are
> acceptable sources, where academic & media coverage trumps any consideration
> of other factors.

I guess it sucks for people who wanted to write articles on each chapter of
Atlas Shrugged or thought Bulbasaur needed his own page. Presumably, though,
you agree there should be SOME criteria for what is notable enough to deserve
a Wikipedia article? For example, in order of increasing mundanity these
topics probably don't deserve a Wikipedia article: Me, my cat, my cat's water
bowl, the water in the bowl on May 5, 2020 etc.

Like, there must be some line that separates what deserves an article and what
is does not. We can argue about where to draw that line but I'm actually
pretty happy with how Wikipedia has done it.

~~~
andrewla
> these topics probably don't deserve a Wikipedia article: Me, my cat, my
> cat's water bowl, the water in the bowl on May 5, 2020 etc.

Yes, they probably don't. But what is the cost of having them? If they confuse
an issue, like if your cat shares a name with a more notable animal, then
yours can just be renamed to "Fluffers (Permit's cat)". Who does it hurt to
have that page on there, even in the reductio ad absurdum? Even,
hypothetically, storage costs -- in the current deletionist world, the initial
article and its deletion will be preserved in history forever, so we don't
save anything.

If we need to provide the capability to flag pages as "deletionists don't like
this" and present a deletionist view of wikipedia to those who don't wish to
be exposed to that content, then go for it.

I don't mean to try to throw rhetorical exclamation points everywhere; I'm
genuinely curious about the cost of having a page about your cat's water bowl.

~~~
teraflop
Wikipedians have been arguing over these positions for many years, so if you
want to see the arguments in favor of deletionism, there are plenty of places
to look.

For example, here are IMO the most salient points from
[https://meta.wikimedia.org/wiki/Deletionism](https://meta.wikimedia.org/wiki/Deletionism):

> Some believe that the presence of uninformative articles damage the
> project's usefulness and credibility, particularly when casual visitors
> encounter them through Internet search engines or Wikipedia's "random page"
> or "recent changes."

> Articles on obscure topics, even if they are in principle verifiable, tend
> to be very difficult to verify. Usually, the more obscure, the harder to
> verify. Actually verifying such articles, or sorting out verifiable facts
> from exaggeration and fiction, takes a great deal of time. Not verifying
> them opens the door to fiction and advertising. This also leads to a de
> facto collapse of the "no original research policy", which is one of the
> fundamental Wikipedia policies. Empirically, there have been a number of
> hoax articles which were difficult to prove to be hoaxes but which could
> have easily been deleted by a sufficiently strict notability policy.

> Poorly-sourced articles can result in Citogenesis, as incorrect or unsourced
> information on Wiki (e.g., information that is the product of original
> research) is then repeated outside Wiki and eventually works its way into a
> publication that is normally regarded as a reliable source.

~~~
EamonnMR
When you hit random page, most of the time you get one of the following:

Uninformative article about a locality

Uninformative article about the football team of the above.

------
andrewla
As an inclusionist who has basically given up on editing wikipedia because of
deletionist forces, I read this article and see nothing but points I agree
with.

What I would like to see, however, is a good defense of deletionism. I have a
lot of trouble even understanding the point of view.

If there's some higher-level motivation about storage costs or something, then
I can buy it, but I don't think anyone is making that argument. Does it hurt
wikipedia to have low-quality articles? I think only insofar as the articles
are about subjects that are important, so the effect seems to be self-
balancing. Can wikipedia be used by people to glorify themselves with personal
wikipedia pages? Sure, but who cares -- unless someone does care in a
particular instance (i.e. a person named Albert Einstein tries to take over
the main article -- people will fight it because they want the article to
point to the right person).

~~~
tptacek
The argument for deletionism is straightforward: the scarcest resource on
Wikipedia --- and this article agrees! --- is volunteer time. Every article on
WP is a drag on volunteer time, because every article needs to be policed for
spam and vandalism. Wikipedia is an encyclopedia, not a Hitchhiker's Guide, so
scoping it to encyclopedic topics reserves volunteer time for meaningful
tasks.

 _later_

If you want to come at this from the vantage point of principles: one of the
oldest rules on Wikipedia is "No Original Research" (an encyclopedia is a
tertiary source). Articles on non-notable subjects are almost by definition
original research.

~~~
andrewla
This is one of the few responses I've seen that actually articulates a defense
of deletionism rather than just a summary of deletionist theory, so let me
engage here.

What exactly does policing for spam and vandalism consist of? There are a
number of automated tools already deployed on wikipedia that get improved over
time for detecting particular types of spam and vandalism. The use of
anonymous accounts, in particular, make it hard to do any interesting spam at
scale. It seems to me that the larger wikipedia is, paradoxically, the easier
it will be to detect patterns of spam, especially out on the tail.

As for "No Original Research", that's probably the principle of wikipedia that
I'm most skeptical of; and really, it isn't respected, as wikipedia has become
such an important secondary source. I think it's useful as a guiding
principle, in that you don't want people using wikipedia as a platform to
advance their alternative theory of gravity, but for the most part the harm is
small to allow us to push the envelope a little bit.

~~~
tptacek
Policing spam and vandalism entails reading all the changes to pages you're
policing. That's why WP editors have watchlists, because no one person can
keep context for all the pages on the site in their head. A lot of spam
detection is automated, but a lot of spam isn't automated: it's people with a
fairly decent understanding of a topic surreptitiously editing its page to
give it a preferred spin (or promotion). The whole premise of Wikipedia is
that not being allowed to stand.

As regards "no OR isn't a problem": if your feeling is that the project is
rife with OR, you should have no problem generating links to 5 pages that
demonstrate the phenomenon. Let's discuss specifics.

------
dang
A thread from 2016:
[https://news.ycombinator.com/item?id=13152255](https://news.ycombinator.com/item?id=13152255)

Thread from 2014 - interesting top comment there:
[https://news.ycombinator.com/item?id=8791791](https://news.ycombinator.com/item?id=8791791)

(Reposts are ok after about a year:
[https://news.ycombinator.com/newsfaq.html](https://news.ycombinator.com/newsfaq.html))

I've made the year in the title a gwernian range.

------
Stierlitz
“The fundamental cause of the decline is the English Wikipedia’s increasingly
narrow attitude as to what are acceptable topics and .. what are acceptable
sources, where academic & media coverage trumps any consideration of other
factors.”

.. as well as self-serving corporate and political interests. As in they sit
on an article 24/7 making sure nothing controversial get in.

“Imagine a world in which every single person on the planet is given free
access to the sum of all human knowledge.”

Except for those that contradict the inner party. Go to the Talk Page and
discuss it they say. Do that and your account gets disabled for violating some
obscure WP rule.

------
narag
No comments yet? I guess others are still reading TFA...

There are a number of issues that seem to be eternal. Even if there's an
obvious right answer, I see year after year, decade after decade that they
keep being discussed, with the same points being made over and over again. Is
there a name for these?

I guess that for each of them, there is some kind of unspeakable reason that
trumps any sound rationale.

~~~
pdonis
_> I guess that for each of them, there is some kind of unspeakable reason
that trumps any sound rationale._

I think the general unspeakable reason is pretty simple: once you give people
power, some of them will misuse it. And since it takes more effort to correct
a misuse of power than to commit the misuse in the first place, any
institution that gives people power becomes more and more corrupt over time as
misuses of power outweigh valid uses of power.

~~~
kragen
The nice thing about Wikis in general is that it's usually easier to correct a
misuse of the power to edit a page than it is to commit the misuse in the
first place. That's why Wikis work at all.

~~~
narag
That works for the power to edit, not for the power to delete. So it makes
sense, yes.

------
miblon
I noticed the decline. About a year ago, during a hackathon, we tried to get
an article online. It took us 3 deletes and retries. Then I reached out to a
national official at Wikipedia and finally the article got accepted. That's
not good...

~~~
tptacek
What was the article? It'll have a history, and we can see for ourselves what
happened with it.

------
eitland
Same goes for stackoverflow.

Since the stackoverflow database is freely available I cannot see a single
good reason why they haven't been outcompeted years ago.

~~~
the_af
So I have my own beefs with stackoverflow and the stackexchange network at
large, but...

...at some point you have to ask yourself, _why_ haven't they been
outcompeted? For every post complaining about stackoverflow moderation or
policies, there are probably thousands of people using it every day to do
their jobs, and _it hasn 't been outcompeted_!

So if we're honest, we cannot rule out the possibility they are doing
something right. They set out to replace closed sites like "Expert Sex Change"
\-- ok, sorry for the joke, expertsexchange -- and also reduce low quality
noise. They succeeded and they are now the gold standard. So why hasn't anyone
simply taken their data and forked it?

A great example: softwareengineering.stackexchange -- formerly
programmers.stackexchange -- has a troubled history. At times it has decided
_everything_ was off-topic there (I'm not joking, there were times where every
question on the home page was closed as off-topic), and some long time
"inclusionist" and well-intentioned contributors declared they would fork it,
and their fork would include everything even tangentially related to
programmers and people wouldn't be censored for asking questions about
anything.

Where are those forks now?

~~~
sillysaurusx
First mover advantage. It's like asking why no one has created another Reddit,
or another HN. When the original exists, a derivative doesn't gain traction
unless something goes seriously wrong.

The sole time that Reddit was in danger was when their front page was blacked
out over the moderator rebellion. Luckily for them, Voat's servers completely
died under the load. Then Reddit came back online, and the rest was history.

I would imagine your questions have straightforward answers along those lines.
No one switches because inertia, not quality.

In general, a new thing has to be fundamentally different in some way. It's
rare that an internet product is out-competed by someone else doing literally
the same thing. Even Reddit was fundamentally different from digg. They were
both link aggregators, but that's about where the similarities stopped.

~~~
the_af
Agreed about the first mover advantage.

Do note StackOverflow is only _relatively_ a first mover: expertsexchange came
_first_ , and StackOverflow killed it. Evidently, they did something better.

Wikipedia also had challengers. I don't remember their names (heh) but many
years ago there were forks. I don't mean Wikia, but actual alternatives to
Wikipedia. Nobody uses them.

> _In general, a new thing has to be fundamentally different in some way. It
> 's rare that an internet product is out-competed by someone else doing
> literally the same thing._

Possibly right, too! But if an inclusionist attitude is not something
"fundamentally different" enough to succeed, then might it be that it wasn't
important enough to say that deletionism is killing Wikipedia/StackOverflow?
If people don't flock to inclusivist forks, maybe inclusivism isn't important
enough?

------
netcan
This sort of problem (or its opposite) is inevitable with our internet circa
2020.

Maybe wikipedia _is_ too narrow. Maybe it's too broad. These things don't have
singular, indisputable answers...

The problem is that we have an internet of bottlenecks. Wikipedia choices
about what is encyclopedic or notable is the only definition of encyclopedic
that matters. Youtube's interpretation of fair use, twitter's definition of
offensive or facebook's definition of obscene... they're the working
definitions.

The internet needs to be less centralised... Even wikipedia.

------
domador
If Wikipedia is to retain a deletionist editorial cultural, I'd at least like
to be able to access the history for deleted entries and read old versions of
such entries. As it stands, those entries seem to be permanently removed from
public access (and maybe even on the back end.) I don't like that 1984-style
versioning, which gives the message that certain entries never existed in the
first place.

(I'd understand a small exception for copyright infringing content--that such
content should remain unavailable when deleted.)

------
severine
Does anyone here remember Seth Finkelstein?

[https://www.theguardian.com/technology/2008/jul/31/wikipedia](https://www.theguardian.com/technology/2008/jul/31/wikipedia)

I miss his writing!

edit: His blog is still up:
[http://sethf.com/infothought/blog/archives/cat_wikipedia.htm...](http://sethf.com/infothought/blog/archives/cat_wikipedia.html)
but unfortunately, no new entries since 2013 :(

------
di4na
I have to say this is spot on. The move of things like the Culture spacecraft
names into wikia was a killing blow, particularly due to the use of ...
dubious... cookies and tracking on wikia.

And while i understand the feeling this does not belong in wikipedia, this was
not wrong either or low quality and a good open door for new editors. I could
keep going, i have a long list of lost good information from wikipedia over
the years. This one is just the one that made me decide to stop using it.

------
notriddle
> If you try to write niche articles on certain topics, people will tell you
> to save it for Wikia. I am not excited or interested in such a parochial
> project which excludes so many of my interests, which does not want me to go
> into great depth about even the interests it deems meritorious—and a great
> many other people are not excited either, especially as they begin to
> realize that even if you navigate the culture correctly and get your
> material into Wikipedia, there is far from any guarantee that your
> contributions will be respected, not deleted, and improved.

I hate to be seen as supporting deletionism, because I've never contributed
enough to WP to really have an opinion (I've mostly just expanded on existing
articles and fixed obvious vandalism and typos), but I don't understand why
this is a bad thing.

I don't like that they're being driven to an ad-supported website, but the
simple act of not having Wikipedia eat all of the web seems like a good thing.
The Iron Law of Oligarchy will weaken the website just like it's destroyed
every other website, with no good reason to believe it'll be different this
time. At least if Wikipedia is limited in scope, the damage it does will be
limited too.

~~~
vkou
> I don't like that they're being driven to an ad-supported website, but the
> simple act of not having Wikipedia eat all of the web seems like a good
> thing.

No, having all of this content under Wikipedia would be an amazing thing. For
all its warts, Wikipedia (When it has the information you are looking for) is
the gold standard in what the the web _should_ be. It's amazing that it exists
as a website, and compared to the mountains of crud that surrounds it, it has
an amazing UX.

The inclusionist argument is that the web would be better if more of its
content was available under the same high standards that Wikipedia set.

Competitors can't deliver that same standard because of economies of scale.

~~~
notriddle
> Competitors can't deliver that same standard because of economies of scale.

The invasion of policy-driven jerks seems like a good counterargument; it's an
anti-economy of scale, since smaller websites have less of the problem.

edit: Arguing about scale also kind of ignores that there's a good space
between the scope of the website being "everything" and having a website
dedicated to the IDW "Sniffles the Mouse" comics. IMDB is well-respected,
TVTropes has been successful with a very different culture from IMDB, the Arch
Linux Wiki has a reputation for being useful even if you aren't technically
using Arch, and, as we've already proven, Wikia has pretty much made a
business out of this.

~~~
vkou
Smaller websites have this problem, it's just that the holy wars waged within
the Pokemon fandom rarely make it to the HN front page. If anything, the
politics get even more personal and vicious in smaller communities.

The business of Wikia may be successful, but as a user, it's complete rubbish.

------
Animats
Most of the important topics were covered years ago. Encyclopedias are not
high-maintenance; the maintenance team for Britannica wasn't all that big.

I used to edit Wikipedia quite a bit, but got into other things.

------
markwkw
Is it easy to tell what technology gwern uses for his website? I'm not web-
savvy, but it looks html looks like it's not generated, but hand crafted. Is
is a static website?

~~~
astine
As Juped said, there is actually an in depth discussion of the tool used to
generate the website on the about page:
[https://www.gwern.net/About#tools](https://www.gwern.net/About#tools) It
looks like static generated website is about correct.

------
dntbnmpls
> fundamental cause of the decline is the English Wikipedia’s increasingly
> narrow attitude as to what are acceptable topics and to what depth those
> topics can be explored, combined with a narrowed attitude as to what are
> acceptable sources, where academic & media coverage trumps any consideration
> of other factors.

This is a result of the state/traditional media asserting more control over
internet propaganda. Year by year, what is acceptable on wikipedia, forums,
social media, etc have narrowed. Followed by the strong arming of these tech
companies into privileging authoritarian/authoritative sources.

Look at what happened to youtube. There was a time when they had great
recommendations and you could go into a "deep dive" of youtube for hours. Now,
the recommendations has no depth and focused on "late night shows",
establishment news and pre-screened content. Google search is no better.
Reddit is just advertisement/pr ( celebrity X donated 5 masks to a hospital )
garbage and propaganda.

It's pretty impressive how easily and quickly a handful of "bankrupt" news
companies could control trillion dollar tech companies. What you could write,
say, etc and what you could watch, hear, etc has been limited in just few
years. And the trend doesn't seem to be stopping. Just 10 years ago, nobody
could've predicted this could happen.

------
qu4ku
Gwern's website starts to be next level.

