Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The thing that your post misses and that most people who criticize Wikipedia miss, is that all stuff written everywhere could be wrong or malicious.

Any book, any blog, anything could be wrong by accident or on purpose. Wikipedia is open enough we can investigate it and readily determine who the malicious or incorrect actors are. Which is far more than any book from antiquity, most blogs and even much professional journalism provides.

Should we trust wikipedia implicitly? Of course not.

Should we trust any single source simplicity? Of course not.



When you view any page's history, all you get is a list of pseudonyms and IP addresses. Sure, you can probably get the country of origin and sometimes maybe even the employer from the latter, but that's basically it. I don't think that's very much, given that nobody on Wikipedia is required to disclose who they are in real life (there's even a policy specifically against posting such information about others[1]). Well, okay, in theory users with a financial conflict of interest in a topic are required to disclose it, but in practice PR agencies can often get away doing anything to articles about their clients.

And openness in itself doesn't guarantee reliability anyway. OpenSSL being open-source didn't prevent Heartbleed from happening. The so-called "Linus's law" is often a mere vacuous truth; there's rarely enough eyeballs to make even the most obvious bugs shallow. And eyeballs themselves don't help if they keep looking in the wrong places.

The problem with Wikipedia isn't merely that it may be sometimes inaccurate. It is that its sources of inaccuracies are much more numerous, most rather predictable, and yet next to nothing is done to address that. A book written by a single author may also be inaccurate; but I'd be more inclined to believe a book written by someone 0) whose good reputation is at stake and who therefore has an disincentive to actively misinform (sometimes at least), and 1) who has an editor above them watching out for inconsistencies; rather than a bunch of pseudonyms and numbers that can immediately publish anything they think up with basically no oversight. Trustworthiness is a spectrum, and Wikipedia does significantly worse on it than many other sources. Review processes on Wikipedia are overworked and even those focus mostly on style rather than substance. There are some relatively sensible policies enacted, but they are quite vague (and therefore bendable to ever-changing whims of particular users) and selectively enforced. And don't get me started with the dispute resolution processes...

There is no systematic, consistently applied process to ensure Wikipedia's accuracy. It is addressed just like the gun problem is in the US; it proceeds from one outrageous incident to another[2], with everyone barely applying ad-hoc measures after the fact, while the people who do have the power to make a real difference are either in complete denial that this is a problem at all or are too attached to the idea that the foundational laws can never ever be wrong to actually enact meaningful reforms.

[1] https://en.wikipedia.org/wiki/Wikipedia:OUTING

[2] https://en.wikipedia.org/wiki/Wikipedia:List_of_Wikipedia_ho...


There are new ways to make inaccuracies but there are also new ways to weed them out.

There are enforced rules about citing sources, the history of any page can be checked, there is discussion for just about any page, and more controls that don't exist on any book or news site. Errors are often fixed as soon as they are discovered, which is impossible for books and less common than it ought to be for web pages.

As for anonymity the author of most content may as well be anonymous. Other than the most famous authors or most learned readers the author of any given data is generally not of interest for practical use. For the domain expert this is of course not true, but the domain expert is generally publishing to a peer reviewed journal and reading from similar sources with an extreme eye on things like bias and sample size.

Since we are using wikipedia as a source, for maximum irony here is the page on wikipedia about the reliability of wikipedia:https://en.wikipedia.org/wiki/Reliability_of_Wikipedia

It cites 240 sources, including peer reviewed journal Nature and panel discussions of specific domain experts.


> There are enforced rules about citing sources

There are articles with no citations that have been lying there and accumulating dust for almost ten years.[0] I wouldn't call that "enforced". And when you bring the verifiability policy at deletion debates, sooner or later someone will inevitably say "the policy says the article has to be verifiABLE, not verifiED" or other such nonsense. Bring in enough users like that and the only sort-of-working cleanup process on Wikipedia grinds to a halt.

And let's forget that while verifiability policies require sources to be "reliable", this word isn't defined anywhere. Which makes the definition bendable to the whims of users at hand, as long as nobody else notices. The result is predictable.[1]

> the history of any page can be checked

Giving very little useful information, as I wrote before. And what does it matter if it can be checked? Who actually does it?

> there is discussion for just about any page

Oh yeah. Like the over 40 thousand words written over whether to capitalise a preposition, which only stopped after Randall Munroe drew a comic which called it out as an embarrassment to the project.[2] (And I don't even like the guy.) And no, it's not just a singular example. Disputes like this arise all of the time; Wikipedians have dutifully collected them into a list[3], from which of course they refuse to draw any conclusions.

The discussion pages are often a liability rather than an advantage. It's easy to understand why those prolonged discussions happen: one, because they tend to concentrate on colour-of-the-bike-shed issues like this, and two, because nobody has a final say on any issue. You win discussions on Wikipedia usually 0) by inviting your friends to support you, 1) by making passive-aggressive remarks just stingy enough to annoy your opponents, but not enough to constitute overt, actionable personal attacks, and 2) by playing silly word-games with policies to twist them to your will. (Of course, there rules do exist against all three, but see point 2.) This shall render your opponents outnumbered and/or too tired to argue any further, at which point you just wait for an administrator to declare consensus in your favour. Of course, it is easy to imagine what kind of people can afford to employ these tactics effectively: PR agencies and OCD sufferers with lots of free time.

> Errors are often fixed as soon as they are discovered

Which can be as late as 10 years after being introduced, and after careless writers have incorporated those errors into their own work, as happened with Jar'Edo Wens. Counter-vandalism patrol only looks for overt hooligans, not fraudsters. While traditional publishing makes errors harder to introduce in the first place.

> As for anonymity the author of most content may as well be anonymous. Other than the most famous authors or most learned readers the author of any given data is generally not of interest for practical use.

So page histories are useless after all? Is it not true that "we can investigate it and readily determine who the malicious or incorrect actors are"?

> Since we are using wikipedia as a source, for maximum irony here is the page on wikipedia about the reliability of wikipedia:https://en.wikipedia.org/wiki/Reliability_of_Wikipedia

> It cites 240 sources, including peer reviewed journal Nature and panel discussions of specific domain experts.

I think you've mentioned sample sizes somewhere in your comment... didn't you? What does one article being ostensibly well-referenced prove? Can you trust the article to be actually supported by those sources? Never mind that single Nature study, which you're no doubt referring to, used rather questionable methodology and is seriously outdated at this point.

[0] https://en.wikipedia.org/wiki/Category:Articles_lacking_sour...

[1] https://en.wikipedia.org/wiki/Category:All_articles_lacking_...

[2] https://xkcd.com/1167/

[3] https://en.wikipedia.org/wiki/Wikipedia:LAMEST


Yet despite all that, and your apparent bias, you continue to use Wikipedia as your primary source.


Admins don't even understand their own policies.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: