Hacker News new | comments | ask | show | jobs | submit login
Wikipedia fails as an encyclopedia, to science’s detriment (arstechnica.com)
67 points by nikbackm on Dec 29, 2015 | hide | past | web | favorite | 56 comments

"And it's not all physics and math; the entry for the Ka/s ratio, a useful measure in evolutionary biology, assumes you already know a lot about evolutionary biology and the genetic code."

It seems to me that any information - in an encyclopedia or anywhere else - on a subtopic must assume knowledge of the parent topic.

Other gripes by the author seemed to be nitpicks about article quality, which are fair but extremely minor.

A well-written encyclopedia article should still be able to instruct a generally educated lay reader about what a thing is and why it's notable even if the more detailed explanation heads into depths that someone without appropriate background may not be able to follow.

I had a very similar discussion with someone involved with the Wikimedia Foundation earlier this year and he highlighted math/science as having exactly this issue. Way too many articles seem to be written by people who are far more comfortable and interested in using the equation editor than in providing an intelligible explanation.

The problem isn't universal to be sure. But it is widespread, especially in less popular topics.

There is a formal language for mathematics because we need it for efficient work. This formal language is intelligible, clear and instructive, given adequate education. This formal language is mandatory to be present in an encyclopedia.

Some articles about mathematics like for example that of determinants are noisy for me because of the well-intentioned “educational” parts. In this example I even think they do more harm than good for people who are struggling with the concept.

The first part of the section “Definition” starts with explaining in horribly ununderstandable natural language english a way to compute the determinant of some matrix. That text is unnecessarily confusing and complex. That text also serves as a good example where formal language can be easier to grasp than natural language. Would you explain quicksort in natural language rather than with a formal description?

If mathematics is taught like in that article, no wonder kids find math boring and hard.

Yeah, if a "natural language" explanation is just writing out mechanical mathematical steps or formulae in clumsy english, it's completely missing the point.

Use natural language for what it's good at. Where did this mathematical idea originate? Who introduced it? What was the problem this mathematical idea helped address? How much does it help address that problem (assuming the why and how are beyond the layman, just describe how helpful it was.)

The Quicksort intro by gohrt is a perfect example.

You're not using the natural language to do away with the thorough mathematical explanation, you're giving the lay reader an idea of why the concept is important, why they should care, in the most general sense.


> Quicksort (sometimes called partition-exchange sort) is an efficient sorting algorithm, serving as a systematic method for placing the elements of an array in order. Developed by Tony Hoare in 1959,[1] with his work published in 1961,[2] it is still a commonly used algorithm for sorting. When implemented well, it can be about two or three times faster than its main competitors, merge sort and heapsort.[3]

Next paragraphs as well: they name and link properties it has, but also give a short definition of them, where you'd otherwise have to follow a bunch of links and read the definitions there.

The encyclopedia articles are not supposed to teach math. And most of the article can be incomprehensible to most people.

Having a plain English lead paragraph that put the article in context is very important, but the leads are often hopeless.

I've had this problem with Wikipedia also when looking up various scientific or mathematical references I find here. Most do not have a laymen's explanation at the top of the article, requiring me to lookup other unknown terms referenced in the summary.

I really wish more articles had a summary for laymen with a more detailed explanation below.

Some scientific terms are too deep to directly describe them in layman's terms in a reasonable number of words.

Also, what one reader calls a layman's explanation another calls gobbledygook.

That's one of the reasons for inventing hypertext. For example, https://en.m.wikipedia.org/wiki/Riemannian_manifold has a nice introduction paragraph, but it presupposes quite a bit of knowledge. If you want to know more, feel free to click some links to learn more.

And yes, things probably aren't presented in a way that is optimal for _your_ learning, but there is no way to do that for every reader.

I have read the Wikipedia description, and at least half a dozen other mathematical descriptions, and the only description of the Chinese Remainder Theorem that made sense was described here[0] (in the very challenge that prompted me to research it).

I feel there's definitely room to make the Wikipedia description more friendly when it runs for pages and is unreadable from my side.

[0] http://cryptopals.com/sets/5/challenges/40/

I'm inclined to agree with you, but instead I agree with the author's sentiment as represented here -- not specifically on Ka/s however. I spend a considerable portion of my time reading Wikipedia articles and very much enjoy it, however:

despite a strong background in science and engineering I can't make much of the considerable majority of mathematics-heavy articles. Many articles as the author no doubt notes define concepts in mathematical symbols which are left undefined, whose meaning is utterly lost on someone without the relevant academic background. For the rest, I find myself lost in a nigh recursive maze of definition upon definition.

Imagine, for example trying to understand: https://en.wikipedia.org/wiki/Camera_matrix with little mathematical background

But who are the hypothetical readers who need to understand the camera matrix without knowing math? In general, my feeling is that most Wikipedia articles are pretty well pitched to the people that would be interested in the article topic.

Camera projections are very common technique in computer graphics and software often refer to this kind of terminology. When I try to get better understanding what they actually mean I often find wikipedia articles lacking. It's like looking up a traditional food and finding only receipe.

I'm not sure that the idea that an encyclopaedia should be targeted to only the already well educated, especially one with the goal that Wikipedia has is much short of elitism.

...what exactly is missing?

Okay, the paragraph about projective spaces and degrees of freedom is a bit of mystery to me, but the actual concept seems to be simple enough. Should every article utilizing matrix multiplications to describe linear mappings spend time rephrasing https://en.wikipedia.org/wiki/Matrix_%28mathematics%29 i.e. a basic linear algebra course?

Yea, I'm all for adding as much layman-intelligible context as possible, but you can always make there sorts of gripes if you expect arbitrarily niche science topics to self-contained. The fact is that there's just not that much to say about quantum memory if you don't have the background to know what a Rabbi oscillation is.

I wonder to what extent this is just a humanities/sciences divide. I'd bet that scientists reading wikipedia far outside their specialty are less surprised and frustrated by the level of accessibility than are non-scientists. With important exceptions, topics in the humanities generally just have fewer levels of dependency.

What are the exceptions?


(Needless to say, I don't think these levels of dependency imply true intellectual depth...)

But not only is that not the case, we don't expect that to be the case for most topics. Most people expect to go to Wikipedia, type in Caligula, and get an article about Gaius Julius Caesar Augustus Germanicus that was titled "Caligula" and which mentions that he is commonly known as Caligula in the first paragraph.

If the people who edited the history subsections had the same attitude as the people who edited many of the math and science pages, there would be only an article titled "Gaius Julius Caesar Augustus Germanicus", which wouldn't use the name "Caligula" anywhere, but somewhere deep within the article would mention that he had the nickname of "little soldier's boot." Because, hey, you need to have the knowledge of the parent topic, and if you don't even know the guys name then why are you even looking up the article?

I'm sure the math and science people would go nuts if the articles for other topics were written in a similarly opaque fashion. But they seem to mistake the clarity others write with as evidence that those topics are inherently simpler.

> It seems to me that any information - in an encyclopedia or anywhere else - on a subtopic must assume knowledge of the parent topic.

I would argue that any subtopic that requires more than basic understanding of the parent topic is too specialized for a general knowledge encyclopedia.

That's why "general knowledge encyclopedia" isn't a good term.

If you mean general knowledge in the sense of "knowledge which is generally held," or even "knowledge which is generally accessible," then a general knowledge encyclopedia wouldn't be that useful, by definition, to most people.

Crucially, it would be useless to the people most likely to use an encyclopedia!

A general knowledge encyclopedia should aim to curate knowledge with no discrimination in terms of category. This is what Wikipedia is, and it's really good at it.

I think there is room for improvement in linking topics and subtopics, even potentially in-line within an article. See a word you don't understand? Click the "+" next to it to expand a sentence that fills its place syntactically and semantically!

Try logging in to Wikipedia and enabling Hovercards[1] in the beta features. It creates a pop-up with a summary of linked pages when you hover over the link.

[1] https://en.wikipedia.org/w/index.php?title=Special:Preferenc...

Oh wow, awesome. I'll give that a shot!

> Click ... to expand a sentence that fills its place syntactically and semantically!

proof of concept of something like what you're describing: http://www.telescopictext.org/text/pFjkqQY9bmfvQ (not the best example, but telescopictext is very cool)

The author of the editorial comments, "Disturbingly, all of the worst entries I have ever read have been in the sciences. Wander off the big ideas in the sciences, and you're likely to run into entries that are excessively technical and provide almost no context, making them effectively incomprehensible."

I am a Wikipedian. I have been editing various articles since 2010. I have won the Million Award[1] twice for major improvements to high-traffic articles. In the sciences I follow most closely in my own research, the Wikipedia articles are even worse than "excessively technical and provide almost no context," and are often simply wrong. The worst part is that many Wikipedia articles about science are wrong more because of omission of things that every working scientist in each field knows than because of miscopying of correct statements about science. Most Wikipedia articles are still very thinly sourced, and most are based only on sources that appear online, and many of those sources are not from professionally edited publications.

I think it would be a good idea for a rich philanthropist to fund a Free Online Encyclopedia X Prize to see what combination of organizational, technological, and other factors could build a team of encyclopedia-compilers that could put together a better free, online encyclopedia than Wikipedia. Right now, Wikipedia gets a lot of external funding, but it has a surprisingly tiny number of active editors,[2] and the Wikimedia Foundation strategic plan still calls for more improvement of content.[3] I think Wikipedia has exceeded everyone's expectations, and I see people who have access to better scholarly resources use it almost every time I'm with other scholars, but I also think Wikipedia today is like Excite or Lycos in 1998: the best available service in its category, but rather easily displaced by something as good as, say, AltaVista. There is still a lot of space to make a much better free, online encyclopedia, and some friendly competition might make Wikipedia a whole lot better too.

[1] https://en.wikipedia.org/wiki/Wikipedia:Million_Award

[2] https://en.wikipedia.org/wiki/Wikipedia:Wikipedians#Number_o...

[3] https://wikimediafoundation.org/wiki/Wikimedia_Movement_Stra...

> surprisingly tiny number of active editors

Nothing surprising about it, given the arcane rules and the likelihood of being told to "f--- off" if you try to edit an article someone has camped out on. Gave up trying to contribute years ago.

I don't think Wikimedia understands just how many editors they have lost due to this problem. It is simply too late to fix. Wikipedia is dead to far too many people with knowledge that could have been shared.

> could build a team of encyclopedia-compilers

OK, but if we want better science articles, we need to find a way to get practicing scientists to contribute. You can't improve the article on quantum memory without them.

> ...that could put together a better free, online encyclopedia than Wikipedia

What's the advantage of making it separate from Wikipedia? The only one I can think of is that you can put domain experts in charge who are allowed to appeal to their expertise rather than verifiability. But in this case, you want domain-specific encyclopedias rather than general purpose ones, like the celebrated Stanford Encyclopedia of Philosophy.

Arguably, you want better science writers to contribute who may or may not be practicing scientists. After all, the goal probably shouldn't be to replicate physics journal depth in Wikipedia but, rather, to provide a mainstream source of information for those who may have some background but are not specialists on a given topic.

> ...you want better science writers to contribute who may or may not be practicing scientists.

The practicing scientists don't necessarily need to be controlling the whole process, but if they aren't playing an integral roll then the encyclopedia will be bad. Even in the mainstream news, reporters know that they can't just read a few journal articles and then start writing up a popular piece. They have to interview scientists.

In practice, I think the only way to do this for a reasonable price is to use non-monetary incentives (e.g., fame) to get the scientists to do at least some of the writing themselves. But maybe someone can figure out how to start a foundation that hires professional writers to interview scientists.

(As an aside, this does make me wonder why, when there is a collaboration between skilled writers and skilled scientists, the writers are always in charge and use the scientists as a resource. You'd think there'd be room for a model where the scientist outlines all the bullet points and uses the writer as a resource.)

> ...the goal probably shouldn't be to replicate physics journal depth in Wikipedia...

The question of whether to purge all physics topics from Wikipedia that have no hope of being useful to laymen (of which there are many thousands) is more-or-less the inclusionist/deletionist debate. Personally, I don't see why you'd want to get rid of them.

>As an aside, this does make me wonder why, when there is a collaboration between skilled writers and skilled scientists, the writers are always in charge and use the scientists as a resource.

Probably because it's the writer who is publishing the book whereas most scientists are far more focused on journal articles. That said, I'm sure that many of the well-known working scientists who also publish books had plenty of editorial help--as, indeed, do most published writers.

>is more-or-less the inclusionist/deletionist debate

I'm actually mostly in the inclusionist camp myself. I think notability is pretty hard to define because it's so dependent on context. Pretty much everything/everyone is notable if you get local enough.

> What's the advantage of making it separate from Wikipedia?

You don't get sucked into half a million words on what type of dash to use in page titles.

Another option would be to get more scientific knowledge online. When I log into my local library with my library card and do a JSTOR search, the difference between those results and what Google hands back are night and day.

"The Internet" returns the cheap, fast and crappy version of so, so many things. And people seem oblivious to it.

Lately I've been thinking about what it would take to implement fork/merge functionality on top of Wikipedia database, to allow people host their own versions of pages with specific changes, still updated with most of other edits from the original. Do you think that this would be a good idea to start several forks of Wikipedia with different rules of conduct and approaches to community to see how they compare?

This is what I expected the article to be about: allowing the information to be distributed and democratized rather than belong to one organization's policies for all topics.

As for the depth of scientific entries, I rather enjoy being out-of-my-depth and after numerous encounters with related topics am able to appreciate more and more of it.

Let's think a bit broader: if you did truly solve that problem it's quite likely that the solution would have broad applications across a wide range of domains.

Unrelated, but I am curious on your opinion on something: do you perceive the current shortcomings as something that stems from the tool (the wiki platform itself), the organizational structure (the people), or a combination of both?

It's a combination of the people and the realities of a crowdsourced platform.

The problems in technical topics (beyond the, often, lack of coherent editing that can be a problem in general) include:

- Editors who think they know a lot more about a topic than they do

- Editors who are genuinely expert but aren't skilled at explaining the topic to someone who wasn't, say, a math major. (Or, indeed, it doesn't even occur to them that they should do so.)

I am curious on your opinion on something: do you perceive the current shortcomings as something that stems from the tool (the wiki platform itself), the organizational structure (the people), or a combination of both?

Thanks for your question. To keep down the total number of replies in this thread, I'll also refer to other questions posted in replies to my comment.

I think that Wikipedia currently has a hostile editing culture that especially discourages participation in editing by subject-matter experts, who cannot count working on Wikipedia as "publishing" in the usual way that publishing is rewarded for academics, and who will find the willful ignorance of many Wikipedians more than a little tedious. I don't have any problems with the Wikipedia software. A wiki (a broadly editable collaboratively edited document) does seem to be the way to go to build a good, free, online encyclopedia--Wikipedia is proof of concept for that.

I think Wikipedia's human culture around editing started out pathological and stayed pathological because there was too much domination of the culture by amateurs (in editing) at the beginning of the project. I think a philanthropist would do well to fund a Free Online Encyclopedia X Prize because there are many forms of editorial culture in online wikis that have not yet been tried, and it may be that Wikipedia itself will reform its culture if it is exposed to more competition. (The very small number of people who so far participate actively in editing Wikipedia are easily replaceable if a new project comes along with a culture more welcoming of subject matter experts and of editors with professional editorial experience.) I care not at all which organization wins if there is an X Prize competition, but I would like to see the experiment tried, because plainly people all over the world will continue to use online encyclopedias, and there is still a lot of low-hanging fruit to pick for improving the existing online encyclopedias.

I agree with the comment (a subcomment of another reply to my earlier comment) that says that what Wikipedia (or some other project) needs more of is more professional writers who are used to explaining technical concepts to the general public. Yes, Wikipedia needs much more of that, and more sound editorial judgment in general.

Let's think a bit broader: if you did truly solve that problem it's quite likely that the solution would have broad applications across a wide range of domains.

That's an important point. There are many projects besides building a free, online encyclopedia that would need to bring about very similar forms of collaboration, and you are correct that working on the online encyclopedia improvement project will probably turn up knowledge that will be applicable to other projects.

AFTER EDIT: Some other comments mention Simple English Wikipedia. But Simple English is a controlled vocabulary, and it's not even clear as a matter of linguistics that Simple English is more understandable to second-language learners for general purposes than everyday English carefully written by a sensitive editor. I happened to look at some of the Simple English articles on the topics I research in the last year, after extensive editorial work on the main Wikipedia articles on the English language, and the whole Simple English Wikipedia has an even bigger problem than most versions of Wikipedia with thinly sourced, incomplete articles.

Thank you for the reply, these are some really interesting insights.

To the problem of laymen understanding any given article, I almost question if this is truly a bad thing? To me, it seems that the 'web' itself was meant in the beginning to help solve this problem in the sense that you could traverse up and down a network of related hyperlinks to gain the context/knowledge needed to understand the page at hand.

This kind of leads me to the thought that maybe it's not the individual articles that are really at issue when it comes to the widest audience being able to read with understanding: maybe what is missing is the 'meta' structure which would allow a user to quickly understand what prerequisite knowledge they need, and where to find it.

In teaching, the analogue is the scope and sequence. We introduce each subject and topic so that it builds on previous understanding. I wonder if some meta structure or learning 'tracks' that are curated on top of the article structure might solve part of the problem?

I feel like following hyperlinks to related pages gets unwieldy quickly, because you then have to follow the links from there, ...

It might be helpful if you could expand links maybe in an extra column to the right, similar to how you might find notes on the margins of a book, to get definitions and info while also seeing the "source" article. But that's really hard to get right, both in UI and content.


i guess.

i mean, what's the layman's description of a rabi cycle? anyone? if someone can lay on me a description my very smart, but non-techy wife would understand, i'll be happy to update the wikipedia entry.

seems like all you have to do is explain quantum physics to a layman first. then (apparently) how a two-state quantum system interacts with an oscillatory driving field and what that has to do with that original layman's explanation of quantum physics.

piece of cake.

> i mean, what's the layman's description of a rabi cycle?

Ars technica tackled that without blinking in this article from 2010:


It's tempting to dismiss lay explanations as impossible when you're assuming the laymen needs a full and complete understanding of every implication. But in lay explanations, you're not graded on completeness, you're just offering a foothold so people know generally what you're talking about and one or two implications.

What are we talking about? Quantum mechanics. What's that? When things are really tiny they seem to obey unexpected rules (of physics). So quantum mechanics describes the behavior of really tiny things. What's a rabi cycle? So you have some of these really tiny things, and they're flipping between two states, like a light switch. Maybe you shine a light on them to get them excited, and when they're excited in a certain way, they flip back and forth between these two states.

What does that get us? Maybe it helps us make lasers that have effects that are more focused than we would expect given diffraction limits--that's the limit of how focused light can get based on the properties of the lens you're using.

There's nothing in that article that the layman can use to predict anything, i.e., the reader will not be able to answer any questions afterwards that aren't directly addressed in the original article. That's a sign that those sorts of explanations give the feeling of understanding, but don't actually impart knowledge.

If someone writes "there's a war in Syria", then the reader can accurately predict that there will be an above-average number of bombed out buildings, and probably lots of refugees on the border. But if someone writes "Rabi oscillations are when very tiny things flip back and forth", there are no non-trivial questions the reader can answer. The reason is that the reader knows roughly what "war" is and what a country is and that "Syria" is a country. But when you tell them tiny things are flipping...all they know is that tiny things are flipping. (Ask "do you think you can catch the tiny thing on it's edge part way through it's flip?" Or "Do the tiny things flip faster or slower than sound waves?" The reader will have no idea.)

In other words, "Energy makes it go!": http://www.textbookleague.org/103feyn.htm

dude! you should update that page! i actually understood what you said!

Wikipedia will always be in flux. I don't use it much for looking up science based stuff but I do check it to see what's being said about the conflict in Syria, the Israel/Palestine conflict and other contentious geopolitical topics. Edit warring and vandalism/inserting bias are the norm for these pages. (I do edit entries when I have the time.) Arguably there is no "right" answer that can settle arguments about political topicsand no matter what is written somebody will come along and claim bias and change the wording or delete sections they find objectionable. Pages about less publicized event (e.g Russian-Georgian war of 2008, the Mahar Arar affair) are blatantly slanted to one side or another and stay that way for months or even years.

The science articles I have read are sometimes overly technical or needlessly verbose but others know more about that than I do so I can't provide examples. If every person who posts a long rant or complaint on an entry's Talk page took the time to edit(in good faith)the parts they think need improvement the public would benefit by having access to better quality information. As it stands people put a lot of energy in pointing out, often valid, flaws but very few actually do anything about it.

Wikipedia is only as good or as accurate as its editors make it and humans, being the strange creatures we are, have a difficult time agreeing on even relatively basic things so it will always be a work in progress. That it exists is a positive thing overall and to really get a sense of how accurate an entry about a topic you know little about is reading the Talk page is a must. And if you do know have knowledge that you think is missing from an entry or think you can improve the wording...be bold and go for it!

Isn't the whole selling point of the hypertext format that we can follow as many links as we need for context and foundational information?

There's always simple.wikipedia.org for Simple English explanations of complicated topics, though it's lacking in nearly the same size as the regular English Wikipedia.

"Simple" shouldn't mean "limited vocabulary". The problem isn't the words used, it's the way the writer expresses the subject using words.

It should mean gradually revealing a complicated subject by building one simple concept upon another.

An example is explaining a rainbow; avoiding the word "refraction" would actually make the explanation more difficult, so instead start by explaining what refraction is.

That's why hyperlinks exist.

Well if the author wants layman terms... Wiki has that: simple.wikipedia.org

example: simple.wikipedia.org/wiki/String_Theory

As for quality: then try and improve it. Much information gets hidden or lost in the Talk pages because of internal politics and other such pettiness.

And subtopics will always require knowledge of the parent topic. This goes for science, mathematics, linguistics, and hell - even video games.

Layman's Explanation of Gravity: Gravity is a force that pulls objects together, the more mass an object has, the stronger the pull. The closer an object is, the stronger the pull. On Earth, gravity pulls at 9.8 meter per second squared. Bla bla bla.

Wiki's Explanation of Gravity: Gravity is a 3 dimensional vector that transform another vector along the 4th dimension. On Earth, this transform can be integrated over the 4th dimensional to bla bla bla. Given a topological surface, this gravity vector, if the magnitude is big enough, can transform and bend the geometric space into a hole. Bla bla bla.

Layman's Thought: What the f*ck that all that even mean?

Maybe if universities offered incentives to undergraduates who cleaned up some of the articles? It would help them get recognition too.

I think betterexplained.com demonstrated that there is an alternative way of detailing information that a lot of people found helpful. Perhaps multiple versions of a page each with a different take on explaining it, perhaps rated versions... etc

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact