Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Most Wikipedia articles lead to the same loop (wikiloopr.com)
295 points by seangransee on Aug 31, 2012 | hide | past | favorite | 161 comments

From a mathematical perspective, this is actually not that surprising.

Think of the act of clicking the first link in the Wikipedia article as a function that takes in the page you're on and outputs another page.

If you call this function on itself over and over again (ie use the output of one step as in the input of the next step), you will eventually enter a loop. The proof of this is simple: there are only a finite number of Wikipedia articles, therefore you must eventually reach an article you've seen before. (This is the same reason systems with a finite number of states cannot be chaotic.)

Since it's necessarily true that all articles will eventually reach a loop when you iteratively click the first link, we have to ask: how unusual is it that they usually reach the SAME loop?

The fact that many lead to the same loop(1) of Argument <-> Logic in unsurprising, too. Wikipedia articles usually define their subject in in the first sentence. Defining things works by saying what general category they belong to, then by differentiating them within that.


A fish is any member of a paraphyletic group of organisms that consist of all gill-bearing aquatic craniate animals that lack limbs with digits.

Google Inc. (NASDAQ: GOOG) is an American _multinational corporation_ which provides Internet-related products and services [...]

A table is a form of _furniture_ with a flat and satisfactory horizontal upper surface [...]

And so on. If you always walk up the abstraction chain because you're picking the first link (the general category), you'll end up at the root of the categorization tree, which is likely something along the lines of Argument, Logic, Fact etc.

Note however that the current state of wikipedia has a different root loop for me, Logic leads to Philosophy.

(1) or one of a very limited number of loops, I also saw Science <-> Knowledge.

Unless the articles are edited before you come back!

Another way of thinking about it: There are N nodes that loop back to themselves. When starting at a given node, what are the odds that you'll never hit one of these N nodes? Additionally, we know that many of the N nodes are very general ones, like "truth", under which many topics roll up.

Agreed. I'd love to know how many different cycles the wikipedia article graph contains. If most articles lead to 1 out of 20 possible cycles, it's much less interesting than if it's 1 out of 5,000,000.

I've been looking to do something like this with the Networkx library in Python, though on an internal company MediaWiki with a much smaller article base. Going to try to visualise much the same thing, the major loops and cliques in the graph.

This is the same reason systems with a finite number of states cannot be chaotic

Without more information, I don't think this is true. It is true if the transition function between states depends on a finite number of previous ones (WLOG if s_n is a function of only s_{n-1}), but I think that it isn't if the transition depends on an infinite number.

(Though, in this case, clearly there is no history, so what you say is true.)

If your model only depends on at most a maximum fixed length of history, you can always model your finite systems to only depend on the most recent state. (Just make a copy of the history part of that state.)

In the case of depending on up to an infinite history, I wouldn't call that system to have a finite number of states any longer.

Unless an article is reached that has no link. [A dead-state in the stated finite-state machine]

After the first few runs, i'm kinda surprised it's not a fixed point at Philosophy.

Back in May of 2011, it was. http://ai.stanford.edu/~west1/attractor.html

Never thought about it as a finite state system, but that's a great way of looking at it.

This may be hyper naive but is this the consequence of beginning most articles representing the current topic as a more specific form of another topic?

For example: http://en.wikipedia.org/wiki/Algebra

"Algebra ... is a branch of mathematics." (mathematics being a link)

So if you continue to generalize you'll end up at the most general subject which appears to be philosophy.

[update] added an example

ah so thats why the degree is called Doctor of Philosophy?! Thanks for the enlightenment! I've wondered quite a lot of times why I'm a Doctor of Philosophy in Computer Science, and not Doctor of Computer Science.

Traditionally, a university had four faculties, the lower or artists faculty, and the three higher faculties of theology, law and medicine. Students would start in the artists faculty learning the seven liberal arts (grammar, rhetoric, logic, arithmetic, geometry, music, and astronomy (including astrology)) to become a magister artium (M.A.) and then go to one of the higher faculties to get their doctorate (Th.D, LL.D or M.D., respectively). Philosophy and science (née natural philosophy) developed in the artists faculty and over time became important enough to grant the artists faculty the right to grant doctorates, too, with philosophy leading the way. That's why many science faculties, which split off of the artists faculty over the centuries still grant the title of a doctor of philosophy.

Ah, and the first three of the liberal arts that were the first thing a student learned, grammar, rhetoric and logic, were called the trivium, hence the modern word "trivial" for obvious things everyone should know.

Really cool comment, I love finding interesting trivia in forums.

> 'interesting trivia'

In the above - grammar, rhetoric, logic - formed the "trivium" and - arithmetic, geometry, music, and astronomy - formed the "quadrivium".

Also "trivium" represented the place where three roads would intersect. People would meet here and exchange pleasantries and gossip. That is how you get the current meaning of the word "trivia".

That's really cool. Thanks for sharing.

Very interesting, thanks. http://atilf.atilf.fr and Dictionnaire historique de la langue Française both see two meanings for the French "trivial": one is common as in commonplace, and the other is common as in gross and vulgar. The first comes from /trivialis/, as fhars says. The latter comes from /trivium/, the crossroads. I don't know if English also has the second meaning: vulgar.

Thank you. I enjoyed reading this. Would be nice if you wrote a little blog post on the subject :)

I get that this is tongue-in-cheek, but philosophy as in "Doctor of Philosophy" doesn't refer to the field of philosophy per se, but to the more general concept of "love of knowledge."

Not sure why you are getting down votes, but checking over Wikipedia confirms your comment

In the context of academic degrees, the term "philosophy" does not refer solely to the field of philosophy, but is used in a broader sense in accordance with its original Greek meaning, which is "love of wisdom". In most of Europe, all fields other than theology, law and medicine were traditionally known as philosophy. The doctorate of philosophy as it exists today originated as a doctorate in the liberal arts at the Humboldt University of Berlin, and was eventually adopted by United States universities, becoming common in large parts of the world in the 20th century.[1] In many countries, the doctorate of philosophy is still awarded only in the liberal arts, which is known as "philosophy" in continental Europe.

That seems like a very reasonable assumption. I just randomly sampled a dozen or so articles and they all followed that theme.

This is beautiful. I've always thought sites like IMDB and Wikipedia are hypertext expressions of the purest form.

As a nerdy kid, over a few years I read my way through a significant chunk of the early 1990's Groliers Encyclopedia set my Grandma bought me at Krogers. And it was never cover-to-cover reading, instead it was filled with hopping around, page-to-page, volume-to-volume. Start at oscilloscope, but what's a cathode ray? And then it's in television? How does broadcasting work? I'd sit on the floor with a half dozen open encyclopedias open around me, the same way we can do now with browser tabs.

Wikipedia has many flaws, but it's a fantastic tool and I'd have killed for it as a 12 year old with 5 volumes of an encyclopedia in my backpack!

Ah yes. To me (to my productivity's detriment...) quite often when I visit Wikipedia it becomes a learning adventure. From one thing you can hop to the other, finding out all sorts of interesting and obscure things. To a curious mind like mine, Wikipedia is heaven!

For myself, I would broaden this and say that almost every link I visit on HN, or every forum post I read leads to more than 1 additional link. Thus three windows of Chrome open with 10+ tabs each :)

I find this drill-down effect to be quite illuminating, if not productive. Strangely, it seems to provide both depth and breadth of topics in relatively equal measures.

Sometimes, I go onto wikipedia just to find out one fact... and end up with 6 or 7 tabs open on the topic, ranging from genetics to Judaism.

As far as I know, this was noticed in May last year on the xkcd forums. http://forums.xkcd.com/viewtopic.php?f=2&t=71309 There's a writeup of it with a gigantic picture over here: http://www.mrphlip.com/wikiphilosophy/

Seems like there is an even older link here on HN http://news.ycombinator.com/item?id=2584896

OK, and this game http://spu.co/get-to-philosophy was played at least as far back as August 2010.

True, the "wikipedia loop" itself is not a new discovery. But still, congrats for coming up with a nice GUI to demonstrate it!

I have a site: http://TheWikiGame.com that makes a game out of a related idea (finding the connection between Wikipedia articles).

Recently, I've been giving access to the game data to people (like university researchers, etc) to test different theories on path connections made by real people, etc.

The game has now been running for over 3 years, has about 1.19 million players, playing over 1.37 million games, with about 1.22 million won games (successful start/end article connection).

Got a really cool application for all the game data? I'd love to hear: alex@thewikigame.com

I have a kind of Wiki game that I play aloud with friends. Person A names a topic. Person B must guess whether Wikipedia owns the top Google (incognito/not signed-in) result for that term. Person B gets a point for guessing right, or Person A gets a point if Person B guesses wrong. The fun is in coming up with "stumpers.'

Maybe the interesting part is the 150,000 lost games.

Surely most of those are just abandoned games?

Nobody linked xkcd yet? Relevant Alt-Text: http://xkcd.com/903/

As they describe in the page, when you describe something, you are trying to place it within a general framework.

In trying to describe a domain in an encyclopedia, it does little good to use language specific to the domain. If your reader doesn't know what Biology is, it's little use trying to describe it in terms of Microbiology and Immunology. You have to use more general language.

There are are few words more general and meaningless in our language than "organization".

edit - grammar

i know. when i found that out, i spent half the day on wikipedia trying it out with different pages. so i built a way to automate it :)

Seems like it would be nice to cache some of these requests, instead of repeatedly hammering the Wikipedia API. And why is it grabbing images?


(Incidentally, my test case for the screenshot did not lead to Philosophy!)

totally agree. i hacked this together pretty quickly last night. definitely a lot of room for improvement. thanks for the feedback!

It looks like there is a bug where it doesn't fetch the correct page:



sorry to say, but couldn't you have thought about that before posting it on HN? you're on the front page, if your site really runs without caching, you must be wasting an enormous amount of WP's resources.

Either philosophy is a dead end, then, or it is the base of all knowledge. Depends on whether knowledge is a tree or a graph, which is... a philosophical question!

My brain exploded.

If there is some characteristic Q that is true of all knowledge, then Q must be true of Q. A cycle!

Only if Q is knowable.

Trees are connected acyclic graphs, so knowledge could be both. ;)

Yes but graphs can very well be cyclic, so eventually a path from a node could lead back to the node itself.

I seem to have broken it here: http://wikiloopr.com/List%20of%20minor%20planets:%2075001%E2...

Wikipedia page: http://en.wikipedia.org/wiki/List_of_minor_planets:_75001%E2...

That is, however, a weird Wikipedia page. It seems like the first link is a target to another location on the page, which (I'd assume) would put it in an infinite loop.

"Egypt" also breaks.

It's funny, but it's true that everything in life ultimately leads back to philosophy.

As noted by XKCD, everything is based on something else: http://xkcd.com/435/

And I think after Maths on that should come Philosophy.

"Nematode" and "Flow (psychology)" stop at "psychology" which I think is just another example of that discipline putting on airs.

That's changed now, thanks to me.

I agree. Thought that at the time. Was a philosophy student at the time too though.

Guys :) This is always fun when it crops up - especially the musing on how deeply rooted the topic of philosophy is.

But please don't go editing articles to improve the loops or otherwise change them. Unless of course the change is beneficial to the article.

It's like going for a walk somewhere fascinating; enjoy the view, but leave nothing behind :)

What happens for Biology? http://wikiloopr.com/Biology. For me, it goes down until it gets to Biology, and then highlights Biology over and over.

  Natural science
  Proof (truth)
  Class (biology)
  Biological classification
The first link on the Wikipedia page for Biology is "Biology(disambiguation)", but the first textual link is "Natural Science". I assume it loops on Biology because it keeps going back between Biology (the article) and Biology (disambiguation), but that doesn't explain why the first redirect is to Natural Science in the very beginning.

Am I just missing something here?

Edit: This no longer happens. Now it properly loops on Philosophy and Proposition.

Here is a similar visualization of the phenomenon from last year http://xefer.com/wikipedia

This is a superior visualisation in my opinion - it actually builds you a tree of links as you enter subsequent searches. Quite interesting to see how article chains cluster into 3 or 4 different "main branches".

Except there's no autocomplete. I couldn't match Carly rae jepsen... but once you get it, it is much cooler.

Works for:

* Carly Rae Jepson

* Paul is dead

* Telephony

* Easter Island

* Drum and bass

* Sunglasses

* CAT 5

* Willy Wonka

* Black hole

* Vancouver

* Cloud Cuckoo Land

* Bacon

* Radish

* Brunette

* Dunning-Kruger Effect

* Lady Gaga

* Sailor Moon

If the chain hits Philosophy first, it converges on Philosophy/Argument. If it hits Fact first, it converges on Fact/Truth.

I can't replicate it, but I encountered a weird bug. I entered "Star Trek" and hit Enter. The suggested items list appeared and I clicked on "Star Trek". Page titles appeared as normal, but everything appeared twice, like so:

  Star Trek
  Star Trek
  Cinema of the United States
  Cinema of the United States
  Level of measurement
  ...etc until the last page title, "Reality"
Seems like two processes were running in parallel and outputting to the same stream.

I also got a weird one:

    The page you specified doesn't exist

That's funny because the first link on the Animation page is


Is that an Easter Egg (illusion == doesn't exist)?

Same thing here, also from Animation.

I love wiki. I spend hours following links. Having loops is a feature or else I might never stop.

Greek City States ->

  Human settlement
  Level of measurement
  Stanley Smith Stevens
  United States
  Human behavior
  Natural science
  Proof (truth)
  Proposition loops to Philosophy

When images appear first in the page's source, the wrong link is chosen. I assume you are trying to go for the first link, as a human reader would see it.

For example, the page on Anime:



thanks! gonna fix that later


"Whale" leads to the "pinniped"/"marine mammal" loop. How about that one?

In fact, according to a crawl of the Wikipedia database from May 2011, nearly 95% of Wikipedia articles will take you back to Philosophy.

- Always interested in the backstory of folks that sit around thinking this stuff up (and then testing it).


I think this is fairly trivially explained in most cases by the fact that Wikipedia is an encyclopedia. Almost by construction, nearly every article starts with "X is a Y" (modulo grammatical niceties) where Y is a more general class to which X belongs. In cases when the article doesn't follow that form, the first linked term is usually still explanatory or definitional in some sense and hence more general, and cases where that doesn't apply (e.g. Obstacle) are unusual enough that they don't generally break the cycles. There are no "Wikipedia axioms" that I know of, so this is pretty much guaranteed to end in a cycle looping through a limited set of terms used to talk about abstract concepts. Hence Philosophy.

One of these, at least, is supported by a pretty weird edit:


Few days ago, when this was on MainPage, it ended in different loop than now.

Today most articles ends in cycle between Philosophy and Modern Philosophy.. ( http://en.wikipedia.org/wiki/Modern_philosophy )

few days ago, it was cycling between Philosophy and Agency ( http://en.wikipedia.org/wiki/Agency_(philosophy) )

What changed and why?

Aha, someone changed the first link on page Problem from purpose to obstacle.

Its a little bit like butterfly effect :)


You have a bug. Pages like http://en.wikipedia.org/wiki/Apple_Inc. (note the dot at the end) have a "geographic coordinate system" link at the top, that isn't part of the article. So queries like http://wikiloopr.com/Steve%20Jobs are actually returning wrong results.

Yep, it's seeing the "Coordinates" link at the top right corner of the page.

thanks! I'll go ahead and fix it.

This is the shortest one I've found so far - http://wikiloopr.com/monty%20python

here's the shortest one i've found: http://wikiloopr.com/home

Oh, I think I found one that doesn't loop:

Port Arthur http://wikiloopr.com/Port%20Arthur

Either that, I found a bug.

Not as dramatic but http://wikiloopr.com/Nelson ends up with Great Britain (and then claims it loops to itself with no intermediaries).

Looks like a bug. Dalian should lead to Liaoning.

Epistemology reigns supreme.

How do you know?

Does this hold true with following the xth link on the page? I'd predict it does, but the further x is from 1, the longer the average chain before you get to the loop (by virtue of definitions being at the top of wiki articles, then at some point the average length would hit a consistent level). Would love to see this modified to let you determine x.

I quite like the loop between Doubt and Belief http://wikiloopr.com/Doubt

The loop found for "Psychology" is pretty neat: http://wikiloopr.com/Psychology

Edit: Looks like somebody is editing certain articles that don't have philosophy as the first link to have it. Psychology used to loop with itself through like 5 intermediates.

After having twenty tries that got me into the philosophy loop, I started to think what are the most basic concepts that philosophy is "made of". Maybe I could escape the loop that way?

And lo and behold, I tried word "word" and it loops between "emotion" and "psychology". http://wikiloopr.com/word

I fixed that.

So did you edit Wikipedia to ensure the Philosophy loop?


Now it loops between "Human" and "Reason". I think there are two secret cabals at work. One to break the Philosophy link and one to thwart these evil schemes.

This reminds me of a game I used to play as an intern with other interns to kill time. Someone would pick a Wikipedia article deep down in the bowels of Wikipedia and we would race to the article from Wikipedia only using links. We could get the article surprisingly fast, most within only 5 intermediate page visits from the homepage.

This behaves erratically. Sometimes it considers the coordinates link, above the entire article the "first link" (in the "Michael Phelps" chain), sometimes not (Chicago). Consequentially, I believe "Michael Phelps", when the coordinate link is ignored, leads to it's own loop that is not the philosophy loop in question.

Nice demo, but I think it is an incorrect asumption that the first link is to a broader subject. For example, for Perú it says: "It is bordered on the north by Ecuador and Colombia, on the east by Brazil" So Perú->Ecuador left me with a big wtf..


about an hour after posting this, someone changed the philosophy page so the philosophy-reality loop doesn't happen.


it still brings you to a loop, but it's not an immediate back-and-forth between the last two.

I don't think the loop is the interesting part, what's interesting is that articles lead to philosophy.

Funny... As I was playing with this, somebody changed the order of:

"In philosophy and sociology..."

On the Agency (philosophy) page. This changed the output of the results, but in the end the loop still existed. However, I couldn't help but feel like someone is trying to hack the matrix.

Someone just changed the Reason page [1] which is now putting the focus on Biology instead of Philosophy.

1: http://en.wikipedia.org/w/index.php?title=Reason&oldid=5...

Puts me in mind of Translation Party[1] (which is back up, kudos to WillC and Rick!). There's a lot to be entertained via this sort of vertex-following algorithm.

[1] http://translationparty.com/

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact