
Show HN: Six Degrees of Wikipedia - jwngr
https://www.sixdegreesofwikipedia.com
======
jwngr
Creator here. Six Degrees of Wikipedia is a side project I've been
sporadically hacking on over the past few years. It was an interesting
technical challenge and it's fun to play with the end result. Here's the tech
stack:

    
    
      * Frontend: React (Create React App)
      * Backend: Python Flask
      * Database: SQLite
      * Web (frontend) hosting: Firebase Hosting
      * Server (backend) hosting: Google Compute Engine (it runs fine on a tiny f1-micro instance)
    

All the code is open source[1] and I'm happy to answer any questions about
building or maintaining it!

[1] [https://github.com/jwngr/sdow](https://github.com/jwngr/sdow)

~~~
thrownaway954
make it so that if I visit the page and just click the "go" button, it will
use the placeholder examples as the start and end points. i did this and got
an error message stating "You'll probably want to choose the start and end
pages before you hit that." that was annoying. the placeholders that were auto
chosen were actually really interesting.

~~~
ng-user
'Please' and 'thank you' go a long when requesting additional features for an
OSS project.

~~~
larkeith
There's a certain irony to the lack of tact in this post suggesting better
manners. I suspect the point would be better received were it more politely
made.

I, of course, am merely propagating the cycle.

~~~
Stratoscope
When I was a kid my mom told me "always say 'please' and 'thank you'."

So from then on, when I wanted something, I said "please and thank you."

I think this was my first social hack. I just wish it had worked better!

~~~
dspillett
I've see something similar when offering items on freecycle: people seem to
think that the more they insert "pls" and "thnx" into a message the more
likely it is that they'll be first in the queue.

I swear at least once there was more instances of "pls" and thnx" (there seems
to be a significant overlap between people who overuse the words and people
who don't bother to type them properly, though that may be the linguistic snob
in me talking) than all other words combined. It backfired by making the
message difficult to read so I binned it and read the next.

------
turc1656
Not sure if you deliberately designed it this way, but I noticed when spot
checking some results that it includes the bibliography section links as
connections. This seems like it may not be desirable. Example, I did a search
that went from the Crusades to Buzz Aldrin and I noticed that Routledge was
the first hop from the Crusades. It strikes me as odd that Routledge (a
publishing company) would be mentioned on the Wiki article for the Crusades.
So I went to look and noticed the link it took was the citation for a book
published by this company. I wouldn't really count that as a legitimate hop
since it's a citation, not a content link.

EDIT - I noticed this also applies to the Notes section.

~~~
trishume
A few years ago I made something similar
([http://ratewith.science/](http://ratewith.science/)) that only uses bi-
directional links, that is pages that both link to each other, and this gives
much more interesting results.

When two Wikipedia pages both link to each other they are usually related in
some reasonable way, but unidirectional links give you things like Wikipedia
-> California, which only exists because Wikipedia is headquartered in
California, a pretty weak connection.

Other than the fact I have it running on an overburdened tiny VPS, my app is
also really fast even though I only do a unidirectional BFS because I use a
custom in-memory binary format that's mmapped directly from a file that's only
700MB, and a tight search loop written in D.

~~~
cyphar
Maybe I made a mistake somewhere, but I'm not sure it uses bi-directional
links. For a really odd search like GoldenEye -> Abbotsford [1] it uses
multiple "special" Wikipedia links that I'm pretty sure wouldn't be
bidirectional.

[1]:
[http://ratewith.science/#start=Goldeneye&stop=abbotsford](http://ratewith.science/#start=Goldeneye&stop=abbotsford)

~~~
trishume
Yah so what it does is try to find a path with bidirectional links, and then
if it can't it tries to find a unidirectional path. You can tell which one it
did by whether the numbers in the path have a pulsing glow animation, that
particular path does not.

------
labster
Anime --> Obesity

[https://www.sixdegreesofwikipedia.com/?source=Anime&target=O...](https://www.sixdegreesofwikipedia.com/?source=Anime&target=Obesity)

Somehow, I didn't expect a one-stop layover in Dubai.

~~~
52-6F-62
I went and made it political: Anime -> Alt-right.

The result was a little more predictable... I guess I shouldn't have had to
look it up.

[https://www.sixdegreesofwikipedia.com/?source=Anime&target=A...](https://www.sixdegreesofwikipedia.com/?source=Anime&target=Alt-
right)

~~~
riking
But going Alt-right -> Anime has a second path through Vaporwave.

~~~
zouhair
Now try Donald Trump -> Nazism

------
tucif
I was getting interesting results until I noticed a pattern involving the
presence of "Wayback machine" as the only connecting dot between really
different things.

That adds noise, since articles now automatically use the wayback machine for
"archived" links, thus generating many paths that do not really connect
topics, just because the text "wayback machine" is part of the link text.

It may be an interesting exercise to find outliers like that and compute paths
without those nodes.

~~~
jwngr
I considered this and may eventually add an option to ignore those kinds of
pages, but I ultimately felt like the current mode remains more true to my
goal for the project which is to traverse the links as any human would be able
to. By the way, the two pages with the most incoming links are "Geographic
coordinate system" (1,047,096 incoming links) and "International Standard Book
Number" (955,957 incoming links).

~~~
nayuki
Indeed, the Wikipedia pages "Geographic coordinate system" and "International
Standard Book Number" have the highest PageRank. See:
[https://www.nayuki.io/page/computing-wikipedias-internal-
pag...](https://www.nayuki.io/page/computing-wikipedias-internal-pageranks)

------
purell_hack
After spending way too much time I got 9 degrees of separation. I did
piggyback off of someone else's work with "Phinney".
[https://www.sixdegreesofwikipedia.com/?source=Lion%20Express...](https://www.sixdegreesofwikipedia.com/?source=Lion%20Express&target=Phinney)

I found a ton of 5 degree paths and only a couple of 6 degree ones. Then I
pulled out the "big guns" (the dead-end pages category).
[https://en.wikipedia.org/wiki/Category:Dead-
end_pages_from_F...](https://en.wikipedia.org/wiki/Category:Dead-
end_pages_from_February_2018)

~~~
plaguuuuuu
I don't know if this is legit. The Lion Express node only connects to a couple
of Wikipedia's generic help pages, which I'm pretty sure don't link back to
Lion Express.

[https://en.wikipedia.org/wiki/Help:Link](https://en.wikipedia.org/wiki/Help:Link)
[https://en.wikipedia.org/wiki/Help:Searching](https://en.wikipedia.org/wiki/Help:Searching)

~~~
purell_hack
Here's one with 7 hops and not using The Lion Express.
[https://www.sixdegreesofwikipedia.com/?source=Zevenhoven&tar...](https://www.sixdegreesofwikipedia.com/?source=Zevenhoven&target=Phinney)

Like I said I spent too much time on this yesterday.

------
colemannugent
This makes the "How many clicks to Hitler" game much faster.

For those uninitiated, the game was to click the "Random Article" link in the
sidebar and count how many links it took to get to Hitler. It is really
interesting to see just how big of an event WWII was. Every country article
has a section on their involvement or why they were not involved.

After playing with it more, this is pretty fun. I vote that a "degrees from
Hitler" score be added to the top of every article. I think it might be an
interesting proxy for how esoteric a particular page is.

~~~
cortesoft
This reminds me of the wikipedia rule I learned a while back: If you click the
first link in an article (besides the pronunciation guide), you will always
end up on philosophy.

~~~
jffry
There's even a great page with a small graph and a rundown of some more
resources:
[https://en.wikipedia.org/wiki/Wikipedia:Getting_to_Philosoph...](https://en.wikipedia.org/wiki/Wikipedia:Getting_to_Philosophy)

~~~
cortesoft
It even works for that page!

------
fortythirteen
Age of Enlightenment -> Consumption of Tide Pods[0]

[0]
[https://www.sixdegreesofwikipedia.com/?source=Age%20of%20Enl...](https://www.sixdegreesofwikipedia.com/?source=Age%20of%20Enlightenment&target=Consumption%20of%20Tide%20Pods)

~~~
vanderZwan
Interestingly, if you go the other way it blows up:

[0]
[https://www.sixdegreesofwikipedia.com/?source=Consumption%20...](https://www.sixdegreesofwikipedia.com/?source=Consumption%20of%20Tide%20Pods&target=Age%20of%20Enlightenment)

~~~
saagarjha
I'd assume very few articles lead in to "Consumption of Tide Pods", while many
do to "Age of Enlightenment".

------
fapjacks
Actually, interestingly, I've been introducing this concept as a "party game"
with other nerds at RL gatherings for some years now. The goal is to start on
a random page and find the shortest path to another random page by only
clicking links in the articles. It can be quite a lot of fun, despite what
you're thinking! And anybody can understand the challenge and compete and have
fun. It's not just something for geeks.

~~~
jnbiche
What is an RL gathering?

~~~
chaseha
RL == "Real Life"

------
BWStearns
[https://www.sixdegreesofwikipedia.com/?source=Spud%20gun&tar...](https://www.sixdegreesofwikipedia.com/?source=Spud%20gun&target=Sputnik-1%20EMC%2FEMI%20lab%20model)

Found a no-path!

~~~
mmanfrin
One of your pages has nothing at all linking to it, so finding a route to that
link is impossible.

[https://en.wikipedia.org/wiki/Special:WhatLinksHere/Sputnik-...](https://en.wikipedia.org/wiki/Special:WhatLinksHere/Sputnik-1_EMC/EMI_lab_model)

However, you can go the other way:

    
    
      Sputnik-1_EMC/EMI_lab_model
      
      Sputnik 1
    
      Soviet space program
    
      United Nations Committee on the Peaceful Uses of Outer Space
    
      United Nations
    
      Model United Nations
    
      Dwight Schrute
    
      Spud gun
    
    

Also I'm realizing after finding that that I could have used the site in this
post...

------
Wile_E_Quixote
I think it would be interesting to have a "swap" button between the start and
end points, perhaps just under where it says "to". (similar to the swap
buttons in translator apps and GPS apps to quickly swap the start and end
points) As some other comments have mentioned, the paths are not necessarily
the same route or length, and it is fun to see how they might be different.

~~~
jwngr
Good suggestion! I intended to have that exact button, but couldn't find a way
to put it in the UI without making things more confusing. I expect I'll add it
in the future. Thanks for the suggestion!

------
sixstringtheory
Parquetry -> Romeo and Juliet, 46 paths
([https://www.sixdegreesofwikipedia.com/?source=Parquetry&targ...](https://www.sixdegreesofwikipedia.com/?source=Parquetry&target=Romeo%20and%20Juliet)).
None through Shakespeare.

Parquetry -> Tromeo and Juliet, 508 paths
([https://www.sixdegreesofwikipedia.com/?source=Parquetry&targ...](https://www.sixdegreesofwikipedia.com/?source=Parquetry&target=Tromeo%20and%20Juliet)).
All through Shakespeare.

~~~
incompatible
508 paths, nice. I was about to post Wake in Fright -> Las Vegas, which has an
amazing 414 paths, but yours is a winner.
[https://www.sixdegreesofwikipedia.com/?source=Wake%20in%20Fr...](https://www.sixdegreesofwikipedia.com/?source=Wake%20in%20Fright&target=Las%20Vegas)

~~~
hexane360
[https://www.sixdegreesofwikipedia.com/?source=Frank%E2%80%93...](https://www.sixdegreesofwikipedia.com/?source=Frank%E2%80%93Read%20source&target=Woodrow%2C%20Colorado)

This isn't really the best measure though, because it only counts # of paths
at the minimum depth level.

edit: although I found some deep searches with very few links:
[https://www.sixdegreesofwikipedia.com/?source=Frank%E2%80%93...](https://www.sixdegreesofwikipedia.com/?source=Frank%E2%80%93Read%20source&target=Fors%20%28locality%29)

~~~
incompatible
I found one with 813 paths with 4 degrees, but yours has 1,645 paths with 5
degrees. I suppose there are end points with many thousands of links. It's a
bit of trivia, not really a measure of anything.

------
aplorbust
In case you receive the following message

"Sorry internet hipster, this little side project requires JavaScript."

here is a quick example of how to get the pages the "traditional way"[FN1]:

    
    
         #/bin/sh
         test $# -eq 2||exec echo usage: $0 source target;
    
         exec curl -H"Content-type: application/json" \
         -d '{"source":"'$1'","target":"'$2'"}' \
         https://api.sixdegreesofwikipedia.com/paths \
         |exec sed '
         s/\",/\"\
         /g;s/,\"/\
         \"/g;s/:{/:\
         {/g;s/}/&\
         /g;s/\"pages\":/&\
         /'
    

It appears the author is using the Wikipedia API. I did not add any HTML tags,
etc. to the output, although this is very easy to do.

FN1. The original "web browsers" needed no GUI, no Javascript.

~~~
dmytrish
Now it makes more sense.

I'd prefer to have a brief technical explanation why Javascript is needed
instead of this condescending labeling.

Edit: NoScript is not a luxury for technically minded geeks anymore, it's a
necessary protection against tracking, CPU-consuming advertisement and attacks
like Spectre/Meltdown and who knows what else Intel has for us.

------
dkuder
Six, very meta version

[https://www.sixdegreesofwikipedia.com/?source=Six%20Degrees%...](https://www.sixdegreesofwikipedia.com/?source=Six%20Degrees%20of%20Kevin%20Bacon&target=Phinney)

------
nathan_f77
This is awesome! I looked up "New Zealand" and "Yellow". There was only one
degree of separation, via Ochre:
[https://en.wikipedia.org/wiki/Ochre#In_Australia_and_New_Zea...](https://en.wikipedia.org/wiki/Ochre#In_Australia_and_New_Zealand)

I learned that the Maori people of New Zealand used ochre mixed with fish oil
to paint wakas (war canoes), and also as an insect repellent.

------
TuringTest
Incidentally, it´s a good way to find incorrect ambiguous wikilinks.
[[Comparison of web browsers]] shouldn´t link directly to [[Gnome]], which is
the article about small mythologic humanoids, not about the desktop
environment.

------
kyleschiller
What determines a "path"?

"Philosophy" is connected to "Ethiopia" through "Sexism", but I couldn't find
that phrase on either page.

[https://www.sixdegreesofwikipedia.com/?source=Philosophy&tar...](https://www.sixdegreesofwikipedia.com/?source=Philosophy&target=Ethiopia)

~~~
endorphone
Sexism links to both philosophy and Ethiopia. It is treating links as an
undirected graph (despite the visual indication showing a directional vector),
which seems entirely fair.

~~~
the_af
It seems somewhat unfair. If it's not a directed graph, it means you can't
actually navigate from the source to the target page by following these
links...

edit: in fact, the author describes the goal of his project in another comment
here: "my goal for the project which is to traverse the links as any human
would be able to".

~~~
larkeith
Parent was incorrect - the graphs are directed.

------
MurrayHill1980
About 12 years ago, some colleagues (Yehuda Koren and Chris Volinsky) and I
proposed a technique for measuring proximity in quasi-random (social)
networks, that can handle out-of-memory databases, and more than two query
nodes, also finds a visualizable subgraph that represents as much of the
relationship as possible using dynamic programming. It is described here
(includes some figures):
[http://web2.research.att.com/export/sites/att_labs/groups/in...](http://web2.research.att.com/export/sites/att_labs/groups/infovis/res/legacy_papers/DBLP-
conf-kdd-KorenNV06.pdf) The proposed heuristic approximates the "cycle-free
effective [electrical] conductance" between the query nodes. In this paper, we
were able to use anonymized phone calls to confirm 6 degrees of separation.

There used to be a live demo on an AT&T Labs website but it is not available
now. There are published algorithms for all the phases of the proposed
heuristic, but my recollection is that Yehuda found an efficient, robust
implementation of k-disjoint-shortest-paths was not easy.

This is an interesting problem, thank you for making your work available. (I
do agree the HTML form placeholders that change rapidly but are ignored when
you press the GO button are a little confusing; it took me a minute or two to
figure out what was going on.)

~~~
jwngr
Thanks a lot for sharing! BTW, I just fixed the confusing interaction with the
text input placeholders[1].

[1]
[https://github.com/jwngr/sdow/commit/6e42e06488a592784e5d3d2...](https://github.com/jwngr/sdow/commit/6e42e06488a592784e5d3d221fba81adbbe0259e)

------
maaaats
Could you make it possible to just test with the fun suggestions that scroll
through?

~~~
scrooched_moose
Yeah, that really confused me. Distance from "M.C. Escher" to "Lucky Charms"
sounds interesting only to get:

    
    
        You'll probably want to choose the start and end pages before you hit that.
    

Took me a while to realize it didn't "lock in" the suggestions and I had to
manually enter it.

~~~
jffry
Which is a shame, because that specific example ends up having some
interestingly diverse paths between the two:
[https://www.sixdegreesofwikipedia.com/?source=M.%20C.%20Esch...](https://www.sixdegreesofwikipedia.com/?source=M.%20C.%20Escher&target=Lucky%20Charms)

------
pjmorris
Took some tinkering to hit five degrees, from 'Hemisphere Dancer' (Jimmy
Buffet's old seaplane) to the VC (Vapnik–Chervonenkis) dimension. Rewarding
for the site to say ' _wipes brow_ I really had to work for this one.'

[0]
[https://www.sixdegreesofwikipedia.com/?source=Hemisphere%20D...](https://www.sixdegreesofwikipedia.com/?source=Hemisphere%20Dancer&target=VC%20dimension)

~~~
Tepix
Same with "Yilan Creole Japanese" to "Gustav Eberlein" with a nice 537 paths.

------
e0m
I would love to see a list compiled somewhere of two articles with exactly 6
degrees of separation. This is proving to be extremely difficult. In the
entire HN thread so far, I only see one so far by dkuder
[https://www.sixdegreesofwikipedia.com/?source=Six%20Degrees%...](https://www.sixdegreesofwikipedia.com/?source=Six%20Degrees%20of%20Kevin%20Bacon&target=Phinney)

~~~
jwngr
I'm storing all the search results and will do some analysis on the data.
Maybe I'll even get around to adding a new page with some of the interesting
stuff I find.

------
jxramos
This thing is pretty dang good, "Golden Ratio" to "Pope Pius X", what could
possibly go wrong? Nailed it with "1 path with 2 degrees of separation".
"Infinite Series" to "Declaration of Independence", nailed it in "4 paths with
3 degrees of separation". This is a lot of fun

------
Theodores
This is brilliant. For a long time I have had a 'motorway game' I play with my
brother in law where we identify two cars on the road and then have to come up
with the connection. So often you see badge engineered vans where one partner,
e.g. Nissan rather than Renault, makes the van as seen, so you can match the
other partner, e.g. Renault, in this game, since they are the same company,
then the next vehicle, e.g. some Mercedes car, how do you get the link, and
prove it when at motorway speed?

Clearly this app makes it effortless, you could find some supplier like Recaro
actually make the seats in both, even the van. Or you could find some joint
venture in Brazil that both companies share, who knows... So I look forward to
using this app on the M4 some time soon!

------
tw1010
Does anyone know what the "diameter" of the wikipedia graph is? In other
words, what is the longest shortest path between two wikipedia articles?

~~~
dbranes
Well that question only makes sense for connected graphs, we don't really know
whether this is connected. So in general a version of this question that makes
sense is, among all the connected components of the wikipedia graph, what is
the largest diameter.

~~~
jwngr
This is an interesting question that I'd like to answer now that I have all
the data. I am curious to see how long it will take to find a solution as I
believe even the most efficient algorithms for this have a high runtime
complexity.

And yes, the graph is not connected (there are both nodes with no outgoing
links and with no incoming links), but over 99% of the pages are connected, so
the answer would still be interesting and worthwhile.

~~~
jcranmer
Floyd-Warshall solves the all-pairs shortest path in time O(V^3). Running a
BFS search rooted from every node would find the shortest path in an
unweighted graph in O(V*(V + E)).

~~~
tw1010
Hence, since there's a whole lot of Vs in the wikipedia graph, he's probably
going to be satisfied with an approximate solution, unless he has a lot of CPU
hours to spare.

~~~
jcranmer
There's 5,579,252 articles in English Wikipedia right now, which means the
largest component has 5 million and change or so vertices. Each individual BFS
should take a few CPU-seconds at most, and you can trivially parallelize the
BFS for each node across as many CPUs as you have.

------
Aardwolf
"You'll probably want to choose the start and end pages before you hit that."

Why? There were two random examples filled in the boxes. I have to retype them
manually?

Really interesting and fun other than that! I'm trying something with 7
degrees, but so far most I could get are 4

------
didgeoridoo
Would be cool to see the paths ranked in order of “narrowness”, e.g. ranking a
path with shorter, more specific articles more highly than a path that goes
through a mega-page like “France”.

------
tbirrell
This reminds me of that thing back around 2010 where if you clicked the first*
link on any wikipedia page you'd eventually get to the article on philosophy
or something. I forget which it was exactly, and I have no idea if it still
works, but it was pretty fun back then trying to find a starting article that
couldn't find the "philosophy" article.

*Non-italicised because of the disambiguation suggestions

~~~
tbirrell
It still works. Technically it can be any of the following 25 articles because
once you hit Philosophy, it loops through:

    
    
      Philosophy > Knowledge > Fact > Education > Learning > Evidence > Logic 
      > Ancient Greek > Greek language > Modern Greek > Colloquialism 
      > Vernacular > Human > Neontology > Biology > Natural science > Science 
      > Latin > Classical language > Language > Communication > Subject (philosophy) 
      > Subjective consciousness > Consciousness > Quality (philosophy) > Philosophy

------
onychomys
There's only one path from Mersenne Prime to Bolivia, if anybody was
wondering. Mersenne Prime -> French People -> Bolivia.

~~~
lucb1e
Only one path of three. There are probably hundreds or thousands in four
steps.

------
tachyoff
This is fabulous! Got from cheese to postmodernism in two hops [1]. Really
excellent work, and thanks for making it open source!

[1]
[https://www.sixdegreesofwikipedia.com/?source=Cheese&target=...](https://www.sixdegreesofwikipedia.com/?source=Cheese&target=Postmodernism)

------
jquinby
I got 6 with "dig" to "dug" (though Dig ended up redirecting to "Double J
Radio"):

[https://www.sixdegreesofwikipedia.com/?source=Double%20J%20%...](https://www.sixdegreesofwikipedia.com/?source=Double%20J%20%28radio%29&target=Dug)

------
s_dev
Another similar project from years ago came to the conclusion that
disregarding "list" wikipedia entries the centre was the article on The United
Kingdom. You reach a similar conclusion?

[http://mu.netsoc.ie/wiki/](http://mu.netsoc.ie/wiki/)

------
PinkMilkshake
I like the little trivia that show up, although sometimes they disappear too
quickly.

My favorite mentioned which Wikipedia article had the longest article name:

Suzukake no Ki no Michi de "Kimi no Hohoemi o Yume ni Miru" to Itte Shimattara
Bokutachi no Kankei wa Dō Kawatte Shimau no ka, Bokunari ni Nannichi ka
Kangaeta Ue de no Yaya Kihazukashii Ketsuron no Yō na Mono
([https://en.wikipedia.org/wiki/Suzukake_no_Ki_no_Michi_de_%22...](https://en.wikipedia.org/wiki/Suzukake_no_Ki_no_Michi_de_%22Kimi_no_Hohoemi_o_Yume_ni_Miru%22_to_Itte_Shimattara_Bokutachi_no_Kankei_wa_D%C5%8D_Kawatte_Shimau_no_ka,_Bokunari_ni_Nannichi_ka_Kangaeta_Ue_de_no_Yaya_Kihazukashii_Ketsuron_no_Y%C5%8D_na_Mono))

~~~
jwngr
The full fact list is on GitHub[1]. The fact list was a lot more interesting
and important when searches took longer to run. One of the bad things about
improving the performance so much was the fact that the facts don't have as
long to display :P

[1]
[https://github.com/jwngr/sdow/blob/master/website/src/resour...](https://github.com/jwngr/sdow/blob/master/website/src/resources/wikipediaFacts.json)

------
mirkonasato
I had some fun doing something similar with Neo4j a few years ago:
[https://github.com/mirkonasato/graphipedia](https://github.com/mirkonasato/graphipedia)

(The code may need some tweaks to work with the latest Neo4j version.)

------
pbnjay
It would be a bit more interesting if it could distinguish between the "See
Also" and other tables at the bottom of the pages, and the main content. E.g.
I connected Donald Knuth to a Geneticist simply because they both won awards
(in different categories).

~~~
jwngr
Unfortunately I'm not aware of a way to distinguish between the two. Wikipedia
stores both types of links in the same database. I would love to cull out all
the links in category boxes and sources. If anyone has any ideas, let me know!

------
turc1656
Very cool. And a lot less degrees of freedom between my tests than I expected
- so far everything has a legitimate 3 degrees when I expected much more.
Higgs Boson to Taylor Swift in 3 steps, amazing. Not only that, but 91 ways to
get there in just 3 steps.

This basically quantifies what my wife and I jokingly refer to "rabbit-holing"
online. She'll ask me what I'm reading and it will be something totally
unrelated to what I said I was coming to look up. And she's always like "how
did that happen?" And I never have a good answer. But now I do (if it involves
Wikipedia)!

~~~
empath75
You need to go more obscure -- Fiber Bundle to Lars Ulrich was 4 steps.
Calabi-Yau Manifold to Donnie Wahlberg is also 4.

I haven't gotten 5 yet.

~~~
el_benhameen
I got 9 steps with "Here (company)" to "There (wikipedia disambiguation
page)".

~~~
purell_hack
I got a 9 step one without a disambiguation page.
[https://www.sixdegreesofwikipedia.com/?source=Lion%20Express...](https://www.sixdegreesofwikipedia.com/?source=Lion%20Express&target=Phinney)
Took a long time though.

------
TheLoneTechNerd
You should track the "longest shortest paths" of things and display them - has
anyone managed to hit 10 degrees of separation, for example? If someone finds
one, let us know!

------
deckar01
Those graph node colors are not very color blind friendly. They are easy to
differentiate in the list view, but in the graph with black borders, they are
just too low contrast.

~~~
jwngr
I'll look into it. Interestingly, I'm using[1] one of the default d3 color
scales, which I assumed would be color blind friendly out of the box.

[1]
[https://github.com/jwngr/sdow/blob/a2699dc95d884ec64a4641630...](https://github.com/jwngr/sdow/blob/a2699dc95d884ec64a46416307bebd9c58f76412/website/src/components/ResultsList.js#L19)

------
bencunningham
I made something similar using Wikipedia data, specifically for Politicians
[http://www.poligraph.io](http://www.poligraph.io)

------
apeace
Did I find a bug?

[https://www.sixdegreesofwikipedia.com/?source=Adolf%20Hitler...](https://www.sixdegreesofwikipedia.com/?source=Adolf%20Hitler&target=Elon%20Musk)

It's showing "Bill Gates" and "Mark Zuckerberg" as the hops, but on the start
page I don't see links to those.

(Apologies for the subject matter. It was the first thing I thought of,
because of a Wikipedia-path-finding game I had heard of before.)

~~~
larkeith
It uses a pre-downloaded database of links [1], so presumably certain pages
that were once linked have been updated.

[1]
[https://news.ycombinator.com/item?id=16469427](https://news.ycombinator.com/item?id=16469427)

~~~
apeace
Thanks! Makes me wonder why those links were in there in the first place...

~~~
larkeith
That's an excellent question - I'd speculate that they were perhaps mentioned
in the passage beginning,

    
    
      ...Further, Haffner claims that other than Alexander the Great, Hitler had a more significant impact than any other comparable historical figure...
    

However, I wasn't able to find out for sure with a quick browse through the
page history.

------
raz32dust
Really cool! May be a bug: I tried Martin Luther Ling -> Elon Musk and it
showed me a link from MLK -> Mark Zuckerberg -> Musk. But I couldn't find a
Zuckerberg link from the MLK page.
[https://www.sixdegreesofwikipedia.com/?source=Martin%20Luthe...](https://www.sixdegreesofwikipedia.com/?source=Martin%20Luther%20King%20Jr.&target=Elon%20Musk)

~~~
larkeith
It uses a pre-downloaded database of links [1], so presumably certain pages
that were once linked have been updated. [1]
[https://news.ycombinator.com/item?id=16469427](https://news.ycombinator.com/item?id=16469427)

------
kazinator
Please make it Ctrl+Scroll to zoom (a UI standard obeyed in umpteen
applications!), and leave Scroll alone for scrolling through the page.

Nobody wants to have to click outside the graph view to change the state of
the UI so that Scroll scrolls the page.

(Of course, Ctrl+Scroll is already taken for browser zoom; but hijacking
browser zoom in this situation is more acceptable than hijacking vertical
scroll).

------
rtkwe
I'm confused and can't find the final link in this chain.

[https://www.sixdegreesofwikipedia.com/?source=Late%20capital...](https://www.sixdegreesofwikipedia.com/?source=Late%20capitalism&target=Five%20Nights%20at%20Freddy%27s)

I'm guessing it's part of an older copy of the page? Is there an easy way to
search revisions?

~~~
jwngr
So the reason is that "Theodore Roosevelt" links to "Freddy Fazbear" (ctrl+f
for "Articles related to Theodore Roosevelt" and then click it and then click
"Teddy Bears") which redirects to "Friday Night at Freddy's". So, technically,
there is a direct link there, albeit through the categories dropdowns.
Unfortunately, Wikipedia doesn't have anything in their database dumps
identifying if a link is in the article itself versus the categories dropdowns
or sources, so I include them all.

------
gabrielrc
That's really cool :-)

I have built a very similar project some time ago and although it's not as
beautiful and organized as yours, it's pretty fast! It's in Portuguese, but if
any of you guys want to check it out:
[http://wikigraph.russoft.tech/](http://wikigraph.russoft.tech/)

------
bussierem
Did this happen to be inspired by the Extra Credits episodes about gamifying
education? It's almost exactly that.

Context:
[https://www.youtube.com/watch?v=MuDLw1zIc94](https://www.youtube.com/watch?v=MuDLw1zIc94)

Addendum - I thought it was a really cool idea, and you made it look amazing!
Well done!

------
pcl
Cool! Since you're using a digraph, it'd be neat to have a UI affordance to
reverse the previous search.

------
maze-le
This is hilarious, from 'Riemann manifold' to 'Grindcore': 196 paths with 5
degrees.

[0]:
[https://www.sixdegreesofwikipedia.com/?source=Riemann%20mani...](https://www.sixdegreesofwikipedia.com/?source=Riemann%20manifold&target=Grindcore)

------
glup
I get a generic error for "Mansard roof" -> "Twister (1996 film)", otherwise
very cool!

------
Nullabillity
[https://www.sixdegreesofwikipedia.com/?source=Shit%20happens...](https://www.sixdegreesofwikipedia.com/?source=Shit%20happens&target=Stefan%20L%C3%B6fven)
seems to crash the tab in Firefox, but works fine in Chrome.

------
grzm
Pretty nifty! Nice visualization and execution. It reminded me of a game a
friend of mine wrote as a Greasemonkey script back in the day, where you do
the work: [http://www.playwikipaths.com](http://www.playwikipaths.com)

------
organman91
Not strictly about separation, but a similar chain-of-links phenomenon:
[https://en.wikipedia.org/wiki/Wikipedia:Getting_to_Philosoph...](https://en.wikipedia.org/wiki/Wikipedia:Getting_to_Philosophy)

------
dzink
One challenge with Wikipedia data I've struggled with over the years is that
knowledge graph and interest graph don't directly overlap. Cancer is about as
closely related to Healthcare as Horse is to Astronaut (stronger ties, but
same distance).

------
airstrike
Take from that what you will...

[https://www.sixdegreesofwikipedia.com/?source=Fragile%20X%20...](https://www.sixdegreesofwikipedia.com/?source=Fragile%20X%20syndrome&target=Counter-
Strike%3A%20Global%20Offensive)

------
danschumann
Vanilla Ice is 3 degrees from apocolyptic literature.

[https://www.sixdegreesofwikipedia.com/?source=Apocalyptic%20...](https://www.sixdegreesofwikipedia.com/?source=Apocalyptic%20literature&target=Vanilla%20Ice)

------
tw456
More like seven degrees amirite
[https://www.sixdegreesofwikipedia.com/?source=John%20Arneil&...](https://www.sixdegreesofwikipedia.com/?source=John%20Arneil&target=Iphiothe%20criopsoides)

------
SeoxyS
The longest path I was able to find is this 5-step path:

[https://www.sixdegreesofwikipedia.com/?source=Red%20Solo%20C...](https://www.sixdegreesofwikipedia.com/?source=Red%20Solo%20Cup&target=Lighter%20fluid)

~~~
drinkzima
Disambiguation pages are a good source for bad endpoints (here's a 6-er):
[https://www.sixdegreesofwikipedia.com/?source=Disambiguation...](https://www.sixdegreesofwikipedia.com/?source=Disambiguation%20%28disambiguation%29&target=Zima)

------
jkprow
Wow, this is awesome work! Two of my classmates and I made something related
for our college capstone project:

[https://wikitree.website](https://wikitree.website)

If the creator happens to read this comment, I'd love to compare notes!

~~~
jwngr
Very cool! Your UI is great. I like all the animations and the graph is super
smooth. All my code is open source[1] and it is decently documented. I'm happy
to answer any questions you have and you're more than welcome to use any of
the code I wrote for your project.

[1] [https://github.com/jwngr/sdow](https://github.com/jwngr/sdow)

------
chasedehan
Interesting. I just got sucked into checking a whole bunch of connections. I
like it.

------
tyWWpow
Created an account just to say this: First time in a long time I've seen
something this fun on Hacker News. Just spent the last 10 minutes trying to
see the hidden connections between Bagels and the Atlantic Puffin. Great work!

------
calt
My friends and I used to play this a race. We weren't concerned so much with
the number of hops as how quickly we could get here.

It was always fun reviewing each round and seeing how everyone got to the
destination.

------
akoster
Nice job. Reminds me of wikispeedia:
[http://www.cs.mcgill.ca/~rwest/wikispeedia/](http://www.cs.mcgill.ca/~rwest/wikispeedia/)

------
diehell
Really really cool implementation and i like that UI. This inspires me of
making kinda six degrees separation of companies by the board directors
(businesses) and the political climate in my country.

------
jumpmanjr
Can you create an option to find the most distant connection for a topic?

~~~
jwngr
I think this is going to be extremely computationally intensive. One of the
big performance wins I got when designing the search algorithm[1] was visiting
as few nodes in the graph as possible (which I did via a bi-directional
breadth-first search). To find the most distant node, I'd need to traverse the
entire graph, which consists of almost 6 million nodes. It can be done, but it
would take minutes, hours, days, ...

[1]
[https://github.com/jwngr/sdow/blob/a2699dc95d884ec64a4641630...](https://github.com/jwngr/sdow/blob/a2699dc95d884ec64a46416307bebd9c58f76412/sdow/breadth_first_search.py#L36)

~~~
AstralStorm
Not too expensive, you can use parallel IDDFS to get a good approximation
quickly. (Especially if you pick a good heuristic to follow links.)

Challenging part is keeping track of already visited pages to break cycles -
some variant of a Bloom filter will help.

------
imhoguy
Cool poject and very compact!

Just quck question, I have found that in Initial setup docs:

> Do not use Debian GNU/Linux 9 (stretch) due to degraded performance.

Could you elaborate or give some reference about that issue please? Thanks.

~~~
jwngr
Thanks! It's weird, but as soon as I tried upgrading to an identical GCP
machine running Debian 9 (stretch) and run my database creation process[1], it
took many, many times longer to even download the several GB dumps from
Wikipedia via wget than it did on Debian 8 (jessie). As the beefy machine I
use for the database creation process costs ~40 cents per hour, I figured it
wasn't worth it to upgrade to Debian 9 since the Debian 8 machine finished in
one hour and worked fine. After trying it a second time a month later and
realizing the same thing happened, I added a note for myself in the README
about it. I did find some other reports about it online[2] and just decided it
wasn't that critical to upgrade it at the moment.

[1] [https://github.com/jwngr/sdow#database-creation-
process](https://github.com/jwngr/sdow#database-creation-process) [2]
[https://lists.debian.org/debian-
kernel/2017/12/msg00265.html](https://lists.debian.org/debian-
kernel/2017/12/msg00265.html)

------
crazysim
I would suggest making the Go button work right away on the example query that
fades in and out without having to enter in anything.

Right now it prompts you to enter something if you enter nothing in.

------
lultimouomo
The same thing from 2008, with some fun graph facts included:
[http://mu.netsoc.ie/wiki/](http://mu.netsoc.ie/wiki/)

~~~
jwngr
Yes, Sthephen Dolan's project (along with others) was definitely an
inspiration for me! Although, I like to think I made a lot of performance and
design improvements over his work. I used to have an acknowledgements section
in my README with his name in it, but I seemed to have destroyed that. I'll
make sure to give credit where credit is due and list all my inspirations.

------
V-2
All I get for every search (no matter what terms) is: _" Whoops... something
is broken and has been reported. In the mean time, please try a different
search."_

~~~
jwngr
It was down for a little while due to the traffic but it's back up and running
again.

------
aecs99
This looks great! Nice work. I have a suggestion: Could you highlight a path
when an edge is clicked on? I guess the clicking on the nodes can link to
their wiki pages.

~~~
jwngr
Yes, definitely a feature I'd like to add. I can change the opening the page
on Wikipedia to happen on double click instead of single click. I am still a
d3 noob and need to figure out how to implement the highlight path thing.

------
SuperKiwi
If you want to play this as a game :
[http://2pages.net/wikirace.php](http://2pages.net/wikirace.php)

------
DIVx0
This is great. I often use "find the bacon number of two wikipedia" as a take-
home programming challenge question for software dev candidates.

------
jimnotgym
This is why I come on HN. Skeleton to Neon in two steps

------
diegorbaquero
How are the paths created? I found intermediate nodes that didn't link to the
actual articles. I searched Colombia > Silicon Valley

~~~
jwngr
The Wikipedia database doesn't differentiate links which appear in the main
article versus in the sources or categories sections. It's possible one of the
intermediate links is in there. You sometimes need to do a CTRL+f in "View
Source" to find the link. Also, the latest Wikipedia dump is from February
2nd, so it's possible the link has been deleted since that date. I'll
regenerated my database when the new dump lands in early March.

------
sleavey
Can you add a link to the title image to go back to the homepage? I try this
on pretty much every website out of muscle memory.

~~~
jwngr
I just added the query string stuff in the URL this morning and didn't even
think about this issue. Just made the change as you suggested[1]. Thanks!

[1]
[https://github.com/jwngr/sdow/commit/b9164b4455661d7775aeb78...](https://github.com/jwngr/sdow/commit/b9164b4455661d7775aeb78f5e4b713612d6f299)

------
zouhair
Nice, wasn't there some website when one can play against other to who can get
from on article to another the fastest?

------
intrasight
Seems broke. Always says "You'll probably want to choose the start and end
pages before you hit that."

------
kazinator
Direct link (== 1 deg) between "Lisp (programming languages)" and "C++".

However, 5 degrees between Emacs and Vim!

~~~
fmihaila
> However, 5 degrees between Emacs and Vim!

5 degrees if you pick Vim (disambiguation page). If you pick Vim (text editor)
you get 1 degree. Which is interesting, since Vim (disambiguation page) links
to Vim (text editor).

------
8bitsrule
The 'Brady Bunch' links to 'Idi Amin' through George Steinbrenner. Might have
known.

------
daemonk
Might be cool to implement a weighted relationship based on reciprocal linking
or number of links.

------
knodi123
This is too much fun.

Found 209 paths with 3 degrees of separation from Judge Roy Bean to Unit 731
in 5.35 seconds!

------
OisinMoran
This is wonderful, thanks for sharing!

I know you're getting swamped by feature requests, but a button to swap the
endpoints would be great. In searching for ever bigger degrees of separation I
found myself manually swapping them a lot to see the difference. It could also
lessen some of the confusion expressed in this thread about whether these
should be different at all.

As for how far I've gotten: I've found finite degrees all the way up to seven
[1] and also pages with no path from one to the other [2]. I have yet to find
a doubly-untraversable path or anything with a degree between seven and
infinity, and I suspect the latter to be impossible, especially considering
Goldbach's Extremely Strong Conjecture [3].

[1]
[https://www.sixdegreesofwikipedia.com/?source=Ramjohn&target...](https://www.sixdegreesofwikipedia.com/?source=Ramjohn&target=Jennings%20Formation)

[2]
[https://www.sixdegreesofwikipedia.com/?source=Coln%20Rogers&...](https://www.sixdegreesofwikipedia.com/?source=Coln%20Rogers&target=Wenke)

[3] [https://xkcd.com/1310/](https://xkcd.com/1310/)

------
schindlabua
This is way too addictive. I managed to get a 7!

Limit and colimit of presheaves -> Nichkesaisk Formation

------
IronWolve
Seems when I search a person, its mostly LinkedIn and twitter to the other
match.

------
hydandata
Erlang programming language to Barbra Streisand crashed the browser tab :)

------
hestipod
This is cool. I love reading Wikipedia so this will enable my habit.

------
kahlonel
This was fun. It somehow connected towels and croissants.

------
luizfzs
Brigadeiro to Half-Life in 3 steps. Seems interesting.

~~~
mmanfrin
That is the closest the number 3 has been to Half-Life.

------
wglb
This is a lot of fun. Thanks for doing this.

------
gthinkin
This is awesome! Keep up the great work.

------
mwcmitchell
awww man perfect! i remember playing the wikipedia game a ton in a hs business
class, thanks for this!

------
mrgill
This is pretty cool. Great job!

------
bjornlouser
prokaryote -> hot spring -> FDR -> libertarianism

------
steffenabel
It reminds me of that old xkcd: [https://xkcd.com/214/](https://xkcd.com/214/)

------
IntronExon
9 paths with 3 steps from Gluon to Lemur.

No way this is going to be addictive. _Disappears for days_

Edit: I finally got 4 steps! Clitoris->Dictation

I am a child.

~~~
MichaelMoser123
Best result: No path of Wikipedia links exists from Bertolt Brecht to 97.

Found 87 paths with 5 degrees of separation from Asteroid family to SIX

Found 460 paths with 4 degrees of separation from Imaginary unit to Borscht

Found 460 paths with 4 degrees of separation from 433 Eros to Shooting of
Oscar Grant

~~~
MichaelMoser123
Found 3 paths with 6 degrees of separation from SIX to Separation

Found 571 paths with 6 degrees of separation from Sepulchre (comics) to
Separation

------
dingo_bat
This is probably the best showHN I have ever seen. Efforts like this inspire
me to not be lazy.

------
nukeop
So by default I block all javascript and I got this message:

"Sorry internet hipster, this little side project requires JavaScript."

These kinds of condescending messages towards people concerned with privacy
aren't going to win you any points here.

~~~
quickthrower2
It's more funny than condescending.

------
johnbatch
Found 2,318 paths with 3 degrees of separation from Firebase to Precipitation
in 16.80 seconds!

[https://www.sixdegreesofwikipedia.com/?source=Firebase&targe...](https://www.sixdegreesofwikipedia.com/?source=Firebase&target=Precipitation)

------
gingericha
Found 1,357 paths with 5 degrees of separation. 24.84 seconds. _wipes brow_

[https://www.sixdegreesofwikipedia.com/?source=Roopmati&targe...](https://www.sixdegreesofwikipedia.com/?source=Roopmati&target=Gravity%20Kills)

