Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: WikiBinge – discover how all things are vaguely connected (wikibinge.com)
246 points by jamez on April 14, 2023 | hide | past | favorite | 61 comments
Connect two articles on Wikipedia, but do it the long way. I've always been a fan of the theory of six degree of separation, but it's an overused concept when exploring the Wiki-graph.

Instead of showing the shortest path, which in my opinion is "boring" and ends up connecting super-important central articles, I came up with my own method: WikiBinge selects the smaller, less represented articles on Wikipedia. In a WikiBinge path, the underdogs are the kings!

How does it work? It's pretty straightforward! Compute PageRank on the Wiki-graph and assign as weight of each edge the PageRank value of the destination node. A WikiBinge path is then simply a shortest path using these weights: the algorithm will then favor paths passing through articles with lower PageRank values.

More on the motives to build this here: https://www.jamez.it/project/wikibinge/

This is an older project of mine, but it never got much exposure, so I'm humbly submitting it now.




This is absolutely amazing.

I built wikiscroll.blankenship.io for myself to scratch my neophile itch. You might be displacing it in my daily routine, a nice pre-built rabbit hole between two topics of interest has proven to be a lot of fun over the past 30 minutes.

Amazing work.

As a short aside, at first I didn't get it. I was surprised the paths between articles were so long. It wasn't until I tried "Adolf Hitler" -> Something (Hitler has notoriously short paths to everything) that I realized these weren't the shortest paths. Your loading text does a really great job of explaining that, but the "random" button appears to be pulling from a cache (clever!) so I didn't get to see that loading message about the "boring shortest path" until I went off the beaten path.

Since it seems like you are computing both the shortest and the "most interesting" path between the two articles, it would be cool to give me a way to see both on the final loaded page. The shortest path is interesting too, even if it is less interesting than the one you ultimately generate.

It'd also be cool to be able to "pin" one of the boxes so the random button only impacts the other. For example, if I started at the Great Molasses Flood, what path could I take to random other articles? Though I guess this can be accomplished by spinning and then retyping the "Great Molasses Flood"

Edit: I deeply appreciate your narrative at https://www.jamez.it/project/wikibinge/ - this is one of my favorite projects I've come across on HN in a long while.


> Since it seems like you are computing both the shortest and the "most interesting" path between the two articles, it would be cool to give me a way to see both on the final loaded page. The shortest path is interesting too, even if it is less interesting than the one you ultimately generate.

I agree. Sometimes it loads fairly quickly and you only get a second to look at the shortest path.


Thank you both, this is a good suggestion. I'll leave the short connection on screen, comparing it with the long and windy path is fun.


This is funny, pressed the dice icon and I got "Milk" and "Cookie" and I thought it was going to be a short connection. It isn't. https://www.wikibinge.com/#Milk/Cookie


The generated paths are not shortest. Click the "About" section for details. It intentionally generates longer paths through smaller articles.


I'd love to be able to customize the algorithm. I'm curious what would happen if you placed various weights to optimize for shortest path AND most niche path (or however you'd describe this)


Just using my Wikigame skills, I went from Milk -> Hot Chocolate -> Chocolate -> Chocolate chip cookie -> Cookie. This path likely got downranked because all would be popular articles.


That is a shockingly long chain. Amusingly if you flip them, Cookie links directly to Milk.


It’s only about 70 jumps from Allosauruses to bodybuilder and trainer Mark Rippetoe


This might be the first time I wish I’d read the comments before visiting the link. Assuming it would find the shortest path, I gave it two very unrelated concepts. I spent way too long exploring a very long chain thinking “How can this be the shortest path?”

So now I want to know, is there a similar tool that does shortest path? Because that would be fun too.


> is there a similar tool that does shortest path?

Yes very many of them, like OP said.

Here’s a popular one: https://www.sixdegreesofwikipedia.com/


My first attempt I can't search donut (Doughnut is the canonical), can't type the "()" parenthesis that appear in page names, and can't use any of France, La France, or French Republic to indicate the wiki page on France. Lots of francesca's though.

Fun fun, thank you for sharing! In the interactive web interface*, I hope non-canonical names can be used, that shortest names can be completed and exact matches can be use, and at least accept what it's in page names.

*It looks like writing the URL fragment yourself allows more leniency.


Unimpressed - I tried to get a chain from "Mocca, Yemen" to "Jimma". I got a whole long list of stuff about aircraft engines (I do get that the author was trying to make non-obvious links).

Mocca is a port on the Red Sea in Yemen. The Ottoman Turks used to require all shipping entering the Red Sea to put in at Mocca so their coffee could be taxed. Mocca is a port town; they don't actually grow coffee in the town, it comes from a mountainous region to the North.

Jimma is a coffee-producing region in Ethiopia, south of the entrance to the Red Sea.

I just bought 250g of beans labelled "Mocca Djimmah"; the vendor couldn't tell me whether it came from Yemen or Ethopia. My guess is that exports from Yemen are "challenged" just now, but I'd like to taste some coffee from the original home of coffee.


In high school my friends and I played a Wikipedia game in which one of us hit the random button to find our "goal" and then had to navigate back to the goal from the Wikipedia home page using only links. It was usually possible to do so and the first person to the goal won.


I used to play a drinking game with the lads. You had to name one thing/person/event and had to click on "Random Article" on Wikipedia and land on that particular concept in 5 clicks or less. Good times...


Excellent. 32 degrees away, and brings in Air Bud.

https://www.wikibinge.com/#John_Wilkes_Booth/Hentai


I tried to go from "Bronze Age" to "Mail" and it took me in a long ride including the dog from Oz. Quite fun. https://www.wikibinge.com/#Bronze_Age/Mail


It might be interesting to see this but with paths where they're instead weighted based on the strength of the relation (e.g., something like TF-idf on the articles each link to).

I think this would avoid the super common article problem, but also lead to more relation between each link.


TF-idf would definitely be something very interesting to try, though I also treasure the serendipity brought by the "blindness" to the content.


This is pretty neat! I tried https://www.wikibinge.com/#Moray_eel/Sony_Alpha and I was pleased with the path it took me for first big chunk of the path. The last bit was a tour of a significant number of camera models (none of them Sony), which felt... strange? It certainly felt less-varied than the combination of animalia, history, geography, and pop culture that the first part took me through.

Fun project, thanks for sharing!

If I had a bit of feedback to share, it's that the shortest path (which shows while loading the binge) continues to be visible after it finishes loading -- maybe at the bottom of the page?


Interesting. Seems like it'd be ideal if there were some way to penalize pages that are too similar to each-other (similar categories/taxonomy, maybe similar text/structure, etc.) because when it does chain a bunch of weird stuff it is very interesting.


Cool path! The last bit of the tour you described is fairly common - the algorithm is just doing its job, getting closer to the target, avoiding articles with larger PageRank. It's very uncommon to pass through "important" pages. Glad you enjoyed it!


Asked to connect two small town on different continent.

One of those town manufacture car parts. Then I got 10ish car model from various brand.

An actor that did a commercial for one of those brand 20 years ago.

He grew up in the second town.

Cool. But loose


Not sure I get it, but on my first use it pointed out connections between Dehumanization and Memes that I had not previously considered. Curious.


The chains are so long, it's really not that impressive :/


The point is that the chains are long and winding instead of the shortest path between two articles.

It seems that, if you pick an uncached path, the loading screen shows you the shortest path while it computes the longer one. More info in their linked article.


I wish there was a toggle to always see the shortest path instead.


If you want the shortest path just use one of the very many websites that does that. For example https://www.sixdegreesofwikipedia.com/


Nice! My wife and I used to play a game sort of like this - find the shortest path between two pages on Wikipedia. It actually made for a fun party game too.

I really love the circuitous path though. Fantastic route to discovery and I can see those even being a neat thing for schools


It is a pretty fun game https://wiki-race.com/



You two sound awesome together.


https://www.wikibinge.com/#Electoral_college/The_Long_Earth

Apparently Humptulips, Washington was Terry Pratchett's favourite place on earth. :)


Botswana National Front to The Cheesecake factory was truly a wild ride, meandering between military hardware, old airplane designs, African social movements, some random celebrities, municipalities in Europe and more. Fun stuff.


Anyone got anything longer than this? (68)

https://www.wikibinge.com/#Madurai/Semiahmoo_Bay


I count 91 articles from FreeBSD to Weedonville, Virginia

https://www.wikibinge.com/#FreeBSD/Weedonville,_Virginia


I count 100+ articles between these two. It is a pretty long path and that is the point I guess. https://www.wikibinge.com/#Amy_Goodman/Anatoly_Karpov


My first search counted 118:

https://www.wikibinge.com/#George_Bush_Intercontinental_Airp...

I imagine you could double this.


Pretty cool project! FYI for some reason search doesn't seem to work that well.

hackNY won't come up and if you try try to add a place with a comma (Lowell, Massachusetts) you can't type it you have to scroll it.

https://en.wikipedia.org/wiki/HackNY

https://en.wikipedia.org/wiki/Lowell,_Massachusetts


I couldn't get NODAPL either. Seems to have trouble with uppercased letters?


Horse and Astronaut are more closely connected (through the shows, road, actors) than Cancer and Healthcare which are not connected at all, apparently. This was the problem with public datasets and counting on HTML links to build the graph of human knowledge years ago. I tried to build better search at the time and hit this wall. Large language models will be a bonanza for new products now that the wall is broken.


To be clear - the point is not to display what can or cannot be connected. The emphasis on this project is about the long and tortuous path chosen.

I take your point about the limits of knowledge graphs written manually vs LLMs. IMHO it's not either/or. We need both curation and statistical approaches, and when they are merged they give the best results. Just ask Wolfram: https://writings.stephenwolfram.com/2023/03/chatgpt-gets-its... Edit: fixed link to Stephen Wolfram's blog.


Not "all things." I can't enter arbitrary titles of wiki articles, like "mathematics" and "poetry."


Strange, my first guess had no connections either way

chicken nugget <-> constitution of canada

Now I'm wondering if its a bug or if there is actually no connection



Would be great to be able to customize the randomness on the graph. Granted it's a hard problem to solve. I picked two unrelated topics and it seems like most of the bridge is obscure athletes and teams. Maybe excluding certain categories, like sports, in my case would be more helpful.


This is excellent. The path from "George Barnes (Musician)" to "Django Reinhardt" somehow managed to pass through Hysterectomy, 5-Bromouracil, Glucuronidation and Port Bannatyne (Scotland). Kudos for some creative coding!


Here is a challenge, what is the longest “shortest path” across all wiki?


FWIW worth mentioning "bacon number" as I don't see it here or on the page and it's a similar(ish) concept.

I managed ~50 intermediates on this tool from [my home town] to [Kevin Bacon].


Good news: Marc Dutroux is in no way related to quantum superposition.


Lindsay Marshall (catless/bifurcated rivets) and I played with doing this informally as a game back in 2002 or so, by hand.

Well done coding it up. The average pathlength will be fascinating.


There is a game where the aim is to navigate between two Wikipedia articles with as few hops as possible. I guess it's now vastly simpler to cheat at this...


I’d be careful trying to cheat using this tool. By the sounds of it, you’d lose every time.


yes it's called six degrees of separation... and there are tons of tools already built to optimize that. Did you read the post at all? lol


Would be interesting to see some transformations like sqrt(pagerank) weight to get the chains more like 10-12 steps instead of 60.


it's nice that it highlights less popular and obvious - i've discovered a lot linking potato to the Millenium Falcon :) https://www.wikibinge.com/#Potato/Millennium_Falcon


This is great fun. Though my first guess had no connection: Cleopatra and The Great Pacific Garbage patch


When you can't find a connection, always try the reverse! https://www.wikibinge.com/#Great_Pacific_garbage_patch/Cleop...


that a nice project, hope you will gather enough resources to make for other language if wikipedias.


hmm, that six degrees of wikipedia bridge is little weird. i thought you can land from everywhere in few clicks by hitler. but it takes over 70 steps to get dill from hitler.


Ciccio!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: