would be fun to measure the word/context similarity for each pair and order by that..

I gave that a shot using Wordnet. It's flawed for a few reasons. A lot of the more unusual words in his list aren't in Wordnet. I also had the stem the words first, and had to pick just the most used sense if the word had multiple senses.

So, the list is only those pairs that made it through the processing.


Would have been better if I had also weighted using his approach, as pairs like "theater theatre" end up on top.

Still interesting though. It found these, which I liked:

  0.33  swinger wingers
  0.33  parrot raptor
  0.33  borsht broths

I also tried a variation based on "Anagrams by Percent of String Not Part of Common Substring, then Length"

It's flawed too, but does put some good ones at the top.


