The 'Spelling Bee Honeycomb' puzzle: efficient computation in R

norvig · on Jan 17, 2020

Very nice, David and William.

I was also able to get to about a 2 second solution, but I didn't have to do bit operations: https://nbviewer.jupyter.org/github/norvig/pytudes/blob/mast...

wchargin · on Jan 17, 2020

Hey, I wrote one of these, too! Mine generates and solves all possible puzzles (with my dictionary, there are 54733) in about 1.8 seconds on my six-year-old laptop, and can also typeset them to LaTeX+TikZ to make printable puzzle sheets. That’s 0.03 milliseconds per puzzle. It turns out that you can get this thing to run quite fast once you realize that you can pack words into bitsets and express the whole algorithm in terms of bitwise operations on machine words. Of particular interest is that the running time is output-sensitive: proportional to the number of solutions to the puzzle but not the total size of the dictionary (after one-time preprocessing).

I used the “hard copy” rule set as used in the printed New York Times, which is slightly different from the web rule set: words must be at least five letters long, not four, and are worth 1 point each, or 3 for a pangram. The hard copy puzzles also include three score thresholds, so I had a bit of fun trying to reverse-engineer how those thresholds are chosen. I didn’t get exactly the right function, but I got fairly close, and most importantly the thresholds feel fair when playing. (The online version also has score thresholds, but there are many more of them, and it was easy for me to transcribe thresholds from the archives of the printed copies.)

In my admittedly biased yet assuredly humble opinion, both the algorithm performance and the rating threshold estimation are interesting: https://github.com/wchargin/spelling-bee/tree/master#perform...

Oh, and a web-based solver, for convenience: https://wchargin.github.io/spelling-bee/

Lovely to see how the author of this article and I both had a lot of fun with this by taking it in different directions. :-)

matrixlab · on Jan 17, 2020

Very nice! I also took a bitwise approach and ended up with something that solved all possible puzzles in 5 seconds (in 60 lines of matlab). A technique I used that helped bring the time down was to calculate the points for a given puzzle irregardless of the center letter. This provides an upper bound for the points that could come from that puzzle. After you've done that for all the boards, you sort by upper bound and then start calculating the actual scores for the different center letter options. Doing this for the 538 puzzle, I only had to look at center letters for the top 2 puzzles 'aeginrt' and 'adeinrt' (after those you find a score that is greater than the upper bound for any of the other puzzles).

JoeAltmaier · on Jan 17, 2020

Its amazing the speedups if you have the right model for the problem!

jim-jim-jim · on Jan 17, 2020

Very cool. I also wrote a solver using the NYT magazine's rules back in 2018, and found the bitwise method of matching both 1 and 3 point values incredibly fast. I just precalculated the hashes and ran a parallel loop over the whole array, but I know there are specialized data structures that games like Scrabble use. I wonder if that would have sped things up even more.

Anyway, whenever I see a problem that involves letters now, I reach for bit operators first.

swilliamsio · on Jan 17, 2020

You can play the daily New York Times Spelling Bee puzzle online[0]. It's free to play up till you get more than a certain number of points, then you require a subscription.

I made a little clone of the game to get around these restrictions[1].

[0] https://www.nytimes.com/puzzles/spelling-bee

[1] http://www.swilliams.io/fun/hex/

ncmncm · on Jan 17, 2020

I wrote one to generate all possible versions of the weekly puzzle (minimum 5-letter words) that score between 26 and 32 points, that runs in under 70 ms. It finds about 4.8k puzzles, for 14us per. (Nowadays they run puzzles that score as low as 20 (pfft).)

https://github.com/ncm/nytm-spelling-bee/

I did both C++ and Rust versions. They run in the same time. My goal was the fastest single-threaded program that would fit on exactly one printed page. My first version ran in about a second, and I spent months shaving off milliseconds. Rust in 2016 was much more sensitive to arbitrary details of the source code. Maybe it's better now.

The final version does enough setup to enable most of the time to be spent in a four-instruction loop that a modern core can do in a single cycle per iteration, with occasional detours.

Compilers (clang is especially insistent) prefer to arrange the loop to take two cycles, and save a cycle on the long detours instead. I filed a bug on Gcc over it preferring the slow loop, but they decided the slow version was better:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=67153

I absolutely recommend spending months optimizing a little but complicated program. I learned so much. Most things you think will make it faster make it slower.

The solution to any such puzzle is an easy one-line shellscript, also on the github page.

paulette449 · on Jan 17, 2020

I've started playing this puzzle every day. If you get every word that NYT expects, you get a "Queen Bee" award, the trick is that they don't tell you what the Queen Bee target is, and more often than not there is at least one word I've never heard of.

The words are revealed the next day on the puzzle page, or if you're impatient, they are in the source code for the Spelling Bee page each day.

hoseja · on Jan 17, 2020

>it must not contain the letter S (that would be too easy)

I don't understand. Why S?

madcaptenor · on Jan 17, 2020

If the puzzle contains S it will be very easy to form plurals and therefore run up the score.

madcaptenor · on Jan 17, 2020

More specifically, if you include S as a possible letter, it appears in the second-most words (53,556, compared to 63,445 for E). But once you have some words it's natural to just stick S on the end of everything (this will work for most nouns and verbs) and this makes the standard spelling bee puzzle uninteresting. (I don't think it makes the meta-puzzle less interesting, though.)

If you run Robinson's code but omit the filtering out of words with S, the best puzzle is E/AINRST (8681 points). The previous winner, R/AEGINT at 3898 points, is now only 442nd-best. (Notation: I put the required center letter before a slash.)

Interestingly, one of the five puzzles that's tied for 15 points, the second-worst possible, includes an S. The puzzle is Q/CIORSU - the words are CROQUIS and SUQS.