Solving Wordle in 3.64 guesses on average, 99.4% of the time

npinsker · on Jan 23, 2022

I wrote a solver a few weeks ago that solves easy mode in 3.45 guesses and hard mode in 3.55 guesses. Rather than using heuristics, it explored a large portion of the game tree (with a little pruning) and I'm hopeful that it's optimal. Interestingly, the EV-optimal hard mode strategy takes 3.52 guesses on average, but requires 7 guesses a tiny fraction of the time, so I instead exposed the best strategy that always requires 6 or fewer guesses.

If you're interested, you can play with it here: http://www.npinsker.me/puzzles/wordle/ and the source (written in neophyte's Rust) is here: https://gist.github.com/npinsker/a495784b9c6eacfe481d8e38963...

(edit: It's posted elsewhere in the thread, but this is actually not optimal and someone else achieved better results! http://sonorouschocolate.com/notes/index.php?title=The_best_...)

hgibbs · on Jan 23, 2022

As a point of curiosity, what do you mean by optimal?

To me, I can see at least two (potentially different) cost functions that a Wordle strategy can aim to minimise. Firstly, you can try to minimise the expected number of guesses, and secondly you can try to minimise the expected number of guesses conditional on never losing the game. In principle you could optimise for the first but not the second by allowing a small list of hard words to take > 6 guesses on average.

Part of my point is the lack of an absolute definition of optimal: strategies are only optimal up to the specified coat function.

svat · on Jan 23, 2022

IMO, never losing the game is the bare minimum for any reasonable cost function: never take > 6 guesses. Beyond that, you can have different optimization functions based on whether you're trying to minimize the worst-case depth of the decision tree (number of guesses) for each position, or the average depth, or some other reasonable function of the distribution over number of guesses.

Although you could have arbitrarily many cost functions, a fairly general class is the following: pick some nonnegative constants (w1, w2, w3, w4, w5, w6, w7), and for a certain strategy, define the cost as:

    w1*n1 + w2*n2 + w3*n3 + w4*n4 + w5*n5 + w6*n6 + w7*n7

where nk (for 1≤k≤6) is the number of hidden (solution) words for which the strategy takes n guesses, and n7 is the number of words for which the strategy loses the game. Then,

• Setting w7 infinite / very high is a way of encoding the condition that you not lose the game. (You can set it finite/small if you don't mind occasionally losing the game for some reason!)

• Setting w1 = 1, w2 = 2, …, w6 = 6 will simply minimize the average number of guesses,

• Having them grow exponentially, e.g. setting wk = c^(k-1), where c is some constant larger than 12972 (the number of acceptable guess words) will make sure that the minimal-cost strategy first minimizes the maximum number of guesses, then break ties by having the fewest words take that many guesses, and so on.

tshaddox · on Jan 24, 2022

> IMO, never losing the game is the bare minimum for any reasonable cost function: never take > 6 guesses.

I wouldn’t think so. If I get 99.9% of rounds correct with 3 guesses (and fail to find a solution to 0.1% of the rounds), and you get 100% of rounds with 6 guesses, I’d say I’ve soundly defeated you 99.9% of the time.

svat · on Jan 24, 2022

Right, that point of view makes sense if you imagine two players separately playing Wordle and comparing their results, and indeed that's what the OP may have been thinking. But as far as I know, Wordle is primarily a game for one player to play against the computer, where eventually getting to the word (or not) is the most prominent outcome (the only notion of "win" or "defeat" in the game itself), and solving it in fewer guesses is just a bonus. But sure, if you don't care about occasionally losing, you can just set an appropriate weight for w7 within the same family of cost functions.

tshaddox · on Jan 24, 2022

Indeed, it’s essentially a single player game, but we’re talking about which bot a player would want to use which is inherently a notion of bots competing against each other.

rob74 · on Jan 24, 2022

If you look at the failed attempts in the article, you'll see it's not so easy to get 100% correct with 6 guesses. For example:

SLATE, CRONY, HOUND, BOUND, POUND, MOUND (correct: WOUND)

SLATE, SHARE, SHAME, SHAPE, SHAKE, SHADE (correct: SHAVE)

...so sometimes even if it gets 4 of 5 letters right on the second try, it still has too many options left. Of course in cases like this it could go "one step back" and try a word with (most of) the letters that are different between these words (HBPMW or MPKDV). So try different words until the number of options is narrowed down sufficiently.

balls187 · on Jan 24, 2022

> I’d say I’ve soundly defeated you 99.9% of the time.

Except it wouldn't be 99.9% of the time. Just the games were you guessed fewer than GP.

GP's solver would expect to guess the correct word 100% of the time in fewer than 6 guess, but not that it would always take 6 guesses.

tshaddox · on Jan 24, 2022

I was saying if you hypothetically got every correct guess in exactly 6 tries. You would pass the suggested “bare minimum” while my 99.9% strategy doesn’t, yet I claim my strategy is better.

guipsp · on Jan 24, 2022

For some value of better, sure.

tshaddox · on Jan 24, 2022

Sure, where "some value of better" is "wins in the overwhelming majority of rounds."

divbzero · on Jan 24, 2022

> never take > 6 guesses

I suspect that with most languages it’s impossible to achieve 100% win rate for five or more letters. So w7 should be by far the largest weight but I don’t think it can be infinite.

rik2 · on Jan 27, 2022

I suspect the opposite. While the theoretical number of combinations is huge, the amount of information given by each guess grows faster than the number of real words of longer lengths.

furyofantares · on Jan 23, 2022

They specified that in the post, they were able to constrain to a strategy that guarantees 6 or fewer and then optimize for EV to get 3.55, but could get down to 3.52 by allowing some small number of words to take more than 6.

I would be interested to know if 6 is the lowest you can constrain it. Can you guarantee a solution in 5, and if so what is the best EV with that constraint?

npinsker · on Jan 23, 2022

Yes, you can -- I believe the EV was around 3.47 for that. The "greedy" strategy (minimizing the worst-case size of the largest set) also ends up using at most 5 words.

Perhaps unsurprisingly, you can't guarantee at most 4 words.

Sesse__ · on Jan 23, 2022

You can get EV of 3.46393 with a guaranteed maximum of five tries. (I don't know whether this bound is tight.)

pxx · on Jan 23, 2022

This bound is not tight. http://sonorouschocolate.com/notes/index.php?title=The_best_...

Sesse__ · on Jan 23, 2022

Good to know.

Another possible question is: What is the best strategy minimizing (at any point) _first_ the maximum number of tries, and second the EV (average number of tries over all possible words)? That's subtly different from either answer.

hervature · on Jan 23, 2022

Funnily enough, I think neither of those notions are the correct thing to optimize. For me, you want to minimize the maximum number of guesses. This leads to an equilibrium. Otherwise, your opponent can just abuse your strategy.

hgibbs · on Jan 23, 2022

My point is exactly that though - the choice of cost function is subjective. There is no best cost function, just a mapping from costs functions to optimal strategies.

hnkain · on Jan 24, 2022

The (exact) optimal EV-minimizing strategies (assuming each target word is equally likely) always solve in 5 of fewer guesses in normal mode, and always solve in 6 guesses in hard mode. If you wish to guarantee 5 guesses in hard mode, there's a different (optimal) strategy with slightly higher EV. See: http://sonorouschocolate.com/notes/index.php?title=The_best_...

I don't think a maximum-number-of-guesses optimizer leads to an equilibrium (if you mean Nash equilibrium), assuming you're playing the game where the score of the game for the setter is the number of guesses. In particular, if the solver's strategy is deterministic, the setter will always pick one of the words that needs 5 guesses. There's no reason to think the nash equilibrium can be easily found.

noxvilleza · on Jan 23, 2022

Adversarial strategies are another type of problem entirely (as in, if you're playing Word Mastermind). Wordle is a 1-player game, so I find hgibbs's comment totally reasonable. I would think that 'optimal' is more accurately his second definition - minimizing avg guesses but always winning (as in, losing is +inf guesses).

pxx · on Jan 23, 2022

Isn't that just #2, but changing the definition of losing to be the least number of rounds where you can always win (which for Wordle is 5)? These small changes to the definition don't seem to change ev (or the code needed to generate it) that much.

If you don't care about optimizing the EV at all, the greedy tree already has the property you're interested in.

eru · on Jan 24, 2022

What do you mean by opponent?

The entity choosing the word in a game like Wordle or Hang Man is more like the dungeon master in a game of D&D than an opponent in the traditional sense? They are there to help you have fun.

tshaddox · on Jan 24, 2022

If two players are competing in a single round of wordle, then surely the player who guesses the correct word in the fewer number of guesses wins, right? Then it seems natural to say that the better of two players is the one with the most wins after playing all possible rounds.

The only problem is that I think this definition leads to the possibility of non-transitivity, where A beats B, B beats C, and C beats A.

ycombobreaker · on Jan 23, 2022

I've seen a number of solvers; but my partner has forbidden me from writing one, forcing me to do everything in wetware. 3-4 guesses seems right intuitively. I saw some speculation that the dictionary was well-known, is that true?

Seems to me that the problem as-stated allows the guesser to discover a nonword "for free", but otherwise expects the dictionary to be "all five letter words" which is much larger than the real solution space. Though, maybe in practice that doesn't change the solvability by much.

kevlened · on Jan 24, 2022

The wordlist is more than well-known, the word for each day is predetermined. The entire game runs in the browser and the selected word is based on the date. Here’s a quick script based on any date you select: https://codepen.io/kevlened/full/NWaOmPw

tomlockwood · on Jan 24, 2022

This program actually inadvertently created a quite philosophical discussion amongst my friends! Some of them contended that writing a program to solve wordle sucked the joy out of the process of solving it, even if I solved the puzzle myself, first.

moritonal · on Jan 24, 2022

I once caused a vague division in my secondary school by writing a (looking back now) horrendously hacky OCR solver for our "world maths day" challenge.

Literally half the teachers thought it was cheating and I should be punished, whilst the other half thought (because I hadn't abused it to win prizes) that it showed a creative use of more modern skills to make up for my renowned weak mental-math.

I saw the joy in writing the solver as much more interesting than the math. Same way I can't focus on sudoku given I always try to "solve" solving it.

ycombobreaker · on Jan 24, 2022

Yes it's interesting, precisely because it's intuitively "solvable" by machines. There's joy in solving by hand, and joy in creating a solver, but probably no joy in running your solver every day and sharing with your friends. Because every decision is some maximum-likelihood estimate, there's no tension or drama in the sequence of guesses.

In fact, simply the act of creating an infallible solver (100% likelihood of 6 guesses or less) demonstrates the futility of competing with their soft, fleshy skills.

I do think any solver that uses the solution-set is "cheating", but I suspect it makes little difference in the total solution (like... 2-3 bits of additional information needed, just a fraction of a round).

eru · on Jan 24, 2022

You might find some enjoyment in writing solvers that compete with your friends' solvers?

You could either compete on number of guesses. Or you could fix the hardware (eg a Famicom), and compete on runtime?

tomlockwood · on Jan 23, 2022

Oh dang! I was guessing that the optimal approach would be under 3.5 for easy mode, but was expecting it to be closer to 3! Amazing work. My next go at the problem is in Rust, which I am a total newcomer to. Thanks for sharing!

theolivenbaum · on Jan 23, 2022

Nice! I was wondering almost the same thing this morning while my wife was playing Wordle - ended up coding an assistant to help you choose the optimum words.

If anyone wants to check it out, I published it here https://curiosity.ai/wordle/ - source code is also available here https://github.com/theolivenbaum/wordle-assist.

bscphil · on Jan 24, 2022

If you just brute force solve with greedy search (always pick the word that results in the highest average reduction in remaining possible words), you get a solution with an average of 3.47. Even only guessing words on the primary word list (which Wordle uses to pick the word of the day) you can get 3.51. That's as far as I've chased the problem.

I'm surprised to hear that optimal play only gets you 3.4212 (according to the link you posted).

forrestthewoods · on Jan 24, 2022

Neat!

The graph of every possible game for Hard mode should be pretty tractable. Sounds like you wrote almost exactly what I started but have been too lazy to finish.

> but requires 7 guesses a tiny fraction of the time, so I instead exposed the best strategy that always requires 6 or fewer guesses.

That's awesome. My question is "what is the minimum number of guesses to solve every word while guaranteeing it never loses". For hard mode I'm pretty sure this can be provably brute forced. Easy mode requires enough pruning I'm not sure it can be proved.

stavros · on Jan 23, 2022

That looks great! It would be very interesting to see the set of words "left" after a guess, too.

desbo · on Jan 23, 2022

This is great. But it seems to fail with words containing repeat letters, for example, with my word as BOOKS, not on hard mode:

SOARE JUMBY BOOST (at this point, it claimed to have won!)

pxx · on Jan 23, 2022

books isn't a valid solution. it's on the guesslist but not the wordlist.

mrb · on Jan 24, 2022

Correct. In fact, thousands of plural words are allowed as guesses, but the solution will never be plural. Source: I grep'd the lists of solutions and authorized guesses.

eco · on Jan 24, 2022

I used the combined solution and valid word list when I was playing around with a program to find good starting words. Feels too much like cheating otherwise.

Humphrey · on Jan 23, 2022

It fails with the word "frogs", as it tells me it is "gross"

nerdo · on Jan 24, 2022

Frogs isn't in the wordle list: https://github.com/tomlockwood/wordle-solve/blob/main/lib/wo...

thxg · on Jan 23, 2022

This [1] solves it in 3.4212 average guesses 100% of the time and claims optimality, which is 99% of the difficulty it seems. For the limited 2671-word set of possible words, I think the question is now settled (for the whole 12k-word set, I don't think anyone tried anything).

It was posted here four days ago [2].

[1] http://sonorouschocolate.com/notes/index.php?title=The_best_...

[2] https://news.ycombinator.com/item?id=30006724

ottowinter · on Jan 23, 2022

Neat, didn't know that existed! Definitely would have saved me some time :)

FWIW, I wrote a similar solution a couple of days ago as well, and come up with the same result (3.421166 guesses on average, with starting word SALET). So that would support that result.

With the following strategy: https://drive.google.com/file/d/1WvxRRzbvDVnHZUczHBZAko3hKft..., code: https://drive.google.com/drive/folders/1Y8k685PS0wxvYulIxsPg...

hddqsb · on Jan 25, 2022

There is a leaderboard at https://botfights.io/game/wordle for solving with all 12972 words as possible solutions, it might be of interest.

hnkain · on Jan 24, 2022

It's a bit disappointing that there's been many writeups of heuristic-based solvers make the front page, but the writeup of an exact solver doesn't make it. [Note: I was the person who submitted the link to the exact solver writeup to HN].

https://news.ycombinator.com/item?id=30006724

thxg · on Jan 24, 2022

Just to make things extra clear, the post we are commenting on has

- 99.4% in 6 guesses or less

- 3.64 guesses average

- no attempt of any formal optimality result,

while the one you posted has:

- not just 100% in 6 guesses or less (so 100% wins not 99.4%), but actually 100% in 5 guesses or less!!!

- 3.42 guesses average

- a fairly convincing optimality proof (as far as average is concerned) which was apparently the only hard part computationally.

- it also had 3 upvotes until i cross-references it here, and barely more now.

- the author seems to have plenty of twitter followers, none of which seemed to care much.

And yet if you look at the "new" page, the flow of heuristics keeps coming. Conclude what you will :-).

thxg · on Jan 23, 2022

Ok, I found this one [1] with 100% wins on the whole 12k-word set. But it optimizes for worst-case, not average.

[1] https://www.poirrier.ca/notes/wordle/

canjobear · on Jan 23, 2022

Going for greens is a good heuristic, but the general solution is to seek guesses that shrink the size of the set of compatible words the most. https://langproc.substack.com/p/information-theoretic-analys...

npinsker · on Jan 23, 2022

> From an information-theoretic perspective, the optimal initial guess is the one which is maximally informative about the target, given the resulting pattern.

This statement (which motivates the rest of the analysis) isn't correct, because it doesn't take into account the highly irregular search space of what you're actually allowed to guess. Sets of possible words can vary a lot in terms of how easily they can be further cut down: for example, if you're on hard mode and your set of possible words is [BILLY, DILLY, FILLY, HILLY, SILLY, WILLY], then you might be in trouble.

canjobear · on Jan 23, 2022

Yeah, the analysis doesn't really work for hard mode as noted in a footnote.

npinsker · on Jan 23, 2022

I don't think it works for easy mode either. The overwhelming majority of 6-word sets can be distinguished in two guesses in easy mode (their EV is either 11/6 or 2), while this set has an EV that's around 3.

canjobear · on Jan 23, 2022

I think you're right. Although a first guess might reduce the set size a lot, it might lock you into a situation where all the possible second guesses are useless.

pxx · on Jan 23, 2022

for this specific set in easy mode doesn't (bawds, chefs) result in unique responses for all of them? so that's 2.5 guesses total? Where are you getting the EV for 6-word sets?

npinsker · on Jan 23, 2022

You're right: the EV is 2.5 (my guess was pessimistic). For (almost all) 6-word sets, there's a single word you can use to delineate all six words, which gives an EV of 1.83 if that word is in the set and 2 if it isn't.

kccqzy · on Jan 23, 2022

That's exactly my approach.

I wrote a very simple O(N^3) algorithm to determine that: for each possible actual word and the guessed word, figure out whether or not a word is eliminated. Then calculate a score based on how many words can be eliminated from a particular guessed word, averaged over all possible actual words.

(I originally thought O(N^3) would be way too slow to be useful. But it runs fast enough with a few thousand words.)

Like a sibling comment, I also found that the best initial guess is RAISE, although when I tried a different word list, I got different words like LARES (I don't think it's in the Wordle list).

One thing that surprises me is that by the time you get to the mid-game stage (after 2 guesses or so), occasionally it can be more optimal to guess a word with repeated letters; I did not originally expect that because I thought repeated letters are wasted opportunity to test more letters, but it turns out Wordle's handling of repeated letters can give you helpful information about the number of occurrences of a letter. (For example if you picked a word with 2 a's, the first one can be green the second one can be black, essentially telling you there is exactly one letter "a" in the word.)

returningfory2 · on Jan 24, 2022

FWIW, your O(N^3) algorithm can be equivalently implemented in O(N^2) by using the following observation: a word is not eliminated by a guess if and only if the result of testing the word against the solution is the same as the result of testing the guess against the solution. Using this, you can implement the algorithm like so: for a given solution, partition all of the guesses based on the result of testing them against that solution. This partitioning takes O(N). Then, the set of words that will be left over from a guess is equal to all the words in the same partition as the guess. This makes it easy to calculate "how many words can be eliminated from a particular guessed word".

I'm making this comment because I started with the same O(N^3) approach and then realized this. :)

kccqzy · on Jan 24, 2022

> a word is not eliminated by a guess if and only if the result of testing the word against the solution is the same as the result of testing the guess against the solution

That's not true. Suppose the actual word is "abcdx" but the guessed word is "xefgh". The result of testing is Yellow-Black-Black-Black-Black.

Now suppose you have other words like "xijkl" that will test against the solution in exactly the same way (Yellow-Black-Black-Black-Black), but you know won't be the correct solution because in the first round the misplaced "x" already tells you "x" should be present at a different position.

So now we have a word that can be eliminated even if it tests against the solution in the way the guessed word does.

returningfory2 · on Jan 25, 2022

Yeah, I think I messed up which things you need to partition. Given a set of possible solutions, and a guess, you partition the solutions based on which result Wordle gives back. Then you can calculate the expected number of solutions that will be left after you make that guess. This takes O(N). Going this for all guesses is then O(N^2).

carry_bit · on Jan 23, 2022

I also implemented essentially this algorithm. For guesses I always consider the full list of possible words, not just the list of possible answers. Doing so requires a check so words that have been excluded are only used if they're better than words that haven't been.

zeroonetwothree · on Jan 23, 2022

Using essentially this method I get that the best initial guess is RAISE (there are some others that are very close). It seems the author of this post didn’t use the actual word list from the game so got a different word.

tomlockwood · on Jan 23, 2022

I used the actual wordlist, yoinked out of the javascript with my totally elite dev-tools-in-chrome-hacking!

But yeah, given the frequency of letters, in position, in the target list, the answer is SLATE.

canjobear · on Jan 23, 2022

Yeah when I ran my script using the real wordlist I got RAISE too. (Just updated the article.)

loeg · on Jan 24, 2022

For me, ARISE and RAISE were equally good.

OJFord · on Jan 23, 2022

Or, especially in first guess or so, shrink the set of letters used by the remaining compatible words, which is subtly different right?

Humphrey · on Jan 23, 2022

I have also hacked around on solving this problem. However, I decided that using the shorter list of answers was cheating, since end users wouldn't have access to this list. From a user's perspective, any word that Wordle lets you guess could be a valid solution.

It's still a work in progress, but my code died when I tried to generate a complete game tree for the full list.

However, in theory it could be done once, to determine the best starting word, and then hardcoded to always use the same start word. So, my plan to explore a game tree approach from guess 2.

jeffomatic · on Jan 23, 2022

No slight to anyone pursuing this strategy, but I also felt like using the shorter list for ranking candidate words was cheating. On the other hand, my solver ends up making silly suggestions that might be appropriate for early on in the game, but not for late guesses.

I've tried to think of what might be a good way of approximating a human's own sense of which words are viable Wordle answers. One thing I attempted was using a word frequency table to help bias the algorithm away from less common words. The funny thing is that some words rank low in word frequency but are otherwise count as "common knowledge". For example, "tapir" was a Wordle solution a few weeks ago, but it ranked lower than some obvious non-answers like "grovy". It could be that the frequency table I was using was weird, but I can believe that there are well-known words that aren't used that often. Maybe I could come up with a better corpus (e.g. only use the NYT or some other newspaper) to use as the basis for a custom frequency table. Seems like a lot of work!

Pxtl · on Jan 24, 2022

But there are some obvious off-books rules. Like Wordle allows plural but the answer appears to never ever be plural.

unfolding · on Jan 24, 2022

Wordle reminds of a more complex version of Mastermind[0]. Donald Knuth proved you could always solve the game in five guesses or fewer![1].

[0]: https://en.wikipedia.org/wiki/Mastermind_(board_game) [1]: https://en.wikipedia.org/wiki/Mastermind_(board_game)#Best_s...

Edman274 · on Jan 23, 2022

If you right off the bat guess "bound" and then positions 2, 3, 4, and 5 all come up green, then your goal needs to then be to use words like 'morse' to rule out (in one fell swoop) words like 'sound', 'mound' and 'round'. Then, if it's not ruled out, you can then choose words that just have m and s, or s and r.

muti · on Jan 23, 2022

This is a valid strategy in easy (default) mode, but in "hard" mode, you must use all yellow/green letters in subsequent guesses

MattRix · on Jan 24, 2022

This is why hard mode is a less interesting way to play.

Edman274 · on Jan 24, 2022

Oh, good point. I didn't know that there was a hard mode - I've only played twice.

RivieraKid · on Jan 23, 2022

Why not simply choose a guess which narrows the set of possible solutions the most?

This could be improved by looking one step further, but this would probably require some kind of optimization / heuristic / approximation to make it computationally feasible.

Retric · on Jan 23, 2022

It really depends on what you are trying to optimize. Least guesses on average may not want a 50/50 split it might want say a 60/40 split where the 60% is 2 guesses and the 40% is 3. But that loses to a 48:52 split which gives 2 guess 80% of the time etc.

noxvilleza · on Jan 23, 2022

Narrowing the search space greedily doesn't solve the problem optimally, because the size of the search space is just a heuristic. Some search spaces of size X might be more difficult to reduce further than another search space of size Y (for Y < X). This is more apparent in hard mode where you have a very limited guess-pool. The greedy heuristic approach would work fine if all possible 5-letter combinations were valid answers.

RivieraKid · on Jan 24, 2022

I know, like I said in my comment, this greedy strategy could be improved.

tomlockwood · on Jan 23, 2022

Watch this space :D

anotheryou · on Jan 23, 2022

Why optimize for greens? Better to rule out more first no matter the position. I think it narrows the search space better.

If you want to go fancy somehow calculate with bigramms. If you guess c you don't have to guess k blindly too, etc.

amluto · on Jan 23, 2022

If you want to get really fancy, a full minimax solution is probably feasible without massive resources. Pretend it’s a two player game with one player guessing and the other player thinking of a potentially different word for each guess subject to the constraint of being consistent with all previous answers.

ummonk · on Jan 23, 2022

Somebody had already tweeted about solving it and finding that 3 guesses suffice, using alpha beta search (though he didn’t use the term “alpha beta” because he might have reinvented it).

Unfortunately, I can’t seem to dig up the tweet.

ford · on Jan 23, 2022

There is a wordle parody (absurdle) [0] that does exactly this.

[0] https://qntm.org/files/wordle/

pxx · on Jan 23, 2022

No it does not. It uses a heuristic at every layer of the tree. Specifically, instead of recursing down each subtree, it chooses the set with the maximum number of remaining elements, which isn't necessarily the set that is most difficult to split.

ford · on Jan 27, 2022

I see - I misunderstood the parent comment.

I believe you are saying that the best adversary would attempt to maximize the depth of the tree. For example a branch with "{L, B, C, R, ..}ake" requires more guesses than a balanced tree despite the fact that the two trees may have the same number of nodes.

sdwr · on Jan 23, 2022

If you want to split hairs, minimax is optimal for worst-case, but not for bringing down the average # of guesses.

tomlockwood · on Jan 23, 2022

> Why optimize for greens? Better to rule out more first no matter the position.

I haven't yet experimentally tested this, but I hope to soon! It may be that a green letter in position rules out more possible targets than a grey letter.

anotheryou · on Jan 23, 2022

I thought of yellows. Taking frequent letters over words with sub-optimal letters but in the right positions.

But manually I'm no where close to 3.6 either :). Also often too lazy.

tomlockwood · on Jan 23, 2022

Yeah, I think my average is closer to 4, ugh! But the program has shown me some strategies that seem interesting. It always guesses "slate" first, and if that has no greys it guesses "crony", and that one-two punch seems to do me well fairly often.

anotheryou · on Jan 23, 2022

interesting :) I go for "torah", "lines", ("ducky", but I think the k is far from optimal)

apart from "h" and "i" it's the same letters (and I do have c and y in my optional 3rd).

canjobear · on Jan 23, 2022

The generally optimal thing is to choose guesses to reduce the set of possible targets as much as possible. This is the same as choosing guesses with maximal entropy over the resulting patterns (that is, the result you get back from the game should be maximally informative). Because greens are rare, by maximizing the probability of greens, you are approximately maximizing the entropy of patterns, by distributing probability mass among rare patterns. This is why going for greens is a good heuristic.

ascar · on Jan 23, 2022

This sounds like the optimal strategy for a single guess. But shouldn't the optimal strategy for solving the game include that there are multiple guesses? While some specific word might provide the highest reduce in entropy for the first guess, it might actually negatively impact the entropy reduction of remaining possible words compared to another first guess.

E.g. if RAISE is the optimal reduction in search space for the given word list, what's the best 2nd guess for every possible word? Now taking the average 2nd guesses into account, is there a better first guess?

canjobear · on Jan 23, 2022

Good point. Because the set of possible guesses is constrained, the totally general solution needs to take future guesses into account. There doesn't seem to be a way to find the optimal guess other than searching the whole game tree (which it looks like someone in another comment did).

prezjordan · on Jan 23, 2022

It's interesting to see the edge cases where the algorithm struggles to find the last letter.

> shave: [slate: 20202, share: 22202, shame: 22202, shape: 22202, shake: 22202, shade: 22202]

I guess it needs to find the right choice of [r, m, p, k, d, v] to get "shave" (since 'v' is so rare it takes 6 guesses!). Trying "marks" and "paved" would narrow that list down in 2 guesses (and a third to actually try "shave")

Anyone code this up yet? (I haven't, I took a greedy approach too [0]). Wonder how to generalize such a plan.

[0]: https://github.com/jdan/wordle.ml

muti · on Jan 23, 2022

The solver is playing in "hard" mode, where any yellow/green letters must be used in further guesses

tomlockwood · on Jan 23, 2022

Yeah, I don't want to give the game away on my next program but its along those lines, and there are some challenges to making it work that I'm still figuring out haha.

matthewfelgate · on Jan 23, 2022

Thanks for sharing.

I have a feeling that a perfect algorithm should be able to solve all wordles in 4 guesses. (but I might be wrong)

The problem with a lot of approaches is they seem to locally optimise for each guess rather than optimise over all the guesses (if that makes sense?)

I get the feeling guessing the most frequent letters might not be the best approach, because does it cut down the number of possible words much?

My intuition tells me if you have the right first 3 guesses you should be able to cut the dataset down to 1 possible answer every time.

However I don't know how to program it.

shagie · on Jan 23, 2022

A bit ago on HN there was a link to Evil Wordle - https://swag.github.io/evil-wordle/ ( https://news.ycombinator.com/item?id=29864418 ) and there's another variation of it at https://qntm.org/files/wordle/

On NPR ( https://www.npr.org/2022/01/23/1075168693/youve-heard-of-wor... )

> QNTM: So with Wordle, in theory, you can win in one guess. I've won in two guesses a couple of times. Absurdle - you cannot beat it in less than four guesses. Four is the limit. But generally just good luck - you're going to need it.

The worst case for Wordle and evil world or Absurdle should be the same... its just that evil and absurd are always the worst case.

davchana · on Jan 24, 2022

You guys are all here real programmers, but I am just good with Excel & a bit of javascript. I made this sheet to get best possible words, & I am trying to get this converted into a bookmarklet.

My excel sheet is explained in this different hn thread about same game https://news.ycombinator.com/item?id=30037065

The whole word list is available in the javascript code of the game page. I know all of that can can be accessed by javascript runtime, but the javascript wordle uses is not the one I can understand, could somebody please have a look at wordle page source & let me know how can I access the variables defined in his code in js file https://www.powerlanguage.co.uk/wordle/main.e65ce0a5.js in variable named La

My some of the basic js apps I have made for myself are hosted on gitlab at username davchana

In manual mode I try to use first word with 3 vowels & no repeat letters. Hard mode is good for not letting the tryies go waste.

Thanks

tomlockwood · on Jan 24, 2022

Hey! Don't say you aren't a real programmer! I learnt how to program in my spare time at an industrial site, using Excel! Excel runs the world, and a few years ago I got a programming contract because I had that VBA knowledge! That same company was so impressed that I eventually got a full time job with them. I am still a full time programmer to this day. So, keep working at it! Excel is real programming and javascript definitely is.

For your problem though, I just copy pasted the entire list into a new file to use. If you want to use my file, its in my repo in the article under the path ./lib/words.py!

You are a real programmer!!!

davchana · on Jan 24, 2022

Hey thanks for the success story & words of encouragement.. you are the hero!!

The file; I was wondering how can I access the list in my javascript bookmarklet, which I can run on his game page. I mean i can define the whole list in my own variable in my bookmarklet, but sure it is going to be over the character limit of a bookmarklet.

How, when the game page is loaded, one can access that variable in console log? I tried simply typing La, but looks like it was not declared on global scope, it is a part of some object.

I am thinking of using a .js file in my bookmarklet, to bypass the char limit; or adding a script link in dom; but those will be redundant data; the original var is there in runtime.

tomlockwood · on Jan 24, 2022

Ahhh for that problem, I am not sure! Best of luck!

AlexanderTheGr8 · on Jan 24, 2022

A very naive solver is giving me 3.289 guesses on average.

What I did was simply sort the word_list based on frequency analysis. The guess is always the first word in the list (with the highest score) After each guess, I update the list based on the results (gray, yellow, green)

For the final test, I just looped through each word in the list and played the game for that word. My algorithm got every single word (2315 words) in 7614 total guesses giving an average guess of 3.298

Right now, I am only using the list of final_words. I believe that I can get better results by using the list of attempt_words to reduce the number of possibilities, and then using final_words to give final guesses. So maybe the first 2 guesses will be using the attempt_words and then the guesses after that will be from final_words. Or maybe keep guessing from attempt_words until the list length is below a certain number and then start guessing from final_words. Something along these lines will give better results.

In my freq analysis, I also accounted for the freq on each index.

shmde · on Jan 24, 2022

Why do people need 3 guess when the solution is literally in the localStorage /s

normalocity · on Jan 26, 2022

It depends on whether you want to play the game or break it. :D

Most people are taking this as a fun algorithmic exercise, assuming imperfect knowledge.

Grustaf · on Jan 23, 2022

As many have mentioned, it seems much more efficient to focus on eliminating letters, which means that the second word should not be very green at all.

Also, since there's only one word a day, I think we can assume that it's chosen manually, and it won't be some archaic anglo-saxon tool for shoeing horses or something, it will be a relatively common word.

yardstick · on Jan 23, 2022

It’s chosen from a sequential list of words in the source code. Each word was vetted by the woman this game was originally made for.

See the NY Times article: http://web.archive.org/web/20220106042510/https://www.nytime...

tomlockwood · on Jan 23, 2022

I think it's chosen based on a random number generated from the date - but one of the most recent games had the answer "REBUS", which I've never heard or read before. But yeah, the target list has many more common words in it than the list of all possible guesses.

oceliker · on Jan 24, 2022

I am pretty sure the list is sequential. If you look at the list in the Javascript code of the game and Ctrl-F today's word, you'll see the previous dates' words are right before it. Here's tomorrow's word (spoiler alert, obv): https://pastebin.com/sfQ7x5ya

Grustaf · on Jan 23, 2022

That is not an uncommon word though, I'd venture most school children know it. You probably read rebuses in school but just forgot about it!

tomlockwood · on Jan 24, 2022

It's a common word? What's your source for that claim?

Grustaf · on Jan 24, 2022

Kindergarten? https://kidpillar.com/free-rebus-puzzles-for-kids/

tomlockwood · on Jan 24, 2022

This website looks like it's target audience is five year olds, to you?

Grustaf · on Jan 24, 2022

It was just a random google result, but we definitely did rebuses in pre-school at least.

The rebus concept is actually a precursor to actual writing, historically speaking. Both in Mesopotamia/Egypt and in the Chinese civilisation, writing developed through a rebus phase. So it's a very natural thing, that's why it's so child friendly.

tomlockwood · on Jan 24, 2022

That's all cool, why's it a common word, though?

Grustaf · on Jan 24, 2022

Why? I mean most pre-schoolers probably play with rebuses, simple crosswords etc. It's just a common concept in the world that most people have come across.

Unlike say an obscure linguistic term like nisba.

tomlockwood · on Jan 24, 2022

Oh I get that they exist, commonly, I just see no evidence to suggest the word is common.

edhebert · on Jan 24, 2022

For the creator of Wordle and their partner, who obviously have some love of word puzzles, REBUS (a picture to word puzzle) would be a familiar word.

speg · on Jan 23, 2022

It says it is designed for hard mode, in which I believe you must use your green (and maybe yellow?) letters.

Grustaf · on Jan 23, 2022

Oh, didn't know there was modes.

hoten · on Jan 23, 2022

Jake Archibald shared an implementation that aims to narrow the search space the most with each guess.

https://twitter.com/jaffathecake/status/1483057877875101697?...

vba616 · on Jan 23, 2022

>so far what was taking 1GB of RAM in Python is taking, literally 1MB in Rust

Is anybody hiring people to fix situations like this at work? i.e. this script that was written in a hurry needs to be revised to be 10^3 times more efficient, in space or time.

tomlockwood · on Jan 23, 2022

As a data engineer, I sometimes run into problems like this, but I generally find that 99.9% of them can be solved "well enough" in python, or in a database. Moving code from Python to Rust or similar has been required only a handful of times.

This specific task, the next program I'm working on, seems to be a very "algo" based, comp sci type problem. I think most companies want to say that they have these kind of problems, but the reality often is, they need a batch script on a laptop.

So like, yeah, I have done things like this before in my work, but often, solutions in Python are "good enough".

dvasdekis · on Jan 24, 2022

A company I know hired a computer science PhD after putting them through the full programmer interview, including whiteboarding algos, etc. Literally the hardest crap that anyone working there could think up, and then hired the programmer who got the best score on the test, and the process took multiple months to find this magic individual who could score high enough to satisfy everyone.

Once hired (at a fantastic salary), she was then put to work copying data from a pile of CDs into a cloud storage account! Not only that, but she was given a laptop with only 1 CD drive to work with.

She quit within a week.

vba616 · on Jan 24, 2022

I have trouble believing that story. An adult, who is also a super genius just says "I was given..." and quits? That's so passive! You have one CD drive and assume that's it, nothing can be done, that's all the resources in the whole company? Don't talk to anyone?

sanitycheck · on Jan 24, 2022

Perhaps evaluating the situation and leaving immediately shows just how smart they were!

vba616 · on Jan 24, 2022

That would depend on whether the company was doing something that needed to be done, wouldn't it? Which should be clear before a person starts.

If something has to be done, then avoiding it makes you "smart" like avoiding taxes makes you "smart".

tomlockwood · on Jan 24, 2022

Wonder what company that was haha :)

vba616 · on Jan 24, 2022

>I generally find that 99.9% of them can be solved "well enough" in python

Sorry, I didn't mean "by translating to a lower level language".

In fact, I especially meant by fixing the script in the existing language.

tomlockwood · on Jan 24, 2022

Oh yeah, I do that all the time - sometimes the volume of data gets higher and there are bottlenecks and such, you know? Much refactoring to be done.

vba616 · on Jan 24, 2022

As I've seen on HN, a lot of people hate and do not think in SQL, and avoid it by writing something imperative. Replacing a loop with one query is an example of the sort of thing that I've seen make 3 orders of magnitude difference.

Another basic pattern is converting a program or script that loads a whole job into memory into something with fixed size buffers, because specific data made the original explode.

saagarjha · on Jan 23, 2022

I did this at an internship of mine. You aren't really going to be hired for "please fix our script's performance", but you might end up finding a place where you have the opportunity to do this kind of thing.

vba616 · on Jan 24, 2022

>You aren't really going to be hired for "please fix our script's performance"

Not for one script - but imagine everything like that is funneled to a group that supports operations. Such exists, as a matter of fact, question I have is how to find another one.

pdanpdan · on Jan 24, 2022

A JS version adapted from https://notfunatparties.substack.com/p/wordle-solver

https://pdanpdan.github.io/wordle-solver/

main code in https://github.com/pdanpdan/wordle-solver/blob/master/src/li...

CarVac · on Jan 23, 2022

It looks like the losing cases are those where in hard mode you can easily get 4 of the letters correct but you have more potential words than guesses left.

tomlockwood · on Jan 23, 2022

Yeah, it requires more analysis but I think counterintuitively many of the games that go up to 5-6 are ones where the initial guesses get a lot of greens, effectively making the remaining guesses a set of 50/50s.

SideQuark · on Jan 24, 2022

I posted a similar bot and tools and analysis a while back.

It solves in 3.45 moves on average, 5 or fewer moves always, and uses 5 moves less than 2% of the time.

The core parts are very fast, using bitwise tricks for filtering, uses caching throughout to reduce computation, and is fully multithreaded.

https://github.com/ChrisLomont/Wordle

julianwachholz · on Jan 23, 2022

Wordle is a perfect example of nerd sniping, I couldn't stop playing once I started. But I wanted to challenge my friends to some more difficult puzzles and wrote a tiny tool this weekend to do just that!

You can create your own Wordle-like puzzles on https://word.rodeo

I've already received a ton of positive feedback from friends. What are your thoughts?

depaulagu · on Jan 23, 2022

It's not copying the URL to my clipboard, would it be possible to expose the url as text so manual copying can be done?

julianwachholz · on Jan 23, 2022

My partner noticed this too, but after a couple of tries the same button will work again. Really hard to debug unfortunately. It usually worked after a couple of tries. I just pushed a fix that greys out the link if there's no URL, hopefully that helps?

BoorishBears · on Jan 23, 2022

Can you really not just share the url and at most pre-highlight it?

On my phone I get a share sheet with additional text which makes visiting the url a pain

julianwachholz · on Jan 24, 2022

It's a normal anchor tag so you can right click or long press it to copy the link!

SamBam · on Jan 24, 2022

Doesn't work for me either, after many clicks.

macOS 11.6, Chrome 96.0.4664.55.

julianwachholz · on Jan 24, 2022

Super embarrassing. I fixed this now. I used event.target which was just the SVG of the icon if you clicked that instead of the text. (I also split the share button in two to make it easier to share)

julianwachholz · on Jan 24, 2022

Well this is embarrassing.

The event listener was using event.target which wasn't the correct node when you hit the icon. I have fixed this now!

itrulia · on Jan 23, 2022

Love it

SoylentOrange · on Jan 24, 2022

This solution feels a little like cheating because it's using the "target" list to train. The target list contains all the solutions, while the larger list contains all 5-letter words. A human would not know the target list. For a more fair solver, it should add the two lists together to train, and then test on the target list.

normalocity · on Jan 26, 2022

Tom's writeup inspired me to dig a little more. I unfortunately (for me) discovered a very simple way to solve the game on the first guess every time. My fault for reading the game's code, but I just can't look at the puzzle the same way anymore.

sam1r · on Jan 24, 2022

It’s because you’re using sets — the reason why the author mentioned things are out of order.

>> A set is a collection which is unordered, unchangeable*, and unindexed. [0]

[0] https://www.w3schools.com/python/python_sets.asp

maskedinvader · on Jan 24, 2022

I came across a post which said use the words dying, clots, brake, and whump (statistically expect worse cast 2 guess since 20/26 alphabets used in 4 guesses, worked every time for me (although with this strategy you are ending up needing 5 guesses all the time and in rare occasion need the 6th guess.

LeoPanthera · on Jan 23, 2022

See also Absurdle, an adversarial Wordle that is designed to be as difficult as possible: https://qntm.org/files/wordle/

I won't spoil how it works, but click the "?" button on the page if you want to know.

d4rkp4ttern · on Jan 24, 2022

Since it is adversarial, it is also deterministic, I.e, for a given play, its response will always be the same. So the challenge here is no longer, “can you guess THE word?”, but rather, “can you find the shortest play?”. Finding a 5-step solution is reasonably challenging, but try for a 4-step solution, it will drive you nuts :) Some have been posted on Twitter but that’s no fun.

pxx · on Jan 23, 2022

It's designed to be, but it isn't necessarily, as it uses the size of the remaining solution set as a heuristic for how many guesses are needed to split it. There are better, more exhaustive searches of the solution space now.

jutaz · on Jan 23, 2022

Shameless plug: I’ve also made a solver, written in Rust. Main goal was to learn the language, 2nd the trie structure behind the search: https://github.com/jutaz/wordle-search

CraigRo · on Jan 23, 2022

The objective function is weighted average number of guesses, with some penalty factor for >6 (aka losing), and optionally different weights for getting it in 1,2,3,4,5,6

And you can solve it in a tree- at least for the lower levels when the search space has been reduced

juanpotato · on Jan 23, 2022

https://github.com/JuanPotato/Wordle-Solver

I made a similar solver, it solves all the words in 3.65 average moves.

Could probably be better. The github explains how it works.

boothby · on Jan 23, 2022

Neat! I came up with a similar statistic (3.65 average moves, hard mode, no failures) using a very simple heuristic.

  1) pick a first guess, and partition the remaining words by the hints that they give

  2) given any part of a partition, pick a word that minimizes the maximum-size part of the resulting sub-partition

  3) repeat (2) until words are solved.

I brute-forced the first guess (that is, generated a tree rooted at each word), and the best one was 'bland'. That has one failure, getting stuck on the chain (hound, mound, pound, sound), which ended up being easy to fix manually.

My tree is quite different from yours, with guess distribution 1, 103, 861, 1114, 212, 24.

juanpotato · on Jan 24, 2022

Yeah I'm on easy mode. With hard mode on I had some failures with words that only differ by 1 letter. I was happy with the result though and didn't care to try and optimize the hardmode.

I think your heuristic is pretty similar to mine, just accomplished in a different way.

s_Hogg · on Jan 23, 2022

This is really interesting, well done :)

As it stands, it seems like your algo will always use the same first guess, have you thought about making that configurable and seeing what happens with e.g. the number of words where it gets stuck and runs out of time?

tomlockwood · on Jan 23, 2022

Get to work Mr. Hogg!

Yeah, next program will effectively do this but in an utterly insane way. Shall we brunch to discuss soon?

richardfey · on Jan 24, 2022

Anyone can please fill me in on why I am seeing a daily wordle post on HN for the past week or so? Is there a competition that I missed?

FR10 · on Jan 24, 2022

At least in my household (and some groupchats) its become like a fun daily ritual, it takes just 5-10 minutes to play and post it, and you can only play once a day.

bbasketball · on Jan 23, 2022

If you like Wordle you should check out https://wordhoot.com/

BoorishBears · on Jan 23, 2022

I thought the UI of Wordle wasn't anything special but this makes me appreciate the subtle attention to detail that went into it

That website you linked doesn't lock the viewport so on my phone it's constantly panning or randomly zooming

And the grid instead a keyboard layout actually matters more than I expected for how easy it is to type

Having to manually start a game is a tiny thing but doesn't feel as smooth

julianwachholz · on Jan 24, 2022

Recreating the game from scratch I noticed a ton of these tiny little details that went into the game which made it even more fun of a challenge.

I'm looking forward to trying to implement these subtle animations, too.

WordHoot · on Jan 24, 2022

Are you still experiencing the panning/zooming issue? If so please let me know and I will look into it. I've updated keyboard layout. Thanks!

bbasketball · on Jan 24, 2022

It's click to start because it has a timer. Hmm, I'm not seeing the other issues.

lottin · on Jan 23, 2022

You can solve it with grep and /usr/share/dict/words but where's the fun in that?

Phylter · on Jan 23, 2022

The answers are all in it's JavaScript source code, but what's the fun in that?

julianwachholz · on Jan 23, 2022

Or just use `JSON.parse(localStorage.gameState).solution` but that's not the point.

cortesoft · on Jan 24, 2022

Everyone is having these really interesting conversations... but no one has explained what Wordle is :(

Quentangle · on Jan 24, 2022

Wordle is a language puzzle game which has gained a lot of popularity quite quickly. The rules are very simple, yet allow for quite a lot of strategy. There are many articles about its rise to fame. The official website [1] has a single puzzle per day which everyone shares, but there are many clones which have variations, or unlimited games.

[1] https://www.powerlanguage.co.uk/wordle/

quickthrower2 · on Jan 24, 2022

Its explained in the link. Go play it. its quite fun.

tomlockwood · on Jan 24, 2022

I added the explanation as a direct result of the comment you're responding to.

vultour · on Jan 24, 2022

If you found them interesting it would take literally a single search to find out.

tomlockwood · on Jan 24, 2022

Good point! Sorry! I'll add a link to the article!

karmasimida · on Jan 24, 2022

Someone plz write a RL-agent for this game, I bet it could go lower than 3 guesses lol

newobj · on Jan 24, 2022

My manual average is 3.28. Smaller sample count than these solvers of course.

anonu · on Jan 24, 2022

Isn't an optimal first word one with plenty of vowels?