Hacker News new | past | comments | ask | show | jobs | submit login
Solving Wordle in 3.64 guesses on average, 99.4% of the time (lockwood.dev)
199 points by tomlockwood on Jan 23, 2022 | hide | past | favorite | 178 comments



I wrote a solver a few weeks ago that solves easy mode in 3.45 guesses and hard mode in 3.55 guesses. Rather than using heuristics, it explored a large portion of the game tree (with a little pruning) and I'm hopeful that it's optimal. Interestingly, the EV-optimal hard mode strategy takes 3.52 guesses on average, but requires 7 guesses a tiny fraction of the time, so I instead exposed the best strategy that always requires 6 or fewer guesses.

If you're interested, you can play with it here: http://www.npinsker.me/puzzles/wordle/ and the source (written in neophyte's Rust) is here: https://gist.github.com/npinsker/a495784b9c6eacfe481d8e38963...

(edit: It's posted elsewhere in the thread, but this is actually not optimal and someone else achieved better results! http://sonorouschocolate.com/notes/index.php?title=The_best_...)


As a point of curiosity, what do you mean by optimal?

To me, I can see at least two (potentially different) cost functions that a Wordle strategy can aim to minimise. Firstly, you can try to minimise the expected number of guesses, and secondly you can try to minimise the expected number of guesses conditional on never losing the game. In principle you could optimise for the first but not the second by allowing a small list of hard words to take > 6 guesses on average.

Part of my point is the lack of an absolute definition of optimal: strategies are only optimal up to the specified coat function.


IMO, never losing the game is the bare minimum for any reasonable cost function: never take > 6 guesses. Beyond that, you can have different optimization functions based on whether you're trying to minimize the worst-case depth of the decision tree (number of guesses) for each position, or the average depth, or some other reasonable function of the distribution over number of guesses.

Although you could have arbitrarily many cost functions, a fairly general class is the following: pick some nonnegative constants (w1, w2, w3, w4, w5, w6, w7), and for a certain strategy, define the cost as:

    w1*n1 + w2*n2 + w3*n3 + w4*n4 + w5*n5 + w6*n6 + w7*n7
where nk (for 1≤k≤6) is the number of hidden (solution) words for which the strategy takes n guesses, and n7 is the number of words for which the strategy loses the game. Then,

• Setting w7 infinite / very high is a way of encoding the condition that you not lose the game. (You can set it finite/small if you don't mind occasionally losing the game for some reason!)

• Setting w1 = 1, w2 = 2, …, w6 = 6 will simply minimize the average number of guesses,

• Having them grow exponentially, e.g. setting wk = c^(k-1), where c is some constant larger than 12972 (the number of acceptable guess words) will make sure that the minimal-cost strategy first minimizes the maximum number of guesses, then break ties by having the fewest words take that many guesses, and so on.


> IMO, never losing the game is the bare minimum for any reasonable cost function: never take > 6 guesses.

I wouldn’t think so. If I get 99.9% of rounds correct with 3 guesses (and fail to find a solution to 0.1% of the rounds), and you get 100% of rounds with 6 guesses, I’d say I’ve soundly defeated you 99.9% of the time.


Right, that point of view makes sense if you imagine two players separately playing Wordle and comparing their results, and indeed that's what the OP may have been thinking. But as far as I know, Wordle is primarily a game for one player to play against the computer, where eventually getting to the word (or not) is the most prominent outcome (the only notion of "win" or "defeat" in the game itself), and solving it in fewer guesses is just a bonus. But sure, if you don't care about occasionally losing, you can just set an appropriate weight for w7 within the same family of cost functions.


Indeed, it’s essentially a single player game, but we’re talking about which bot a player would want to use which is inherently a notion of bots competing against each other.


If you look at the failed attempts in the article, you'll see it's not so easy to get 100% correct with 6 guesses. For example:

SLATE, CRONY, HOUND, BOUND, POUND, MOUND (correct: WOUND)

SLATE, SHARE, SHAME, SHAPE, SHAKE, SHADE (correct: SHAVE)

...so sometimes even if it gets 4 of 5 letters right on the second try, it still has too many options left. Of course in cases like this it could go "one step back" and try a word with (most of) the letters that are different between these words (HBPMW or MPKDV). So try different words until the number of options is narrowed down sufficiently.


> I’d say I’ve soundly defeated you 99.9% of the time.

Except it wouldn't be 99.9% of the time. Just the games were you guessed fewer than GP.

GP's solver would expect to guess the correct word 100% of the time in fewer than 6 guess, but not that it would always take 6 guesses.


I was saying if you hypothetically got every correct guess in exactly 6 tries. You would pass the suggested “bare minimum” while my 99.9% strategy doesn’t, yet I claim my strategy is better.


For some value of better, sure.


Sure, where "some value of better" is "wins in the overwhelming majority of rounds."


> never take > 6 guesses

I suspect that with most languages it’s impossible to achieve 100% win rate for five or more letters. So w7 should be by far the largest weight but I don’t think it can be infinite.


I suspect the opposite. While the theoretical number of combinations is huge, the amount of information given by each guess grows faster than the number of real words of longer lengths.


They specified that in the post, they were able to constrain to a strategy that guarantees 6 or fewer and then optimize for EV to get 3.55, but could get down to 3.52 by allowing some small number of words to take more than 6.

I would be interested to know if 6 is the lowest you can constrain it. Can you guarantee a solution in 5, and if so what is the best EV with that constraint?


Yes, you can -- I believe the EV was around 3.47 for that. The "greedy" strategy (minimizing the worst-case size of the largest set) also ends up using at most 5 words.

Perhaps unsurprisingly, you can't guarantee at most 4 words.


You can get EV of 3.46393 with a guaranteed maximum of five tries. (I don't know whether this bound is tight.)



Good to know.

Another possible question is: What is the best strategy minimizing (at any point) _first_ the maximum number of tries, and second the EV (average number of tries over all possible words)? That's subtly different from either answer.


Funnily enough, I think neither of those notions are the correct thing to optimize. For me, you want to minimize the maximum number of guesses. This leads to an equilibrium. Otherwise, your opponent can just abuse your strategy.


My point is exactly that though - the choice of cost function is subjective. There is no best cost function, just a mapping from costs functions to optimal strategies.


The (exact) optimal EV-minimizing strategies (assuming each target word is equally likely) always solve in 5 of fewer guesses in normal mode, and always solve in 6 guesses in hard mode. If you wish to guarantee 5 guesses in hard mode, there's a different (optimal) strategy with slightly higher EV. See: http://sonorouschocolate.com/notes/index.php?title=The_best_...

I don't think a maximum-number-of-guesses optimizer leads to an equilibrium (if you mean Nash equilibrium), assuming you're playing the game where the score of the game for the setter is the number of guesses. In particular, if the solver's strategy is deterministic, the setter will always pick one of the words that needs 5 guesses. There's no reason to think the nash equilibrium can be easily found.


Adversarial strategies are another type of problem entirely (as in, if you're playing Word Mastermind). Wordle is a 1-player game, so I find hgibbs's comment totally reasonable. I would think that 'optimal' is more accurately his second definition - minimizing avg guesses but always winning (as in, losing is +inf guesses).


Isn't that just #2, but changing the definition of losing to be the least number of rounds where you can always win (which for Wordle is 5)? These small changes to the definition don't seem to change ev (or the code needed to generate it) that much.

If you don't care about optimizing the EV at all, the greedy tree already has the property you're interested in.


What do you mean by opponent?

The entity choosing the word in a game like Wordle or Hang Man is more like the dungeon master in a game of D&D than an opponent in the traditional sense? They are there to help you have fun.


If two players are competing in a single round of wordle, then surely the player who guesses the correct word in the fewer number of guesses wins, right? Then it seems natural to say that the better of two players is the one with the most wins after playing all possible rounds.

The only problem is that I think this definition leads to the possibility of non-transitivity, where A beats B, B beats C, and C beats A.


I've seen a number of solvers; but my partner has forbidden me from writing one, forcing me to do everything in wetware. 3-4 guesses seems right intuitively. I saw some speculation that the dictionary was well-known, is that true?

Seems to me that the problem as-stated allows the guesser to discover a nonword "for free", but otherwise expects the dictionary to be "all five letter words" which is much larger than the real solution space. Though, maybe in practice that doesn't change the solvability by much.


The wordlist is more than well-known, the word for each day is predetermined. The entire game runs in the browser and the selected word is based on the date. Here’s a quick script based on any date you select: https://codepen.io/kevlened/full/NWaOmPw


This program actually inadvertently created a quite philosophical discussion amongst my friends! Some of them contended that writing a program to solve wordle sucked the joy out of the process of solving it, even if I solved the puzzle myself, first.


I once caused a vague division in my secondary school by writing a (looking back now) horrendously hacky OCR solver for our "world maths day" challenge.

Literally half the teachers thought it was cheating and I should be punished, whilst the other half thought (because I hadn't abused it to win prizes) that it showed a creative use of more modern skills to make up for my renowned weak mental-math.

I saw the joy in writing the solver as much more interesting than the math. Same way I can't focus on sudoku given I always try to "solve" solving it.


Yes it's interesting, precisely because it's intuitively "solvable" by machines. There's joy in solving by hand, and joy in creating a solver, but probably no joy in running your solver every day and sharing with your friends. Because every decision is some maximum-likelihood estimate, there's no tension or drama in the sequence of guesses.

In fact, simply the act of creating an infallible solver (100% likelihood of 6 guesses or less) demonstrates the futility of competing with their soft, fleshy skills.

I do think any solver that uses the solution-set is "cheating", but I suspect it makes little difference in the total solution (like... 2-3 bits of additional information needed, just a fraction of a round).


You might find some enjoyment in writing solvers that compete with your friends' solvers?

You could either compete on number of guesses. Or you could fix the hardware (eg a Famicom), and compete on runtime?


Oh dang! I was guessing that the optimal approach would be under 3.5 for easy mode, but was expecting it to be closer to 3! Amazing work. My next go at the problem is in Rust, which I am a total newcomer to. Thanks for sharing!


Nice! I was wondering almost the same thing this morning while my wife was playing Wordle - ended up coding an assistant to help you choose the optimum words.

If anyone wants to check it out, I published it here https://curiosity.ai/wordle/ - source code is also available here https://github.com/theolivenbaum/wordle-assist.


If you just brute force solve with greedy search (always pick the word that results in the highest average reduction in remaining possible words), you get a solution with an average of 3.47. Even only guessing words on the primary word list (which Wordle uses to pick the word of the day) you can get 3.51. That's as far as I've chased the problem.

I'm surprised to hear that optimal play only gets you 3.4212 (according to the link you posted).


Neat!

The graph of every possible game for Hard mode should be pretty tractable. Sounds like you wrote almost exactly what I started but have been too lazy to finish.

> but requires 7 guesses a tiny fraction of the time, so I instead exposed the best strategy that always requires 6 or fewer guesses.

That's awesome. My question is "what is the minimum number of guesses to solve every word while guaranteeing it never loses". For hard mode I'm pretty sure this can be provably brute forced. Easy mode requires enough pruning I'm not sure it can be proved.


That looks great! It would be very interesting to see the set of words "left" after a guess, too.


This is great. But it seems to fail with words containing repeat letters, for example, with my word as BOOKS, not on hard mode:

SOARE JUMBY BOOST (at this point, it claimed to have won!)


books isn't a valid solution. it's on the guesslist but not the wordlist.


Correct. In fact, thousands of plural words are allowed as guesses, but the solution will never be plural. Source: I grep'd the lists of solutions and authorized guesses.


I used the combined solution and valid word list when I was playing around with a program to find good starting words. Feels too much like cheating otherwise.


It fails with the word "frogs", as it tells me it is "gross"



This [1] solves it in 3.4212 average guesses 100% of the time and claims optimality, which is 99% of the difficulty it seems. For the limited 2671-word set of possible words, I think the question is now settled (for the whole 12k-word set, I don't think anyone tried anything).

It was posted here four days ago [2].

[1] http://sonorouschocolate.com/notes/index.php?title=The_best_...

[2] https://news.ycombinator.com/item?id=30006724


Neat, didn't know that existed! Definitely would have saved me some time :)

FWIW, I wrote a similar solution a couple of days ago as well, and come up with the same result (3.421166 guesses on average, with starting word SALET). So that would support that result.

With the following strategy: https://drive.google.com/file/d/1WvxRRzbvDVnHZUczHBZAko3hKft..., code: https://drive.google.com/drive/folders/1Y8k685PS0wxvYulIxsPg...


There is a leaderboard at https://botfights.io/game/wordle for solving with all 12972 words as possible solutions, it might be of interest.


It's a bit disappointing that there's been many writeups of heuristic-based solvers make the front page, but the writeup of an exact solver doesn't make it. [Note: I was the person who submitted the link to the exact solver writeup to HN].

https://news.ycombinator.com/item?id=30006724


Just to make things extra clear, the post we are commenting on has

- 99.4% in 6 guesses or less

- 3.64 guesses average

- no attempt of any formal optimality result,

while the one you posted has:

- not just 100% in 6 guesses or less (so 100% wins not 99.4%), but actually 100% in 5 guesses or less!!!

- 3.42 guesses average

- a fairly convincing optimality proof (as far as average is concerned) which was apparently the only hard part computationally.

- it also had 3 upvotes until i cross-references it here, and barely more now.

- the author seems to have plenty of twitter followers, none of which seemed to care much.

And yet if you look at the "new" page, the flow of heuristics keeps coming. Conclude what you will :-).


Ok, I found this one [1] with 100% wins on the whole 12k-word set. But it optimizes for worst-case, not average.

[1] https://www.poirrier.ca/notes/wordle/


Going for greens is a good heuristic, but the general solution is to seek guesses that shrink the size of the set of compatible words the most. https://langproc.substack.com/p/information-theoretic-analys...


> From an information-theoretic perspective, the optimal initial guess is the one which is maximally informative about the target, given the resulting pattern.

This statement (which motivates the rest of the analysis) isn't correct, because it doesn't take into account the highly irregular search space of what you're actually allowed to guess. Sets of possible words can vary a lot in terms of how easily they can be further cut down: for example, if you're on hard mode and your set of possible words is [BILLY, DILLY, FILLY, HILLY, SILLY, WILLY], then you might be in trouble.


Yeah, the analysis doesn't really work for hard mode as noted in a footnote.


I don't think it works for easy mode either. The overwhelming majority of 6-word sets can be distinguished in two guesses in easy mode (their EV is either 11/6 or 2), while this set has an EV that's around 3.


I think you're right. Although a first guess might reduce the set size a lot, it might lock you into a situation where all the possible second guesses are useless.


for this specific set in easy mode doesn't (bawds, chefs) result in unique responses for all of them? so that's 2.5 guesses total? Where are you getting the EV for 6-word sets?


You're right: the EV is 2.5 (my guess was pessimistic). For (almost all) 6-word sets, there's a single word you can use to delineate all six words, which gives an EV of 1.83 if that word is in the set and 2 if it isn't.


That's exactly my approach.

I wrote a very simple O(N^3) algorithm to determine that: for each possible actual word and the guessed word, figure out whether or not a word is eliminated. Then calculate a score based on how many words can be eliminated from a particular guessed word, averaged over all possible actual words.

(I originally thought O(N^3) would be way too slow to be useful. But it runs fast enough with a few thousand words.)

Like a sibling comment, I also found that the best initial guess is RAISE, although when I tried a different word list, I got different words like LARES (I don't think it's in the Wordle list).

One thing that surprises me is that by the time you get to the mid-game stage (after 2 guesses or so), occasionally it can be more optimal to guess a word with repeated letters; I did not originally expect that because I thought repeated letters are wasted opportunity to test more letters, but it turns out Wordle's handling of repeated letters can give you helpful information about the number of occurrences of a letter. (For example if you picked a word with 2 a's, the first one can be green the second one can be black, essentially telling you there is exactly one letter "a" in the word.)


FWIW, your O(N^3) algorithm can be equivalently implemented in O(N^2) by using the following observation: a word is not eliminated by a guess if and only if the result of testing the word against the solution is the same as the result of testing the guess against the solution. Using this, you can implement the algorithm like so: for a given solution, partition all of the guesses based on the result of testing them against that solution. This partitioning takes O(N). Then, the set of words that will be left over from a guess is equal to all the words in the same partition as the guess. This makes it easy to calculate "how many words can be eliminated from a particular guessed word".

I'm making this comment because I started with the same O(N^3) approach and then realized this. :)


> a word is not eliminated by a guess if and only if the result of testing the word against the solution is the same as the result of testing the guess against the solution

That's not true. Suppose the actual word is "abcdx" but the guessed word is "xefgh". The result of testing is Yellow-Black-Black-Black-Black.

Now suppose you have other words like "xijkl" that will test against the solution in exactly the same way (Yellow-Black-Black-Black-Black), but you know won't be the correct solution because in the first round the misplaced "x" already tells you "x" should be present at a different position.

So now we have a word that can be eliminated even if it tests against the solution in the way the guessed word does.


Yeah, I think I messed up which things you need to partition. Given a set of possible solutions, and a guess, you partition the solutions based on which result Wordle gives back. Then you can calculate the expected number of solutions that will be left after you make that guess. This takes O(N). Going this for all guesses is then O(N^2).


I also implemented essentially this algorithm. For guesses I always consider the full list of possible words, not just the list of possible answers. Doing so requires a check so words that have been excluded are only used if they're better than words that haven't been.


Using essentially this method I get that the best initial guess is RAISE (there are some others that are very close). It seems the author of this post didn’t use the actual word list from the game so got a different word.


I used the actual wordlist, yoinked out of the javascript with my totally elite dev-tools-in-chrome-hacking!

But yeah, given the frequency of letters, in position, in the target list, the answer is SLATE.


Yeah when I ran my script using the real wordlist I got RAISE too. (Just updated the article.)


For me, ARISE and RAISE were equally good.


Or, especially in first guess or so, shrink the set of letters used by the remaining compatible words, which is subtly different right?


I have also hacked around on solving this problem. However, I decided that using the shorter list of answers was cheating, since end users wouldn't have access to this list. From a user's perspective, any word that Wordle lets you guess could be a valid solution.

It's still a work in progress, but my code died when I tried to generate a complete game tree for the full list.

However, in theory it could be done once, to determine the best starting word, and then hardcoded to always use the same start word. So, my plan to explore a game tree approach from guess 2.


No slight to anyone pursuing this strategy, but I also felt like using the shorter list for ranking candidate words was cheating. On the other hand, my solver ends up making silly suggestions that might be appropriate for early on in the game, but not for late guesses.

I've tried to think of what might be a good way of approximating a human's own sense of which words are viable Wordle answers. One thing I attempted was using a word frequency table to help bias the algorithm away from less common words. The funny thing is that some words rank low in word frequency but are otherwise count as "common knowledge". For example, "tapir" was a Wordle solution a few weeks ago, but it ranked lower than some obvious non-answers like "grovy". It could be that the frequency table I was using was weird, but I can believe that there are well-known words that aren't used that often. Maybe I could come up with a better corpus (e.g. only use the NYT or some other newspaper) to use as the basis for a custom frequency table. Seems like a lot of work!


But there are some obvious off-books rules. Like Wordle allows plural but the answer appears to never ever be plural.


Wordle reminds of a more complex version of Mastermind[0]. Donald Knuth proved you could always solve the game in five guesses or fewer![1].

[0]: https://en.wikipedia.org/wiki/Mastermind_(board_game) [1]: https://en.wikipedia.org/wiki/Mastermind_(board_game)#Best_s...


If you right off the bat guess "bound" and then positions 2, 3, 4, and 5 all come up green, then your goal needs to then be to use words like 'morse' to rule out (in one fell swoop) words like 'sound', 'mound' and 'round'. Then, if it's not ruled out, you can then choose words that just have m and s, or s and r.


This is a valid strategy in easy (default) mode, but in "hard" mode, you must use all yellow/green letters in subsequent guesses


This is why hard mode is a less interesting way to play.


Oh, good point. I didn't know that there was a hard mode - I've only played twice.


Why not simply choose a guess which narrows the set of possible solutions the most?

This could be improved by looking one step further, but this would probably require some kind of optimization / heuristic / approximation to make it computationally feasible.


It really depends on what you are trying to optimize. Least guesses on average may not want a 50/50 split it might want say a 60/40 split where the 60% is 2 guesses and the 40% is 3. But that loses to a 48:52 split which gives 2 guess 80% of the time etc.


Narrowing the search space greedily doesn't solve the problem optimally, because the size of the search space is just a heuristic. Some search spaces of size X might be more difficult to reduce further than another search space of size Y (for Y < X). This is more apparent in hard mode where you have a very limited guess-pool. The greedy heuristic approach would work fine if all possible 5-letter combinations were valid answers.


I know, like I said in my comment, this greedy strategy could be improved.


Watch this space :D


Why optimize for greens? Better to rule out more first no matter the position. I think it narrows the search space better.

If you want to go fancy somehow calculate with bigramms. If you guess c you don't have to guess k blindly too, etc.


If you want to get really fancy, a full minimax solution is probably feasible without massive resources. Pretend it’s a two player game with one player guessing and the other player thinking of a potentially different word for each guess subject to the constraint of being consistent with all previous answers.


Somebody had already tweeted about solving it and finding that 3 guesses suffice, using alpha beta search (though he didn’t use the term “alpha beta” because he might have reinvented it).

Unfortunately, I can’t seem to dig up the tweet.


There is a wordle parody (absurdle) [0] that does exactly this.

[0] https://qntm.org/files/wordle/


No it does not. It uses a heuristic at every layer of the tree. Specifically, instead of recursing down each subtree, it chooses the set with the maximum number of remaining elements, which isn't necessarily the set that is most difficult to split.


I see - I misunderstood the parent comment.

I believe you are saying that the best adversary would attempt to maximize the depth of the tree. For example a branch with "{L, B, C, R, ..}ake" requires more guesses than a balanced tree despite the fact that the two trees may have the same number of nodes.


If you want to split hairs, minimax is optimal for worst-case, but not for bringing down the average # of guesses.


> Why optimize for greens? Better to rule out more first no matter the position.

I haven't yet experimentally tested this, but I hope to soon! It may be that a green letter in position rules out more possible targets than a grey letter.


I thought of yellows. Taking frequent letters over words with sub-optimal letters but in the right positions.

But manually I'm no where close to 3.6 either :). Also often too lazy.


Yeah, I think my average is closer to 4, ugh! But the program has shown me some strategies that seem interesting. It always guesses "slate" first, and if that has no greys it guesses "crony", and that one-two punch seems to do me well fairly often.


interesting :) I go for "torah", "lines", ("ducky", but I think the k is far from optimal)

apart from "h" and "i" it's the same letters (and I do have c and y in my optional 3rd).


The generally optimal thing is to choose guesses to reduce the set of possible targets as much as possible. This is the same as choosing guesses with maximal entropy over the resulting patterns (that is, the result you get back from the game should be maximally informative). Because greens are rare, by maximizing the probability of greens, you are approximately maximizing the entropy of patterns, by distributing probability mass among rare patterns. This is why going for greens is a good heuristic.


This sounds like the optimal strategy for a single guess. But shouldn't the optimal strategy for solving the game include that there are multiple guesses? While some specific word might provide the highest reduce in entropy for the first guess, it might actually negatively impact the entropy reduction of remaining possible words compared to another first guess.

E.g. if RAISE is the optimal reduction in search space for the given word list, what's the best 2nd guess for every possible word? Now taking the average 2nd guesses into account, is there a better first guess?


Good point. Because the set of possible guesses is constrained, the totally general solution needs to take future guesses into account. There doesn't seem to be a way to find the optimal guess other than searching the whole game tree (which it looks like someone in another comment did).


It's interesting to see the edge cases where the algorithm struggles to find the last letter.

> shave: [slate: 20202, share: 22202, shame: 22202, shape: 22202, shake: 22202, shade: 22202]

I guess it needs to find the right choice of [r, m, p, k, d, v] to get "shave" (since 'v' is so rare it takes 6 guesses!). Trying "marks" and "paved" would narrow that list down in 2 guesses (and a third to actually try "shave")

Anyone code this up yet? (I haven't, I took a greedy approach too [0]). Wonder how to generalize such a plan.

[0]: https://github.com/jdan/wordle.ml


The solver is playing in "hard" mode, where any yellow/green letters must be used in further guesses


Yeah, I don't want to give the game away on my next program but its along those lines, and there are some challenges to making it work that I'm still figuring out haha.


Thanks for sharing.

I have a feeling that a perfect algorithm should be able to solve all wordles in 4 guesses. (but I might be wrong)

The problem with a lot of approaches is they seem to locally optimise for each guess rather than optimise over all the guesses (if that makes sense?)

I get the feeling guessing the most frequent letters might not be the best approach, because does it cut down the number of possible words much?

My intuition tells me if you have the right first 3 guesses you should be able to cut the dataset down to 1 possible answer every time.

However I don't know how to program it.


A bit ago on HN there was a link to Evil Wordle - https://swag.github.io/evil-wordle/ ( https://news.ycombinator.com/item?id=29864418 ) and there's another variation of it at https://qntm.org/files/wordle/

On NPR ( https://www.npr.org/2022/01/23/1075168693/youve-heard-of-wor... )

> QNTM: So with Wordle, in theory, you can win in one guess. I've won in two guesses a couple of times. Absurdle - you cannot beat it in less than four guesses. Four is the limit. But generally just good luck - you're going to need it.

The worst case for Wordle and evil world or Absurdle should be the same... its just that evil and absurd are always the worst case.


You guys are all here real programmers, but I am just good with Excel & a bit of javascript. I made this sheet to get best possible words, & I am trying to get this converted into a bookmarklet.

My excel sheet is explained in this different hn thread about same game https://news.ycombinator.com/item?id=30037065

The whole word list is available in the javascript code of the game page. I know all of that can can be accessed by javascript runtime, but the javascript wordle uses is not the one I can understand, could somebody please have a look at wordle page source & let me know how can I access the variables defined in his code in js file https://www.powerlanguage.co.uk/wordle/main.e65ce0a5.js in variable named La

My some of the basic js apps I have made for myself are hosted on gitlab at username davchana

In manual mode I try to use first word with 3 vowels & no repeat letters. Hard mode is good for not letting the tryies go waste.

Thanks


Hey! Don't say you aren't a real programmer! I learnt how to program in my spare time at an industrial site, using Excel! Excel runs the world, and a few years ago I got a programming contract because I had that VBA knowledge! That same company was so impressed that I eventually got a full time job with them. I am still a full time programmer to this day. So, keep working at it! Excel is real programming and javascript definitely is.

For your problem though, I just copy pasted the entire list into a new file to use. If you want to use my file, its in my repo in the article under the path ./lib/words.py!

You are a real programmer!!!


Hey thanks for the success story & words of encouragement.. you are the hero!!

The file; I was wondering how can I access the list in my javascript bookmarklet, which I can run on his game page. I mean i can define the whole list in my own variable in my bookmarklet, but sure it is going to be over the character limit of a bookmarklet.

How, when the game page is loaded, one can access that variable in console log? I tried simply typing La, but looks like it was not declared on global scope, it is a part of some object.

I am thinking of using a .js file in my bookmarklet, to bypass the char limit; or adding a script link in dom; but those will be redundant data; the original var is there in runtime.


Ahhh for that problem, I am not sure! Best of luck!


A very naive solver is giving me 3.289 guesses on average.

What I did was simply sort the word_list based on frequency analysis. The guess is always the first word in the list (with the highest score) After each guess, I update the list based on the results (gray, yellow, green)

For the final test, I just looped through each word in the list and played the game for that word. My algorithm got every single word (2315 words) in 7614 total guesses giving an average guess of 3.298

Right now, I am only using the list of final_words. I believe that I can get better results by using the list of attempt_words to reduce the number of possibilities, and then using final_words to give final guesses. So maybe the first 2 guesses will be using the attempt_words and then the guesses after that will be from final_words. Or maybe keep guessing from attempt_words until the list length is below a certain number and then start guessing from final_words. Something along these lines will give better results.

In my freq analysis, I also accounted for the freq on each index.


Why do people need 3 guess when the solution is literally in the localStorage /s


It depends on whether you want to play the game or break it. :D

Most people are taking this as a fun algorithmic exercise, assuming imperfect knowledge.


As many have mentioned, it seems much more efficient to focus on eliminating letters, which means that the second word should not be very green at all.

Also, since there's only one word a day, I think we can assume that it's chosen manually, and it won't be some archaic anglo-saxon tool for shoeing horses or something, it will be a relatively common word.


It’s chosen from a sequential list of words in the source code. Each word was vetted by the woman this game was originally made for.

See the NY Times article: http://web.archive.org/web/20220106042510/https://www.nytime...


I think it's chosen based on a random number generated from the date - but one of the most recent games had the answer "REBUS", which I've never heard or read before. But yeah, the target list has many more common words in it than the list of all possible guesses.


I am pretty sure the list is sequential. If you look at the list in the Javascript code of the game and Ctrl-F today's word, you'll see the previous dates' words are right before it. Here's tomorrow's word (spoiler alert, obv): https://pastebin.com/sfQ7x5ya


That is not an uncommon word though, I'd venture most school children know it. You probably read rebuses in school but just forgot about it!


It's a common word? What's your source for that claim?



This website looks like it's target audience is five year olds, to you?


It was just a random google result, but we definitely did rebuses in pre-school at least.

The rebus concept is actually a precursor to actual writing, historically speaking. Both in Mesopotamia/Egypt and in the Chinese civilisation, writing developed through a rebus phase. So it's a very natural thing, that's why it's so child friendly.


That's all cool, why's it a common word, though?


Why? I mean most pre-schoolers probably play with rebuses, simple crosswords etc. It's just a common concept in the world that most people have come across.

Unlike say an obscure linguistic term like nisba.


Oh I get that they exist, commonly, I just see no evidence to suggest the word is common.


For the creator of Wordle and their partner, who obviously have some love of word puzzles, REBUS (a picture to word puzzle) would be a familiar word.


It says it is designed for hard mode, in which I believe you must use your green (and maybe yellow?) letters.


Oh, didn't know there was modes.


Jake Archibald shared an implementation that aims to narrow the search space the most with each guess.

https://twitter.com/jaffathecake/status/1483057877875101697?...


>so far what was taking 1GB of RAM in Python is taking, literally 1MB in Rust

Is anybody hiring people to fix situations like this at work? i.e. this script that was written in a hurry needs to be revised to be 10^3 times more efficient, in space or time.


As a data engineer, I sometimes run into problems like this, but I generally find that 99.9% of them can be solved "well enough" in python, or in a database. Moving code from Python to Rust or similar has been required only a handful of times.

This specific task, the next program I'm working on, seems to be a very "algo" based, comp sci type problem. I think most companies want to say that they have these kind of problems, but the reality often is, they need a batch script on a laptop.

So like, yeah, I have done things like this before in my work, but often, solutions in Python are "good enough".


A company I know hired a computer science PhD after putting them through the full programmer interview, including whiteboarding algos, etc. Literally the hardest crap that anyone working there could think up, and then hired the programmer who got the best score on the test, and the process took multiple months to find this magic individual who could score high enough to satisfy everyone.

Once hired (at a fantastic salary), she was then put to work copying data from a pile of CDs into a cloud storage account! Not only that, but she was given a laptop with only 1 CD drive to work with.

She quit within a week.


I have trouble believing that story. An adult, who is also a super genius just says "I was given..." and quits? That's so passive! You have one CD drive and assume that's it, nothing can be done, that's all the resources in the whole company? Don't talk to anyone?


Perhaps evaluating the situation and leaving immediately shows just how smart they were!


That would depend on whether the company was doing something that needed to be done, wouldn't it? Which should be clear before a person starts.

If something has to be done, then avoiding it makes you "smart" like avoiding taxes makes you "smart".


Wonder what company that was haha :)


>I generally find that 99.9% of them can be solved "well enough" in python

Sorry, I didn't mean "by translating to a lower level language".

In fact, I especially meant by fixing the script in the existing language.


Oh yeah, I do that all the time - sometimes the volume of data gets higher and there are bottlenecks and such, you know? Much refactoring to be done.


As I've seen on HN, a lot of people hate and do not think in SQL, and avoid it by writing something imperative. Replacing a loop with one query is an example of the sort of thing that I've seen make 3 orders of magnitude difference.

Another basic pattern is converting a program or script that loads a whole job into memory into something with fixed size buffers, because specific data made the original explode.


I did this at an internship of mine. You aren't really going to be hired for "please fix our script's performance", but you might end up finding a place where you have the opportunity to do this kind of thing.


>You aren't really going to be hired for "please fix our script's performance"

Not for one script - but imagine everything like that is funneled to a group that supports operations. Such exists, as a matter of fact, question I have is how to find another one.



It looks like the losing cases are those where in hard mode you can easily get 4 of the letters correct but you have more potential words than guesses left.


Yeah, it requires more analysis but I think counterintuitively many of the games that go up to 5-6 are ones where the initial guesses get a lot of greens, effectively making the remaining guesses a set of 50/50s.


I posted a similar bot and tools and analysis a while back.

It solves in 3.45 moves on average, 5 or fewer moves always, and uses 5 moves less than 2% of the time.

The core parts are very fast, using bitwise tricks for filtering, uses caching throughout to reduce computation, and is fully multithreaded.

https://github.com/ChrisLomont/Wordle


Wordle is a perfect example of nerd sniping, I couldn't stop playing once I started. But I wanted to challenge my friends to some more difficult puzzles and wrote a tiny tool this weekend to do just that!

You can create your own Wordle-like puzzles on https://word.rodeo

I've already received a ton of positive feedback from friends. What are your thoughts?


It's not copying the URL to my clipboard, would it be possible to expose the url as text so manual copying can be done?


My partner noticed this too, but after a couple of tries the same button will work again. Really hard to debug unfortunately. It usually worked after a couple of tries. I just pushed a fix that greys out the link if there's no URL, hopefully that helps?


Can you really not just share the url and at most pre-highlight it?

On my phone I get a share sheet with additional text which makes visiting the url a pain


It's a normal anchor tag so you can right click or long press it to copy the link!


Doesn't work for me either, after many clicks.

macOS 11.6, Chrome 96.0.4664.55.


Super embarrassing. I fixed this now. I used event.target which was just the SVG of the icon if you clicked that instead of the text. (I also split the share button in two to make it easier to share)


Well this is embarrassing.

The event listener was using event.target which wasn't the correct node when you hit the icon. I have fixed this now!


Love it


This solution feels a little like cheating because it's using the "target" list to train. The target list contains all the solutions, while the larger list contains all 5-letter words. A human would not know the target list. For a more fair solver, it should add the two lists together to train, and then test on the target list.


Tom's writeup inspired me to dig a little more. I unfortunately (for me) discovered a very simple way to solve the game on the first guess every time. My fault for reading the game's code, but I just can't look at the puzzle the same way anymore.


It’s because you’re using sets — the reason why the author mentioned things are out of order.

>> A set is a collection which is unordered, unchangeable*, and unindexed. [0]

[0] https://www.w3schools.com/python/python_sets.asp


I came across a post which said use the words dying, clots, brake, and whump (statistically expect worse cast 2 guess since 20/26 alphabets used in 4 guesses, worked every time for me (although with this strategy you are ending up needing 5 guesses all the time and in rare occasion need the 6th guess.


See also Absurdle, an adversarial Wordle that is designed to be as difficult as possible: https://qntm.org/files/wordle/

I won't spoil how it works, but click the "?" button on the page if you want to know.


Since it is adversarial, it is also deterministic, I.e, for a given play, its response will always be the same. So the challenge here is no longer, “can you guess THE word?”, but rather, “can you find the shortest play?”. Finding a 5-step solution is reasonably challenging, but try for a 4-step solution, it will drive you nuts :) Some have been posted on Twitter but that’s no fun.


It's designed to be, but it isn't necessarily, as it uses the size of the remaining solution set as a heuristic for how many guesses are needed to split it. There are better, more exhaustive searches of the solution space now.


Shameless plug: I’ve also made a solver, written in Rust. Main goal was to learn the language, 2nd the trie structure behind the search: https://github.com/jutaz/wordle-search


The objective function is weighted average number of guesses, with some penalty factor for >6 (aka losing), and optionally different weights for getting it in 1,2,3,4,5,6

And you can solve it in a tree- at least for the lower levels when the search space has been reduced


https://github.com/JuanPotato/Wordle-Solver

I made a similar solver, it solves all the words in 3.65 average moves.

Could probably be better. The github explains how it works.


Neat! I came up with a similar statistic (3.65 average moves, hard mode, no failures) using a very simple heuristic.

  1) pick a first guess, and partition the remaining words by the hints that they give

  2) given any part of a partition, pick a word that minimizes the maximum-size part of the resulting sub-partition

  3) repeat (2) until words are solved.
I brute-forced the first guess (that is, generated a tree rooted at each word), and the best one was 'bland'. That has one failure, getting stuck on the chain (hound, mound, pound, sound), which ended up being easy to fix manually.

My tree is quite different from yours, with guess distribution 1, 103, 861, 1114, 212, 24.


Yeah I'm on easy mode. With hard mode on I had some failures with words that only differ by 1 letter. I was happy with the result though and didn't care to try and optimize the hardmode.

I think your heuristic is pretty similar to mine, just accomplished in a different way.


This is really interesting, well done :)

As it stands, it seems like your algo will always use the same first guess, have you thought about making that configurable and seeing what happens with e.g. the number of words where it gets stuck and runs out of time?


Get to work Mr. Hogg!

Yeah, next program will effectively do this but in an utterly insane way. Shall we brunch to discuss soon?


Anyone can please fill me in on why I am seeing a daily wordle post on HN for the past week or so? Is there a competition that I missed?


At least in my household (and some groupchats) its become like a fun daily ritual, it takes just 5-10 minutes to play and post it, and you can only play once a day.


If you like Wordle you should check out https://wordhoot.com/


I thought the UI of Wordle wasn't anything special but this makes me appreciate the subtle attention to detail that went into it

That website you linked doesn't lock the viewport so on my phone it's constantly panning or randomly zooming

And the grid instead a keyboard layout actually matters more than I expected for how easy it is to type

Having to manually start a game is a tiny thing but doesn't feel as smooth


Recreating the game from scratch I noticed a ton of these tiny little details that went into the game which made it even more fun of a challenge.

I'm looking forward to trying to implement these subtle animations, too.


Are you still experiencing the panning/zooming issue? If so please let me know and I will look into it. I've updated keyboard layout. Thanks!


It's click to start because it has a timer. Hmm, I'm not seeing the other issues.


You can solve it with grep and /usr/share/dict/words but where's the fun in that?


The answers are all in it's JavaScript source code, but what's the fun in that?


Or just use `JSON.parse(localStorage.gameState).solution` but that's not the point.


Everyone is having these really interesting conversations... but no one has explained what Wordle is :(


Wordle is a language puzzle game which has gained a lot of popularity quite quickly. The rules are very simple, yet allow for quite a lot of strategy. There are many articles about its rise to fame. The official website [1] has a single puzzle per day which everyone shares, but there are many clones which have variations, or unlimited games.

[1] https://www.powerlanguage.co.uk/wordle/


Its explained in the link. Go play it. its quite fun.


I added the explanation as a direct result of the comment you're responding to.


If you found them interesting it would take literally a single search to find out.


Good point! Sorry! I'll add a link to the article!


Someone plz write a RL-agent for this game, I bet it could go lower than 3 guesses lol


My manual average is 3.28. Smaller sample count than these solvers of course.


Isn't an optimal first word one with plenty of vowels?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: