Thanks for sharing your code. I had seen other people also mention that these 5 guesses (AESIR, ARISE, RAISE, REAIS, SERAI) all narrow down the list to 168 (in the worst case), but I was curious why the code I wrote to play with this was giving different answers.
After reading your code, I see an error in it. In fact, only RAISE is optimal; the others are worse, and leave bigger lists (according to my code):
is too simplistic, and not actually what the real game does. In the game, for each letter position, there are three possible responses:
• Correct (Green), what you call "position"
• Present (Yellow), what you call "included"
• Absent (Grey)
Here are three test cases, that you could try out on the real Wordle in recent days:
• When the solution is "FAVOR" and our guess is "ERROR", Wordle's response is [Grey, Grey, Grey, Green, Green] — note that for the first two Rs in "ERROR", the correct response is Grey (Absent), because the last R has already "used up" the "Green" response.
• When the solution is "FAVOR" and our guess is "ROARS", Wordle's response is [Yellow, Yellow, Yellow, Grey, Grey] — note that only the first R in "ROARS" gets a Yellow response and the second one gets Grey, because there's only one R in the solution.
• (As pointed out by @pedrosorio in a sibling comment) When the solution is "ABBEY" and our guess is "APNEA", Wordle's response is [Green, Grey, Grey, Green, Grey], but your solver thinks that the second "A" would get a "Yellow" response too.
> Rule 2 is somewhat difficult to state precisely and unambiguously, and the manufacturers have in fact not succeeded in doing so on the directions they furnish with the game […]
and gives an exact rule that you may want to study carefully.
In my code, the `response` function I use (it's not the most efficient, but we can just memoize it) is:
def response(h, g):
'''
- The hidden word is h.
- The guess is g.
For each position in the word g, some color:
- 'green' if in the same position
- 'yellow' if present (after subtracting 'green's)
- 'grey' if absent (after subtracting "green"s and "yellow"s)
'''
assert len(h) == len(g)
L = len(h)
green = [i for i in range(L) if h[i] == g[i]]
yellow = []
for i in range(L):
# We want to check whether g[i] is "present" in h
if i in green: continue
for j in range(L):
if j in green: continue
if j in yellow: continue
if h[j] == g[i]:
yellow.append(i)
break
return (green, yellow)
Note the three "continue" statements — they are crucial, to match the behaviour of the real Wordle (or Master Mind) on the three test cases I mentioned above.
The testcases notwithstanding, I found a bug in my code as well. Fixed my `response` function to:
def response(h, g):
assert len(h) == len(g)
L = len(h)
correct = [i for i in range(L) if h[i] == g[i]]
present_h = []
present_g = []
for i in range(L):
# We want to check whether g[i] is "present" in h
if i in correct: continue
for j in range(L):
if j in correct: continue
if j in present_h: continue
if h[j] == g[i]:
present_g.append(i)
present_h.append(j)
break
return (correct, present_g)
and now I too get (AESIR, ARISE, RAISE, REAIS, SERAI) all leaving 168 words. (But the testcases in the above comment still hold, though, make sure your code works for them.)
After reading your code, I see an error in it. In fact, only RAISE is optimal; the others are worse, and leave bigger lists (according to my code):
The error in your code is that your "evaluateGuess" function (right at the top, in the first 10 or so lines of https://github.com/christiangenco/wordlesolver/blob/9c3bd94a...): is too simplistic, and not actually what the real game does. In the game, for each letter position, there are three possible responses:• Correct (Green), what you call "position"
• Present (Yellow), what you call "included"
• Absent (Grey)
Here are three test cases, that you could try out on the real Wordle in recent days:
• When the solution is "FAVOR" and our guess is "ERROR", Wordle's response is [Grey, Grey, Grey, Green, Green] — note that for the first two Rs in "ERROR", the correct response is Grey (Absent), because the last R has already "used up" the "Green" response.
• When the solution is "FAVOR" and our guess is "ROARS", Wordle's response is [Yellow, Yellow, Yellow, Grey, Grey] — note that only the first R in "ROARS" gets a Yellow response and the second one gets Grey, because there's only one R in the solution.
• (As pointed out by @pedrosorio in a sibling comment) When the solution is "ABBEY" and our guess is "APNEA", Wordle's response is [Green, Grey, Grey, Green, Grey], but your solver thinks that the second "A" would get a "Yellow" response too.
You mentioned Donald Knuth's Mastermind paper; in fact in the paper (http://www.cs.uni.edu/~wallingf/teaching/cs3530/resources/kn...) Knuth points this out on the very first page:
> Rule 2 is somewhat difficult to state precisely and unambiguously, and the manufacturers have in fact not succeeded in doing so on the directions they furnish with the game […]
and gives an exact rule that you may want to study carefully.
In my code, the `response` function I use (it's not the most efficient, but we can just memoize it) is:
Note the three "continue" statements — they are crucial, to match the behaviour of the real Wordle (or Master Mind) on the three test cases I mentioned above.