Most mathematical questions one could have about Wordle are settled by now, and a few remain open. I summarize here what is known, as far as I can tell.
First, let’s clarify a few things about the game:
Wordle comes with a dictionary of 12972 words that the player is allowed to use as guesses. They are essentially all 5-letter combinations one could reasonably argue are English words. The “secret” word that the player has to discover is also always in that dictionary.
It is so trivially easy to cheat at Wordle that there is no point to it: The list of secret words is right there in the code of the game. The daily secret words are picked sequentially from that list, so they are all known since 2021-06-19 (“cigar”) and until 2027-10-20 (not going to spoil it for you), for a total of 2315 scheduled secret words.
One could always look up that list for the current date and solve the game immediately at the first guess. However, this is not much fun (mathematically), so we need to define the game somehow. There are two obvious options:
While many focused on finding the best “first word” to play as a guess, the first guess is just a tiny part of what we really want: A decision tree. A decision tree not only tells you which guess to play first, but accompanies you as you play, telling you which guesses to play subsequently in function of the information given you by the game as you play (i.e. all the letter colors 🟩🟨⬜ so far). I only cover here fixed/deterministic decision trees, which will always reach a given secret word in the same way, because it is straightforward to give formal evaluations of how good they are.
The word optimal has taken quite a mauling lately in Wordle-related writings. Just to be to be absolutely clear, a solution is optimal only if no better solution can possibly exist. In particular, there is no such thing as a “more optimal” solution.
Now, regarding what we optimize, there are two natural choices:
As we will see, those are competing objectives, sometimes irreconcilably, sometimes not so.
Buried in the options, Wordle features a hard mode. The game is different in hard mode, in that only words that satisfy the hints given earlier are allowed as player guesses. (For example, if a letter is colored green 🟩 after one guess, that letter must be present in the same position in all subsequent guesses.)
There are subtleties about the rules regarding words with repeated letters, and this particularly affects hard mode. However, it seems that by now everyone agrees what the rules are.
In this section, we assume that we know the list of 2315 scheduled secret words, but we ignore the list’s order (it would be trivially easy otherwise).
For this case, Cyrus Freshman set up a nice automated leader board with a standardized input format. There we can directly find that five distinct people have found decision trees that solve Wordle in 3.4212 average guesses and (simultaneously) 5 worst-case guesses: Jonathan Olson Peter Tseng, CHIsomorphism
, Alex Selby, as well as rahsosprout
.
While I suspect all employed similar general approaches (recursive enumeration / backtracking with tree pruning and some form of caching), I will now mainly defer to Alex Selby’s nice writeup, because it is the most thorough and because the corresponding code can answer more of our questions. Alex and Peter both implemented their algorithms in such a way that not only they get good decision trees, but they can also subsequently perform an accelerated complete enumeration and prove optimality. Alex indicates that the latter is by far the most difficult part computationally: finding good trees (that happen to be optimal) is quick, proving that they are indeed optimal is much more expensive.
The best known decision tree for this version of Wordle lead to a win in 3.4212 guesses on average. In the worst case, this same tree wins the game in 5 guesses (i.e., 100% of the time, never even using the 6th guess). Alex reports on the result of the complete enumeration, formally proving that this is optimal, both in average (no tree has an average of less than 3.4212) and in the worst case (no tree wins all games in 4 guesses at most).
For hard mode, both Alex and Peter found a decision tree with in 3.5085 guesses on average and 6 guesses at worst (i.e., 100% wins in 6 guesses). Hard mode is interesting in that, if we are willing to sacrifice a bit on the average, we can improve the worst case: Alex found a different decision tree with 3.5448 guesses on average, but a worst case of 5 guesses! Again, Alex reports optimality on both fronts.
In terms of worst case, I could confirm the optimality results both for standard Wordle and hard mode.
In this section, we assume that the game can pick secret words from its whole 12972-word dictionary. The number of potential secret words is now about 5.6x as large as in the previous case, dramatically increasing the size of an already large enumeration space.