
What word do you start Wordle with?
My wife and I enjoy a good sparring game of Wordle over our morning coffee and like all devoted Wordle players, one of the top conundrums is choosing the best starting word. I’m not the first person to write about this topic, but I think I have taken a different approach and have come up with a bit of a different answer.
Most of the articles about “best Wordle starting words” reference the bot analysis done by current Wordle owner itself, The New York Times:
- “One mathematician created an automated bot to test over 12,000 words, while online urban legend suggests words like ‘Irate’ or ‘Salet’ are best because of their vowels. In creating their WordleBot, though, New York Times has found ‘Crane’ to be the best place to start while ‘Crate’, ‘Slate’, ‘Slant’ and ‘Trace’ are also very good guesses, as are ‘Lance’, ‘Carte’, ‘Least’ and ‘Trice’.”
But people are not bots. I decided to play card counter in the Wordle casino and simply look at letter frequency by position to optimize the odds of hitting a useful letter (and actually, if the letter is a miss then it rules out the highest percentage of words by definition).
I started with a list of the 700 most common words 5-letter words in the English language (to filter out the esoteric and foreign words not included in Wordle). I then cobbled together a little Excel macro to count letter frequency by position (the results are shown in the table below).
With the positional frequencies in hand, the final step was to find the word with all of its letter in the highest frequency position. If I just took the top frequency letters, you would get “SOALE”. Which unfortunately is not a word. By using a top 5 position scoring system (like they do at track and field meets), the highest scoring word I could come up with was “SHALE” which has the #1 letter in every position except the 2nd (which is the 4th highest letter).
A high ranking post by Tom’s Guide took a similar approach, but used a much bigger dictionary of “common words” and came up with a different words, namely “STARE”. He too came up with words that weren’t real (ie. his top possibility was “SOARE”). But the methodology is arguably a bit unscrupulous because it did searching of known “solutions” to Wordle.
The real difference between Tom’s analysis and mine is whether (a) “H” or “T” is a higher frequency 2nd letter, and (b) whether “L” or “R” is the higher frequency slot in the 4th position. Another consideration is that “T” and “R” are more generally common letters than “H” or “L” (so more likely to hit a gold-shaded letter in the word, but just in the wrong position).
