In Words with Friends, sometimes it is helpful to know the probability of your opponent having a certain letter in his hand. While this may sound like a daunting task, it is actually quite simple and can be modelled with a hypergeometric distribution, which is most commonly used for sampling without replacements. The formula is given as:
where N refers to the population size, n the number of draws, m the number of 'successes' in the whole population, and k the number of successes that you want to calculate. The brackets relate to a binomial coefficient such that:
Let's use a simple example to test first. What is the probability that your opponent has a 'X' in the first round, assuming that you don't have one in your hand?
In this case:
n = 7, because each player starts with 7 tiles.
N = 97, because each player starts with 7 tiles and I don't have a 'X'.
m = 1, because there is only one 'X' in the whole game.
N - m = 96, which refers to all the non 'X' tiles in the game.
k = 1, because you want to find the probability that your opponent has one 'X' in his hand.
P(X = 1) = [ 1! / ( 1! x 0! ) ] x [ 96! / ( 6! x 90! ) ] / [ 97! / ( 7! x 90! ) ]
= 7 / 97
= 7.22%
Let's check this with just logic and simple probability instead of the formula. Your opponent has 7 tiles in his hand, so this is equivalent to 7 draws.
Probability that he draws a 'X' in his first attempt = 1 / 97
Probability that he draws a 'X' in his second attempt = 96 / 97 x 1 / 96 = 1 / 97
Probability that he draws a 'X' in his third attempt = 96 / 97 x 95 / 96 x 1 / 95 = 1 / 97
...and so on until the seventh attempt. Now, it is obvious that the probability is 1 / 97 for each attempt, so in total, the answer is 7 / 97, which is 7.22%.
Of course, the hypergeometric distribution can be used to calculate more complicated scenarios, like for example, what is the probability that your opponent has 2 'S' in his first round, assuming you have none in your hand?
This time, m = 5 (there are 5 'S' in the game), and k = 2. The other variables remain the same.
P(X = 2) = [ 5! / ( 2! x 3!) ] x [ 92! / ( 5! x 87! ) ] / [ 97! / ( 7! x 90! ) ]
= 3.83%
However, it would be more apt to calculate the probability that your opponent has at least 2 'S' in his first round. This means we want to find P(X ≥ 2), which means P(X = 2) + P(X = 3) + P(X = 4) + P(X = 5). Doing the calculations would give you an answer of 4.05%.
This distribution can be easily replicated on Microsoft Excel with the function FACT() for factorial (for the Binomial coefficient).
It should be noted that this method may not be 100% accurate due to the fact that you can use less than 7 tiles at one go. So it is likely that your opponent will hoard the better tiles like 'S' or blank and wait for an opportunity. Thus the calculations above can be thought of as 'the minimum probability' that your opponent has a particular letter.
However, it would be more apt to calculate the probability that your opponent has at least 2 'S' in his first round. This means we want to find P(X ≥ 2), which means P(X = 2) + P(X = 3) + P(X = 4) + P(X = 5). Doing the calculations would give you an answer of 4.05%.
This distribution can be easily replicated on Microsoft Excel with the function FACT() for factorial (for the Binomial coefficient).
It should be noted that this method may not be 100% accurate due to the fact that you can use less than 7 tiles at one go. So it is likely that your opponent will hoard the better tiles like 'S' or blank and wait for an opportunity. Thus the calculations above can be thought of as 'the minimum probability' that your opponent has a particular letter.