If I randomly generate a string of length N from an alphabet {A, B, C}, what's the likelihood that exactly k characters will be the same?

Initially, my solution looked like this:
P(k characters are the same) = $(\frac{1}{3})^k * (\frac{2}{3})^{n-k}$
Until I realized that this solution wasn’t robust enough– it doesn’t matter WHICH characters are the same, only that k characters are the same.

Thanks so much in advance for your help.

Solutions Collecting From Web of "If I randomly generate a string of length N from an alphabet {A, B, C}, what's the likelihood that exactly k characters will be the same?"

Let k, l, m be the # of occurrences for different letters in decreasing order.

Using the multinomial distribution formula, to illustrate,

for string AAABBC, Pr = $\frac{6!}{3!2!1!}\cdot\left(\frac{1}{3}\right)^6$

and for string AABBCC, Pr = $\frac{6!}{2!2!2!}\cdot\left(\frac{1}{3}\right)^6$

In general, Pr = $\frac{N!}{k!l!m!}\cdot\left(\frac{1}{3}\right)^N$, k+l+m = N

continued…

I have taken that you want probabilities for particular values of k. For k = 3 and N = 6, for example, you will need to sum up probabilities for permutations of 3-3-0 (3#s) & 3-2-1 (6#s)

edited by the questioner…
The final solution comes down to this.
Imagine that we have three bins, each representing the number of times each character appears in a string. For AAABC, the bins would be {A:3, B:1, C:1}
For:

  • N = the length of the string,
  • k = the maximum bin value (there can be ties),
  • l = the next bin value,
  • m = the last bin value,
  • d = the number of bin values that are different from k’s bin value (max 2)
    • Examples:
      1. ABC: bins = {A: 1, B: 1, C: 1}. d = 0
      2. AAA: bins = {A: 3, B: 0, C: 0}. d = 1
      3. AAC: bins = {A: 2, B: 0, C: 1}. d = 2
  • C = the number of letters in our alphabet (always 3),

Pr = $\frac{N!}{k!l!m!}\cdot\left(\frac{1}{3}\right)^N \cdot\frac{C!}{(C – d)!}$, k+l+m = N