Variance of the number of empty cells

If I place $k$ balls in $n$ cells randomly with uniform probability, then for large enough $n$ and $k/n$ small enough, I expect about $n e^{-k/n}$ cells to be empty (using the Poisson approximation to the Binomial.) However, I cannot find the limit of the variance of the number of empty cells?

One can compute explicit cases by using the probability that exactly $m$ cells are empty,
$$var = \sum_{m=0}^nm^2\binom{n}{m}\sum_{\nu=0}^{n-m}(-1)^\nu\binom{n-m}{\nu}\left(1-\frac{m+\nu}{n}\right)^k – \left(1 – \frac{1}{n}\right)^{2k}$$

but I cannot find the limit of this.

The same question can be asked of singletons, doubletons, etc.

Solutions Collecting From Web of "Variance of the number of empty cells"

We use the method of indicator functions, which is often useful for this kind of problem. Let $X_i=1$ if the $i$-th cell is empty, and $0$ otherwise. Let
$$Y=\sum_{i=1}^n X_i.$$
Then $Y$ is the number of empty cells. We want the variance of $Y$. This is equal to
It is easy to find $E(Y)$, since we can find $P(X_i=1)$, and then use the linearity of expectation.

Now we want $E(Y^2)$. To find it, expand $Y^2$, that is, $(X_1+\cdots +X_n)^2$. We get
$$\sum_{i=1}^n X_i^2 + 2\sum_{1\le i<j\le n} X_iX_j.$$
Finding the expectation of $\sum X_i^2$, is easy, indeed we have already found it, since $X_i^2=X_i$. It remains to find the expectation of $2\sum X_iX_j$.

There are $\binom{n}{2}$ terms in the sum. So we need to find the expectation of a typical mixed term $X_iX_j$, and multiply by $2\binom{n}{2}$.
To find $E(X_iX_j)$, all we need to do is to find the probability that cells $i$ and $j$ are both empty. That is not hard to do.

Comment: The technique used bypasses the difficult problem of finding the probability distribution of $Y$, then getting an expression for the variance, and then somehow simplifying this expression so that we can actually use it for further work.

Note that our random variables $X_i$ are not independent. However, that need not be an impediment. The expectation $E(X_iX_j)$ is, in our problem, somewhat harder to calculate than if we had independence, but it is not much harder.

The technique will also work for the problems involving singletons, doubletons, etc., proposed in the post.

Let $Y$ be the number of empty cells. We can write $Y = \sum_{i=1}^n Y_i$ where $Y_i$ is $1$ if cell number $i$ is empty, $0$ otherwise.

$E(Y_j)$ is the probability that cell number $j$ is empty, namely $(1 – 1/n)^k$, so $E(Y) = n (1 – 1/n)^k$.

For two distinct cells $j_1$ and $j_2$, $E(Y_{j_1} Y_{j_2}$ is the probability that both are empty, namely
$(1 – 2/n)^k$. There are $n(n-1)$ such ordered pairs. On the other hand
$E(Y_j^2) = E(Y_j) = (1 – 1/n)^k$, and there are $n$ of these. So
$E(Y^2) = n(n-1) (1 – 2/n)^k + n (1 – 1/n)^k$, and
$$\text{Var}(Y) = E(Y^2) – E(Y)^2 = n(n-1)(1 – 2/n)^k + n (1 – 1/n)^k – n^2 (1 – 1/n)^{2k}$$

As $n \to \infty$ with $k = \lambda n$, $(1 – 1/n)^k \approx (1 – \frac{\lambda}{2n}) e^{-\lambda} $,
$(1 – 1/n)^{2k} \approx (1 – \frac{\lambda}{n}) e^{-2\lambda}$
and $(1 – 2/n)^k \approx (1 – 2 \lambda/n) e^{-2 \lambda}$. The end result is $\text{Var}(Y) \approx n (e^{-\lambda} – (1+\lambda) e^{-2\lambda})$.