Negative binomial distribution – sum of two random variables

Suppose $X, Y$ are independent random variables with $X\sim NB(r,p)$ and $Y\sim NB(s,p)$. Then $$X + Y \sim NB(r+s,p)$$

How do I go about proving this? I’m not sure where to begin, I’d be glad for any hint.

Solutions Collecting From Web of "Negative binomial distribution – sum of two random variables"


If $\Pr(X=k)={k+r-1 \choose k}\cdot (1-p)^r p^k$ and $\Pr(Y=k)={k+s-1 \choose k}\cdot (1-p)^s p^k$ then

$$\Pr(X+Y=k)=\sum_{j=0}^k {j+r-1 \choose j}\cdot (1-p)^r p^j \cdot {k-j +s-1 \choose k-j}\cdot (1-p)^s p^{k-j}$$

$$=\sum_{j=0}^k {j+r-1 \choose j}\cdot {k-j +s-1 \choose k-j}\cdot (1-p)^{r+s} p^k$$

and you need to show

$$\Pr(X+Y=k)= {k+r+s-1 \choose k}\cdot (1-p)^{r+s} p^k$$

so it is just a matter of showing $\displaystyle \sum_{j=0}^k {j+r-1 \choose j}\cdot {k-j +s-1 \choose k-j}={k+r+s-1 \choose k}.$

The $NB(r,p)$ can be written as independent sum of geometric random variables.

Let $X_i$ be i.i.d. and $X_i\sim Geometric (p)$.

Then $X\sim NB(r,p)$ satisfies $X = X_1 + \cdots +X_r$,

and $Y\sim NB(s,p)$ satisfies $Y= X_{r+1} + \cdots + X_{r+s}.$

Therefore, $X+Y = X_1 + \cdots + X_{r+s}.$

This yields $X+Y \sim NB(r+s, p)$.

Since $X,Y$ are independent, the moment generating function (MGF) of $X+Y$ is the multiplication of the MGF of $X$ and MGF of $Y$. The MGF of $X$ is $\displaystyle M_X(t)=(\frac{1-p}{1-pe^t})^r$, and this is $\displaystyle(\frac{1-p}{1-pe^t})^s$ for $Y$. Now since $X,Y$ are independent, we have that $$\begin{align} M_{X+Y}(t)&=M_X(t)M_Y(t)\\
Therefore $\displaystyle M_{X+Y}(t)=(\frac{1-p}{1-pe^t})^{s+r}$ is the MGF of an $NB$ distribution with parameters $r+s$ and $p$, meaning that $X+Y$ is $NB(r+s,p)$.

Building upon the idea that NB(r,p) is the time to the r-th success in Bernoulli trials, and that the trials are independent, it is clear that NB(r+k,p) can be seen as the time to the r-th success and then to the next k-th success, giving the result directly with no algebra.

Have you learnt about the convolution of two independent random variables? That will allow you to compute the pmf directly without saying anything about the mgf. The method is to condition on one of them and use the total probability. For any $k\geq 0$, verify the sum is a NB pmf as required:

$P(X+Y=k)=\sum_{x=0}^k P(Y+X=k|X=x)P(X=x)=\sum_{x=1}^k P(Y=k-x)P(X=x)$