# Maximum Likelihood Estimator of parameters of multinomial distribution

Suppose that 50 measuring scales made by a machine are selected at random from the production of the machine and their lengths and widths are measured. It was found that 45 had both measurements within the tolerance limits, 2 had satisfactory length but unsatisfactory width, 2 had satisfactory width but unsatisfactory length, 1 had both length and width unsatisfactory. Each scale may be regarded as a drawing from a multinomial population with density

$$\pi_{11}^{x_{11}} \pi_{12}^{x_{12}} \pi_{21}^{x_{21}}(1-\pi_{11}-\pi_{12}-\pi_{21})^{x_{22}}$$

Obtain the maximum likelihood estimates of the parameters.

I have tried this by the following way:

the likelihood function is

$L=L(\pi_{11},\pi_{12},\pi_{21},(1-\pi_{11}-\pi_{12}-\pi_{21}))$

$=\prod_{i=1}^{50}[\pi_{11}^{x_{11}} \pi_{12}^{x_{12}} \pi_{21}^{x_{21}}(1-\pi_{11}-\pi_{12}-\pi_{21})^{x_{22}}]$

$=[\pi_{11}^{x_{11}} \pi_{12}^{x_{12}} \pi_{21}^{x_{21}}(1-\pi_{11}-\pi_{12}-\pi_{21})^{x_{22}} ]^{50}$

$=[\pi_{11}^{45} \pi_{12}^{2} \pi_{21}^{2}(1-\pi_{11}-\pi_{12}-\pi_{21})^{1} ]^{50}$

$=\pi_{11}^{2250} \pi_{12}^{100} \pi_{21}^{100}(1-\pi_{11}-\pi_{12}-\pi_{21})^{50}$

Taking logarithm of the likelihood function yields,

$L^*=\log L=\log [\pi_{11}^{2250} \pi_{12}^{100} \pi_{21}^{100}(1-\pi_{11}-\pi_{12}-\pi_{21})^{50}]$

$=2250\log [\pi_{11}]+100\log [\pi_{12}]+100\log [\pi_{21}]+50\log (1-\pi_{11}-\pi_{12}-\pi_{21})$

Now taking the first derivative of $L^*$ with respect to $\pi_{11}$

$\frac{\partial L^*}{\partial \pi_{11}}$
$=\frac{2250}{\pi_{11}}-\frac{50}{(1-\pi_{11}-\pi_{12}-\pi_{21})}$

setting $\frac{\partial L^*}{\partial \pi_{11}}$ equal to $0$,

$\frac{\partial L^*}{\partial \hat\pi_{11}}=0$

$\Rightarrow\frac{2250}{\hat\pi_{11}}-\frac{50}{(1-\hat\pi_{11}-\hat\pi_{12}-\hat\pi_{21})}=0$

$\Rightarrow \hat\pi_{11}=\frac{45(1-\hat\pi_{12}-\hat\pi_{21})}{44}$

$\bullet$Are the procedure and estimate of $\pi_{11}$ correct?

$\bullet$I have another question that if it is multinomial then where the term $\binom{n}{x_{11}x_{12}x_{21}x_{22}}=\binom{50}{45,2,2,1}$?

#### Solutions Collecting From Web of "Maximum Likelihood Estimator of parameters of multinomial distribution"

Consider a positive integer $n$ and a set of positive real numbers $\mathbf p=(p_x)$ such that $\sum\limits_xp_x=1$. The multinomial distribution with parameters $n$ and $\mathbf p$ is the distribution $f_\mathbf p$ on the set of nonnegative integers $\mathbf n=(n_x)$ such that $\sum\limits_xn_x=n$ defined by
$$f_\mathbf p(\mathbf n)=n!\cdot\prod_x\frac{p_x^{n_x}}{n_x!}.$$
For some fixed observation $\mathbf n$, the likelihood is
$L(\mathbf p)=f_\mathbf p(\mathbf n)$ with the constraint $C(\mathbf p)=1$, where $C(\mathbf p)=\sum\limits_xp_x$. To maximize $L$, one asks that the gradient of $L$ and the gradient of $C$ are colinear, that is, that there exists $\lambda$ such that, for every $x$,
$$\frac{\partial}{\partial p_x}L(\mathbf p)=\lambda\frac{\partial}{\partial p_x}C(\mathbf p).$$
In the present case, this reads
$$\frac{n_x}{p_x}L(\mathbf p)=\lambda,$$
that is, $p_x$ should be proportional to $n_x$. Since $\sum\limits_xp_x=1$, one gets finally $\hat p_x=\dfrac{n_x}n$ for every $x$.

Let $\mathbf{X}$ be a RV following multinomial distribution. Then, \begin{align}P(\mathbf{X} = \mathbf{x};n,\mathbf{p}) &= n!\,\Pi_{k=1}^K \frac{p_k^{x_k}}{x_k!} \end{align}
$x_i$ is the number of success of the $k^{th}$ category in $n$ random draws, where $p_k$ is the probability of success of the $k^{th}$ category. Note that, \begin{align}\sum_{k=1}^K x_k &= n\\ \sum_{k=1}^{K} p_k &=1 \end{align}

For the estimation problem, we have $N$ samples $\mathbf{X_1}, \ldots,\mathbf{X_N}$ drawn independently from above multinomial distribution. The log-liklihood is given as $$\mathcal{L}(\mathbf{p},n) = \sum_{i=1}^N \log P(\mathbf{x_i},n,\mathbf{p})$$
where \begin{align}\log P(\mathbf{x_i},n,\mathbf{p}) &= \log \frac{n!}{\Pi_k x_{ik}!} + \sum_{k=1}^{K} x_{ik} \log p_k \\ \sum_{i=1}^N \log P(\mathbf{x_i},n,\mathbf{p}) &= C + \sum_{k=1}^{K} N_k \log p_k \end{align}
where $N_k = \sum_{i=1}^{N} x_{ik}$, is the total number of success of $k^{th}$ category in $N$ samples.

For MLE estimate of $\mathbf{p}$, assuming $n$ is known, we solve the following optimization problem:
\begin{align} \max_{\mathbf{p}} &\,\, \mathcal{L}(\mathbf{p},n) \\ s.t. & \,\, \sum_{k=1}^{K} p_k \,\,=1\end{align} Using equality constraint for variable reduction, $$p_K\,=\, 1 – \sum_{k=1}^{K-1} p_k$$ We have an unconstrained problem in $K-1$ variables. Compute gradient for stationary point computation as,
\begin{align}\frac{\partial\mathcal{L}(\mathbf{p},n)}{\partial p_k} &= \frac{N_k}{p_k} – \frac{N_K}{p_K}\,\,=\,\, 0 \\ p_k &= \frac{N_k\,p_K}{N_K}\end{align} Solving, with $\sum_{k=1}^{K} p_k\,=\, 1$ gives MLE estimate for $p_k$, $$p_k = \frac{N_k}{nN}$$