Why the gradient of $\log{\det{X}}$ is $X^{-1}$, and where did trace tr() go??

I’m studying Boyd & Vandenberghe’s “Convex Optimization” and encountered a problem in page 642.

According to the definition, the derivative $Df(x)$ has the form:
$$f(x)+Df(x)(z-x)$$

And when $f$ is real-valued($i.e., f: R^n\to R$),the gradient is
$$\nabla{f(x)}=Df(x)^{T}$$

See the original text below:


enter image description here

But when discussing the gradient of function $f(X)=\log{\det{X}}$, author said “we can identify $X^{-1}$ as the gradient of $f$ at $X$”, please see below:
enter image description here

enter image description here


My question is: Where did trace $tr()$ go?

Solutions Collecting From Web of "Why the gradient of $\log{\det{X}}$ is $X^{-1}$, and where did trace tr() go??"

First of all, if you write (for a general function $f: U \to \mathbb R$, where $U \subset \mathbb R^n$)

$$f(y) \approx f(x) + Df(x) (y-x),$$

then term $Df(x) (y-x)$ is really

$$\sum_{i=1}^n D_i f \ (y_i – x_i).$$

Now the function $Z\mapsto \log\det (Z)$ are defined on an open set $S^n_{++}$ in $\mathbb R^{n^2}$, so it has $n^2$ coordinate given by $Z_{ij}$, where $i, j = 1, \cdots, n$.

Now take a look at

$$\begin{split}
\text{tr} \left( X^{-1} (Z-X)\right) &= \sum_{i=1}^n \left(X^{-1} (Z-X) \right)_{ii}\\
&= \sum_{i=1}^n \sum_{j=1}^n X^{-1}_{ij} (Z_{ji}-X_{ji}) \\
\end{split}$$

Thus we should have identified $(X^{-1})^T$ as the gradient of $\log \det$.