# Computing matrix-vector calculus derivatives

$x, a$ in $\mathbb R^n$, $A$ in $\mathbb R^{n\times n}$. Compute $d(x^T a)/dx$ and $d(x^T A x)/dx$.

I’m not sure about how to think about these and how to do these. Can someone explain how to derive the expressions for the two?

Finally, what happens when we have $A$ and $X$, BOTH in $\mathbb R^{n\times n}$, and we want to find $dTrace(XA)/dX$?

#### Solutions Collecting From Web of "Computing matrix-vector calculus derivatives"

Let’s use the convention that members of $\mathbb{R}^{n}$ are column
vectors. Recall that
$$x^{T}a=\sum_{i=1}^{n}x_{i}a_{i}$$
and for a scalar $c$,
$$\frac{dc}{dx}\equiv\left(\begin{array}{c} \frac{dc}{dx_{1}}\\ \frac{dc}{dx_{2}}\\ \vdots\\ \frac{dc}{dx_{n}} \end{array}\right).$$
Therefore,
$$\frac{d\left(x^{T}a\right)}{dx}=\left(\begin{array}{c} a_{1}\\ a_{2}\\ \vdots\\ a_{n} \end{array}\right)=a.$$
You can follow the same arguments to get
$$\frac{d\left(x^{T}Ax\right)}{dx}=2Ax.$$
See a list of identities at https://en.wikipedia.org/wiki/Matrix_calculus#Scalar-by-matrix_identities.

Note that $x^ta=a^tx$ and so
$$\frac{d(x^tA)}{dx}=\frac{d}{dh}a^t(x+h)|_{h=0}=a^t.$$
Similarly, since $h^tAx=x^tA^th$,
$$\frac{d(x^tAx)}{dx}=\frac{d}{dh}(x+h)^tA(x+h)|_{h=0}=\frac{d}{dh}(x^tAh+h^tAx)|_{h=0}=x^t(A+A^t).$$
Finally, since
$$tr(XA)=\sum_{i=1}^n\sum_{j=1}^n X_{ij}A_{ji}$$
we have
$$\frac{d tr(XA)}{d X_{ij}}=A_{ji},$$
which gives
$$\frac{d tr(XA)}{d X}=A^t.$$
All this must be interpreted properly if used to make other computations!