Computing matrix-vector calculus derivatives

$x, a$ in $\mathbb R^n$, $A$ in $\mathbb R^{n\times n}$. Compute $d(x^T a)/dx$ and $d(x^T A x)/dx$.

I’m not sure about how to think about these and how to do these. Can someone explain how to derive the expressions for the two?

Finally, what happens when we have $A$ and $X$, BOTH in $\mathbb R^{n\times n}$, and we want to find $dTrace(XA)/dX$?

Solutions Collecting From Web of "Computing matrix-vector calculus derivatives"

Let’s use the convention that members of $\mathbb{R}^{n}$ are column
vectors. Recall that
$$
x^{T}a=\sum_{i=1}^{n}x_{i}a_{i}
$$
and for a scalar $c$,
$$
\frac{dc}{dx}\equiv\left(\begin{array}{c}
\frac{dc}{dx_{1}}\\
\frac{dc}{dx_{2}}\\
\vdots\\
\frac{dc}{dx_{n}}
\end{array}\right).
$$
Therefore,
$$
\frac{d\left(x^{T}a\right)}{dx}=\left(\begin{array}{c}
a_{1}\\
a_{2}\\
\vdots\\
a_{n}
\end{array}\right)=a.
$$
You can follow the same arguments to get
$$
\frac{d\left(x^{T}Ax\right)}{dx}=2Ax.
$$
See a list of identities at https://en.wikipedia.org/wiki/Matrix_calculus#Scalar-by-matrix_identities.

Note that $x^ta=a^tx$ and so
$$
\frac{d(x^tA)}{dx}=\frac{d}{dh}a^t(x+h)|_{h=0}=a^t.
$$
Similarly, since $h^tAx=x^tA^th$,
$$
\frac{d(x^tAx)}{dx}=\frac{d}{dh}(x+h)^tA(x+h)|_{h=0}=\frac{d}{dh}(x^tAh+h^tAx)|_{h=0}=x^t(A+A^t).
$$
Finally, since
$$
tr(XA)=\sum_{i=1}^n\sum_{j=1}^n X_{ij}A_{ji}
$$
we have
$$
\frac{d tr(XA)}{d X_{ij}}=A_{ji},
$$
which gives
$$
\frac{d tr(XA)}{d X}=A^t.
$$
All this must be interpreted properly if used to make other computations!