If $X$ is an orthogonal matrix, why does $X^TX = I$?

It’s not immediately clear to me why this is true. My notes say that putting $n$ orthonormal vectors $ v_1, …, v_n$ in the columns of $X$ gives $X^TX = I$, and it follows from this that the rows of $X$ are orthonormal.

Can you explain this better to me? I tried the $2\times 2$ case, but it isn’t clear to me why this is true.

Solutions Collecting From Web of "If $X$ is an orthogonal matrix, why does $X^TX = I$?"

A more visual, though not rigorous approach.

Let $X=[v_1|\cdots|v_n\textbf{]}$ be a a orthogonal matrix, where $v_i,\ldots ,v_n$ are $n$-dimensional column vectors.

Then $X^TX=\begin{bmatrix}\underline{v_1}\\\vdots\\\overline{v_n}\end{bmatrix}[v_1|\cdots|v_n\textbf{]}=\begin{bmatrix}v_1\cdot v_1 & v_1\cdot v_2 &\cdots &v_1 \cdot v_n\\
v_2\cdot v_1 &v_2\cdot v_2 &\cdots &v_2\cdot v_n\\
\vdots &\vdots &\ddots &\vdots\\
\vdots &\vdots &\ddots &\vdots\\
v_n\cdot v_1& v_n\cdot v_2 &\cdots &v_n\cdot v_n\end{bmatrix}=I_n$

Multiplying $\,X^tX\,$ , for a square matrix $\,X\,$, is the same as doing the inner product of the columns of $\,X^t\,$ = the rows of $\,X^t\,$ , when we look at them as $\,n$-th vectors…

$$X=(a_{ij})\implies X^t=(b_{ij})\;,\;\;b_{ij}=a_{ji}\;,\;\;1\le i,j\le n\implies$$

$$X^tX=\left(\sum_{k=1}^nb_{ik}a_{kj}\right)=\left(\sum_{k=1}^na_{ki}a_{kj}\right)=\delta_{ij}$$

Who is the $(i,j)$ element of the product $A \cdot B$ ? Answer: it is the canonical scalar product beetween i-th row of A and j-th column of B. That should answer your question.

Consider the matrix $Y = X^TX$. For every $i,j$ between $1$ and $n$, the $(i,j)$-th entry $y_{ij}$ of $Y$ is simply the dot product (scalar product) of the $i$-th column and the $j$-th column of $X$. (The $i$-th row of $X^T$ is the $i$-th column of $X$.) Evaluate this when $i\ne j$, and when $i=j$.