How do you formally prove that rotation is a linear transformation?

The fact that rotation about an angle is a linear transformation is both important (for example, this is used to prove the sine/cosine angle addition formulas; see How can I understand and prove the "sum and difference formulas" in trigonometry?) and somewhat intuitive geometrically. However, even if this fact seems fairly obvious (at least from a diagram), how does one turn the picture proof into a formal proof? On a related note, it seems likely that many formal proofs using a diagram will end up relying on Euclidean geometry (using angle/side congruence properties), but isn’t one of the points of linear algebra to avoid using Euclidean geometry explicitly?

Solutions Collecting From Web of "How do you formally prove that rotation is a linear transformation?"

Maybe my first answer was not enough “formal”, since I didn’t say explicitely what I was assuming a “rotation” is -nor “angle”. (Exercise: (1) find in which places I was using those unsaid definitions of “rotation” and “angle” and (2) which definitions are they. 🙂 )

Well, it was a sort of “high-school proof”. The problem about the definition of “rotation”, as some answers have already pointed out, or implied, is that, if you really want to be formal, you end saying that a rotation is some kind of linear transformation by definition. Full stop.

Nevertheless, I think I can produce an “undergraduate” proof of this fact, avoiding circular arguments. It has the merit that is valid in any dimension. For which I need the following

Convention. Whatever “rotation” means,

  1. It is a map of some vector space $V$,
  2. Which has a way of measuring “lengths” and “angles” of its vectors, and
  3. “Rotations” preserve those “lengths” and “angles”.

Now, a fancy way to have “lengths” and “angles” in a vector space is to have a dot product in it. So let’s assume that $V$ is an euclidian vector space; for instance, a real vector space such as $\mathbb{R}^n$ with the standard dot product.

Then, in such a $V$, “length” means “norm”, which is $\| v \| = +\sqrt{v\cdot v}$.

We have to be more careful with “angles” because the standard definition already involves rotations. To avoid a circular argument, we define the (non-oriented) angle determined by two vectors $v, w$ as the unique real number $\theta = \widehat{vw} \in [0, \pi ]$ such that

$$
\cos \theta = \frac{v\cdot w}{\|v\| \|w\|} \ .
$$

Notice that this definition makes sense since, because of the Cauchy-Schwarz inequality, we always have $-1 \leq \frac{v\cdot w}{\|v\| \|w\|} \leq 1$ and $\cos : [0, \pi] \longrightarrow [-1,1]$ is bijective.

So, “rotation” is some kind of map $f: V \longrightarrow V$ which preserves norms and angles. Since the dot product can be expressed in terms of norms and (cosines of) angles,

$$
v\cdot w = \|v\| \|w\| \cos\widehat{vw}
$$

rotations preserve dot products:

$$
f(v) \cdot f(w) = v\cdot w \ .
$$

Now, let’s show that a map that preserves the dot product is necessarily linear:

$$
\begin{align}
\| f(v+w) – f(v) -f(w) \|^2 &= \left( f(v+w) -f(v) -f(w) \right) \cdot \left( f(v+w) -f(v) -f(w) \right) \\\
&= \left( f(v+w)\cdot f(v+w) – f(v+w) \cdot f(v) – f(v+w)\cdot f(w) \right) \\\
&{} \qquad + \left( -f(v)\cdot f(v+w) + f(v) \cdot f(v) + f(v)\cdot f(w) \right) \\\
&{} \qquad + \left( -f(w)\cdot f(v+w) + f(w)\cdot f(v) + f(w)\cdot f(w) \right) \\\
&= \left( (v+w)\cdot (v+w) – (v+w) \cdot v – (v+w)\cdot w \right) \\\
&{} \qquad + \left( -v\cdot (v+w) + v \cdot v + v\cdot w \right) \\\
&{} \qquad + \left( -w\cdot (v+w) + w\cdot v + w\cdot w \right) \\\
&= \| v+w – v -w \|^2 = 0 \ .
\end{align}
$$

So

$$
f(v+w) = f(v) + f(w) \ .
$$

A similar computation shows that

$$
\|f(\lambda v ) – \lambda f(v) \|^2 = \|\lambda v – \lambda v\|^2 = 0 \ .
$$

Thus

$$
f(\lambda v) = \lambda f(v) \ .
$$

Hence, our rotation $f$ is a linear transformation.

The problem comes in defining rotation so that this whole thing doesn’t wind up circular, doing so fully would undoubtedly tie us all in knots, but one can prove the linearity of rotations by observing a few choice properties:

  1. Rotations preserve angles
  2. They fix the origin
  3. And they preserve lengths

Now just draw a parallelogram $OACB$ (spanned by vectors $\vec{OB}, \vec{OA}$- with $\vec{OC}=\vec{OB}+\vec{OA}$), rotate and chase the angles and lengths to observe that $R(\vec{OB})+R(\vec{OA})= R(\vec{OC})= R(\vec{OB}+ \vec{OA})$.

It’s sort of Euclidianish but it’s the only decent way to play it (especially if we want higher dimensions) without the ‘lucky guess’ approach championed by Qiaochu.

Addendum:

If you feel pushed, you can turn the angle and length chase into norms and inner products but that would be a tragic waste of an isomorphism- here we are dealing with a geometric object in the language of vectors and to treat it non-geometrically would put us at an unnecessary disadvantage.

A good deal of higher mathematics depends on finding ‘the best place’ to solve a given problem, but the best place need not be the most abstract one: translating something about PDE into hilbert space theory may help find a solution, but it may also rob us of our geometric techniques and intuition.

A great example is something called the h-cobordism theorem, which can (and does- Milnor’s take is a gem) take up a small book, using formal techniques (morse theory) to prove it, but can be done in half the time with some geometric (handlebody) jazz.

Vector spaces are another string to our bow, a new place to do math, and at times a more rigorous place- but they are not (at least from my perspective) the only place.

This depends on your definitions of “rotation” and “angle”, but, if you are willing to accept the sum and difference formulas of your link How can I understand and prove the "sum and difference formulas" in trigonometry? , then you may follow the reasoning in Quiochu Yuan’s answer there the other way around: given a vector $(x,y) \in \mathbb{R}^2$, let $\alpha$ be the angle determined by $(x,y)$ and the $x$-axis and $r= \sqrt{x^2 + y^2}$ its length. Then, of course,

\begin{eqnarray*}
x &=& r \cos \alpha \\\
y &=& r \sin \alpha
\end{eqnarray*}

If you rotate $(x,y)$ an angle $\theta$, you’ll obtain the vector $(x’,y’)$

\begin{eqnarray*}
x’ &=& r \cos (\alpha + \theta ) \\\
y’ &=& r \sin (\alpha + \theta ) \ .
\end{eqnarray*}

Now you apply those sum and difference formulas and get

\begin{eqnarray*}
x’ &=& r \cos\alpha\cos\theta – r \sin\alpha\sin\theta &=& \cos\theta \cdot x – \sin\theta \cdot y\\\
y’ &=& r \sin\alpha\cos\theta + r \cos\alpha\sin\theta &=& \sin\theta \cdot x + \cos\theta\cdot y \ .
\end{eqnarray*}

Which is the same as saying that $(x’,y’)$ is obtained from $(x,y)$ multiplying with the matrix

$$
\begin{pmatrix}
\cos\theta & -\sin\theta \\\
\sin\theta & \cos\theta
\end{pmatrix}
$$

Since the rotation $(x,y) \mapsto (x’,y’)$ is the same as multiplication by a matrix, it is a linear transformation.

As an exercise in trigonometry.

alt text

The transformation of rectangular co-ordinates $(x,y)$ of $P$ into the
rectangular co-ordinates $(x’,y’)$ of $P$ by a rotation of the axes is given by:

$x=x^{\prime }\cos \phi -y^{\prime }\sin \phi $

$y=x^{\prime }\sin \phi +y^{\prime }\cos \phi $.

This transformation is invertible.

Edit: This way of proving the transformation of rectangular co-ordinates does not “avoid using Euclidean geometry explicitly” (in this case trigonometry), as writen in the question.

However the same method can be used to derive the transformation of rectangular into spherical co-ordinates, which is a non-linear transformation.

If you want to be formal and avoid Euclidean geometry, it seems to me that the right way to go is the other way around: first define certain linear transformations (the ones which preserve the dot product and have determinant $1$) and then prove that they have the properties you expect them to have. More precisely, the group $\text{SO}(2)$ is abelian with Lie algebra $\mathbb{R}$, and the parameterization $\theta \mapsto \left[ \begin{array}{cc} \cos \theta & \sin \theta \\ – \sin \theta & \cos \theta \end{array} \right]$ is the exponential map from the Lie algebra to the group. The methods for doing this should be covered in any book on Lie theory.

First how do you non-geometrically define rotation?

Perhaps in the 2D case, you can define it as multiplication by a complex number (which is in fact true!)

Once you do that, use the definition of a Linear Transformation.

Show that the map T satisfies:

1) $T(a + b) = T(a) + T(b)$ for any vectors $a,b$ (or complex numbers if you chose to go that route)
2) $T(\alpha b) = \alpha T(b)$ for any scalar $\alpha$ and any vectors $a,b$.

Note that, you can define complex numbers as vectors in the abstract without having to rely on any Euclidean Axioms.

You can prove this by looking at the appropriate diagram and using trig.

However, what is interesting is that it is not too difficult to show that any isometry of $\mathbb R^n$ (with its usual inner product) is necessarily linear. So if you accept that rotations are isometries (which to me at least is much more self-evident than them being linear) then you get linearity immediately.