What does it mean to represent a number in term of a $2\times2$ matrix?

Today my friend showed me that the imaginary number can be represented in term of a matrix

$$i = \pmatrix{0&-1\\1&0}$$

This was very very confusing for me because I have never thought of it as a matrix. But it is apparent that the properties of imaginary number holds even in this representation, namely $i\cdot i = -1$

Even more confusing is that a bunch of quantities can be represented by matrices

$$e^{i\theta} = \pmatrix{\cos\theta&-\sin\theta\\
\sin\theta&\cos\theta}$$

Naturally I wonder if we can perform this for any number. What is the big picture here? What is this operation called turning a number into a matrix. What is the deeper implication – how does knowing this help?

100 points to anyone who can answer this in a comprehensive way.

Solutions Collecting From Web of "What does it mean to represent a number in term of a $2\times2$ matrix?"

Your friend meant that all complex numbers can be represented by such matrices.

$$a+bi = \begin{pmatrix} a & -b \\ b & a \end{pmatrix}$$

Adding complex numbers matches adding such matrices and multiplying complex numbers matches multiplying such matrices.

This means that the collection of matrices:

$$R = \left\{ \begin{pmatrix} a & -b \\ b & a \end{pmatrix} \;\Bigg|\; a,b \in \mathbb{R} \right\}$$

is “isomorphic” to the field of complex numbers.

Specifically,

$$i = \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix}$$

Notice that for this matrix $i^2=-I_2=-1$. 🙂

How does this help?

It allows you to construct the complex numbers from matrices over the reals. This allows you to get at some properties of the complex numbers via linear algebra.

For example: The modulus of a complex number is $|a+bi|=a^2+b^2$. This is the same as the determinant of such a matrix. Now since the determinant of a product is the product of a determinant, you get that $|z_1z_2|=|z_1|\cdot |z_2|$ for any two complex numbers $z_1$ and $z_2$.

Another nice tie, transposing matches conjugation. 🙂

Edit: As per request, a little about Euler’s formula.

The exponential function can be defined in a number of ways. One nice way is via its MacLaurin series: $e^x = 1+x+\frac{x^2}{2!}+\cdots$. If you start thinking of $x$ as some sort of indeterminant, you might start to ask, “What can I plug into this series?” It turns out that the series:
$$e^A = I+A+\frac{A^2}{2!}+\frac{A^3}{3!}+\cdots$$
converges for any square matrix $A$ (you have to make sense out of “a convergent series of matrices”).

Consider a “real” number, $x$, encoded as one of our matrices:
$$x=\begin{pmatrix} x & 0 \\ 0 & x \end{pmatrix} \quad \mbox{then} \quad e^x = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} + \begin{pmatrix} x & 0 \\ 0 & x \end{pmatrix} + \begin{pmatrix} x^2/2 & 0 \\ 0 & x^2/2 \end{pmatrix} + \cdots$$
$$= \begin{pmatrix} 1+x+x^2/2+\cdots & 0 \\ 0 & 1+x+x^2/2+\cdots \end{pmatrix}
= \begin{pmatrix} e^x & 0 \\ 0 & e^x \end{pmatrix} = e^x$$

So (no surprise) the matrix exponential and the good old real exponential do the same thing.

Now one can ask, “What does the exponential of a complex number get you?” It turns out that…
$$\mbox{Given } a+bi = \begin{pmatrix} a & -b \\ b & a \end{pmatrix} \quad \mbox{then} \quad e^{a+bi} = \begin{pmatrix} e^a\cos(b) & -e^a\sin(b) \\ e^a\sin(b) & e^a\cos(b) \end{pmatrix}$$
…this involves some (?intermediate?) linear algebra.

Anyway accepting that, we have found that $e^{a+bi} = e^a(\cos(b)+i\sin(b))$. In particular,
$$e^{i\theta} = \begin{pmatrix} \cos(\theta) & -\sin(\theta) \\ \sin(\theta) & \cos(\theta) \end{pmatrix}$$
So that
$$e^{i\pi} = \begin{pmatrix} -1 & 0 \\ 0 & -1 \end{pmatrix} = -1$$

We can see this way that complex exponentiation (with pure imaginary exponent) yields a rotation matrix. Thus leading us down a path to start identifying complex arithmetic with 2-dimensional geometric transformations.

Of course, there are many other ways to arrive at these various relationships. The matrix route is not the fastest/easiest route but it is an interesting one to contemplate.

I hope that helps a little bit. 🙂

The point is that the operator $\phi: \mathbb{C} \to {\cal R}$,
where ${\cal R} = \{ \pmatrix{a&-b\\b&a}\}_{a,b \in \mathbb{R}}$ defined by
$\phi(a+ib) = \pmatrix{a&-b\\b&a}$, satisfies a few conditions, namely
$\phi$ is a bijection,
$\phi(1) = I$, $\phi(z_1 z_2) = \phi(z_1) \phi(z_2)$, where multiplication is complex number multiplication on the left and matrix multiplication on the left,
and similarly $\phi(z_1 + z_2) = \phi(z_1) + \phi(z_2)$ (where $+$ means the
corresponding effort in $\mathbb{R}$ and ${\cal R}$).

Plus you have $|z| = \|\phi(z)\|$ (the induced Euclidean norm), so
if $|z|=1$, we can write $z=e^{i\theta} = \cos \theta + i \sin \theta$ and
so $\phi(e^{i\theta}) = \pmatrix{\cos \theta &- \sin \theta \\ \sin \theta & \cos \theta}$. Also, $\phi(\bar{z}) = \phi(z)^T$.

One can write $\phi(z) = \phi(\operatorname{re} z) + \phi(\operatorname{im} z)\phi(i) = (\operatorname{re} z) I + (\operatorname{im} z) J$,
where $I$ is the identity matrix and $\phi(i) = J= \pmatrix{0&-1\\1&0}$ (a little unfortunate that $i$ maps to $J$). Of course,
${\cal R} = \operatorname{sp} \{ I, J\}$ and $J^2 = -I$.

Since $\phi$ is an isometry, we see that if $f$ has a power series representation $f(z) = \sum_k f_k z^k$for $z\in B(z_0,R)$, then
we have $\phi(f(z)) = \phi(\sum_k f_k z^k) = \sum_k \phi(f_k) \phi(z)^k$.
Representations for $\exp, \sin, \cos$, etc. follow from this.

The operation $\phi$ is called a ring isomorphism.

It identifies complex multiplication with scalings and rotations in $\mathbb{R}^2$ which provides some geometric insight.

Let
$$
I=\begin{bmatrix}1&0\\0&1\end{bmatrix}\tag{1}
$$
and
$$
J=\begin{bmatrix}0&-1\\1&0\end{bmatrix}\tag{2}
$$

Since $I$ is the identity matrix, whether multiplying on the left or on the right, it leaves all matrices untouched. Thus, $I$ has the properties of $1$.

The key fact here is that $J^2=-I$. If $I$ represents $1$ then $J$ would represent $i$.

As alluded to earlier, $I$ commutes with all matrices; in particular $J$. That is, $IJ=J=JI$.

Scalar and matrix multiplication distribute over matrix addition. Therefore,
$$
(xI+yJ)+(uI+vJ)=(x+u)I+(y+v)J\tag{3}
$$
and
$$
(xI+yJ)(uI+vJ)=(xu-yv)I+(xv+yu)J\tag{4}
$$
With $I$ representing $1$ and $J$representing $i$, $(3)$ and $(4)$ correspond exactly with complex addition and multiplication.

Since analytic functions can be written as series involving addition and multiplication of complex numbers, those functions can be translated directly to corresponding functions involving $I$ and $J$.

For example
$$
\begin{align}
\exp(xI+yJ)
&=\sum_{n=0}^\infty\frac{(xI+yJ)^n}{n!}\\
&=\sum_{n=0}^\infty\sum_{k=0}^n\frac1{n!}\binom{n}{k}(xI)^{n-k}(yJ)^k\\
&=\sum_{n=0}^\infty\sum_{k=0}^n\frac{(xI)^{n-k}}{(n-k)!}\frac{(yJ)^k}{k!}\\
&=\sum_{n=0}^\infty\frac{(xI)^n}{n!}\sum_{k=0}^\infty\frac{(yJ)^k}{k!}\\
&=\sum_{n=0}^\infty\frac{(xI)^n}{n!}\left(\sum_{k=0}^\infty\frac{(yJ)^{2k}}{(2k)!}+\sum_{k=0}^\infty\frac{(yJ)^{2k+1}}{(2k+1)!}\right)\\
&=\sum_{n=0}^\infty\frac{x^n}{n!}I\left(\sum_{k=0}^\infty(-1)^k\frac{y^{2k}}{(2k)!}I+\sum_{k=0}^\infty(-1)^k\frac{y^{2k+1}}{(2k+1)!}J\right)\\[9pt]
&=e^xI(\cos(y)I+\sin(y)J)\\[15pt]
&=e^x\cos(y)I+e^x\sin(y)J\tag{5}
\end{align}
$$
We can rewrite $(5)$ as
$$
\exp\left(\begin{bmatrix}x&-y\\y&x\end{bmatrix}\right)
=\begin{bmatrix}e^x\cos(y)&-e^x\sin(y)\\e^x\sin(y)&e^x\cos(y)\end{bmatrix}\tag{6}
$$


In this fashion, we can reformulate almost any formula involving complex numbers in terms of matrices using the isomorphism
$$
x+yi\leftrightarrow\overbrace{\begin{bmatrix}x&-y\\y&x\end{bmatrix}}^{xI+yJ}\tag{7}
$$
For example, the complex conjugate is represented by the transpose
$$
\overline{x+yi}=x-yi\leftrightarrow\begin{bmatrix}x&y\\-y&x\end{bmatrix}=\begin{bmatrix}x&-y\\y&x\end{bmatrix}^T\tag{8}
$$
and the square of the absolute value is represented by the determinant times $I$
$$
\begin{align}
|x+yi|^2
=(x+yi)\overbrace{(x-yi)\vphantom{\begin{bmatrix}x\\y\end{bmatrix}}}^{\text{conjugate}}
&\leftrightarrow\begin{bmatrix}x&-y\\y&x\end{bmatrix}
\overbrace{\begin{bmatrix}x&y\\-y&x\end{bmatrix}}^{\text{transpose}}\\
&=\begin{bmatrix}x^2+y^2&0\\0&x^2+y^2\end{bmatrix}\\
&=\det\begin{bmatrix}x&-y\\y&x\end{bmatrix}I\tag{9}
\end{align}
$$
Not surprisingly, the reciprocal is represented by the matrix inverse
$$
\begin{array}{c}
\dfrac1{x+yi}&=&\dfrac{x-yi}{|x+yi|^2}\\
\updownarrow&&\updownarrow\\
\begin{bmatrix}x&-y\\y&x\end{bmatrix}^{-1}
&=&\begin{bmatrix}x&y\\-y&x\end{bmatrix}\left(\det\begin{bmatrix}x&-y\\y&x\end{bmatrix}I\right)^{-1}\tag{10}
\end{array}
$$

I think that the ”big picture” is expressed in the Artin–Wedderburn theorem.

This is (I think) the most important classification theorem in rings theory. In his general form, the theorem classifies all semisimple rings, but as regards this question, it is sufficient to note that, as its consequence:

Every finite-dimensional simple algebra over $\mathbb{R}$ can be
represented as a matrix ring over $\mathbb{R}$, $\mathbb{C}$, or
$\mathbb{H}$ ( the quaternions).

We see that $\mathbb{C}$ is a vector space of dimension $2$ over $\mathbb{R}$ and it is a simple algebra since its only ideals are the zero ideal and $\mathbb{C}$ itself.

So the A.W. theorem states that $\mathbb{C}$ must be isomorphic to a matrix ring. Since $\mathbb{C}$ is a commutative algebra the matrix ring have to be commutative ( and eventually a field because $\mathbb{C}$ is a field), and the matrices in OP give an exemple of such a ring.

Note that this matrix representation is not unique ( see e.g.
Ambiguous matrix representation of imaginary unit?) but all representations are isomorphic.

I hope this answer give a sufficiently big picture.

The correct claim is that you can represent $i = \pmatrix{0&-1\\1&0}$ and any complex number $a+bi$ as $a+bi = \pmatrix{a&-b\\b&a}$ The justification is that you can show that addition and multiplication of complex numbers corresponds to addition and multiplication of the matrices using the usual rules. This makes an isomorphism, which we consider to be an identity.

So far so good:
$$ i^2 =
\left[ \begin{array}{cc} 0 & -1 \\ 1 & 0 \end{array} \right]
\left[ \begin{array}{cc} 0 & -1 \\ 1 & 0 \end{array} \right]
= \left[ \begin{array}{cc} -1 & 0 \\ 0 & -1 \end{array} \right] =
– \left[ \begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array} \right] = -1
$$
But what I’m missing in the current answers is this:
$$ e^{i\theta} =
e^{\left[ \begin{array}{cc} 0 & -1 \\ 1 & 0 \end{array} \right]\theta} = \\
\left(\left[ \begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array} \right]\theta\right)^0
+ \left(\left[ \begin{array}{cc} 0 & -1 \\ 1 & 0 \end{array} \right]\theta\right)^1
+ \frac{1}{2!}\left(
\left[ \begin{array}{cc} 0 & -1 \\ 1 & 0 \end{array} \right]\theta
\right)^2 + \frac{1}{3!}\left(
\left[ \begin{array}{cc} 0 & -1 \\ 1 & 0 \end{array} \right]\theta
\right)^3 + \\ \frac{1}{4!}\left(
\left[ \begin{array}{cc} 0 & -1 \\ 1 & 0 \end{array} \right]\theta
\right)^4 + \frac{1}{5!}\left(
\left[ \begin{array}{cc} 0 & -1 \\ 1 & 0 \end{array} \right]\theta
\right)^5 + \frac{1}{6!}\left(
\left[ \begin{array}{cc} 0 & -1 \\ 1 & 0 \end{array} \right]\theta
\right)^6 + \frac{1}{7!}\left(
\left[ \begin{array}{cc} 0 & -1 \\ 1 & 0 \end{array} \right]\theta
\right)^7 +\; \cdots \; = \\
\left[ \begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array} \right] +
\left[ \begin{array}{cc} 0 & -1 \\ 1 & 0 \end{array} \right]\theta
+ \frac{1}{2!}
\left[ \begin{array}{cc} -1 & 0 \\ 0 & -1 \end{array} \right]\theta^2
+ \frac{1}{3!}
\left[ \begin{array}{cc} 0 & 1 \\ -1 & 0 \end{array} \right]\theta^3
+ \\ \frac{1}{4!}
\left[ \begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array} \right]\theta^4
+ \frac{1}{5!}
\left[ \begin{array}{cc} 0 & -1 \\ 1 & 0 \end{array} \right]\theta^5
+ \frac{1}{6!}
\left[ \begin{array}{cc} -1 & 0 \\ 0 & -1 \end{array} \right]\theta^6
+ \frac{1}{7!}
\left[ \begin{array}{cc} 0 & 1 \\ -1 & 0 \end{array} \right]\theta^7
+ \; \cdots \; = \\
\left[ \; \begin{array}{cc}
1 – \theta^2/2! + \theta^4/4! – \theta^6/6! + \; \cdots \; & \;
– \, \theta + \theta^3/3! – \theta^5/5! + \theta^7/7! + \; \cdots \\
+ \, \theta – \theta^3/3! + \theta^5/5! – \theta^7/7! + \; \cdots \; & \;
1 – \theta^2/2 + \theta^4/4! – \theta^6/6! + \; \cdots
\end{array} \; \right] = \\
\left[ \begin{array}{cc} \cos(\theta) & -\sin(\theta) \\
\sin(\theta) & \cos(\theta) \end{array} \right]
$$

Well, actually, the “big picture” is not the Artin–Wedderburn theorem, but a much more simple theorem: if $K$ is a field, and $L$ is a finite simple extension of $K$, then $L$ is isomorphic to a field of matrices with entries in $K$. The proof is very easy: it suffice to observe that the map that associates to every $x\in L$, the K-linear application $t\mapsto xt$ is an injective algebra homomorphism, hence the $K$-algebra $L$ is isomorphic to a sub-algebra of matrices over $K$. If $a$ is a generator of $L/K$ of degree $n$ over $K$, then the matrix representing $a$ in the basis $1,a,a^2,..,a^{n-1}$ is the companion matrix of the minimal polynomial of $a$ over $K$. See this document, p. 16 and p. 24.

Consider the complex numbers as the vector space $\mathbb R$, with $e_1(1,0)$ corresponding to $1+0i$ and $e_2(0,1)$ corresponding to $0+1i$. Then multiplication by $a+bi$ in $\mathbb C$ corresponds is a linear operation on $\mathbb R^2$, and if we represent it as a matrix:

$$(a+bi)(ce_1+de_2) = ((ac-bd)e_1+(bc+ad)e_2$$ yielding the matrix representation:

$$\begin{pmatrix}a&-b\\b&a\end{pmatrix}\begin{pmatrix}c\\d\end{pmatrix}=\begin{pmatrix}ac-bd\\bc+ad\end{pmatrix}$$

The complex numbers $\mathbb{C}$ can be identified with the plane $\mathbb{R}^2$, with the complex number $x + iy$ corresponding to the point $(x, y)$ in the plane. If you think about it, you should be able to see that multiplication by $i$ rotates the complex plane by $90^\circ$ counterclockwise. In fact, for any real $\theta$, multiplication by $e^{i\theta}$ rotates the complex plane by $\theta$ counterclockwise.

In addition, multiplication by a real number $r$ scales the complex plane by a factor of $r$. Any complex number can be written as $re^{i\theta}$ with real $r$ and $\theta$, so multiplication by an arbitrary complex number transforms the complex plane by a rotation followed by a scaling.

For any complex number $z$, let’s say $M_z$ is the transformation of the complex plane given by multiplication by $z$. Here’s a surprising fact: if you know the transformation $M_z$, you can use it to reconstruct the number $z$! This is because multiplication by $z$ sends $1$ to $z$, so $M_z$ sends the point $(1,0)$ to the point $(\operatorname{Re} z, \operatorname{Im} z)$. Since $z$ and $M_z$ determine each other uniquely, they’re essentially the same thing. In other words, a complex number is the same as a certain kind of transformation of the plane.

Multiplying by $wz$ is the same as multiplying by $z$ and then multiplying by $w$: in symbols, $(wz)p = w(zp)$ for any $p \in \mathbb{C}$. That means $M_{wz} = M_w M_z$. Similarly, $(w + z)p = wp + zp$ for any $p \in \mathbb{C}$, so $M_{w+z} = M_w + M_z$. In this way, the addition and multiplication of complex numbers matches the addition and multiplication of the corresponding transformations.

Which transformations of the plane correspond to complex numbers? Let’s say $\mathfrak{C}$ is the set of transformations given by complex numbers: in symbols, $\mathfrak{C} = \{M_z \mid z \in \mathbb{C}\}$. All of the numbers in $\mathbb{C}$ can be written in the form $x + iy$, with $x$ and $y$ real, so all of the transformations in $\mathfrak{C}$ can be written in the form $M_x + M_i M_y$. We figured out earlier that $M_i$ is a $90^\circ$ rotation, and $M_x$ is scaling by $x$. As matrices,
$$ \begin{align*}
M_i & = \left[\begin{array}{rr} 0 & -1 \\ 1 & 0 \end{array} \right] &
M_x & = \left[\begin{array}{rr} x & 0 \\ 0 & x \end{array} \right].
\end{align*} $$
Hence, $\mathfrak{C}$ consists of all the transformations of the plane given by matrices that look like
$$ \left[\begin{array}{rr} x & -y \\ y & x \end{array} \right], $$
where $x$ and $y$ are real.

This discussion shows that, if we didn’t know about complex numbers already, we could have discovered them by studying rotations and scalings of the plane $\mathbb{R}^2$.


Meelo mentioned that the complex numbers can be represented by two-by-two matrices in other ways. These other representations correspond to other ways of identifying $\mathbb{C}$ with $\mathbb{R}^2$. If you identify the number $x + iy$ with the point $(x + y, y)$, for example, the transformation $M_i$ is given by the matrix
$$ \left[\begin{array}{rr} 1 & -2 \\ 1 & -1 \end{array} \right], $$
as in Meelo’s example.