Intereting Posts

Geodesics of the $\mathbb{S}^n$ are great circles
Can mathematical inductions work for other sets?
(Computationally) Simple sigmoid
$|\frac{\sin(nx)}{n\sin(x)}|\le1\forall x\in\mathbb{R}-\{\pi k: k\in\mathbb{Z}\}$
Excess Notation System
Perhaps a Pell type equation
Coin Related Brain Teaser (Frobenius coin problem)
True or false: {{∅}} ⊂ {∅,{∅}}
PDE $(\partial_{tt}+a\partial_t-b\nabla^2)f(r,t)=0$
A binomial identity from Mathematical Reflections
Integrating each side of an equation w.r.t. to a different variable?
how to find Radius of convergence for $\sum_{n=0}^{\infty}z^{n!}$?
How can I improve my problem solving/critical thinking skills and learn higher math?
Proving $ \frac{\sigma(n)}{n} < \frac{n}{\varphi(n)} < \frac{\pi^{2}}{6} \frac{\sigma(n)}{n}$
Proof that $26$ is the one and only number between square and cube

Multiplication of matrices — taking the dot product of the $i$th row of the first matrix and the $j$th column of the second to yield the $ij$th entry of the product — is not a very intuitive operation: if you were to ask someone how to mutliply two matrices, he probably would not think of that method. Of course, it turns out to be very useful: matrix multiplication is precisely the operation that represents composition of transformations. But it’s not intuitive. **So my question is where it came from. Who thought of multiplying matrices in that way, and why?** (Was it perhaps multiplication of a matrix and a vector first? If so, who thought of multiplying *them* in that way, and why?) My question is intact no matter whether matrix multiplication was done this way only after it was used as representation of composition of transformations, or whether, on the contrary, matrix multiplication came first. (Again, I’m not asking about the *utility* of multiplying matrices as we do: this is clear to me. I’m asking a question about history.)

- Determinant of a special $n\times n$ matrix
- Rank of the $n \times n$ matrix with ones on the main diagonal and $a$ off the main diagonal
- How adjacency matrix shows that the graph have no cycles?
- Multiplication of Rational Matrices
- Counting matrices over $\mathbb{Z}/2\mathbb{Z}$ with conditions on rows and columns
- Induction for Vandermonde Matrix

- Dominant Eigenvalues
- When a 0-1-matrix with exactly two 1’s on each column and on each row is non-degenerated?
- On monomial matrices (Generalized Permutation Matrices )
- Dimension of $GL(n, \mathbb{R})$
- When is matrix multiplication commutative?
- Invertibility of $BA$
- How do I write this matrix in Jordan-Normal Form
- Differentiate $f(x)=x^TAx$

Matrix multiplication is a symbolic way of substituting one linear change of variables into another one. If $x’ = ax + by$ and $y’ = cx+dy$, and

$x” = a’x’ + b’y’$ and $y” = c’x’ + d’y’$ then we can plug the first pair of formulas into the second to express $x”$ and $y”$ in terms of $x$ and $y$:

$$

x” = a’x’ + b’y’ = a'(ax + by) + b'(cx+dy) = (a’a + b’c)x + (a’b + b’d)y

$$

and

$$

y” = c’x’ + d’y’ = c'(ax+by) + d'(cx+dy) = (c’a+d’c)x + (c’b+d’d)y.

$$

It can be tedious to keep writing the variables, so we use arrays to track the coefficients, with the formulas for $x’$ and $x”$ on the first row and for $y’$ and $y”$ on the second row. The above two linear substitutions coincide with the matrix product

$$

\left(

\begin{array}{cc}

a’&b’\\c’&d’

\end{array}

\right)

\left(

\begin{array}{cc}

a&b\\c&d

\end{array}

\right)

=

\left(

\begin{array}{cc}

a’a+b’c&a’b+b’d\\c’a+d’c&c’b+d’d

\end{array}

\right).

$$

So matrix multiplication is just a *bookkeeping* device for systems of linear substitutions plugged into one another (order matters). The formulas are not intuitive, but it’s nothing other than the simple idea of combining two linear changes of variables in succession.

Matrix multiplication was first defined explicitly in print by Cayley in 1858, in order to reflect the effect of composition of linear transformations. See paragraph 3 at http://darkwing.uoregon.edu/~vitulli/441.sp04/LinAlgHistory.html. However, the idea of tracking what happens to coefficients when one linear change of variables is substituted into another (which we view as matrix multiplication) goes back further. For instance, the work of number theorists in the early 19th century on binary quadratic forms $ax^2 + bxy + cy^2$ was full of linear changes of variables plugged into each other (especially linear changes of variable that we would recognize as coming from ${\rm SL}_2({\mathbf Z})$). For more on the background, see the paper by Thomas Hawkins on matrix theory in the 1974 ICM. Google “ICM 1974 Thomas Hawkins” and you’ll find his paper among the top 3 hits.

Here is an answer directly reflecting the historical perspective from the paper *Memoir on the theory of matrices* By Authur Cayley, 1857. This paper is available here.

This paper is credited with “containing the first abstract definition of a matrix” and “a matrix algebra defining addition, multiplication, scalar multiplication and inverses” (source).

In this paper a nonstandard notation is used. I will do my best to place it in a more “modern” (but still nonstandard) notation. The bulk of the contents of this post will come from pages 20-21.

To introduce notation, $$ (X,Y,Z)= \left( \begin{array}{ccc}

a & b & c \\

a’ & b’ & c’ \\

a” & b” & c” \end{array} \right)(x,y,z)$$

will represent the set of linear functions $(ax + by + cz, a’z + b’y + c’z, a”z + b”y + c”z)$ which are then called $(X,Y,Z)$.

Cayley defines addition and scalar multiplication and then moves to matrix multiplication or “composition”. He specifically wants to deal with the issue of:

$$(X,Y,Z)= \left( \begin{array}{ccc}

a & b & c \\

a’ & b’ & c’ \\

a” & b” & c” \end{array} \right)(x,y,z) \quad \text{where} \quad (x,y,z)= \left( \begin{array}{ccc}

\alpha & \beta & \gamma \\

\alpha’ & \beta’ & \gamma’ \\

\alpha” & \beta” & \gamma” \\ \end{array} \right)(\xi,\eta,\zeta)$$

He now wants to represent $(X,Y,Z)$ in terms of $(\xi,\eta,\zeta)$. He does this by creating another matrix that satisfies the equation:

$$(X,Y,Z)= \left( \begin{array}{ccc}

A & B & C \\

A’ & B’ & C’ \\

A” & B” & C” \\ \end{array} \right)(\xi,\eta,\zeta)$$

He continues to write that the value we obtain is:

$$\begin{align}\left( \begin{array}{ccc}

A & B & C \\

A’ & B’ & C’ \\

A” & B” & C” \\ \end{array} \right) &= \left( \begin{array}{ccc}

a & b & c \\

a’ & b’ & c’ \\

a” & b” & c” \end{array} \right)\left( \begin{array}{ccc}

\alpha & \beta & \gamma \\

\alpha’ & \beta’ & \gamma’ \\

\alpha” & \beta” & \gamma” \\ \end{array} \right)\\[.25cm] &= \left( \begin{array}{ccc}

a\alpha+b\alpha’ + c\alpha” & a\beta+b\beta’ + c\beta” & a\gamma+b\gamma’ + c\gamma” \\

a’\alpha+b’\alpha’ + c’\alpha” & a’\beta+b’\beta’ + c’\beta” & a’\gamma+b’\gamma’ + c’\gamma” \\

a”\alpha+b”\alpha’ + c”\alpha” & a”\beta+b”\beta’ + c”\beta” & a”\gamma+b”\gamma’ + c”\gamma”\end{array} \right)\end{align}$$

This is the standard definition of matrix multiplication. I must believe that matrix multiplication was defined to deal with this specific problem. The paper continues to mention several properties of matrix multiplication such as non-commutativity, composition with unity and zero and exponentiation.

Here is the written rule of composition:

Any line of the compound matrix is obtained by combining the corresponding line of the first component matrix successively with the several columns of the second matrix (p. 21)

\begin{align}

u & = 3x + 7y \\ v & = -2x + 11y \\ \\ \\ \\

p & =13u-20v \\ q & = 2u+6v

\end{align}

Given $x$ and $y$, how do you find $p$ and $q$? How do you write:

\begin{align}

p & = \bullet x + \bullet y \\ q & = \bullet x+\bullet y\quad\text{?}

\end{align}

What numbers go where the four $\bullet$’s are?

That is what matrix multiplication is. The rationale is mathematical, not historical.

- Prove: If $n=2^k-1$, then $\binom{n}{i}$ is odd for $0\leq i\leq n$
- expected length of broken stick
- Equivalence of definitions of prime ideal in commutative ring
- Proof that multilinear rank of tensor less of equal than border rank
- What hexahedra have faces with areas of exactly 1, 2, 3, 4, 5, and 6 units?
- Non-differentiability in $\mathbb R\setminus\mathbb Q$ of the modification of the Thomae's function
- How to solve an overdetermined system of point mappings via rotation and translation
- The ring of germs of functions $C^\infty (M)$
- How can I prove that two sets are isomorphic, short of finding the isomorphism?
- Find values for $a$, $b$, $c$ that make this linear system solvable?
- Does it make sense to compare complex numbers in certain circumstances?
- Enumerating Graphs with Self-Loops
- $\lfloor \sqrt n+\sqrt {n+1}+\sqrt{n+2}+\sqrt{n+3}+\sqrt{n+4}\rfloor=\lfloor\sqrt {25n+49}\rfloor$ is true?
- The Laplace transform of $\exp(t^2)$
- Is the sphere with points deleted a topological group?