Why is the determinant defined in terms of permutations?

Where does the definition of the determinant come from, and is the definition in terms of permutations the first and basic one? What is the deep reason for giving such a definition in terms of permutations?

$$
\text{det}(A)=\sum_{p}\sigma(p)a_{1p_1}a_{2p_2}…a_{np_n}.
$$

I have found this one useful:

http://phalanstere.univ-mlv.fr/~al/Classiques/Muir/History_5/VOLUME5_TEXT.PDF

Solutions Collecting From Web of "Why is the determinant defined in terms of permutations?"

This is only one of many possible definitions of the determinant.

A more “immediately meaningful” definition could be, for example, to define the determinant as the unique function on $\mathbb R^{n\times n}$ such that

  • The identity matrix has determinant $1$.
  • Every singular matrix has determinant $0$.
  • The determinant is linear in each column of the matrix separately.

(Or the same thing with rows instead of columns).

While this seems to connect to high-level properties of the determinant in a cleaner way, it is only half a definition because it requires you to prove that a function with these properties exists in the first place and is unique.

It is technically cleaner to choose the permutation-based definition because it is obvious that it defines something, and then afterwards prove that the thing it defines has all of the high-level properties we’re really after.

The permutation-based definition is also very easy to generalize to settings where the matrix entries are not real numbers (e.g. matrices over a general commutative ring) — in contrast, the characterization above does not generalize easily without a close study of whether our existence and uniqueness proofs will still work with a new scalar ring.

The amazing fact is that it seems matrices were developed to study determinants. I’m not sure, but I think the “formula” definition of the determinant you have there is known as the Leibnitz formula. I am going to quote some lines from the following source Tucker, 1993.:

Matrices and linear algebra did not grow out of the study of coefficients of
systems of linear equations, as one might guess. Arrays of coefficients led mathematicians to develop determinants, not matrices. Leibnitz, co-inventor of calculus, used determinants in 1693 about one hundred and fifty years before the study of matrices in their own right. Cramer presented his determinant-based formula for solving systems of linear equations in 1750. The first implicit use of matrices occurred in Lagrange’s work on bilinear forms in the late 18th century.

In 1848, J. J. Sylvester introduced the term “matrix,” the Latin word for
womb, as a name for an array of numbers. He used womb, because he viewed a
matrix as a generator of determinants. That is, every subset of k rows and k
columns in a matrix generated a determinant (associated with the submatrix
formed by those rows and columns).

You would probably have to dig (historical texts, articles) to find out why exactly Leibnitz devised the definition, most probably he had some hunch/intuition that it could lead to some breakthroughs in understanding the underlying connection between coefficients and the solution of a system equations…

Hint:

Determinants appear in the solution of linear systems of equation, among others. If you permute the equations, the solution cannot change. Hence, the expression of a determinant must be insensitive to row permutations, and this is why they are a combination of terms involving $a_{ip_i}$.

This explains the pattern
$$\sum_p \sigma_p\prod_i a_{ip_i},$$ where the operators are commutative and imply multilinearity of the expression. Also, the form must be antisymmetric so that two equal rows yield a zero determinant (causing failure of the solution) and this explains why $\sigma_p=\pm1$ indicates the parity of the permutation.

Here is a natural path to the idea of determinant (though this is not how they were originally developed).

An alternating $k$-linear function on a vector space $V$ over a field $\Bbb F$ is a map $f\,:\, V^k \to \Bbb F$ which is

  • Linear in each argument: $$f(v_1, \ldots, v_{i-1}, av_i + bw_i, v_{i+1}, \ldots, v_k) = af(v_1, \ldots, v_i, \ldots, v_k) + bf(v_1, \ldots, w_i, \ldots, v_k)$$ for all $i$.
  • Changes sign under exchange of any two arguments: $$f(v_1, \ldots, v_i, \ldots, v_j, \ldots v_k) = -f(v_1, \ldots, v_j, \ldots, v_i, \ldots v_k)$$ for all $i \ne j$

It is easy to see that if $f,g$ are two alternating $k$-linear functions on $V$, then so is $af + bg$ for any $a,b \in \Bbb F$, so the alternating $k$-linear functions on $V$ form another vector space $A^k(V)$. Some development shows that if $V$ has dimension $n$, then $A^k(V)$ has dimension $n \choose k$. In particular $A^n(V)$ has dimension $1$.

Now if $M\,:\,V \to V$ is linear and if $f\in A^k(V)$, then the map $$M_kf\,:\, V^k \to \Bbb F\,:\,(v_1, … v_k) \mapsto f(Mv_1, …, Mv_k)$$ is also alternating $k$-linear. And clearly $M_k(af+bg) = aM_kf + bM_kg$, so $M_k$ defines a linear map from $A^k(V)$ to itself (i.e., an endomorphism of $A^k(V)$).

Since $A^n(V)$ is one dimensional, any endomorphism is just multiplication by some element of the field $\Bbb F$. Thus we define the determinant of $M$ to be the unique element $\det(M) \in \Bbb F$ such that $$M_nf = \det(M)f\text{ for all }f \in A^n(V)$$

All the properties of determinants, including the permutation formula can be developed from this. Certain properties of determinants that are difficult to prove from the Liebnitz formula are almost trivial from this definition. In particular that $\det(MN) = \det(M)\det(N)$.

There is a close connection between the space of alternating $k$-linear functions and the $k$-order wedge product of a space, so I could have very similarly developed the determinant based on the wedge product, but alternating $k$-linear functions are easier conceptually.

I think Paul’s answer gets the algebraic nub of the issue. There is a geometric side, which gives some motivation for his answer, because it isn’t clear offhand why multilinear alternating functions should be important.

On the real line function of two variables (x,y) given by x-y gives you a notion of length. It really gives you a bit more than length because is a signed notion of length. It cares about the direction of the line from x to y and gives you positive or negative based on that direction. If you swap x and y you get the negative of your previous answer.

In R^n it is useful to have a similar function that is the signed volume of the parallelpiped spanned by n vectors. If you swap two vectors that reverse the orientation of the parellelpiped, so you should get the negative of the previous answer. From a geometric persepective, that is how alternating functions come into play.

The determinant of a matrix with columns v_1,… v_n calculates the signed volume of the parallelpiped given by the vectors v_1,.. v_n. Such a function is necessarily alternating. It is also necessarily linear in each variable separately, which can also be seen geometrically.