Alternative proof that the parity of permutation is well defined?

I learned the following theorem about the properties of permutation from Gallian’s Contemporary Abstract Algebra.

Theorem 5.4 $\;$ Always Even or Always Odd

If a permutation $\alpha$ can be expressed as a product of an even number of $2$-cycles, then every decomposition of $\alpha$ into a product of $2$-cycles must have an even number of $2$-cycles. In symbols, if
$$
\alpha = \beta_1 \beta_2 \dotsm \beta_r
\quad\text{and}\quad
\alpha = \gamma_1 \gamma_2 \dotsm \gamma_s
$$
where the $\beta$’s and the $\gamma$’s are $2$-cycles, then $r$ and $s$ are both even or both odd.

When I tried to reconstruct the proof myself, I found that it suffices to prove the following lemma:

If $\epsilon=\beta_1\beta_2\cdots\beta_r$ where $\beta$’s are $2$-cycles, then $r$ is even.

The original proof for this lemma uses the following key property of the product of $\beta_1\beta_2$:

The product can always be expressed in one of the following forms on the left:
$$\begin{align}(ab)(ab)&=\epsilon\\
(ab)(ac)&=(bc)(ab)\\
(ab)(cd)&=(cd)(ab)\\
(ab)(bc)&=(bc)(ac)\end{align}$$

The proof for the lemma is based on such property and mathematical induction. I found it really hard to remind of such property, so I set it as an exercise to give another proof. However, I have no idea how to actually do it.

So here is my question:

Does anybody know an alternative proof of the lemma or the theorem?

Solutions Collecting From Web of "Alternative proof that the parity of permutation is well defined?"

Here’s a nice proof that uses group actions.

Let $x_1,\ldots,x_n$ be $n$ unknowns, and consider
$$\Delta= \prod_{1\leq i\lt j\leq n} (x_j-x_i).$$

For example, for $n=4$, we would have
$$\Delta = (x_2-x_1)(x_3-x_1)(x_4-x_1)(x_3-x_2)(x_4-x_2)(x_4-x_3).$$

Given a permutation $\sigma\in S_n$, define a function $f_{\sigma}\colon\{\Delta,-\Delta\}$ by letting
$$f_{\sigma}(\Delta) = \prod_{1\leq i\lt j\leq n} (x_{\sigma(j)}-x_{\sigma(i)}),$$
and $f_{\sigma}(-\Delta)=-f_{\sigma}\Delta$.

Note that since $\sigma$ is a permutation, $f_{\sigma}(\Delta)=\Delta$ or $f_{\sigma}(\Delta) = -\Delta$. Also, if $\sigma,\rho$ are two permutations, then $f_{\sigma}\circ f_{\rho} = f_{\sigma\rho}$, as is easy to verify.

Now, let’s consider what a transposition $\tau=(a,b)$ does to $\Delta$. Without loss of generality, say $a\lt b$.

The factors $(x_j-x_i)$ with neither $i$ nor $j$ equal to $a$ nor $b$ are unchanged.

For the pairs with exactly one index in $\{a,b\}$, we have two classes: those in which the other index is between $a$ and $b$, and those where the other index is not between $a$ and $b$.

If the other index is between $a$ and $b$, then $x_j-x_a$ is sent to $-(x_b-x_j)$ and $x_b-x_j$ is sent to $-(x_j-x_a)$; the two sign changes cancel each other out.

If the other index is larger than $b$, then $x_j-x_a$ and $x_j-x_b$ are swapped, with no sign changes.

If the other index is smaller than $a$, then $x_a-x_i$ and $x_b-x_i$ are swapped, with no sign changes.

Finally, the factor $x_b-x_a$ is sent to $-(x_b-x_a)$.

In summary, if $\tau$ is a transposition, then $f_{\tau}(\Delta)=-\Delta$, $f_{\tau}(-\Delta)=\Delta$.

Now take an arbitrary permutation $\sigma$, and express it as a product of transpositions in two different ways:
$$\sigma = \tau_1\cdots \tau_r = \rho_1\cdots\rho_s.$$
Then
$$f_{\sigma}(\Delta) = f_{\tau_1\cdots\tau_r}(\Delta) = f_{\tau_1}\circ\cdots\circ f_{\tau_r}(\Delta) = (-1)^r\Delta$$
and
$$f_{\sigma}(\Delta) = f_{\rho_1\cdots\rho_s}(\Delta) = f_{\rho_1}\circ\cdots\circ f_{\rho_s}(\Delta) = (-1)^s\Delta.$$
Therefore, $(-1)^r\Delta = (-1)^s\Delta$, so $r$ and $s$ have the same parity: both odd, or both even.

Here’s my favorite proof. Represent a permutation of $n$ letters as follows. Write the numbers $1$ to $n$ in a row and then write them in another row somewhat beneath the first row. Now connect the numbers in the top to the numbers in the bottom by curves representing the permutation.

So, if $\sigma$ is the permutation, connect $i$ to $\sigma(i)$ by a curve. For convenience, draw the curves so that they intersect each other in transverse double points. (That is, the tangent vectors to each curve when they intersect are not parallel, and no point has more than two curves passing through it.) Now I claim that the parity of the permutation matches the parity of the number of intersections.

There are two steps to showing this. First is to show that the intersection number is well-defined modulo 2. Second is to show that it matches the number of transpositions. To see that it is well-defined modulo 2, note that we just have to show that the intersection between two given curves is well-defined modulo 2. Indeed if the permutation swaps the endpoints of the arcs, then they should intersect an odd number of times and if the permutation does not reverse the order of the endpoints, then the arcs intersect an even number of times. (This is because an arc locally disconnects the plane, so in order to get from one side to the other I have to cross the arc, and if I do it twice, I’m back on the side I started with.) So we have a well-defined intersection number modulo 2.

Now write $\sigma$ as a product of transpositions $\tau_j$, and correspondingly stack the pictures for each $\tau_j$ to get a picture for $\sigma$. Then we just need to verify that the number of crossings in some standard picture of $\tau_j$ is odd, which is easy. All the arcs are straight except two, which form an X-shape. The X shape has one intersection point in the middle and crosses all of the other straight arcs in pairs. So it is an odd intersection number.

If you’re familiar with the braid group, you’ll notice that this looks likes the braid group, only crossing information has been dropped. Perhaps that’s why it appeals to me as a topologist. In any event, I’ve found that this is a quick way of calculating the parity of a permutation in practice.

A closely related proof, but without nice pictures, is to say that the parity of a permutation is the number of inversions modulo 2, where an inversion is a pair $i<j$ where $\sigma (i)>\sigma(j)$.

I personally like the following proof which sweeps some of the annoying technicalities for some fairly elementary linear algebra.

Consider the action of $S_n$ on $\mathbb R^n$ where $\sigma$ acts by permuting the basis elements, that is, if $v=\sum a_ie_i$ where $\{e_i\}$ is the standard basis for $\mathbb R^n$, then every $\sigma\in S_n$ determines a linear transformation $M_\sigma(v)=\sum a_i e_{\sigma(i)}$. Hence, we have a map $\rho\colon S_n\to GL_n(\mathbb R)$, the group of invertible $n\times n$ real-valued matrices.

This $\rho$ is in fact a group homomorphism since $M_\sigma M_\tau=M_{\sigma\tau}$ by definition of the linear transformations $M_\sigma$.

Now, the determinant in turn is a group homomorphism $GL_n(\mathbb R)\to\{-1,1\}$, the latter being a group under multiplication, so composing it with $\rho$ we get a group homomorphism $\det\circ\rho\colon S_n\to\{-1,1\}$.

Thinking of the determinant as an alternating multilinear form (function on $n$ vectors, linear on each variable, that flips sign when two of its inputs are switched), we see that $\det\circ\rho(\sigma)=\det(M_\sigma)=\det(e_{\sigma(1)}, e_{\sigma(2)},\dots, e_{\sigma(n)})$. Now the fact that transpositions of basis vectors flips the sign of the determinant tells us that transpositions get sent by our map to $-1$, while the identity map gets sent to $1$. Since the determinant is multiplicative, then $\sigma=\tau_1\dots\tau_m$ where $\tau_i$ are transpositions gets sent to $(-1)^m$ which depends only on the parity of $m$, hence $m$ is either always even or always odd.


Update: I now think the linear algebra is too ad-hoc and that counting inversions $\bmod2$ is more natural, as long as its done as follows.

For each permutation $\sigma$, let $N(\sigma)=\{(i,j):\sigma(i)>\sigma(j)$ but $i<j$), i.e. let $N(\sigma)$ be the set of pairs of elements put out of order by the permutation.

Then it follows easily from the definition that

  1. $N((i,i+1))=\{i,i+1\}$ for any adjacent transposition $(i,i+1)$.
  2. $N(\sigma\circ\tau)=\tau(N(\sigma))\Delta N(\tau)$ where $\Delta$ is the symmetric difference of sets, i.e. given $i<j$, we have $\sigma\circ\tau(i)>\sigma\circ\tau(j)$ if and only if either $\{i,j\}\in N(\tau)$ or $\{\tau(i),\tau(j)\}\in N(\sigma)$, but not both.

Then 1. gives that $|N((i,i+1))|=1$ for any adjacent transposition and 2. gives that $|N(\sigma\circ\tau)|=|\tau(N(\sigma))\Delta N(\tau)|=|\tau(N(\sigma))|+|N(\tau)|-2|\tau(N(\sigma))\cap N(\tau)|=|N(\tau)|+|N(\sigma)|-2|\tau N(\sigma)\cap N(\tau)|$ by inclusion-exclusion and permutations being bijections, hence $\tau\mapsto|N(\tau)|\bmod2$ is precisely the parity function on permutations.

In fact, with a little bit of induction, one can show that $|N(\sigma)|$ is the minimimal number of adjacent transpositions needed to write down the permutation $\sigma$. Even better, this point of view suggests what I’ve found to be the most practical definition of the notion of Coxeter group (c.f. M. Dyer’s Reflection Subgroups of Coxeter Systems).

Namely a Coxeter group is a group $G$ generated by a set $S$ of order two elements such that there is a (necessarily unique) well-defined function $G\xrightarrow{N}\bigoplus_{t\in T}\left<t\right>$ where $T=\bigcup_{g\in G}g^{-1}Sg$ such that

  1. $N(s)=s$
  2. $N(gh)=g^{-1}N(h)g+N(g)$

The the above proofs go word for word, characterizing the lengths and parity functions on Coxeter groups. Even better you use this definition as an intermediate step in figuring out easy proofs of the two standard characterizations of Coxeter groups by generators and relations given by a Cartan matrix, and by generators satisfying the (strong) exchange condition. The advantage to this approach is that you don’t have to take a detour through linear algebra and representation theory which is what most expositions do. But even for the representation theory, the $G$-module $T$ is useful as it captures canonically the simple roots of a root system.

There are lots of possible proofs, here’s one based on the cycle decomposition of the permutation. We show that if we multiply a permutation by a transposition from the right (i.e. first applying the transposition), then the parity of the number of cycles always changes.

The number of cycles in the identity permutation is the same as the number of points $n$ (since there are $n$ cycles). Thus for $n$ even, if you multiply an even number of transpositions, you always get a permutation with an even number of cycles; if you multiply an odd number of transpositions, you always get a permutation with an odd number of cycles. For $n$ odd the association is reversed.

Suppose the transposition is $(12)$. There are two cases: either $1,2$ belong to the same cycle or not.

Case 1: $1,2$ belong to the same cycle $(1\alpha2\beta)$. Then in the new permutation the cycle splits to $(1\beta)(2\alpha)$.

Case 2: $1,2$ belong to two different cycles $(1\alpha)(2\beta)$. Then in the new permutation the two cycles join to $(1\beta2\alpha)$.

In the first case we gain one cycle, in the second we lose one cycle. So in both cases, the parity changes.


Here’s another one, from Herstein’s book. For $i<j$ and a permutation $\pi$, let $\psi(\pi,i,j)$ be $1$ if $\pi(i)>\pi(j)$, and $0$ otherwise. Let $\Psi(\pi) = \sum_{i<j} \psi(\pi,i,j)$. So $\Psi(\pi)$ be the number of indices $i<j$ such that $\pi(i) > \pi(j)$. For the identity we have $\Psi(e) = 0$.

Let $k < l$ and $\tau = \pi (k l)$, so that $\tau(t) = \pi(t)$ for $t \neq k,l$, $\tau(k) = \pi(l)$ and $\tau(l) = \pi(k)$. Since $k,l$ are the only changed entries, in order to consider the change in $\Psi$ we only have to consider pairs $i,j$ in which one of the indices (at least) is $k$ or $l$.

Case 1: Pairs $i<k$ and $i<l$. It is easy to see that $\psi(\pi,i,k) = \psi(\tau,i,l)$ and $\psi(\pi,i,l) = \psi(\tau,i,k)$. So the contribution to $\Psi$ is the same.

Case 2: Pairs $k<i$ and $l<i$. Similar to Case 1.

Case 3: Pairs $k<i$ and $i<l$. This time $\psi(\pi,k,i) = 1 – \psi(\tau,i,l)$ and $\psi(\pi,i,l) = 1 – \psi(\tau,k,i)$. In total,
$$ \psi(\pi,k,i) + \psi(\pi,i,l) = 2 – (\psi(\tau,k,i) + \psi(\tau,i,l)). $$

Case 4: The pair $k<l$. Clearly $\psi(\pi,k,l) = 1 – \psi(\tau,k,l)$.

Only Case 4 changes the parity of $\Psi$, and we deduce that multiplying by a transposition changes the parity of $\Psi$. So $\pi$ is a product of an even number of transpositions iff $\Psi(\pi)$ is even.