I’m trying to prove that if $A\neq 0$ is a commutative ring and there is an injective $A$-module homomorphism $A^m\hookrightarrow A^n$ then $m\leq n$ must necessarily hold.
This is exercise 2.11 from Atiyah and MacDonald’s Introduction to Commutative Algebra, but unfortunately all the solutions available online are either very sparse with regard to this question or seem to use (to my surprise) a generalized version of Cramer’s rule. Is there perhaps some other “cleaner” approach to solving this problem?
I’ll permit myself to paraphrase and somewhat expand Balazs Strenner’s argument using the Cayley-Hamilton theorem in the MO thread mentioned in the comment by Dylan Moreland, as maybe the sheer conciseness of that answer makes it less transparent to some.
Supposing for a contradiction $m>n$, compose your injection with the natural injection $A^n\to A^m$ so as to obtain an injective $A$-module endomorphism $\phi:A^m\to A^m$, which moreover when (further) composed with the function $\gamma:A^m\to A$ taking the final coordinate gives the zero map $A^m\to A$.
Let $\chi\in A[X]$ be the characteristic polynomial of $\phi$ (which being monic is certainly non-zero), and let $k\in\mathbf N$ be maximal so that $X^k$ divides $\chi$, in other words $\chi=X^kP$ where $P$ has constant term $c\neq0$. (Since the matrix of $\phi$ has its last $m-n$ columns zero, one can see that $k$ is at least $m-n$, but this fact is not used in the argument.) Now by the Cayley-Hamilton theorem $\chi(\phi)=0\in\mathrm{End}(A^m)$, which we can write as $\phi^k\circ P(\phi)=0$. Since $\phi$ is injective one has $\ker(\phi^k)=\{0\}$, so it follows that $P(\phi)=0$. But then $\gamma\circ P(\phi)=0$, and since $\gamma\circ\phi=0$ what remains after dropping null terms is $\gamma\circ c\phi^0=\gamma\circ cI=c\gamma=0$, a contradiction since $c\gamma$ sends the final basis element of $A^m$ to $c\neq0$.
From Lam’s Lectures on Modules and Rings…
This doesn’t directly answer the original post asking for a proof that does not invoke Cramer’s rule (which I interpret to mean a proof that doesn’t use the fact that a system of homogeneous linear equations in more variables than equations has a nontrivial solution). But other questions have been closed with this question as the duplicate, I thought perhaps we might add some of the standard proofs.
To make up for it, let me discuss four related notions:
A ring $R$ is said to have (right) IBN ( Invariant basis number ) if, for any natural numbers $n$ and $m$, $R^n\cong R^m$ (as right modules) implies $n=m$.
A ring $R$ satisfies the rank condition if, for any $n\lt\infty$, any set of $R$-module generators for $R^n$ has cardinality at least $n$; equivalently, if $\alpha\colon R^k\to R^n$ is an epimorphism of (right) free modules, then $k\geq n$.
A ring $R$ satisfies the strong rank condition if, for any $n\lt\infty$, any set of linearly independent elements in $R^n$ has cardinality at most $n$. Equivalently, if $\beta\colon R^m\to R^n$ is a monomorphism of (right) free modules, then $m\leq n$.
A ring $R$ is stably finite if the matrix rings $\mathbb{M}_n(R)$ are Dedekind finite for all natural numbers $n$ (a ring $S$ is Dedekind finite if, for any $c,d\in S$, $cd=1$ implies $dc=1$).
(Commutative rings are, of course, Dedekind finite; they are also stably finite, as noted below.)
Proposition. If $R$ satisfies the strong rank condition, then it satisfies the rank condition.
Proof. Let $\alpha\colon R^k\to R^n$ be onto; by the universal property of free modules, $\alpha$ splits, so there is a one-to-one map $\beta\colon R^n\to R^k$ such that $\alpha\circ\beta=I_{R^n}$. By the strong rank condition, $n\leq k$, as required. $\Box$
Proposition. The following are equivalent:
Proof. (1)$\Rightarrow$(3): Let $p\colon R^n\to R^n$ be an epimorphism. Since $R^n$ is free, we know that $p$ splits, so there exists $q\colon R^n\to R^n$ such that $p\circ q = I_{R^n}$. Viewing $p$ and $q$ as $n\times n$ matrices $c$ and $d$, we have that $cd=1$ in $\mathbb{M}_n(R)$, hence by stable finiteness $dc=1$. Therefore, $q\circ p = 1$, so $p$ is one-to-one; thus $p$ is an isomorphism.
(3)$\Rightarrow$(2). Compose the isomorphism $R^n\to R^n\oplus N$ with the projection onto the first coordinate; this is an epimorphism, hence by (3) an isomorphism, so $N=0$.
(2)$\Rightarrow$(1). Let $c,d\in\mathbb{M}_n$ be such that $cd=1$. We view $c$ and $d$ as maps $R^n\to R^n$. Then we can write $R^n = d(R^n)\oplus\mathrm{ker}(c)$, and since $cd=1$, $d(R^n)\cong R^n$. Hence $R^n\cong R^n\oplus\mathrm{ker}(c)$, so by (2) we have $\mathrm{ker}(c)=0$. Thus, $c$ is one-to-one and onto, hence invertible; since $d$ is a right inverse for $c$, we must have $d=c^{-1}$, so $dc=1$. Thus, $R$ is stably finite. $\Box$
Proposition. If $R$ is nonzero and is stably finite, then $R$ satisfies the rank condition.
Proof. If $R$ does not satisfy the rank condition, then we have an epimorphism $\alpha\colon R^k\to R^n$ with $k\lt n\lt\infty$. Then (by an argument similar to that for $R^n = d(R^n) \oplus \mathrm{ker}(c)$ in the proof of (2)$\Rightarrow$(1) above) we get
$$R^k \cong R^n\oplus\mathrm{ker}(\alpha) \cong R^k\oplus (R^{n-k}\oplus \mathrm{ker}(\alpha)),$$
where $R^{n-k}\oplus \mathrm{ker}(\alpha)\neq 0$, which proves that $R$ is not stably finite. $\Box$
Proposition. If $R$ satisfies the rank condition, then $R$ has IBN.
Proof. If $R^n\cong R^m$, the rank condition gives $n\leq m$ and $m\leq n$, hence $n=m$. $\Box$
Proposition.
A ring $R$ fails to satisfy the rank condition if and only if for some $n\gt k\geq 1$ there exist an $n\times k$ matrix $A$ and a $k\times n$ matrix $B$ over $R$ such that $AB=I_n$.
A ring $R$ satisfies the strong rank condition if and only if any homogeneous system of $n$ linear equations over $R$ with $m\gt n$ unknowns has a nontrivial solution over $R$.
Proof.
If $A$ and $B$ are matrices as given, then multiplication by $A$ gives an epimorphism $R^k\to R^n$ with $n\gt k$, proving that $R$ does not have the rank condition. Conversely, if the rank condition fails, then an epimorphism $\alpha\colon R^k\to R^n$ with $n\gt k$ yields $A$, and a splitting map for $\alpha$ gives $B$.
Write $R^n = \bigoplus_{i=1}^n e_iR$; let $u_1,\ldots,u_m$ be vectors of $R^n$, and write $u_j = \sum_{i=1}^n a_{ij}e_i$ (writing as left modules). Then if $x_j\in R$, we have
$$\sum_j x_ju_j = \sum_i e_i\left(\sum_j a_{ij}x_j\right);$$
this combination is zero if and only if $x_1,\ldots,x_m$ are a solution of the system $A\mathbf{x}=\mathbf{0}$, where $A=(a_{ij})$ and $\mathbf{x}=(x_1,\ldots,x_m)^t$. Thus, if every system with $m\gt n$ has a nonzero solution, then $u_1,\ldots,u_m$ linearly independent implies $m\leq n$, the rank condition; and conversely, a system with $m\gt n$ with no nontrivial solutions produces an $m\gt n$ with $m$ linearly independent vectors in $R^n$. $\Box$
Theorem. A non-zero (right) noetherian ring $R\neq 0$ satisfies the strong rank condition.
Proof. Let $R\neq 0$ be (right) noetherian. For any $n$, $A=R^n$ is a noetherian module (finitely generated over a noetherian ring). If $m\gt n$, then $R^m = R^n\oplus R^{m-n}$. If we could embed $R^m$ in $R^n$, then we would be able to create an infinite ascending chain of submodules, $R^{m-n}\subset R^{m-n}\oplus R^{m-n}\subset R^{m-n}\oplus R^{m-n}\oplus R^{m-n}\subset\cdots$ by iterating the embedding of $R^m$ into $R^n$. But this is impossible with $R^{m-n}\neq 0$ in a noetherian ring. $\Box$
Corollary. Every commutative unital ring $R\neq 0$ satisfies the strong rank condition.
Proof. Let $A\mathbf{x}=\mathbf{0}$ be a system of $n$ linear equations in $m$ unknowns, $m\gt n$, and let $a_{ij}$ be the entries of $A$. Let $R_0$ be the subring of $R$ generated by $1_R$ and the $a_{ij}$. By the Hilbert Basis Theorem, this is a nonzero noetherian ring (it is a quotient of a polynomial ring over $\mathbb Z$ in finitely many unknowns), hence the system has a nontrivial solution in $R_0$ (as $R_0$ satisfies the strong rank condition), hence the system has a nontrivial solution in $R$. Thus, $R$ has the strong rank condition. $\Box$
Corollary. Any unital commutative ring $R$ is stably finite.
Proof. Let $C,D\in\mathbb{M}_n(R)$ such that $CD=I_n$. Then $\det(C)\det(D)=1$, so $\det C$ is a unit in $R$; hence $C$ is invertible with inverse $(\det C)^{-1}\mathrm{adj}(C)$ (the classical adjoint of $C$); since $D$ is a right inverse of an invertible matrix, it is the inverse, so $DC=I_n$. Thus, $R$ is stably finite. $\Box$
I have wanted to prove this using elementary row operations for some
time, and finally found a way. Previously, I liked the proof using
Cayley-Hamilton best.
First translate to matrix language. It must be shown that if $T$ is
an $m$ by $n$ matrix with $m < n$, then its columns are linearly
dependent. Equivalently, the linear equations corresponding to the
matrix have a nontrivial solution.
If $A$ is an field, then the proof using elementary row operations
is well known (reduce the matrix to an echelon form so that the
equations are easy to solve). Similarly if $A$ is an integral domain.
Then we can work in the fraction field and clear denominators to get
a solution in $A$, or work directly in $A$ after admitting
multiplication of rows by nonzero scalars as elementary row operations.
For general $A$, the same method would work if we could avoid
multiplying by zero-divisors. This is impossible in general, but
multiplication by suitable non-nilpotent elements works almost as
simply.
Lemma: If all of the entries $c_i$ of a column vector $c$ are nilpotent,
then $c$ is linearly dependent.
Proof: Use induction. Let $r$ be a nonzero scalar that kills the first
$k$ entries (start with $r = 1$ and $k = 0$). To kill the first $k+1$
entries, replace $r$ by $r c_i^p$ where $p$ is the largest integer
(possibly $0$) such that $r c_i^p$ is nonzero ($p$ exists since $c_i$
is nilpotent).
If the first column of $T$ is linearly dependent, then we are done.
Otherwise, by the lemma, one of its entries is non-nilpotent.
Rearrange the rows so that the top left corner entry $d_1$ is
non-nilpotent. We could now work entirely in the ring of fractions
$S^{-1}A$ where S is the multiplicative set consisting of powers of
$d_1$, but I want to make everything explicit. Working in $A$, use
multiplication of rows by $d_1$ and other more elementary row
operations to reduce the first column to $0$ except for its top
left entry which remains as $d_1$. If $d_1$ is a zero-divisor,
this gives some extra solutions to the equations and we must be
careful not to use these. This will be done later, to keep
everything explicit.
Continue reducing by induction. Keep in mind the ring of fractions
that we are otherwise trying to avoid. To keep control of
zero-divisors, at each step we must consider nilpotence in the ring
of fractions set up by the previous step. If we reach a (reduced) column
that has no non-nilpotent entry, then we are done for the purposes of
solving the equations although not for the reduction to echelon form.
Assume for simplicity that the reduction completes. Then the echelon
form has entries of $0$ below the diagonal, and diagonal entries
$d_i$ whose product is non-nilpotent; moreover, the problematic
multiplications were only by the $d_i$. (The ring of fractions
expanded at each step by allowing 1 more $d_i$ in denominators.)
Solving the equations for the echelon matrix is easy, as usual,
since there are more equations than unknowns. Take $x_{n+1}$ as
the product of the $d_i$, and $x_j = 0$ for $j$ larger than $n+1$.
Solve as usual for $x_j$ for $j$ smaller than $n$. The critical
point here is that the product of the $d_i$’s is nonzero. We
had to take this product to get a value that can be divided by
all of the $d_i$. If we had not expanded the ring of fractions
at each step, then we still would have obtained an echelon
matrix with non-nilpotent $d_i$, but then the $d_i$ would have
been unrelated so their product might be $0$.