Is the square root of a triangular matrix necessarily triangular?

$X^2 = L$, with $L$ lower triangular, but $X$ is not lower triangular. Is it possible?

I know that a lower triangular matrix $L$ (not a diagonal matrix for this question),
$$L_{nm} \cases{=0 & for all $m > n$ \\ \ne 0 & for some $ m<n$} $$
when squared is lower triangular. But is the square root, when it exists, always lower triangular? I have found some examples that give a diagonal matrix:
$$\pmatrix{1 & 0 & 1 \\ 0 & e & 0 \\ c & 0 & -1}\pmatrix{1 & 0 & 1 \\ 0 & e & 0 \\ c & 0 & -1}=\pmatrix{c+1 & 0 & 0 \\ 0 & e^2 & 0 \\ 0 & 0 & c+1}=L$$
But I am wondering about the possibility of the square being strictly triangular, not diagonal.
I believe the answer is yes, that if the square root of a strictly lower triangular exists, then that is also lower triangular. I am looking for a good argument as to why, or for any counter examples.

EDIT:

Also, all $ 2 \times 2$ that gives the upper right zero in the square implies also the lower left is zero.
$$\pmatrix{a & c \\ d & b\\}\pmatrix{a & c \\ d & b\\}=\pmatrix{a^2 + cd & ac + cb \\ ad + bd & cd + b^2\\}$$

$$(ac+cb= 0) \Rightarrow (a = -b) \Rightarrow (ad+bd=0)$$
So a counter example will necessarily be higher dimension, but I am thinking that the same logic will still apply somehow, along the lines of $2 \times 2$ sub-matrices or something.

Solutions Collecting From Web of "Is the square root of a triangular matrix necessarily triangular?"

$$ \left( \begin{matrix} 1 & 1 & 1 \\ 4 & 1 & 2 \\ 1 & -2 & -3 \end{matrix} \right)^2 = \left( \begin{matrix} 6 & 0 & 0 \\ 10 & 1 & 0 \\ -10 & 5 & 6 \end{matrix} \right)$$

This example was found more or less by

  • Picking an arbitrary top row
  • Filling out the middle column to make a 0 in the product
  • Filling out the right column to make a 0 in the product
  • Filling out the middle row to make a 0 in the product

$$\pmatrix{0 & 1 & 0 \\ 1 & 0 & 0 \\ 1 & 1 & 1}\pmatrix{0 & 1 & 0 \\ 1 & 0 & 0 \\ 1 & 1 & 1}=\pmatrix{1 & 0 & 0 \\ 0 & 1 &0 \\ 2 & 2 & 1}$$

I simply started from $\pmatrix{0 & 1 \\ 1 & 0 }^2= I_2$, and then went to a 3×3 to make it a non-diagonal matrix. The key here is that if you start with a matrix whose square is triangular, and add $(0 ,0 , 0,…, 0, 1)$ as the last column, and anything else in the last row, its square stays lower triangular…

With counterexamples abounding, I’ll try here to explain why nonetheless for most triangular $L$, any solution to $X^2=L$ will be similarly triangular, and investigate properties of $L$ (replacing that of having a non-zero below-diagonal entry) that will ensure this. The key observations are

  • Lower triangularity of $A$ means that the subspaces $\langle e_i,\ldots,e_n\rangle$ for $i=1,\ldots,n$ are $A$-stable,
  • Any solution $X$ of $X^2=L$ will commute with $L$,
  • Therefore the kernel of any polynomial $P_L$in $L$ will be $X$-stable.

(For the last point, if $P_L(v)=0$, then $P_L(X(v))=X(P_L(v))=0$.)

Now the subspaces $\langle e_i,\ldots,e_n\rangle$ are $L$-stable if $L$ is triangular, but they need not be kernels of polynomials in $L$; however should they all be such kernels, then they will all be $X$-stable for any solution $X$ by the last point, and $X$ will have to be lower triangular. For this to happen it suffices that the diagonal coefficients $a_{1,1},\ldots,a_{n,n}$ of $L$ are all distinct. For in that case $ \langle e_i,\ldots,e_n\rangle$ is precisely $\ker((L-a_{i,i}Id)\circ\cdots\circ(L-a_{n,n}Id))$, as can easily be checked. The reason that this argument fails when some $a_{j,j}=a_{k,k}$ with $j<k$ is that the eigenspace for this $a_{j,j}$ might be of dimension${}>1$, in which case the kernel of any polynomial in $L$ that contains a factor $L-a_{j,j}$ will kill not only $e_k$ but also $e_j$, so that there is no way for the polynomial to have exactly $ \langle e_k,\ldots,e_n\rangle$ as kernel.

In fact an even weaker condition on $L$ can be given that forces $X$ to be triangular: if the minimal polynomial of $L$ is equal to its characteristic polynomial $(X-a_{1,1})\ldots(X-a_{n,n})$, then the kernels of the $n$ polynomial $\ker((L-a_{i,i}Id)\circ\cdots\circ(L-a_{n,n}Id))$ are all distinct, and therefore necessarily equal to $ \langle e_i,\ldots,e_n\rangle$ respectively. Another argument for this case is that the here the only matrices commuting with $L$ are polynomials in $L$ and therefore triangular; this applies in particular to $X$. The condition on the minimal polynomial can be seen to be equivalent to all eigenspaces of $L$ having dimension $1$.

I believe that this, sufficient condition for $L$ to force $X$ to be triangular is also necessary, in other words once there is an eigenspace of dimension${}>1$ this can be explioted to construct a solution $X$ that is not triangular.
Here is an example of this. Suppose we want $L$ to have a double eigenvalue $1$, and a single eigenvalue $4$ (which makes taking square roots easier). It will help to make the entries $1$ on the diagonal nonadjacent, so take $L$ of the form
$$
L=\begin{pmatrix}1&0&0\\x&4&0\\y&z&1\end{pmatrix}
$$
Now we need $L$ to have eigenspace for $1$ of dimension $2$ (in other words to be diagonalisable), and so $L-I_3$ should have rank $1$, which here means $xz-3y=0$. Let’s take $x=y=z=3$ to have this. This leads also to an easy second eigenvector $e_1-e_2$ in addition to the inevitable eigenvector $e_3$ at $\lambda=1$. Computing $L-4I_3$ shows that for these values $e_2+e_3$ is the eigenvector at $\lambda=4$. Now it happens that we can choose a matrix $P$ with eigenvectors as columns (which we will use for base change) to itself be lower triangular; this is of no relevance, but fun and it helps find its inverse:
$$
P=\begin{pmatrix}1&0&0\\-1&1&0\\0&1&1\end{pmatrix}
\quad\text{for which}\quad
P^{-1}=\begin{pmatrix}1&0&0\\1&1&0\\-1&-1&1\end{pmatrix}.
$$
The most important point is to decide what $X$ does with the eigenspace of $L$ for $\lambda=1$. As linear map $X$ must stabilize this eigenspace globally, and its restriction to this subspace must square to the identity (it must be an involution); taking the restriction itself to be plus or minus identity will not give a counterexample, so let us take it to have both eigenvalues $1$ and $-1$. A simple way to do this is simply interchange the eigenvalues we found. We still have the choice of square roots $\pm2$ on the eigenspace for $\lambda=4$, so one gets
$$
X=P\cdot\begin{pmatrix}0&0&1\\0&\pm2&0\\1&0&0\end{pmatrix}\cdot P^{-1}
$$
giving
$$
X=\begin{pmatrix}-1&-1&1\\3&3&-1\\3&2&0\end{pmatrix}
\quad\text{and}\quad
X=-\begin{pmatrix}1&1&-1\\1&1&1\\1&2&0\end{pmatrix},
$$
which indeed both square to $L$. You can experiment with variants, like changing the double eigenvalue of $L$ to $\lambda=-1$ instead of $\lambda=1$; one can still find a real square root of the restriction of $L$ to this eigenspace, though the flavour is a bit different.

expanding on @Hurkyl’s answer: in a 3×3-matrix $M$, where $M^2$ shall be triangular, we have six variables, which in principle are free to choose. Let’s denote the freely choosable entries in $M$ with capital letters and the dependent entries ith small letters, then we can state it this way
$$ M=\begin{bmatrix} A&g&i \\ D&B&h\\ F&E&C\end{bmatrix} $$ Let’s restrict to the case where $D \ne 0$,$E \ne 0$. With $F$ being completely irrelevant for the triangularity of $M^2$ then if in $M$ we set $$ \begin{eqnarray}
i&=&&&- {(A+B)(A+C)(B+C) \over DE } \\
g&=&-i {E \over A+B} &=& {(B+C)(A+C) \over D } \\
h&=&-i {D \over B+C} &=& {(A+B)(A+C) \over E } \\
\end{eqnarray} $$
we get the result
$$ M^2 = \text{ <lower triangular> }$$
If we set $F=0$ ,$D=E=1$ and also all other freee variable to 1: $A=B=C=1$ we get
$$ M=\begin{bmatrix} 1&4 &- 8 \\ 1&1&4\\ 0&1&1\end{bmatrix} $$
$$ M^2=\begin{bmatrix} 5&. & . \\ 2&9&.\\ 1&2&5\end{bmatrix} $$

Now I’m curious how many free variables we have for bigger matrix-sizes…

Take your matrix $M$ (such that $M$ is not triangular but $M^2$ is) and any 2 by 2 lower triangular matrix $L$ such that $L^2$ is not diagonal and consider the 5 by 5 matrix $M\oplus L.$ Then $(M \oplus L)^2 = M^2\oplus L^2$ is lower triangular but not diagonal.

For example

$$\pmatrix{1 & 0 & 1& 0 & 0 \\ 0 & e & 0 & 0 & 0 \\ c & 0 & -1 & 0 & 0 \\ 0 & 0 & 0 & 1& 0\\ 0 & 0 & 0 & 1& 1}$$ is an example.

The fact that you needed to explicitly require a nonzero coefficient below the diagonal in $L$ should have been a warning. It smells of a condition that pushes away the simplest counterexamples without doing anything that would help you to prove the result in general. In such cases you can usually resuscitate the counterexample by adding something independent just for the sake of the extra condition. The first such idea that came to my mind was a block matrix
$$
X=\begin{pmatrix}0&-1&0&0\\1&0&0&0\\0&0&1&0\\0&0&1&1\end{pmatrix},
$$
but as N.S. shows you need not even go to $4\times 4$ matrices. Thinking up such a counterexample might take less effort than writing up the question…

Edit: From the comments here and at the question itself I understand that my wording was interpreted as derision, which I regret because it is not. Insteead I was trying to bring across a general lesson that can be learned from this kind of question.

It happens that you seem to be near a general fact, but then there is an obvious counterexample showing that in your formulation it is wrong. Then you can add a condition to the statement that bars this counterexample, and then conjecture that this is now a general fact. This sometimes works, as some general statements need a technical clause to be true; for instance “every non-constant polynomial in $\mathbf C[X]$ has at least one root” really needs the “non-constant”. However whether you are in this situation or not depends on the extra condition actually being useful in a proof that you could imagine giving. In this case the extra condition is some coefficient(s) in $X^2$ being non-zero, while what you want to prove is that a whole bunch of coefficients in $X$ must be zero. It is not impossible that there could be some implication of this kind, as the argument you gave for the $2\times2$ case illustrates, but deducing equalities from inequalities is not an obvious kind of thing in algebra. This could lead you to suspect the extra condition was not the proper way to rule out the counterexample, and being able to resuscitate it is the simplest way to be sure it is not.

I will give an answer which is more conceptual than the ones already given. First, let $X$ be a square (real) matrix. Then a square root of $X$ is any (real) matrix $L$ such that $X=L L^T$. You ask: If $X$ is triangular, must then $L$ be triangular? the short answer is “no”.

In fact, we can characterize all possible square roots of $X$. let $L$ and $M$ be two square roots. Then they are orthogonally related! (this is called “unitary freedom of square roots”). First, let us prove this: First, it is easy to see that if $L$ is a square root, then so is $LQ$, where $Q$ is any orthogonal matrix (of the right size). But this gives all the solutions. Write $Q=M^{-1} L$, now it is easy to prove that $Q$ is an orthogonal matrix, which proves the claim.

So, returning to your question, let $L$ be a triangular square root. But now it is easy to see that $LQ$ well might be not triangular, where $Q$ is any orthogonal matrix. If you thinkabout the QR-decomposition of a matrix, this becomes immediate.