Math Help - Matrix of a Nilpotent Operator Proof

1. Matrix of a Nilpotent Operator Proof

I'm trying to understand this one proof about the matrix of a nilpotent operator and I'm stuck on this one part. Here is the theorem I'm trying to prove and how the proof goes:

Suppose N is a nilpotent operator on a vector space V. Then there is a basis of V with respect to which the matrix of N has the form:

$
$\begin{bmatrix} 0 & & * \\ & \ddots & \\ 0 & & 0 \end{bmatrix}$$

where all entries on and under the diagonal are 0 and the asterisk indicates the other entries.

The proof starts by creating a basis for V. First a basis for $null(N)$ is chosen and then extended to a basis of $null(N^2)$. Then our basis is extended to a basis for $null(N^3)$ and so on until $null(N^{\dim{V}})=V$. Now we have a basis of V.

The text then says:
"Now let’s think about the matrix of N with respect to this basis. The
first column, and perhaps additional columns at the beginning, consists
of all 0’s because the corresponding basis vectors are in $null(N)$. The
next set of columns comes from basis vectors in $null(N^2)$. Applying N
to any such vector, we get a vector in $null(N)$; in other words, we get a
vector that is a linear combination of the previous basis vectors. Thus
all nonzero entries in these columns must lie above the diagonal. The
next set of columns come from basis vectors in $null(N^3)$. Applying N
to any such vector, we get a vector in $null(N^2)$; in other words, we get a
vector that is a linear combination of the previous basis vectors. Thus,
once again, all nonzero entries in these columns must lie above the
diagonal. Continue in this fashion to complete the proof."

I understand how the entries in the first set of columns are all 0. However, I do not understand how the second set of columns are determined.

2. Originally Posted by Anthonny
I'm trying to understand this one proof about the matrix of a nilpotent operator and I'm stuck on this one part. Here is the theorem I'm trying to prove and how the proof goes:

Suppose N is a nilpotent operator on a vector space V. Then there is a basis of V with respect to which the matrix of N has the form:

$
$\begin{bmatrix} 0 & & * \\ & \ddots & \\ 0 & & 0 \end{bmatrix}$$

where all entries on and under the diagonal are 0 and the asterisk indicates the other entries.

The proof starts by creating a basis for V. First a basis for $null(N)$ is chosen and then extended to a basis of $null(N^2)$. Then our basis is extended to a basis for $null(N^3)$ and so on until $null(N^{\dim{V}})=V$. Now we have a basis of V.

The text then says:
"Now let’s think about the matrix of N with respect to this basis. The
first column, and perhaps additional columns at the beginning, consists
of all 0’s because the corresponding basis vectors are in $null(N)$. The
next set of columns comes from basis vectors in $null(N^2)$. Applying N
to any such vector, we get a vector in $null(N)$; in other words, we get a
vector that is a linear combination of the previous basis vectors. Thus
all nonzero entries in these columns must lie above the diagonal. The
next set of columns come from basis vectors in $null(N^3)$. Applying N
to any such vector, we get a vector in $null(N^2)$; in other words, we get a
vector that is a linear combination of the previous basis vectors. Thus,
once again, all nonzero entries in these columns must lie above the
diagonal. Continue in this fashion to complete the proof."

I understand how the entries in the first set of columns are all 0. However, I do not understand how the second set of columns are determined.
Think about it like this. Let $(x_1,\cdots,x_m)$ be the selected basis for $\ker N$ and $\left\{x_1,\cdots,x_n\}$ the full basis constructed. Then, as the author said one has that for each element $x_j$ of basis vectors one has that $N(x_j)\in \ker N$ so that there exists $\alpha_1,\cdots,\alpha_m\in F$ such that $\displaystyle N(x_j)=\sum_{r=}^{m}\alpha_r x_r\quad\mathbf{(1)}$ and so in particular if $\beta_1,\cdots,\beta_n$ we form the $x_j^{\text{th}}$ column by listing them as the corresponding coefficients of $\displaystyle \sum_{r=1}^{n}\beta_r x_r$...but since the representation with respect to basis elements is unique we may conclude by $\mathbf{(1)}$ that $\beta_r=0$ for $r>m$. But, since $j>m$ this implies that the first element which is necessarily zero occurs at or above the basis. Make sense?

3. ok, remember we have picked a basis B = {v1,...,vk} for V. so we can write v in V as: v = a1v1 + a2v2 +....+ akvk.

hence N(v) = a1N(v1) + a2N(v2) + a3N(v3) +...+ akN(vk)

now the columns of N will be the images N(vj), in the basis B. for illustration suppose dim(null(N)) = 3.

relative to the basis B, v1,v2 and v3 have coordinates (1,0,0,...,0), (0,1,0,...,0) and (0,0,1,...,0). clearly N(v1) and N(v2) and N(v3) are all 0-vectors

(in the basis B!). now, in the basis B, v4 has coordinates, (0,0,0,1,...,0). N(v4) is not 0, so the 4th column for N cannot be all 0's.

but N^2(v4) = 0. this means that N(v4) is in null(N), so N(v4) = b1v1 + b2v2 + b3v3. so the B-coordinates of N(v4) are (b1,b2,b3,0,....,0).

again, for the sake of illustration, suppose that v5 isn't in null(N^2), but is in null(N^3).

then N(v5) is in null(N^2) so N(v5) = c1v1 + c2v2 + c3v3 + c4v4, so the B-coordinates of N(v5) are (c1,c2,c3,c4,....,0)

(remember {v1,v2,v3,v4} form a basis for null(N^2)).

do you see it now?

4. The jth column of the matrix for $N$ are the coordinates of $N(\mathbf{e}_j)$ with respect to the constructed basis for $V$.

Suppose $\mathbf{e}_j \in \text{null}N^2$. Then $N(\mathbf{e}_j)\in \text{null} N$. Hence $N(\mathbf{e}_j)$ can be written as a linear combination of the basis for $\text{null} N$. This implies that the coordinates wrt to the rest of the basis for $V$ are zero. That is, everything on and below the diagonal of the matrix for $N$ in the jth column will be zero. You then continue arguing like this for columns corresponding to basis elements in $\text{null} N^3$, $\text{null} N^4$, etc.

5. Originally Posted by Drexel28
Think about it like this. Let $(x_1,\cdots,x_m)$ be the selected basis for $\ker N$ and $\left\{x_1,\cdots,x_n\}$ the full basis constructed. Then, as the author said one has that for each element $x_j$ of basis vectors one has that $N(x_j)\in \ker N$ so that there exists $\alpha_1,\cdots,\alpha_m\in F$ such that $\displaystyle N(x_j)=\sum_{r=}^{m}\alpha_r x_r\quad\mathbf{(1)}$ and so in particular if $\beta_1,\cdots,\beta_n$ we form the $x_j^{\text{th}}$ column by listing them as the corresponding coefficients of $\displaystyle \sum_{r=1}^{n}\beta_r x_r$...but since the representation with respect to basis elements is unique we may conclude by $\mathbf{(1)}$ that $\beta_r=0$ for $r>m$. But, since $j>m$ this implies that the first element which is necessarily zero occurs at or above the basis. Make sense?

Yes, I completely understand how the proof works.
Thank you!

as well as the other replies which also made sense to me.