# Thread: Quick one on stochastic matrices

1. ## Quick one on stochastic matrices

let $P$ be a stochastic matrix, $\pi$ a non-zero line vector solution of the equation $\pi P = \pi$ (the solution necessarily exists, since $\det(P-I)=\det(P^T-I)=0$).

If $\pi = (\pi_1,...,\pi_n)$, how do you prove that $(|\pi_1|,...,|\pi_n|)$ is also solution?

I was wondering if it is possible to obtain the result without too elaborate references to ergodicity results and state subsets decomposition.

2. Originally Posted by akbar
let $P$ be a stochastic matrix, $\pi$ a non-zero line vector solution of the equation $\pi P = \pi$ (the solution necessarily exists, since $\det(P-I)=\det(P^T-I)=0$).

If $\pi = (\pi_1,...,\pi_n)$, how do you prove that $(|\pi_1|,...,|\pi_n|)$ is also solution?

I was wondering if it is possible to obtain the result without too elaborate references to ergodicity results and state subsets decomposition.

I don't think there is an immediate proof, but there are several proofs that don't involve Markov chains (although I prefer the Markov chain proof).

First you can reduce to a matrix with positive coefficients. This is like the decomposition of a Markov chain, but in an "algebraic" context; thus you reduce to the case when $P$ is irreducible (on a smaller space) and then there exists a power $k$ such that $P^k$ has positive coefficients, so we may study $P^k$ (still stochastic, and $\pi P^k=\pi$).

Then for stochastic matrices with positive coefficients, one proves rather simply (considering right eigenvectors) that 1 is a simple eigenvalue and that it is the (only) greatest in absolute value. More precisely, if $Px=\lambda x$, apply the triangular inequality to the row where $x$ has the greatest component (use stochasticity) ; you get $|\lambda|\leq 1$, and if $|\lambda|=1$ the previous inequalities are in fact equalities, hence using the "equality cases" you get that $x_i=x_j$ for all i,j, this should be it.

Then (using triangularization), $P^n$ converges to a rank 1 matrix $\Pi$, which has to be stochastic and nonnegative. Therefore all the rows of $\Pi$ are equal, stochastic and nonnegative. And since $\Pi P=\Pi$ (from $P^n P =P^{n+1}$), this common row is a left eigenvector for P associated with eigenvalue 1. And because 1 is a simple eigenvalue, this vector spans the eigenspace.

This is what you need. There are other proofs, but they all involve the same ideas: study the eigenvalues to the right using triangular inequalities, say that they are the same as the eigenvalues to the left with same multiplicities, and conclude in one way or another using the spectral radius or the limit of $P^n$ like I did. Now you see why the Markov chain proof is not really elaborate (no ergodicity involved, by the way, only the Markov property; and much more visual).

Dealing with simple proofs: take as a challenge to prove (for yourself) from scratch that there exists a non-zero $\pi$ such that $\pi P=\pi$ (i.e. a left eigenvector for eigenvalue 1). When I say "from scratch", that means no determinant (or you prove every property you use, but that's not a nice and clean way to procede, since determinant is a "strong" tool compared to the simple statement...). After that, which is not very different from the fact that A and $A^T$ have same rank for any A, then the Markov chain version should look even more elementary.

3. Thanks for your answer. Given a stochastic matrix $P$, the original problem deals with proving the existence of a vector $\pi$ with non-negative elements such that $\pi P=\pi$.

My initial answer was to start as you did, e.g. to restrict to a subset of (persistent) states $C$ such that the matrix $P_C$ would be irreductible. For this, you use the decomposition theorem of Markov chains (see for example Grimmett & Stirzaker p.224) and the existence of at least one persistent state, immediate in the case of a finite statespace.

Once in this subset, you end up in the case of an ergodic matrix for which you can apply the ergodicity theorem, e.g. the existence of $\pi_C$ with positive elements such that $\pi_C P_C = \pi_C$.
Complete the vector with zeros to obtain a solution to the problem.

Such a proof uses rather heavy results and I was dubious this would be the simplest one. Starting from right eigenvectors seemed a simpler approach, hence my question.

Your useful proof also uses the decomposition theorem but replaces the ergodicity theorem with results on matrix reduction. Other possible proofs involve finding a vector which is also a probability distribution ( $\Sigma_i \pi_i =1$), which come down to using the Fixed Point or (worse) the Brouwer Theorem.

So I finally tend to fully agree with you when you said there was no straightforward solution to what seemed initially a simple problem.

Just to check, I would appreciate if you could tell more about the "Markov chain proof" you mentioned, since it seems to be different from the one I came out with.

Thanks again for your precious help.

4. Originally Posted by akbar
Thanks for your answer. Given a stochastic matrix $P$, the original problem deals proving the existence of a vector $\pi$ with non-negative elements such that $\pi P=\pi$.

My initial answer was to start as you did, e.g. to restrict to a subset of (persistent) states $C$ such that the matrix $P_C$ would be irreductible. For this, you use the decomposition theorem of Markov chains (see for example Grimmett & Stirzaker p.224) and the existence of at least one persistent state, immediate in the case of a finite statespace.

Once in this subset, you end up in the case of an ergodic matrix for which you can apply the ergodicity theorem, e.g. the existence of $\pi_C$ with positive elements such that $\pi_C P_C = \pi_C$.
Complete the vector with zeros to obtain a solution to the problem.
My proof is just the same, except that I don't call it the ergodicity theorem (for me, the ergodicity theorem is the one that states that positive recurrent Markov chains are ergodic, i.e. satisfy a "law of large numbers" with respect to the invariant probability).

The result of existence of an invariant measure is very simple for Markov chains since we have an explicit formula: Reduce the state space, like you said, to an irreducible component, where we choose a state $o$, and define $\pi(x)=E_o\left[\sum_{n=0}^{H_o-1}{\bf 1}_{(X_n=x)}\right]$; it is the average number of visits to state $x$ before the first return to the initial state $o$. The fact that $\pi$ is stationary comes quickly from Markov property; and $\pi(x)<\infty$ then follows since $\pi(o)=1$ and $\pi(o)=(\pi P)(o)$ imply $\pi(x)<\infty$ for the states $x$ with $p_{xo}>0$, and so on inductively for every state $x$ (by irreducibility).

5. This explicit construction of an invariant distribution was not mentioned in the book where the problem was (it is though in Grimmett & Stirzaker, p. 227-228). Maybe that was what the problem was aiming at, though I tend to think it's more likely your proof using trigonalization.
Thanks again for your help. Merry Christmas.