My bad, the transition matrix should be
I have been reading a few things about Markov chains and I want to clarify a few things using a made-up problem as an example. Any help would be appreciated greatly.
A group of scientists create an artificial lake to study the lifespan of a certain type of fish. The scientists populate a transition matrix describing the movements of the fish between one length division to another length division at the end of each month.
Lets call the states:
and include a final state E: denoting the fish has been removed from the system because the scientists got hungry and went fishing, or the fish died of natural causes. (Also, due to some chemical in the water, the fish cannot breed).
Let where the states read A, B, C, D, E across the top.
Okay, now here are my questions:
(1) Would these probabilities represent that a randomly selected fish in state A in the current month, will move to state D in the next month with probability 0.1?
(2) Lets say the current population on the 30th of September in each state is as follows:
A: 500 fish
B: 400 fish
C: 300 fish
D: 100 fish
E: 0 fish
Can I use this transition matrix to predict/calculate (and if so how):
(i) the probability that there will be 500 fish in state B at the end of October?
(ii) The probability that by the end of October there will be 500 fish in state B and 400 fish in state C?
(iii) The number of months it would take for all the fish population to die (state E)?
(iv) The number of months it would take to have 200 fish in state D and 380 fish in state C?
(v) what would the population look like in 3 months(in terms of the overall fish distribution across the states)?
I am really not sure if the last two are possible to predict from the matrix.
If anyone could provide me with any insight, you would make me very happy.
The answer to question (1) is yes, that's what the matrix says. You may understand 0.1 as the probability that a random fish in state A will grow to D (in one month's time). If there is a very large number of fishes, you may also interpret it approximately as the proportion of fishes that grow from size A to size D.
Your answer to (v) is correct. If the vector describes the initial proportions, then after one month the number of fishes in state A must be (where is the entry A,A of ), while the number of fishes in B is the sum (either fishes from A growing to B, or fishes remaining in B), and so on. As you can check, this amounts to say that after one month the new proportion is described by the matrix product (it is a row matrix). In the same way, after two months you get and in general the distribution after months is .
As for question (iii) (the eventual death of all fishes), you won't have (every fish in state E) for any because there is always a small probability that a fish stays in state A (for instance) for a very long time. One could consider that a proportion smaller than corresponds to 0 fish but that is not very satisfying since a proportion depicts a rare but not impossible event.
Several kind of answers could be given. For instance, it would be interesting for "fish growers" to know at what time the proportion of dead fishes exceeds 95% (or 99%, or...). To that aim, compute recursively for (with a computer you can use SciLab for instance, which is a free numerical computation program, or any other one) until the last component (state E) becomes larger than 0.95.
Note that this result about proportions is independent of the total number of fishes. It is more difficult to say something about the extinction of all fishes, which depends heavily on the size of the population. What is for sure is that "eventually", i.e. at some random possibly large time , all fishes will have passed away. It is of course impossible to give a specific value for , but it could be interesting to find the expected value of . This seems however difficult to obtain explicitly; not impossible, but very tedious. You can perform simulations on computer to get an approximate value.
A simpler computation is the average lifespan of a random fish. Denote by the average lifespan of one fish in state , and similarly for other states. Of course, since state E is death. Then (1 month + (with prob 0.4, the fish stays in D and the situation is brought back to the beginning, or with prob 0.6 it gets killed)), hence . You may note that in this case the lifespan is a geometric random variable. Then and you deduce the value of . Same with B and A. (Since the matrix is triangular, the values are computed very easily one after another; in general it would involve a system of equations)
Using these datas, you can compute the average lifespan of a random fish in the pool: . It gives you a rough idea of how old your fishes can get.
Question (i). After one month, the average proportions are given by . However, in order to find the probability that 500 fishes have size B, you need more than the expected number of fishes of size B. You can see that the number of fishes of size B after one month is where is the number of fishes that went from size x to size y. is a binomial r.v. of parameter (each of the 400 fishes in state B has probability 0.4 to stay in B), is a binomial r.v. of parameter , and these r.v. are independent. Thus in principle one can compute explicitly but there is no simple formula... Again simulations work best to get an approximate value.
In the same way, question (ii) is even more complicated to answer; in principle, the matrix contains all one needs to find the answer, but the answer is again a huge sum that is uneasy to compute.
As for question (iv), it raises the same problem as the extinction question: because of the randomness of the model, there is not one answer: the number of months may vary (here we may also imagine that state D never reaches a population of 200 fishes because fishes can "jump" from C to E).
I hope this clarifies things.
(NB for future threads: it is usually more advisable to edit your previous post if you want to add precisions (as long as you don't erase your initial questions) than post a reply to yourself)