I guess what I am saying is why they even did that in the first place?
We want to minimize subject to .
is an matrix of fixed constants, , (both are vectors of fixed constants) and is the -vector of nonnegative values that is to be chosen to minimize . So if , then the optimal can always be chosen to have components equal to .
So in a Markov chain, suppose that the algorithm is at the th best extreme point, then after the next pivot the resulting extreme point is equally likely to be any of the best. We want to find the expected number of transitions needed to go from state to state , or .
Then why does ? This is only for the initial transition right? They probably conditioned on some variable, but I am not seeing it.
Ultimately .
Source: An Introduction to Probability Models by Sheldon Ross