I was reading some proofs about the convergence of random variables, and here are the little bits that I couldn't figure out...

1) Let X_n be a sequence of random variables, and let X_(n_k) be a subsequence of it. If X_n conveges in probability to X, then X_(n_k) also conveges in probability to X. WHY?

2) I was looking at a theorem: if E(Y)<∞, then Y<∞ almost surely. Now I am puzzled by the notation.What does it MEAN to say that Y=∞ or Y<∞?

For example, if Y is a Poisson random variable, then the possible values are 0,1,2,..., (there is no upper bound). Is it true to say that Y=∞ in this case?

3) If (X_n)^4 converges to 0 almost surely, then is it true to say that X_n also converges to 0 almost surely? Why or why not?

4) The moment generating function(mgf) determines the distribution uniquely, so we can use mgf to find the distributions of random varibles. If the mgf already does the job, what is thepointof introducing the "characteristic function"?

Can someone please explain?

Any help is much appreciated!

[note: also under discussion in talk stats forum]