The same proof works for every convergent sequence. Write out the definition of convergence and it should be clear. (A quick explanation, any element of the subsequence is an element of the sequence, hence if n_k>N then the same stuff for the sequence hold for the subsequence).

means .2) I was looking at a theorem: if E(Y)<∞, then Y<∞ almost surely. Now I am puzzled by the notation.What does it MEAN to say that Y=∞ or Y<∞?

For example, if Y is a Poisson random variable, then the possible values are 0,1,2,..., (there is no upper bound). Is it true to say that Y=∞ in this case?

No, if N is Poisson then N<\infty.

The reason is the same for a convergent sequence. Pick , then there exists an N (a.s) s.t. a.s. for all n>N.3) If (X_n)^4 converges to 0 almost surely, then is it true to say that X_n also converges to 0 almost surely? Why or why not?

The point is that mgf do not always exist! Characteristic function is a Fourier transform, which exists for almost every process (finite expectation is sufficient and necessary I think). Also Fourier analysis has been studied in depth and has quite a few tools that you can use.4) The moment generating function(mgf) determines the distribution uniquely, so we can use mgf to find the distributions of random varibles. If the mgf already does the job, what is thepointof introducing the "characteristic function"?

I hope this way clear.