Hey avisccs.

The reason is that all we have to do is differentiate n times to get the nth moment and plug in the value t = 0. It's a lot easier generally than having to calculate Integral x^n*f(x)dx.

The probability generating function is for discrete distributions. The analogue for the continuous distributions is the characteristic function which is the "complex" (as in i = sqrt(-1)) form of the MGF.

The intuition of the MGF is to get the moments in a systematic fashion. It's also used to characterize a distribution and for that reason is used to classify distributions. If an MGF is the same as some particular expression then its distribution is the same. It has a one to one correspondence. Also we use MGF's to prove results about addition of IID distributions amongst others.

To get the intuition of MGFs, write out the taylor series form of the exponential function and replace x with a random variable X. Use the property that E[aX + bY] = a*E[X] + b*E[Y] and then differentiate the function n times and plug in t = 0. You should eliminate all the higher order terms and be left with E[X^n].