Why is the ML estimator for the variance of a normal distribution biased?

Hi,

I'm a bit perplexed by this and I wonder if someone can clarify things at all.

I understand (from my textbook) that the maximum likelihood estimator for the variance of a normal distribution is

$\displaystyle \hat{\sigma}^2 = \frac{1}{n} \sum_{i=1}^{n} (X_{i} - \bar{X}) $

But this is biased. Now that's all well and good, but I had understood that ML estimators are unbiased, yet plainly this is not always true. When is it true?

Or to put things another way: Why is the (unbiased) sample variance not the ML solution to the variance estimation problem?

Thanks in advance to anyone that can clear this up for me. MD

Re: Why is the ML estimator for the variance of a normal distribution biased?

Hey Mathsdog.

The condition for bias is that E[B_hat] = b where B_hat is the estimator for the parameter b. If you have that condition then the estimator is unbiased.

Basically the reason why this is biased is that the parameter estimate depends on the mean. This mean effect makes a difference on the actual estimator.

There is a technique for the variance known as restricted maximum likelihood (REML) and it works in the following way:

You take your random variable and you transform out the mean by taking linear combinations of the random variables so that your combination gives you a Normal(0 - Mean, Variance - V). Then you do the MLE on this to get an estimate for your variance.

By transforming out the other parameters you get a better estimate for your sigma^2 term.

You can use this idea with estimation of any parameter where you get some dependency between other parameters by choosing a transformation to transform them out.

You do this all the time when you say calculate a Z-score by doing Z = (X - mu)/sigma. This gets rid of the parameters and transforms a Normal(mu,sigma^2) to a Normal(0,1).

You should try creating a new random variable by transforming out the mean and do MLE on that new variable and see what happens.