-
Degrees of freedom
Hi! Can someone explain this to me:
The definition of
-distribution, taken from my statistics book, is:
Quote:
If

are independent and
)
, then
f is the number of degrees of freedom.
The book also says (but it doesn't prove it) that if
are independent and
, then
^2\ \sim\chi^2(n-1),)
where

I would really like to see what the proof looks like. How can this be proven?
-
First of all that's a lame definition of a chi-square.
The real definition is )
WHERE the dfs need not be an integer.
It's easy to prove that if you square a st normal you get a chi-square with 1 df and then via MGFs you can show that sums of independent chi-squares gives you a chi-square.
NOW, when you subtract the sample mean you do lose that 1 df.
It's not a simple proof and I couldn't find it on the web.
I'm sure it's here and I'll look again.
-
I just realized it wasn't the definition of chi-square :P It was just a theorem; the definition was some function containing the gamma function, like you wrote. I think the definition was
=<br />
\begin{cases}\displaystyle<br />
\frac{1}{2^{k/2}\Gamma(k/2)}\,x^{(k/2) - 1} e^{-x/2}&\text{for }x>0,\\<br />
0&\text{for }x\le0,<br />
\end{cases}<br />
)
(the same as that on wikipedia). What I think is kind of strange - my book (our course literature) states a lot of things, but it proves few of them. Another thing that it states but it doesn't prove is that the test variable in the chi-square test is chi-square distributed:
If Z is distributed in r states with probabilities
, and
is the number of times Z, out of n observations, ended up in state i, then the test variable:
^2}{np_i}=\sum_{i=1}^r \frac{(X_i-E_i)^2}{E_i})
is chi-square(r-1)-distributed (here
is the expected number of times Z will end up in state i). The formula however is not motivated, although they prove it is chi-square(r-1) distributed for r = 2. If you look at a single term:
^2}{E_i})
it doesn't look chi-square distributed. If
is 1 if Z ends up in state i, and 0 otherwise,
has mean
and variance
. The sum
will approximately get distributed by
. Now
^2}{E_i(1-p_i)}\sim\chi^2(1))
approximately. That is why I wonder why the formula doesn't look like
^2}{E_i(1-p_i)})
instead (which I think should be chi-square(r) or possibly chi-square(r-1) distributed), so for me it looks like someone has forgot a factor in the denominator (although I know that it's not the case). Anyone who knows why it looks as it does and how the formula has been obtained?
-
I figured it was just a theorem.
But I don't have your book in front of me.
I teach out of wackerly and walpole all the time.
The second thing you posted is the pearson goodness of fit test.
http://en.wikipedia.org/wiki/Pearson's_chi-square_test
It is important to note that this is not exactly a chi-square as mentioned in that link
'the distribution of the test statistic is not exactly that of a chi-square random variable'
This link seems reasonable too...http://www.statsdirect.com/help/chi_...s/chi_good.htm
Note that the denominators are all different, see
http://www.stat.wisc.edu/~mchung/tea.../lecture24.pdf
-
No, I know, it's an approximation and only good for large n, or for large E:s. Besides, this distribution is discrete (there is only a finite number of possible outcomes for each n), while the chi-square distribution is continuous. It seems to be difficult to prove that these sums are chi-square distributed, or in the latter case, distributed similarly to the chi-square distribution. I still think it's bad though that we are taught stuff that is not proven. It forces you to trust blindly in different mathematical expressions.