Degrees of freedom
Hi! Can someone explain this to me:
The definition of -distribution, taken from my statistics book, is:
f is the number of degrees of freedom.
are independent and
The book also says (but it doesn't prove it) that if are independent and , then
I would really like to see what the proof looks like. How can this be proven?
First of all that's a lame definition of a chi-square.
The real definition is
WHERE the dfs need not be an integer.
It's easy to prove that if you square a st normal you get a chi-square with 1 df and then via MGFs you can show that sums of independent chi-squares gives you a chi-square.
NOW, when you subtract the sample mean you do lose that 1 df.
It's not a simple proof and I couldn't find it on the web.
I'm sure it's here and I'll look again.
I just realized it wasn't the definition of chi-square :P It was just a theorem; the definition was some function containing the gamma function, like you wrote. I think the definition was
(the same as that on wikipedia). What I think is kind of strange - my book (our course literature) states a lot of things, but it proves few of them. Another thing that it states but it doesn't prove is that the test variable in the chi-square test is chi-square distributed:
If Z is distributed in r states with probabilities , and is the number of times Z, out of n observations, ended up in state i, then the test variable:
is chi-square(r-1)-distributed (here is the expected number of times Z will end up in state i). The formula however is not motivated, although they prove it is chi-square(r-1) distributed for r = 2. If you look at a single term:
it doesn't look chi-square distributed. If is 1 if Z ends up in state i, and 0 otherwise, has mean and variance . The sum will approximately get distributed by . Now
approximately. That is why I wonder why the formula doesn't look like
instead (which I think should be chi-square(r) or possibly chi-square(r-1) distributed), so for me it looks like someone has forgot a factor in the denominator (although I know that it's not the case). Anyone who knows why it looks as it does and how the formula has been obtained?
I figured it was just a theorem.
But I don't have your book in front of me.
I teach out of wackerly and walpole all the time.
The second thing you posted is the pearson goodness of fit test.
It is important to note that this is not exactly a chi-square as mentioned in that link
'the distribution of the test statistic is not exactly that of a chi-square random variable'
This link seems reasonable too...http://www.statsdirect.com/help/chi_...s/chi_good.htm
Note that the denominators are all different, see
No, I know, it's an approximation and only good for large n, or for large E:s. Besides, this distribution is discrete (there is only a finite number of possible outcomes for each n), while the chi-square distribution is continuous. It seems to be difficult to prove that these sums are chi-square distributed, or in the latter case, distributed similarly to the chi-square distribution. I still think it's bad though that we are taught stuff that is not proven. It forces you to trust blindly in different mathematical expressions.