Even if you post on 12 more websites, you should continue to pursue the "Goodness of Fit" angle. When was your last Statistics Class?
Hello,
What can use instead of the ‘pair t test’?
I have two variables both of which are recording temperature of the same house at 1 hr intervals over the same time period. One of the variables is the real-world measurements and the other is temperature modelled by myself in a program.
I need an indication of how separate two sets of measurements are. They are measured over 2 months so I have 1300 measurements i.e. n = 1300.
These temperatures oscillate over the day like a sine wave and so aren’t normally distributed.
Can anyone please help?
Let's say the measured temperatures are and the modeled temperatures are . My suggestion would be to compute the values and apply a Z-test for zero mean. That way your assumption is that the values are normally distributed, which seems reasonable, and the fact that the temperatures have a pattern may be irrelevant.
Yes, my last Statistics class was a couple of years ago, hence the somewhat wondering in dark.
From what I've read I need to use the chi square statistic, which is a sum of differences between observed and expected outcome frequencies, each squared and divided by the expectation: The degrees of freedom df = (rows-1)(columns-1)
My problem is that, my degrees of freedom is = (1200-1)(2-1) = 1200
All the tables I've seen only go up to 9 or so df's.
Does this mean I should just take a random sample of 10 measurements and apply it to that rather than the entire data set?
Thanks for your patience.
Throwing away perfectly good data is totally absurd.
The chi-squared distribution is asymptotically normal as the number of degrees of freedom becomes infinite. It has mean = n and variance = 2n; for a large degree of freedom (and 1200 is large) you can get a very good approximation using the normal distribution with this mean and variance.
That's why tables are usually not made for more than 100 degrees of freedom.
Thanks for that
Ok, I think I’ve got it now. Here it goes:
My null hypothesis is that there is no difference between the observed and expected temperatures.
My calculated chi statistic = 2.76
I have 1276 rows which effectively gives a degree of freedom of 100 (tables limit).
The critical value for the chi-square at significance of 0.05 is =124.3; if the calculated chi-square value is equal to or greater than this critical value, I can conclude that the probability of the null hypothesis being correct is 0.05. i.e. a very low probability. But it isn’t! it is in fact much less than 124.3. This means that because my degrees of freedom are so high and my chi statistic so low, I can accept the null hypothesis with great confidence.
Is my method of calculation correct or should I be using a different statistic because my degrees of freedom are so large?