# Math Help - Pair t test or....?

1. ## Pair t test or....?

Hello,

What can use instead of the ‘pair t test’?

I have two variables both of which are recording temperature of the same house at 1 hr intervals over the same time period. One of the variables is the real-world measurements and the other is temperature modelled by myself in a program.

I need an indication of how separate two sets of measurements are. They are measured over 2 months so I have 1300 measurements i.e. n = 1300.

These temperatures oscillate over the day like a sine wave and so aren’t normally distributed.

2. Even if you post on 12 more websites, you should continue to pursue the "Goodness of Fit" angle. When was your last Statistics Class?

3. Originally Posted by grain
Hello,

What can use instead of the ‘pair t test’?

I have two variables both of which are recording temperature of the same house at 1 hr intervals over the same time period. One of the variables is the real-world measurements and the other is temperature modelled by myself in a program.

I need an indication of how separate two sets of measurements are. They are measured over 2 months so I have 1300 measurements i.e. n = 1300.

These temperatures oscillate over the day like a sine wave and so aren’t normally distributed.

Let's say the measured temperatures are $T_i,\, i=1,2,\dots,1300$ and the modeled temperatures are $M_i,\, i=1,2,\dots,1300$. My suggestion would be to compute the values $T_i - M_i$ and apply a Z-test for zero mean. That way your assumption is that the $T_i - M_i$ values are normally distributed, which seems reasonable, and the fact that the temperatures have a pattern may be irrelevant.

4. Originally Posted by TKHunny
Even if you post on 12 more websites, you should continue to pursue the "Goodness of Fit" angle. When was your last Statistics Class?
Yes, my last Statistics class was a couple of years ago, hence the somewhat wondering in dark.

From what I've read I need to use the chi square statistic, which is a sum of differences between observed and expected outcome frequencies, each squared and divided by the expectation: The degrees of freedom df = (rows-1)(columns-1)

My problem is that, my degrees of freedom is = (1200-1)(2-1) = 1200

All the tables I've seen only go up to 9 or so df's.

Does this mean I should just take a random sample of 10 measurements and apply it to that rather than the entire data set?

5. Originally Posted by grain
Yes, my last Statistics class was a couple of years ago, hence the somewhat wondering in dark.

From what I've read I need to use the chi square statistic, which is a sum of differences between observed and expected outcome frequencies, each squared and divided by the expectation: The degrees of freedom df = (rows-1)(columns-1)

My problem is that, my degrees of freedom is = (1200-1)(2-1) = 1200

All the tables I've seen only go up to 9 or so df's. Mr F says: Then you're looking in the wrong places. Tables of critical values routinely go up to 100 degrees of freedom. See PlanetMath: table of critical values of chi-squared distributions for example.

Does this mean I should just take a random sample of 10 measurements and apply it to that rather than the entire data set?

Throwing away perfectly good data is totally absurd.

The chi-squared distribution is asymptotically normal as the number of degrees of freedom becomes infinite. It has mean = n and variance = 2n; for a large degree of freedom (and 1200 is large) you can get a very good approximation using the normal distribution with this mean and variance.

That's why tables are usually not made for more than 100 degrees of freedom.

6. Originally Posted by mr fantastic
Throwing away perfectly good data is totally absurd.

The chi-squared distribution is asymptotically normal as the number of degrees of freedom becomes infinite. It has mean = n and variance = 2n; for a large degree of freedom (and 1200 is large) you can get a very good approximation using the normal distribution with this mean and variance.

That's why tables are usually not made for more than 100 degrees of freedom.
Thanks for that

Ok, I think I’ve got it now. Here it goes:

My null hypothesis is that there is no difference between the observed and expected temperatures.

My calculated chi statistic = 2.76

I have 1276 rows which effectively gives a degree of freedom of 100 (tables limit).

The critical value for the chi-square at significance of 0.05 is =124.3; if the calculated chi-square value is equal to or greater than this critical value, I can conclude that the probability of the null hypothesis being correct is 0.05. i.e. a very low probability. But it isn’t! it is in fact much less than 124.3. This means that because my degrees of freedom are so high and my chi statistic so low, I can accept the null hypothesis with great confidence.

Is my method of calculation correct or should I be using a different statistic because my degrees of freedom are so large?

7. Originally Posted by grain
Thanks for that

Ok, I think I’ve got it now. Here it goes:

My null hypothesis is that there is no difference between the observed and expected temperatures.

My calculated chi statistic = 2.76

I have 1276 rows which effectively gives a degree of freedom of 100 (tables limit).

The critical value for the chi-square at significance of 0.05 is =124.3; if the calculated chi-square value is equal to or greater than this critical value, I can conclude that the probability of the null hypothesis being correct is 0.05. i.e. a very low probability. But it isn’t! it is in fact much less than 124.3. This means that because my degrees of freedom are so high and my chi statistic so low, I can accept the null hypothesis with great confidence.

Is my method of calculation correct or should I be using a different statistic because my degrees of freedom are so large?
If you're testing goodness of fit between your model and the data, your calculation looks OK to me.

8. Originally Posted by mr fantastic
If you're testing goodness of fit between your model and the data, your calculation looks OK to me.
Great, Thanks for that.

Goodness of fit is what I'm trying to achieve so that's a relief!