# Need Help With Hypothesis Test

• Mar 28th 2011, 09:28 AM
elleg
Need Help With Hypothesis Test
Let's just say it's been a while since college, and I can't remember everything about hypothesis testing, and now I'm faced with a situation at work where I need to refresh some of that old knowledge. Before I get too far, let me also state that I have no access to any statistical analysis software packages (like Minitab). The only thing I have is Excel.

I have some data regarding item failures along with a number of other variables. I have the date range in which the item was manufactured, and some other variables like environmental variables (temp, humidity, etc.) during the test of the items. The item is tested with a result of 'pass' or 'fail'. For each range variable, I want to see if the failures show any correlation. (I don't want to check combinations between variables, at least, not yet.)

For example,
Code:

```Date Range        Pass        Fail        Total ------------------------------------- Jan 2010        14        1        15 Feb 2010        1        0        1 Mar 2010        19        2        21 Apr 2010        59        1        60 May 2010        13        0        13 Jun 2010        17        4        21 ------------------------------------- Total                123        8        131```
After much research, I determined that I ought to find the Pearson chi-squared statistic, and then Cramer's V. (I also calculated it with Yate's correction, since many expected counts were < 5.) For the example above, I chose the null hypothesis that the failures are independent of date, with the alternative hypothesis being that the failures are dependent of date.

So, these are the values I calculated:
Code:

```Date Range        Expected Pass        Expected Fail        Total ----------------------------------------------------- Jan 2010        14.084                0.916                15 Feb 2010        0.939                0.061                1 Mar 2010        19.718                1.282                21 Apr 2010        56.336                3.664                60 May 2010        12.206                0.794                13 Jun 2010        19.718                1.282                21 ----------------------------------------------------- Total                123                8                131 Date Range        Deg/Free        Chi-2        Chi-Yates        V        Yates-V ----------------------------------------------------------------------- Jan 2010        1                0.008        0.201                Feb 2010        1                0.065        3.360                Mar 2010        1                0.428        0.039                Apr 2010        1                2.063        1.361                May 2010        1                0.846        0.116                Jun 2010        0                6.133        4.084                ----------------------------------------------------------------------- Total                5                9.543        9.162                0.270        0.264```
Now, I go look in a chi-square table, and I see there is 90% probability that chi-square will be greater than 1.61 for the 5 degrees of freedom in my example. Clearly, my chi-square value is greater than that, whether I look at the Yates-corrected one or not. So, my null hypothesis cannot be disproved, which effectively tells me nothing.

So, here are my issues:
1) To be perfectly honest, I can't remember how I'm supposed to choose which "side" of the table to look up the value on, and I'm assuming I chose the right value from the table, but I really don't know if I did. Did I? How do I know which side to choose? I recall graphs of a bell curve and regions in the tails on either side of the curve representing where these types of values fall, but this is a vague recollection...
--EDIT: From what I recall, if you do a two-tailed test, you have to cut your significance in half, (0.05 becomes 0.025) because there is a region of significance at each end of the probability curve. I think the chi-squared test for independence is two-tailed, but every example I find online still only looks at one p-value, and appears to use the full significance (0.05 instead of 0.025). Can someone explain this?
--
2) Assuming I picked the right value, and my test really does tell me nothing, how can I reverse my null hypothesis and alternative hypothesis, such that I'm testing a null hypothesis that the failures are dependent of date?

Thanks in advance for help on this!
• Mar 28th 2011, 11:26 AM
elleg