# Thread: Chi Squared Assistance

1. ## Chi Squared Assistance

Hi all I'm having trouble getting my head around a prescribed problem. Chi squared is mandatory. No room to move on that.

See attached. I need to use chi squared to describe whether there's a statistically significant difference in the occurrence of the thing observed based on vegetation species (the row labels). After googling furiously I'm still a bit at sea on how you use chi squared to do that.... all tips appreciated!

2. ## Re: Chi Squared Assistance

Originally Posted by Chopperdog
Hi all I'm having trouble getting my head around a prescribed problem. Chi squared is mandatory. No room to move on that.

See attached. I need to use chi squared to describe whether there's a statistically significant difference in the occurrence of the thing observed based on vegetation species (the row labels). After googling furiously I'm still a bit at sea on how you use chi squared to do that.... all tips appreciated!
You want to use a chi-square test of independence. That is, you are testing whether frequency depends on vegetation type.

Your null hypothesis is $H_0$: Frequency and vegetation type are independent.

The alternative hypothesis is $H_1$: $H_0$ is false.

The test statistic is $\chi^2=\sum \dfrac{(O-E)^2}{E}$.

Since there are three rows and three columns, your test has $(r-1)(c-1) = (3-1)(3-1)=4$ degrees of freedom.
For each combination of vegetation type and frequency, we need to establish expected values along with the observed values that you are already given in the problem. To compute the expected value of a cell, multiply the column total by the row total and divide that product by N. For example, the expected value for acacia vegetation and 0 frequency is $\frac{11\cdot19}{50} = 4.18$. I will compute another expected value for you: the expected value of rainforest vegetation and frequency 1 to 5 is $\frac{15\cdot10}{50}=3$.

Note that your professor assigned you a bad problem because the general rule is that each cell's expected value should be at least 5 for the chi-square test to be appropriate. In practice, you would want to design your study better, perhaps changing the frequency classes or collecting more data.

Once you have computed the expected values for all 9 cells, you will want to compute $\dfrac{(O-E)^2}{E}$ for each cell. The sum of these values give you your chi-square test statistic.

Finally, you look up the range in which your test statistic falls on a chi-square table for 4 degrees of freedom. Since I didn't do this problem, let's say that your test statistic is 10 (I'm making this number up). For 4 degrees of freedom, 10 falls somewhere between P=0.95 and P=0.975 (the chart I am using offers these ranges). In this case, you would reject $H_0$ at the 0.05 significance level, but you would NOT reject the null hypothesis at the 0.025 level. If you are not given a significance level, the typical one to use is $\alpha=0.05$, corresponding to a probability of $1-0.05=0.95$. Using my example test statistic value of 10, you would conclude that vegetation type and frequency are NOT independent at the 0.05 significance level.

I hope this helps! Good luck!
-Andy