Quote:

Originally Posted by **askmemath**

I started studying for an MBA degree and I was told it would be helpful to study Statistics.So I did. And now I am stuck

Summation(i=1,n) (Oi - Ei) squared divided by Ei

I need to use this formula to work out the frequencies expected under the hypothesis that the number of absentees is independant of the day of the week

No that's not what this formula is used for. This what we call a test statistic,

it's something we know (approximately) the distribution of. It is a function of

the data, and the data expected under the test hypothesis. The $\displaystyle O_i$'s are the

observed frequencies in the analysis bins (in this case days), and the $\displaystyle E_i$ are the

frequencies we would have expected if our test hypos thesis were true.

Quote:

Below is the number of employees absent for a single day during a particular period of time

MOn(121absent), Tues (87absent) Weds (87 absent) Thur (91 absent) Fri (114 absent) TOTAL 500

Then I have to "test at the 5% level whether the differences in the observed and expected data are significant.

Our test hypothesis is that there is no difference between the absentee rate

for differing days of the week. Under this hypothesis we expect the same

number of absentees on each day, with a total number of $\displaystyle 500$ absentees.

So we have expected numbers $\displaystyle E_i$ of $\displaystyle 100, 100, 100, 100,100$ for $\displaystyle i=1..5$,

where $\displaystyle i=1$ denotes Monday, $\displaystyle i=2$ denotes Tuesday etc.

The observed frequencies $\displaystyle O_i$ are $\displaystyle 121,87,87,91,114$, so now we have

everything we need to plug into the formula for the test statistic.

$\displaystyle

\sum_{i=1}^5 \frac{(O_i-E_i)^2}{E_i}=\frac{(121-100)^2}{100}+\frac{(87-100)^2}{100}

$

$\displaystyle

+\frac{(87-100)^2}{100}+\frac{(91-100)^2}{100}+\frac{(114-100)^2}{100}

$

so:

$\displaystyle

\sum_{i=1}^5 \frac{(O_i-E_i)^2}{E_i}=4.41+1.69+1.69+0.81+1.96=10.56

$

Now if I recall correctly the test statistic has a $\displaystyle \chi^2$ (Chi squared)

distribution, and $\displaystyle 4$ degrees of freedom (number of bins minus one)

Now we look up the value of $\displaystyle \chi^2$ with $\displaystyle 4$ degrees of freedom that 95%

of all observations should fall below, this is $\displaystyle 9.49$ Thus we would expect

our test statistic to exceed this no more than 5% of the time. In this case

our test statistic does exceed this value so "at 5% level the differences in

the observed and expected data are significant".

RonL