1. ## Statistics

I started studying for an MBA degree and I was told it would be helpful to study Statistics.So I did. And now I am stuck

Summation(i=1,n) (Oi - Ei) squared divided by Ei

I need to use this formula to work out the frequencies expected under the hypothesis that the number of absentees is independant of the day of the week Below is the number of employees absent for a single day during a particular period of time
MOn(121absent), Tues (87absent) Weds (87 absent) Thur (91 absent) Fri (114 absent) TOTAL 500
Then I have to "test at the 5% level whether the differences in the observed and expected data are significant.

I want to be really dense and say "Eh??"

I started studying for an MBA degree and I was told it would be helpful to study Statistics.So I did. And now I am stuck

Summation(i=1,n) (Oi - Ei) squared divided by Ei

I need to use this formula to work out the frequencies expected under the hypothesis that the number of absentees is independant of the day of the week
No that's not what this formula is used for. This what we call a test statistic,
it's something we know (approximately) the distribution of. It is a function of
the data, and the data expected under the test hypothesis. The $O_i$'s are the
observed frequencies in the analysis bins (in this case days), and the $E_i$ are the
frequencies we would have expected if our test hypos thesis were true.

Below is the number of employees absent for a single day during a particular period of time
MOn(121absent), Tues (87absent) Weds (87 absent) Thur (91 absent) Fri (114 absent) TOTAL 500
Then I have to "test at the 5% level whether the differences in the observed and expected data are significant.
Our test hypothesis is that there is no difference between the absentee rate
for differing days of the week. Under this hypothesis we expect the same
number of absentees on each day, with a total number of $500$ absentees.

So we have expected numbers $E_i$ of $100, 100, 100, 100,100$ for $i=1..5$,
where $i=1$ denotes Monday, $i=2$ denotes Tuesday etc.

The observed frequencies $O_i$ are $121,87,87,91,114$, so now we have
everything we need to plug into the formula for the test statistic.

$
\sum_{i=1}^5 \frac{(O_i-E_i)^2}{E_i}=\frac{(121-100)^2}{100}+\frac{(87-100)^2}{100}
$

$
+\frac{(87-100)^2}{100}+\frac{(91-100)^2}{100}+\frac{(114-100)^2}{100}
$
so:
$
\sum_{i=1}^5 \frac{(O_i-E_i)^2}{E_i}=4.41+1.69+1.69+0.81+1.96=10.56
$

Now if I recall correctly the test statistic has a $\chi^2$ (Chi squared)
distribution, and $4$ degrees of freedom (number of bins minus one)

Now we look up the value of $\chi^2$ with $4$ degrees of freedom that 95%
of all observations should fall below, this is $9.49$ Thus we would expect
our test statistic to exceed this no more than 5% of the time. In this case
our test statistic does exceed this value so "at 5% level the differences in
the observed and expected data are significant".

RonL

3. WOW!!!

p.s. Thank you so very much!!