# Thread: Goodness of fit - explaining standard deviations

1. ## Goodness of fit - explaining standard deviations

Evening folks.

Hopefully someone can help me with this as stats isn't my forte and hours of internet trawling hasn't been any help!

I've performed an experiment to find out whether a certain scenario fits a negative binomial distribution. After collecting the data, I performed a chi-squared goodness of fit test. The result of this was that there was was no evidence to reject the null, so it was likely that the data did fit a Neg. Bin. Distribution.

My problem lies with the mean and standard deviations. The mean for the observed data and the expected are equal. Why is that so? Is it because the expected values are calculated from the observed values??

Secondly, the standard deviations of the observed and expected are not equal (but are close: 5.6 and 6.5 approximately). Why is there a difference in the SD?

*Edit* - just to clarify, I know what SD is, it's just the reasoning behind the observed and expected SDs being different that I'm unclear on.

2. In this situation it would be beneficial to see your data set or at least some summary statistics.

As far as the mean being the same and std being different, this is just a property of the data sets you have. I would just accept this.

For example consider $\displaystyle n_1 = 5, p_1=0.2 \implies \mu_1 = 1, \sigma_1 = 0.89$ and $\displaystyle n_2 = 10, p_2=0.1 \implies \mu_2 = 1, \sigma_2 = 0.94$ you can see that $\displaystyle \mu_1 =\mu_1, \sigma_1 \neq \sigma_2$ thisis just an example off the top of my head, so this can happen.

3. Hi, thanks for the reply.

Chi-square value was 17.24. Critical value at p<.05 is 22.36 and p<.01 is 27.69

Observed & expected means are 12.8

Obs, SD = 5.29
Exp, SD = 6.47

4. Still not making any headway with this.

Any assistance appreciated.

5. I think I might have the answer I was looking for and, if I'm right, then I'm stupid for not seeing this in the first place. If I'm wrong, I'm stupid for thinking this might be the answer. Either way, stats is not my thing!

So, here's why I think the difference in the SD can be explained (hopefully the graph shows).

The expected values show values for X > 30. In the experiment, the observed values were always X < 27. So the expected takes into account values the possibility of seeing values of X greater than observed.

Wouldn't this explain the greater SD for the expected?

6. Originally Posted by davemk
Evening folks.

Hopefully someone can help me with this as stats isn't my forte and hours of internet trawling hasn't been any help!

I've performed an experiment to find out whether a certain scenario fits a negative binomial distribution. After collecting the data, I performed a chi-squared goodness of fit test. The result of this was that there was was no evidence to reject the null, so it was likely that the data did fit a Neg. Bin. Distribution.

My problem lies with the mean and standard deviations. The mean for the observed data and the expected are equal. Why is that so? Is it because the expected values are calculated from the observed values??

Secondly, the standard deviations of the observed and expected are not equal (but are close: 5.6 and 6.5 approximately). Why is there a difference in the SD?

*Edit* - just to clarify, I know what SD is, it's just the reasoning behind the observed and expected SDs being different that I'm unclear on.

How did you select the expected distribution?

CB

7. I ran an experiment and estimated the probability from that data. Then used the following formula to work out the probability of the expected values. n = no. of required successful observations, x = no. of failures, p = estimated probability.

8. Originally Posted by davemk
I ran an experiment and estimated the probability from that data. Then used the following formula to work out the probability of the expected values. n = no. of required successful observations, x = no. of failures, p = estimated probability.

There are two parameters p and n, how did you estimate them from your data?

CB

9. n was the number of successes I needed (3).

p was estimated from the observed data. 100 observations were made (needing 3 successes each). Therefore estimate of p was given by (3 x 100)/total number of trials.

10. Originally Posted by davemk
n was the number of successes I needed (3).

p was estimated from the observed data. 100 observations were made (needing 3 successes each). Therefore estimate of p was given by (3 x 100)/total number of trials.
OK the means are the same because you have set things up that way, the standard deviations are not matched so not usually equal.

CB

11. Thank you very much pickslides and CaptainBlack.