# Making Population Inferences From Samples

• Apr 8th 2010, 01:16 PM
CogitoErgoCogitoSum
Making Population Inferences From Samples
How do you infer about the population from a sample? This how-to is the bulk of my question.

Now, do you need just one sample to infer about the population? Or do you need a multitude of samples? Do you need the sample mean and standard deviation? Or do you need the mean of sample means and the standard deviation of the standard deviations amongst these multitude of samples? Every reference I find leaves this point open to assumption.

Do you then need a large sample size? Is that relevant? Or do you need a large sample set of samples? I mean, is having a large multitude of small samples sufficient? Or is that the same as fewer samples of a larger size?

Suppose then you had the population standard deviation. According to my reference, you can use the standard normal distribution and a z-score test to infer about the population. But it doesnt say how.

Likewise, if your sample is large enough (or if the quantity of samples is large enough, Im not sure), you can also infer about a population from a sample. This time you dont need pop standard deviation, but you do need the sample standard deviation (or the standard deviations of the standard deviations of the samples, Im not sure which).

But if you dont know your population standard deviation and your sample size is small, then use the t-test. But my reference also doesnt tell me how to do that either... up until now there was no mention of the student t-distribution.

What exactly are you infering about the population? Whether you use the z-score or the t-score test, either way, you still need to know something about the population... be it the standard deviation or the mean. So it seems to me that you already need to have tallied up the population in order to make any sense of the samples deviation. What exactly are you inferring about the population from the sample?.. When you already know these particular variables about the population anyway, what is the point?

What I mean to say is... if you know neither the population standard deviation NOR the population mean... but you only have one (or more) samples... THEN how do you infer about the population?
• Apr 8th 2010, 07:25 PM
macosxnerd101
Quote:

Originally Posted by CogitoErgoCogitoSum
How do you infer about the population from a sample? This how-to is the bulk of my question.

Now, do you need just one sample to infer about the population? Or do you need a multitude of samples? Do you need the sample mean and standard deviation? Or do you need the mean of sample means and the standard deviation of the standard deviations amongst these multitude of samples? Every reference I find leaves this point open to assumption.

Do you then need a large sample size? Is that relevant? Or do you need a large sample set of samples? I mean, is having a large multitude of small samples sufficient? Or is that the same as fewer samples of a larger size?

When making inferrences from a sample, remember that it is a SIN not to check conditions.
SRS- The sample must be from a simple random sample
Independence- The units or subjects must be independent of each other
Normalcy- The sample must be approximately normally distributed. So if you are working with means, then the sample size needs to be >= 30. If it isn't, then graph it and check for skewness. If there is no significant skew (kind of a subjective thing), then you can proceed under the assumption of normalcy. If you are working with proportions, then np >= 10, n(1-p) >= 10, and 10n <= population size.

Quote:

Suppose then you had the population standard deviation. According to my reference, you can use the standard normal distribution and a z-score test to infer about the population. But it doesnt say how.

Likewise, if your sample is large enough (or if the quantity of samples is large enough, Im not sure), you can also infer about a population from a sample. This time you dont need pop standard deviation, but you do need the sample standard deviation (or the standard deviations of the standard deviations of the samples, Im not sure which).

But if you dont know your population standard deviation and your sample size is small, then use the t-test. But my reference also doesnt tell me how to do that either... up until now there was no mention of the student t-distribution.
Z-scores are for when you use proportions, work with the population, or work with a sample where n > 1000. T-scores are used for working with samples where n <= 1000 and you are not measuring proportions.

Quote:

What exactly are you infering about the population? Whether you use the z-score or the t-score test, either way, you still need to know something about the population... be it the standard deviation or the mean. So it seems to me that you already need to have tallied up the population in order to make any sense of the samples deviation. What exactly are you inferring about the population from the sample?.. When you already know these particular variables about the population anyway, what is the point?
Z and t-scores are used to help you find the probability of the even occurring given the mean and standard deviation (and df in the case of a t-test). They are usually used in combination with some kind of hypothesis testing to compare against an alpha value to see if data is significant.

Quote:

What I mean to say is... if you know neither the population standard deviation NOR the population mean... but you only have one (or more) samples... THEN how do you infer about the population?
In a nutshell, this is why we use confidence intervals. Basically, confidence intervals are used to state how confident we are that our sample statistic has captured the population parameter. The formula for confidence intervals is as follows: C = stat +- Crit.Value(Std. Error or Std. Dev). Note that with the +- operators both being used, you will get a range of values (or an interval) from this.