As described in "Sampling distributions" (under Inferential statistics) a larger sample contains more information. Hence, a parameter (like the population mean) can be estimated with more precision (lower standard deviation) as the sample size increases.

The number of observations in a sample helps to control the probability of making a Type II error (the probability of accepting a false null hypothesis).

For purpose of hypothesis testing about the population mean the following rule is applied in practice.

Is n large (n>=30)?

- no -- Is the population approximately
normal?
- no -- increase sample size to 30 or more
- yes -- Is the value of sigma
known?
- no -- estimate sigma -- use the t-distribution
- yes -- use the normal distribution (z)

- yes -- Is the value of sigma known?
- no -- estimate sigma -- use the normal distribution
- yes -- use the normal distribution (z)

The determination of an "appropriate" sample size depends on the parameter in question. For the population mean the following information is necessary:

- The population variance (or some estimate, e.g., the range divided four).
- The degree of confidence.
- Some specified bound for the sample mean.

Therefore, the estimate of the sample size can be obtained by applying the following formula:

n = ( z(a/2) * sigma / bound ) ** 2

where z(a/2) is the value from the table of a normal distribution with alpha over two (a/2) as the level of significance, sigma is the population variance, and bound is the limit of the interval for the sample mean. The whole expression is squared (** 2).

Suppose we are interested in estimating the mean GPA at UWF, with 95% confidence, to within 0.25 of a point. z(a/2) = z(0.025) = 1.96 bound = 0.25 Suppose that sigma = 0.725 (for instance, assuming that the lowest GPA is 1.1 the range would be 2.9 = 4 - 1.1, and 2.9/4 = 0.725).

The estimated sample size would be: n = (1.96 * 0.725 / 0.25) ** 2 = 32.308 Therefore, we need to sample about 33 students.