The population variance gives an indication of how to spread out a data set is. Unfortunately, it is typically impossible to know exactly what this population parameter is. To compensate for our lack of knowledge, we use a topic from inferential statistics called confidence intervals. We will see an example of how to calculate a confidence interval for a population variance.

## Confidence Interval Formula

The formula for the (1 - α) confidence interval about the population variance. Is given by the following string of inequalities:

[ (*n* - 1)*s*^{2}] / *B* < σ^{2} < [ (*n* - 1)*s*^{2}] / *A*.

Here *n* is the sample size, *s*^{2} is the sample variance. The number *A* is the point of the chi-square distribution with *n* -1 degrees of freedom at which exactly α/2 of the area under the curve is to the left of *A*. In a similar way, the number *B* is the point of the same chi-square distribution with exactly α/2of the area under the curve to the right of *B*.

## Preliminaries

We begin with a data set with 10 values. This set of data values was obtained by a simple random sample:

97, 75, 124, 106, 120, 131, 94, 97,96, 102

Some exploratory data analysis would be needed to show that there are no outliers. By constructing a stem and leaf plot we see that this data is likely from a distribution that is approximately normally distributed. This means that we can proceed with finding a 95% confidence interval for the population variance.

## Sample Variance

We need to estimate the population variance with the sample variance, denoted by *s*^{2}. So we begin by calculating this statistic. Essentially we are averaging the sum of the squared deviations from the mean. However, rather than dividing this sum by *n* we divide it by *n* - 1.

We find that the sample mean is 104.2. Using this, we have the sum of squared deviations from the mean given by:

(97 – 104.2)^{2} + (75 – 104.3)^{2} + . . . + (96 – 104.2)^{2} + (102 – 104.2)^{2} = 2495.6

We divide this sum by 10 – 1 = 9 to obtain a sample variance of 277.

## Chi-Square Distribution

We now turn to our chi-square distribution. Since we have 10 data values, we have 9 degrees of freedom. Since we want the middle 95% of our distribution, we need 2.5% in each of the two tails. We consult a chi-square table or software and see that the table values of 2.7004 and 19.023 enclose 95% of the distribution’s area. These numbers are *A* and *B*, respectively.

We now have everything that we need, and we are ready to assemble our confidence interval. The formula for the left endpoint is [ (*n* - 1)*s*^{2}] / *B*. This means that our left endpoint is:

(9 x 277)/19.023 = 133

The right endpoint is found by replacing *B* with *A*:

(9 x 277)/2.7004 = 923

And so we are 95% confident that the population variance lies between 133 and 923.

## Population Standard Deviation

Of course, since the standard deviation is the square root of the variance, this method could be used to construct a confidence interval for the population standard deviation. All that we would need to do is to take square roots of the endpoints. The result would be a 95% confidence interval for the standard deviation.