In inferential statistics one of the major goals is to estimate an unknown population parameter. We start with a statistical sample and from this determine a range of values for the parameter. This range of values is called a confidence interval.
Confidence intervals are all similar to one another in a few ways. First, many two-sided confidence intervals have the same form:
Estimate ± Margin of Error
Secondly, the steps for calculating confidence intervals are very similar, no matter what type of confidence interval we are trying to find. The specific type of confidence interval that we will look at is a two-sided confidence interval for a population mean when we know the population standard deviation. We also assume that we are working with a population that is normally distributed.
Process for Confidence Interval for Mean - Known Sigma
Below is a process to find our desired confidence interval. Although all of the steps are important, the first one is particularly so:
- Check Conditions: Begin by making sure that the conditions for our confidence interval have been met. We assume that we know the value of the population standard deviation, denoted by the Greek letter sigma σ. We also assume a normal distribution.
- Calculate Estimate: We estimate our population parameter, in this case the population mean, by use of a statistic, in this case the sample mean. This involves forming a simple random sample from our population. Sometimes we can suppose that our sample is a simple random sample, even if it does not meet the strict definition.
- Critical Value: We obtain the critical value z^{*} that corresponds with our confidence level. These values are found by consulting a table of z-scores or by using software. We use a z-score table because we know the value of the population standard deviation, and we assume the population is normally distributed. Common critical values are 1.645 for a 90% confidence level, 1.960 for a 95% confidence level, and 2.576 for a 99% confidence level.
- Margin of Error: Calculate the margin of error z^{*} σ /√n, where n is the size of the simple random sample that we formed.
- Conclude: Finish by putting together the estimate and margin of error. This can be expressed as either Estimate ± Margin of Error or as Estimate - Margin of Error to Estimate + Margin of Error. Be sure to clearly state the level of confidence that is attached to your confidence interval.
Example
To see how we can construct a confidence interval, we will work through an example. Suppose we know that the IQ scores of all incoming college freshman are normally distributed with standard deviation of 15. We have a simple random sample of 100 freshmen, and the mean IQ score for this sample is 120. Find a 90% confidence interval for the mean IQ score for the entire population of incoming college freshmen.
We will work through the steps that were outlined above:
- Check Conditions: The conditions have been met as we have been told that the population standard deviation is 15 and that we are dealing with a normal distribution.
- Calculate Estimate: We have been told that we have a simple random sample of size 100. The mean IQ for this sample is 120, so this is our estimate.
- Critical Value: The critical value for confidence level of 90% is given by z^{*} = 1.645.
- Margin of Error: Now we use the margin of error formula and obtain of error z^{*} σ /√n = (1.645)(15) /√(100) = 2.467.
- Conclude: We conclude by putting everything together. A 90% confidence interval for the population’s mean IQ score is 120 ± 2.467. Alternatively we could state this confidence interval as 117.5325 to 122.4675.
Practical Considerations
Confidence intervals of the above type are not very realistic. It is very rare to know the population standard deviation but not know the population mean. There are ways that this unrealistic assumptions can be removed.
While we have assumed a normal distribution here, this assumption does not need to hold. Nice samples, which exhibit no strong skewness or have any outliers, along with a large enough sample size, allow us to invoke the central limit theorem.
As a result we are justified in using a table of z scores, even for populations that are not normally distributed.