Four Confidence Interval Mistakes

Some misleading information in document.
Anna Bizon/Getty Images

Confidence intervals are a key part of inferential statistics. We can use some probability and information from a probability distribution to estimate a population parameter with the use of a sample. The statement of a confidence interval is done in such a way that it is easily misunderstood. We will look at the correct interpretation of confidence intervals and investigate four mistakes that are made concerning this area of statistics.

What Is a Confidence Interval?

A confidence interval can be expressed either as a range of values, or in the following form:

Estimate ± Margin of Error

A confidence interval is typically stated with a level of confidence. ​Common confidence levels are 90%, 95% and 99%.

We will look at an example where we want to use a sample mean to infer the mean of a population. Suppose that this results in a confidence interval from 25 to 30. If we say that we are 95% confident that the unknown population mean is contained in this interval, then we are really saying that we found the interval using a method that is successful in giving correct results 95% of the time. In the long run, our method will be unsuccessful 5% of the time. In other words, we will fail at capturing the true population mean only one out of every 20 times.

Confidence Interval Mistake One

We will now look at a series of different mistakes that can be made when dealing with confidence intervals. One incorrect statement that is often made about a confidence interval at a 95% level of confidence is that there is a 95% chance that the confidence interval contains the true mean of the population.

The reason that this is a mistake is actually quite subtle. The key idea pertaining to a confidence interval is that the probability used enters the picture with the method that is used, in determining confidence interval is that it refers to the method that is used.

Mistake Two

A second mistake is to interpret a 95% confidence interval as saying that 95% of all of the data values in the population fall within the interval. Again, the 95% speaks to the method of the test.

To see why the above statement is incorrect, we could consider a normal population with a standard deviation of 1 and mean of 5. A sample that had two data points, each with values of 6 has a sample mean of 6. A 95% confidence interval for the population mean would be 4.6 to 7.4. This clearly does not overlap with 95% of the normal distribution, so it will not contain 95% of the population.

Mistake Three

A third mistake is to say that a 95% confidence interval implies that 95% of all possible sample means fall within the range of the interval. Reconsider the example from the last section. Any sample of size two that was comprised of only values less than 4.6 would have a mean that was less than 4.6. Thus these sample means would fall outside of this particular confidence interval. Samples that match this description account for more than 5% of the total amount. So it is a mistake to say that this confidence interval captures 95% of all sample means.

Mistake Four

A fourth mistake in dealing with confidence intervals is to think that they are the sole source of error. While there is a margin of error associated to a confidence interval, there are other places that errors can creep into a statistical analysis. A couple of examples of these kinds of errors could be from an incorrect design of the experiment, bias in the sampling or an inability to obtain data from a certain subset of the population.