Sometimes in statistics, it is helpful to see worked out examples of problems. These examples can help us in figuring out similar problems. In this article, we will walk through the process of conducting inferential statistics for a result concerning two population means. Not only will we see how to conduct a hypothesis test about the difference of two population means, we will also construct a confidence interval for this difference. The methods that we use are sometimes called a two sample t test and a two sample t confidence interval.

## The Statement of the Problem

Suppose we wish to test the mathematical aptitude of grade school children. One question that we may have is if higher grade levels have higher mean test scores.

A simple random sample of 27 third graders is given a math test, their answers are scored, and the results are found to have a mean score of 75 points with a sample standard deviation of 3 points.

A simple random sample of 20 fifth graders is given the same math test and their answers are scored. The mean score for the fifth graders is 84 points with a sample standard deviation of 5 points.

Given this scenario we ask the following questions:

- Does the sample data provide us with evidence that the mean test score of the population of all fifth graders exceeds the mean test score of the population of all third graders?
- What is a 95% confidence interval for the difference in mean test scores between the populations of third graders and fifth graders?

## Conditions and Procedure

We must select which procedure to use. In doing this we must make sure and check that conditions for this procedure have been met. We are asked to compare two population means. One collection of methods that can be used to do this are those for two-sample t-procedures.

In order to use these t-procedures for two samples, we need to make sure that the following conditions hold:

- We have two simple random samples from the two populations of interest.
- Our simple random samples do not constitute more than 5% of the population.
- The two samples are independent of one another, and there is no matching between the subjects.
- The variable is normally distributed.
- Both the population mean and standard deviation are unknown for both of the populations.

We see that most of these conditions are met. We were told that we have simple random samples. The populations that we are studying are large as there are millions of students in these grade levels.

The condition that we are unable to automatically assume is if the test scores are normally distributed. Since we have a large enough sample size, by the robustness of our t-procedures we do not necessarily need the variable to be normally distributed.

Since the conditions are satisfied, we perform a couple of preliminary calculations.

## Standard Error

The standard error is an estimate of a standard deviation. For this statistic, we add the sample variance of the samples and then take the square root. This gives the formula:

(*s*_{1 }^{2} / *n*_{1} + *s*_{2}^{2} / *n*_{2})^{1/2}

By using the values above, we see that the value of the standard error is

(3^{2 }/ 27+ 5^{2 }/ 20)^{1/2} =(1 / 3 + 5 / 4 )^{1/2} = 1.2583

## Degrees of Freedom

We can use the conservative approximation for our degrees of freedom. This may underestimate the number of degrees of freedom, but it is much easier to calculate than using Welch's formula. We use the smaller of the two sample sizes, and then subtract one from this number.

For our example, the smaller of the two samples is 20. This means that the number of degrees of freedom is 20 - 1 = 19.

## Hypothesis Test

We wish to test the hypothesis that fifth-grade students have a mean test score that is greater than the mean score of third-grade students. Let μ_{1} be the mean score of the population of all fifth graders. Similarly, we let μ_{2} be the mean score of the population of all third graders.

The hypotheses are as follows:

- H
_{0}: μ_{1}- μ_{2}= 0 - H
_{a}: μ_{1}- μ_{2}> 0

The test statistic is the difference between the sample means, which is then divided by the standard error. Since we are using sample standard deviations to estimate the population standard deviation, the test statistic from the t-distribution.

The value of the test statistic is (84 - 75)/1.2583. This is approximately 7.15.

We now determine what the p-value is for this hypothesis test. We look at the value of the test statistic, and where this is located on a t-distribution with 19 degrees of freedom. For this distribution, we have 4.2 x 10^{-7} as our p-value. (One way to determine this is to use the T.DIST.RT function in Excel.)

Since we have such a small p-value, we reject the null hypothesis. The conclusion is that the mean test score for fifth graders is higher than the mean test score for third graders.

## Confidence Interval

Since we have established that there is a difference between the mean scores, we now determine a confidence interval for the difference between these two means. We already have much of what we need. The confidence interval for the difference needs to have both an estimate and a margin of error.

The estimate for the difference of two means is straightforward to calculate. We simply find the difference of the sample means. This difference of the sample means estimates the difference of the population means.

For our data, the difference in sample means is 84 – 75 = 9.

The margin of error is slightly more difficult to compute. For this, we need to multiply the appropriate statistic by the standard error. The statistic that we need is found by consulting a table or statistical software.

Again using the conservative approximation, we have 19 degrees of freedom. For a 95% confidence interval we see that t^{*} = 2.09. We could use the T.INV function in Excel to calculate this value.

We now put everything together and see that our margin of error is 2.09 x 1.2583, which is approximately 2.63. The confidence interval is 9 ± 2.63. The interval is 6.37 to 11.63 points on the test that the fifth and third graders chose.