Many statistical inference problems require us to find the number of degrees of freedom. The number of degrees of freedom selects a single probability distribution from among infinitely many. This step is an often overlooked but crucial detail in both the calculation of confidence intervals and the workings of hypothesis tests.

There is not a single general formula for the number of degrees of freedom. However, there are specific formulas used for each type of procedure in inferential statistics. In other words, the setting that we are working in will determine the number of degrees of freedom. What follows is a partial list of some of the most common inference procedures, along with the number of degrees of freedom that are used in each situation.

## Standard Normal Distribution

Procedures involving standard normal distribution are listed for completeness and to clear up some misconceptions. These procedures do not require us to find the number of degrees of freedom. The reason for this is that there is a single standard normal distribution. These types of procedures encompass those involving a population mean when the population standard deviation is already known, and also procedures concerning population proportions.

## One Sample T Procedures

Sometimes statistical practice requires us to use Student’s t-distribution. For these procedures, such as those dealing with a population mean with unknown population standard deviation, the number of degrees of freedom is one less than the sample size. Thus if the sample size is *n*, then there are *n* - 1 degrees of freedom.

## T Procedures With Paired Data

Many times it makes sense to treat data as paired. The pairing is carried out typically due to a connection between the first and second value in our pair. Many times we would pair before and after measurements. Our sample of paired data is not independent; however, the difference between each pair is independent. Thus if the sample has a total of *n* pairs of data points, (for a total of 2*n* values) then there are *n* - 1 degrees of freedom.

## T Procedures for Two Independent Populations

For these types of problems, we are still using a t-distribution. This time there is a sample from each of our populations. Although it is preferable to have these two samples be of the same size, this is not necessary for our statistical procedures. Thus we can have two samples of size *n _{1}* and

*n*. There are two ways to determine the number of degrees of freedom. The more accurate method is to use Welch’s formula, a computationally cumbersome formula involving the sample sizes and sample standard deviations. Another approach, referred to as the conservative approximation, can be used to quickly estimate the degrees of freedom. This is simply the smaller of the two numbers

_{2}*n*- 1 and

_{1}*n*- 1.

_{2}## Chi-Square for Independence

One use of the chi-square test is to see if two categorical variables, each with several levels, exhibit independence. The information about these variables is logged in a two-way table with *r* rows and *c* columns. The number of degrees of freedom is the product (*r* - 1)(*c* - 1).

## Chi-Square Goodness of Fit

Chi-square goodness of fit starts with a single categorical variable with a total of *n* levels. We test the hypothesis that this variable matches a predetermined model. The number of degrees of freedom is one less than the number of levels. In other words, there are *n* - 1 degrees of freedom.

## One Factor ANOVA

One factor analysis of variance (ANOVA) allows us to make comparisons between several groups, eliminating the need for multiple pairwise hypothesis tests. Since the test requires us to measure both the variation between several groups as well as the variation within each group, we end up with two degrees of freedom. The F-statistic, which is used for one factor ANOVA, is a fraction. The numerator and denominator each have degrees of freedom. Let *c* be the number of groups and *n* is the total number of data values. The number of degrees of freedom for the numerator is one less than the number of groups, or *c* - 1. The number of degrees of freedom for the denominator is the total number of data values, minus the number of groups, or *n* - *c*.

It is clear to see that we must be very careful to know which inference procedure we are working with. This knowledge will inform us of the correct number of degrees of freedom to use.