Statistics: Degrees of Freedom

Teacher at chalkboard
Teacher at chalkboard. JGI/Jamie Grille/Blend Images/Getty Images

In statistics, the degrees of freedom are used to define the number of independent quantities that can be assigned to a statistical distribution. This number typically refers to a positive whole number that indicates the lack of restrictions on a person's ability to calculate missing factors from statistical problems.

Degrees of freedom act as variables in the final calculation of a statistic and are used to determine the outcome of different scenarios in a system, and in math degrees of freedom define the number of dimensions in a domain that are needed to determine the full vector.

To illustrate the concept of a degree of freedom, we will look at a basic calculation concerning the sample mean, and to find the mean of a list of data, we add all of the data and divide by the total number of values.

An Illustration with a Sample Mean

For a moment suppose that we know the mean of a data set is 25 and that the values in this set are 20, 10, 50, and one unknown number. The formula for a sample mean gives us the equation (20 + 10 + 50 + x)/4 = 25, where x denotes the unknown, using some basic algebra, one can then determine that the missing number, x, is equal to 20.

Let's alter this scenario slightly. Again we suppose that we know the mean of a data set is 25. However, this time the values in the data set are 20, 10, and two unknown values. These unknowns could be different, so we use two different variables, x and y, to denote this. The resulting equation is (20 + 10 + x + y)/4 = 25. With some algebra, we obtain y = 70- x. The formula is written in this form to show that once we choose a value for x, the value for y is completely determined. We have one choice to make, and this shows that there is one degree of freedom.

Now we'll look at a sample size of one hundred. If we know that the mean of this sample data is 20, but do not know the values of any of the data, then there are 99 degrees of freedom. All values must add up to a total of 20 x 100 = 2000. Once we have the values of 99 elements in the data set, then the last one has been determined.

Student t-score and Chi-Square Distribution

Degrees of freedom play an important role when using the Student t-score table. There are actually several t-score distributions. We differentiate between these distributions by use of degrees of freedom.

Here the probability distribution that we use depends upon the size of our sample. If our sample size is n, then the number of degrees of freedom is n-1. For instance, a sample size of 22 would require us to use the row of the t-score table with 21 degrees of freedom.

The use of a chi-square distribution also requires the use of degrees of freedom. Here, in an identical manner as with the t-score distribution, the sample size determines which distribution to use. If the sample size is n, then there are n-1 degrees of freedom.

Standard Deviation and Advanced Techniques

Another place where degrees of freedom show up is in the formula for the standard deviation. This occurrence is not as overt, but we can see it if we know where to look. To find a standard deviation we are looking for the "average" deviation from the mean. However, after subtracting the mean from each data value and squaring the differences, we end up dividing by n-1 rather than n as we might expect.

The presence of the n-1 comes from the number of degrees of freedom. Since the n data values and the sample mean are being used in the formula, there are n-1 degrees of freedom.

More advanced statistical techniques use more complicated ways of counting the degrees of freedom. When calculating the test statistic for two means with independent samples of n1 and n2 elements, the number of degrees of freedom has quite a complicated formula. It can be estimated by using the smaller of n1-1 and n2-1

Another example of a different way to count the degrees of freedom comes with an F test. In conducting an F test we have k samples each of size n—the degrees of freedom in the numerator is k-1 and in the denominator is k(n-1).