Degrees of Freedom in Statistics and Mathematics

In statistics, the degrees of freedom are used to define the number of independent quantities that can be assigned to a statistical distribution. This number typically refers to a positive whole number that indicates the lack of restrictions on a person's ability to calculate missing factors from statistical problems.

Degrees of freedom act as variables in the final calculation of a statistic and are used to determine the outcome of different scenarios in a system, and in math degrees of freedom define the number of dimensions in a domain that is needed to determine the full vector.

To illustrate the concept of a degree of freedom, we will look at a basic calculation concerning the sample mean, and to find the mean of a list of data, we add all of the data and divide by the total number of values.

An Illustration with a Sample Mean

For a moment suppose that we know the mean of a data set is 25 and that the values in this set are 20, 10, 50, and one unknown number. The formula for a sample mean gives us the equation (20 + 10 + 50 + x)/4 = 25, where x denotes the unknown, using some basic algebra, one can then determine that the missing number, x, is equal to 20.

Let's alter this scenario slightly. Again we suppose that we know the mean of a data set is 25. However, this time the values in the data set are 20, 10, and two unknown values. These unknowns could be different, so we use two different variables, x, and y, to denote this. The resulting equation is (20 + 10 + x + y)/4 = 25. With some algebra, we obtain y = 70- x. The formula is written in this form to show that once we choose a value for x, the value for y is completely determined. We have one choice to make, and this shows that there is one degree of freedom.

Now we'll look at a sample size of one hundred. If we know that the mean of this sample data is 20, but do not know the values of any of the data, then there are 99 degrees of freedom. All values must add up to a total of 20 x 100 = 2000. Once we have the values of 99 elements in the data set, then the last one has been determined.

Student t-score and Chi-Square Distribution

Degrees of freedom play an important role when using the Student t-score table. There are actually several t-score distributions. We differentiate between these distributions by use of degrees of freedom.

Here the probability distribution that we use depends upon the size of our sample. If our sample size is n, then the number of degrees of freedom is n-1. For instance, a sample size of 22 would require us to use the row of the t-score table with 21 degrees of freedom.

The use of a chi-square distribution also requires the use of degrees of freedom. Here, in an identical manner as with the t-score distribution, the sample size determines which distribution to use. If the sample size is n, then there are n-1 degrees of freedom.