Science, Tech, Math › Math Degrees of Freedom for Independence of Variables in Two-Way Table Share Flipboard Email Print Number of degrees of freedom for Test for Independence. C.K.Taylor Math Statistics Inferential Statistics Statistics Tutorials Formulas Probability & Games Descriptive Statistics Applications Of Statistics Math Tutorials Geometry Arithmetic Pre Algebra & Algebra Exponential Decay Functions Worksheets By Grade Resources View More by Courtney Taylor Courtney K. Taylor, Ph.D., is a professor of mathematics at Anderson University and the author of "An Introduction to Abstract Algebra." Updated March 06, 2017 The number of degrees of freedom for independence of two categorical variables is given by a simple formula: (r - 1)(c - 1). Here r is the number of rows and c is the number of columns in the two way table of the values of the categorical variable. Read on to learn more about this topic and to understand why this formula gives the correct number. Background One step in the process of many hypothesis tests is the determination of the number degrees of freedom. This number is important because for probability distributions that involve a family of distributions, such as the chi-square distribution, the number of degrees of freedom pinpoints the exact distribution from the family that we should be using in our hypothesis test. Degrees of freedom represent the number of free choices that we can make in a given situation. One of the hypothesis tests that requires us to determine the degrees of freedom is the chi-square test for independence for two categorical variables. Tests for Independence and Two-Way Tables The chi-square test for independence calls for us to construct a two-way table, also known as a contingency table. This type of table has r rows and c columns, representing the r levels of one categorical variable and the c levels of the other categorical variable. Thus, if we do not count the row and column in which we record totals, there are a total of rc cells in the two-way table. The chi-square test for independence allows us to test the hypothesis that the categorical variables are independent of one another. As we mentioned above, the r rows and c columns in the table give us (r - 1)(c - 1) degrees of freedom. But it may not be immediately clear why this is the correct number of degrees of freedom. The Number of Degrees of Freedom To see why (r - 1)(c - 1) is the correct number, we will examine this situation in more detail. Suppose that we know the marginal totals for each of the levels of our categorical variables. In other words, we know the total for each row and the total for each column. For the first row, there are c columns in our table, so there are c cells. Once we know the values of all but one of these cells, then because we know the total of all of the cells it is a simple algebra problem to determine the value of the remaining cell. If we were filling in these cells of our table, we could enter c - 1 of them freely, but then the remaining cell is determined by the total of the row. Thus there are c - 1 degrees of freedom for the first row. We continue in this manner for the next row, and there are again c - 1 degrees of freedom. This process continues until we get to the penultimate row. Each of the rows except for the last one contributes c - 1 degrees of freedom to the total. By the time that we have all but the last row, then because we know the column sum we can determine all of the entries of the final row. This gives us r - 1 rows with c - 1 degrees of freedom in each of these, for a total of (r - 1)(c - 1) degrees of freedom. Example We see this with the following example. Suppose that we have a two way table with two categorical variables. One variable has three levels and the other has two. Furthermore, suppose that we know the row and column totals for this table: Level A Level B Total Level 1 100 Level 2 200 Level 3 300 Total 200 400 600 The formula predicts that there are (3-1)(2-1) = 2 degrees of freedom. We see this as follows. Suppose that we fill in the upper left cell with the number 80. This will automatically determine the entire first row of entries: Level A Level B Total Level 1 80 20 100 Level 2 200 Level 3 300 Total 200 400 600 Now if we know that the first entry in the second row is 50, then the rest of the table is filled in, because we know the total of each row and column: Level A Level B Total Level 1 80 20 100 Level 2 50 150 200 Level 3 70 230 300 Total 200 400 600 The table is entirely filled in, but we only had two free choices. Once these values were known, the rest of the table was completely determined. Although we do not typically need to know why there are this many degrees of freedom, it is good to know that we are really just applying the concept of degrees of freedom to a new situation. Continue Reading How to Find Degrees of Freedom in Statistics Degrees of Freedom in Statistics and Mathematics What Is a Two-Way Table of Categorical Variables? See an Example of a Confidence Interval for a Variance Chi-Square Statistical Functions in Excel Clone the 2048 Game Board Using Two Dimensional Arrays in Ruby Chi-Square Goodness of Fit Test Critical Values with a Chi-Square Table Descriptive vs. Inferential Statistics: What's the Difference? Compare Two Population Proportions with This Hypothesis Test Example Chi-Square Test for a Multinomial Experiment How to Program SQLite in C Tutorial Two Know When to Use a Binomial Distribution How to Conduct a Hypothesis Test How Do You Create a Java Table? What Is the F-Distribution?