One of the goals of statistics is to arrange data in a meaningful way. Two-way tables are an important way to organized a particular type of paired data. As with the construction of any graphs or table in statistics, it is very important to know the types of variables that we are working with. If we have quantitative data, then a graph such as a histogram or stem and leaf plot should be used. If we have categorical data, then a bar graph or pie chart is appropriate.
When working with paired data we must be careful. A scatterplot exists for paired quantitative data, but what kind of graph is there for paired categorical data? Whenever we have two categorical variables, then we should use a two-way table.
Description of a Two-Way Table
First, we recall that categorical data relates to traits or to categories. It is not quantitative and does not have numerical values.
A two-way table involves listing all of the values or levels for two categorical variables. All of the values for one of the variables are listed in a vertical column. The values for the other variable are listed along a horizontal row. If the first variable has m values and the second variable has n values, then there will be a total of mn entries in the table. Each of these entries corresponds to a particular value for each of the two variables.
Along each row and along each column, the entries are totaled. These totals are important when determining marginal and conditional distributions. These totals are also important when we conduct a chi-square test for independence.
Example of a Two-Way Table
For example, we will consider a situation in which we look at several sections of a statistics course at a university. We want to construct a two-way table to determine what differences, if any, there are between the males and females in the course. To achieve this, we count the number of each letter grade that was earned by members of each gender.
We note that the first categorical variable is that of gender, and there are two possible values in the study of male and female. The second categorical variable is that of letter grade, and there are five values that are given by A, B, C, D and F. This means that we will have a two-way table with 2 x 5 = 10 entries, plus an additional row and an additional column that will be needed to tabulate the row and column totals.
Our investigation shows that:
- 50 males earned an A, while 60 females earned an A.
- 60 males earned a B, and 80 females earned a B.
- 100 males earned a C, and 50 females earned a C.
- 40 males earned D, and 50 females earned a D.
- 30 males earned an F, and 20 females earned an F.
This information is entered into the two-way table below. The total of each row tells us how many of each kind of grade was earned. The column totals tell us the number of males and the number of females.
Importance of Two-Way Tables
Two-way tables help to organize our data when we have two categorical variables. This table can be used to help us compare between two different groups in our data. For example, we could consider the relative performance of males in the statistics course against the performance of females in the course.
Next Steps
After forming a two-way table, the next step may be to analyze the data statistically. We may ask if the variables that are in the study are independent of one another or not. To answer this question we can use a chi-square test on the two-way table.
Two-Way Table for Grades and Genders
Male | Female | Total | |
A | 50 | 60 | 110 |
B | 60 | 80 | 140 |
C | 100 | 50 | 150 |
D | 40 | 50 | 90 |
F | 30 | 20 | 50 |
Total | 280 | 260 | 540 |