What Is Panel Data?

The Definition and Relevance of Panel Data in Economic Research

Panel data, also known as longitudinal data or cross-sectional time series data in some special cases, is data that is derived from a (usually small) number of observations over time on a (usually large) number of cross-sectional units like individuals, households, firms, or governments.

In the disciplines of econometrics and statistics, panel data refers to multi-dimensional data that generally involves measurements over some period of time. As such, panel data consists of researcher's observations of numerous phenomena that were collected over several time periods for the same group of units or entities. For example, a panel data set may be one that follows a given sample of individuals over time and records observations or information on each individual in the sample.

Basic Examples of Panel Data Sets

The following are very basic examples of two panel data sets for two to three individuals over the course of several years in which the data collected or observed includes income, age, and sex:

Panel Data Set A

Panel Data Set B

Both Panel Data Set A and Panel Data Set B above show the data collected (the characteristics of income, age, and sex) over the course of several years for different people. Panel Data Set A shows the data collected for two people (person 1 and person 2) over the course of three years (2013, 2014, and 2015). This example data set would be considered a balanced panel because each person is observed for the defined characteristics of income, age, and sex each year of the study. Panel Data Set B, on the other hand, would be considered an unbalanced panel as data does not exist for each person each year. Characteristics of person 1 and person 2 were collected in 2013 and 2014, but person 3 is only observed in 2014, not 2013 and 2014.

Analysis of Panel Data in Economic Research

There are two distinct sets of information that can be derived from cross-sectional time series data. The cross-sectional component of the data set reflects the differences observed between the individual subjects or entities whereas the time series component which reflects the differences observed for one subject over time. For instance, researchers could focus on the differences in data between each person in a panel study and/or the changes in observed phenomena for one person over the course of the study (e.g., the changes in income over time of person 1 in Panel Data Set A above).

It is panel data regression methods that permit economists to use these various sets of information provided by panel data. As such, analysis of panel data can become extremely complex. But this flexibility is precisely the advantage of panel data sets for economic research as opposed to conventional cross-sectional or time series data. Panel data gives researchers a large number of unique data points, which increases the researcher's degree of freedom to explore explanatory variables and relationships.

