Understanding Path Analysis

A Brief Introduction

A woman writes path analysis equations on a chalk board.
Eric Raptosh Photography/Getty Images

Path analysis is a form of multiple regression statistical analysis used to evaluate causal models by examining the relationships between a dependent variable and two or more independent variables. Using this method one can estimate both the magnitude and significance of causal connections between variables.

There are two main requirements for path analysis:

1. All causal relationships between variables must go in one direction only (you cannot have a pair of variables that cause each other)

2. The variables must have a clear time-ordering since one variable cannot be said to cause another unless it precedes it in time.

Path analysis is theoretically useful because, unlike other techniques, it forces us to specify relationships among all of the independent variables. This results in a model showing causal mechanisms through which independent variables produce both direct and indirect effects on a dependent variable.

Path analysis was developed by Sewall Wright, a geneticist, in 1918. Over time the method has been adopted in other physical sciences and social sciences, including sociology. Today one can conduct path analysis with statistical programs including SPSS and STATA, among others. The method is also known as causal modeling, analysis of covariance structures, and latent variable models.

How to Use Path Analysis

Typically path analysis involves the construction of a path diagram in which the relationships between all variables and the causal direction between them are specifically laid out. When conducting path analysis one might first construct an input path diagram, which illustrates the hypothesized relationships. After statistical analysis has been completed, a researcher would then construct an output path diagram, which illustrates the relationships as they actually exist, according to the analysis conducted.

Examples of Path Analysis in Research

Let's consider an example in which path analysis might be useful. Say you hypothesize that age has a direct effect on job satisfaction, and you hypothesize that it has a positive effect, such that the older one is, the more satisfied they will be with their job. A good researcher will realize that there are certainly other independent variables influencing the dependent variable in this situation (job satisfaction), like for example, autonomy and income, among others.

Using path analysis, one can create a diagram that charts the relationships between age and autonomy (because typically the older one is, the greater degree of autonomy they will have), and between age and income (again, there tends to be a positive relationship between the two). Then, the diagram should also show the relationships between these two sets of variables and the dependent variable: job satisfaction. After using a statistical program to evaluate these relationships, one can then redraw the diagram to indicate the magnitude and significance of the relationships.

While path analysis is useful for evaluating causal hypotheses, this method cannot determine the direction of causality. It clarifies correlation and indicates the strength of a causal hypothesis, but does not prove direction of causation.

Students wishing to learn more about path analysis and how to conduct it should refer to Quantitative Data Analysis for Social Scientists by Bryman and Cramer.