Structural Equation Modeling

Ashley Crossman

Structural equation modeling is an advanced statistical technique that has many layers and many complex concepts. Researchers who use structural equation modeling have a good understanding of basic statistics, regression analyses, and factor analyses. Building a structural equation model requires rigorous logic as well as a deep knowledge of the field’s theory and prior empirical evidence. This article provides a very general overview of structural equation modeling without digging into the intricacies involved.

Structural equation modeling is a collection of statistical techniques that allow a set of relationships between one or more independent variables and one or more dependent variables to be examined. Both independent and dependent variables can be either continuous or discrete and can be either factors or measured variables. Structural equation modeling also goes by several other names: causal modeling, causal analysis, simultaneous equation modeling, analysis of covariance structures, path analysis, and confirmatory factor analysis.

When exploratory factor analysis is combined with multiple regression analyses, the result is structural equation modeling (SEM). SEM allows questions to be answered that involve multiple regression analyses of factors. At the simplest level, the researcher posits a relationship between a single measured variable and other measured variables. The purpose of SEM is to attempt to explain “raw” correlations among directly observed variables.

Path Diagrams

Path diagrams are fundamental to SEM because they allow the researcher to diagram the hypothesized model, or set of relationships. These diagrams are helpful in clarifying the researcher’s ideas about the relationships among variables and can be directly translated into the equations needed for analysis.

Path diagrams are made up of several principles:

  • Measured variables are represented by squares or rectangles.
  • Factors, which are made up of two or more indicators, are represented by circles or ovals.
  • Relationships between variables are indicated by lines; lack of a line connecting the variables implies that no direct relationship is hypothesized.
  • All lines have either one or two arrows. A line with one arrow represents a hypothesized direct relationship between two variables, and the variable with the arrow pointing toward it is the dependent variable. A line with an arrow at both ends indicates an unanalyzed relationship with no implied direction of effect.

Research Questions Addressed by Structural Equation Modeling

The main question asked by structural equation modeling is, “Does the model produce an estimated population covariance matrix that is consistent with the sample (observed) covariance matrix?” After this, there are several other questions that SEM can address.

  • Adequacy of the model: Parameters are estimated to create an estimated population covariance matrix. If the model is good, the parameter estimates will produce an estimated matrix that is close to the sample covariance matrix. This is evaluated primarily with the chi-square test statistic and fit indices.
  • Testing theory: Each theory, or model, generates its own covariance matrix. So which theory is best? Models representing competing theories in a specific research area are estimated, pitted against each other, and evaluated.
  • Amount of variance in the variables accounted for by the factors: How much of the variance in the dependent variables is accounted for by the independent variables? This is answered through R-squared-type statistics.
  • Reliability of the indicators: How reliable are each of the measured variables? SEM derives the reliability of measured variables and internal consistency measures of reliability.
  • Parameter estimates: SEM generates parameter estimates, or coefficients, for each path in the model, which can be used to distinguish if one path is more or less important than other paths in predicting the outcome measure.
  • Mediation: Does an independent variable affect a specific dependent variable or does the independent variable affect the dependent variable through a mediating variable? This is called a test of indirect effects.
  • Group differences: Do two or more groups differ in their covariance matrices, regression coefficients, or means? Multiple group modeling can be done in SEM to test this.
  • Longitudinal differences: Differences within and across people across time can also be examined. This time interval can be years, days, or even microseconds.
  • Multilevel modeling: Here, independent variables are collected at different nested levels of measurement (for example, students nested within classrooms nested within schools) are used to predict dependent variables at the same or other levels of measurement.

Weaknesses of Structural Equation Modeling

Relative to alternative statistical procedures, structural equation modeling has several weaknesses:

  • It requires a relatively large sample size (N of 150 or greater).
  • It requires much more formal training in statistics to be able to effectively use SEM software programs.
  • It requires a well-specified measurement and conceptual model. SEM is theory-driven, so one must have well-developed a priori models.


  • Tabachnick, B. G., and Fidell, L. S. (2001). Using Multivariate Statistics, Fourth Edition. Needham Heights, MA: Allyn and Bacon.
  • Kercher, K. (Accessed November 2011). Introduction to SEM (Structural Equation Modeling).