What it Means When a Variable Is Spurious

Definition, Overview and Examples

What does it mean when a relationship between variables is spurious?
Monty Rakusen/Getty Images

Spurious is a term used to describe a statistical relationship between two variables that would, at first glance, appear to be causally related, but upon closer examination, only appear so by coincidence or due to the role of a third, intermediary variable. When this occurs, the two original variables are said to have a "spurious relationship."

This is an important concept to understand within the social sciences, and in all sciences that rely on statistics as a research method because scientific studies are often designed to test whether or not there is a causal relationship between two things. When one tests a hypothesis, this is generally what one is looking for. Therefore, in order to accurately interpret the results of a statistical study, one must understand spuriousness and be able to spot it in one's findings.

How to Spot a Spurious Relationship

The best tool for spotting a spurious relationship in research findings is common sense. If you work with the assumption that, just because two things might co-occur does not mean they are causally related, then you are off to a good start. Any researcher worth her salt will always take a critical eye to examining her research findings, knowing that failing to account for all possibly relevant variables in the course of a study can impact the results. Ergo, a researcher or critical reader must critically examine the research methods employed in any study to truly understand what the results mean.

The best way to eliminate spuriousness in a research study is to control for it, in a statistical sense, from the start. This involves carefully accounting for all variables that might impact the findings and including them in your statistical model to control their impact on the dependent variable.

Example of a Spurious Relationships Between Variables

Many social scientists have focused their attention on identifying which variables impact the dependent variable of educational attainment. In other words, they are interested in studying what factors influence who much formal schooling and degrees a person will achieve in their lifetime.

When you look at historical trends in educational attainment as measured by race, you see that Asian Americans between the ages of 25 and 29 are most likely to have completed college (a full 60 percent of them have done so), while the rate of completion for white people is 40 percent. For Black people, the rate of college completion is much lower -- just 23 percent, while the Hispanic population has a rate of just 15 percent.

Looking at these two variables -- educational attainment and race -- one might surmise that race has a causal effect on completion of college. But, this is an example of a spurious relationship. It is not race itself that impacts educational attainment, but racism, which is the third "hidden" variable that mediates the relationship between these two.

Racism impacts the lives of people of color so deeply and diversely, shaping everything from where they live, which schools they go to and how they are sorted within them, how much their parents work, and how much money they earn and save. It also affects how teachers perceive their intelligence and how frequently and harshly they are punished in schools. In all of these ways and many others, racism is a causal variable that impacts educational attainment, but race, in this statistical equation, is a spurious one.