Extrapolation and interpolation are both used to estimate hypothetical values for a variable based on other observations. There are a variety of interpolation and extrapolation methods based on the overall trend that is observed in the data. These two methods have names that are very similar. We will examine the differences between them.

## Prefixes

To tell the difference between extrapolation and interpolation, we need to look at the prefixes “extra” and “inter.” The prefix “extra” means “outside” or “in addition to.” The prefix “inter” means “in between” or “among.” Just knowing these meanings (from their originals in Latin) goes a long way to distinguish between the two methods.

## The Setting

For both methods, we assume a few things. We have identified an independent variable and a dependent variable. Through sampling or a collection of data, we have a number of pairings of these variables. We also assume that we have formulated a model for our data. This may be a least squares line of best fit, or it could be some other type of curve that approximates our data. In any case, we have a function that relates the independent variable to the dependent variable.

The goal is not just the model for its own sake, we typically want to use our model for prediction. More specifically, given an independent variable, what will the predicted value of the corresponding dependent variable be? The value that we enter for our independent variable will determine whether we are working with extrapolation or interpolation.

## Interpolation

We could use our function to predict the value of the dependent variable for an independent variable that is in the midst of our data. In this case, we are performing interpolation.

Suppose that data with *x* between 0 and 10 is used to produce a regression line *y* = 2*x* + 5. We can use this line of best fit to estimate the *y* value corresponding to *x* = 6. Simply plug this value into our equation and we see that *y* = 2(6) + 5 =17. Because our *x* value is among the range of values used to make the line of best fit, this is an example of interpolation.

## Extrapolation

We could use our function to predict the value of the dependent variable for an independent variable that is outside the range of our data. In this case, we are performing extrapolation.

Suppose as before that data with *x* between 0 and 10 is used to produce a regression line *y* = 2*x* + 5. We can use this line of best fit to estimate the *y* value corresponding to *x* = 20. Simply plug this value into our equation and we see that *y* = 2(20) + 5 =45. Because our *x* value is not among the range of values used to make the line of best fit, this is an example of extrapolation.

## Caution

Of the two methods, interpolation is preferred. This is because we have a greater likelihood of obtaining a valid estimate. When we use extrapolation, we are making the assumption that our observed trend continues for values of *x* outside the range we used to form our model. This may not be the case, and so we must be very careful when using extrapolation techniques.