One of the goals of statistics is the organization and display of data. Many times one way to do this is to use a graph, chart or table. When working with paired data, a useful type of graph is a scatterplot. This type of graph allows us to easily and effectively explore our data by examining a scattering of points in the plane.

## Paired Data

It is worth highlighting that a scatterplot is a type of graph that is used for paired data. This is a type of data set in which each of our data points has two numbers associated with it. Common examples of such pairings include:

- A measurement before and after a treatment. This could take the form of a student’s performance on a pretest and then later a posttest.
- A matched pairs experimental design. Here one individual is in the control group and another similar individual is in the treatment group.
- Two measurements from the same individual. For example, we may record the weight and height of 100 people.

## 2D Graphs

The blank canvas that we will start with for our scatterplot is the Cartesian coordinate system. This is also called the rectangular coordinate system due to the fact that every point can be located by drawing a particular rectangle. A rectangular coordinate system can be set up by:

- Starting with a horizontal number line. This is called the
*x*-axis. - Add a vertical number line. Intersect the
*x-*axis in such a way that the zero point from both lines intersects. This second number line is called the*y*-axis. - The point where the zeroes of our number line intersect is called the origin.

Now we can plot our data points. The first number in our pair is the *x*-coordinate. It is the horizontal distance away from the y-axis, and hence the origin as well. We move to the right for positive values of *x* and to the left of the origin for negative values of *x*.

The second number in our pair is the *y*-coordinate. It is the vertical distance away from the x-axis. Starting at the original point on the *x*-axis, move up for positive values of *y* and down for negative values of *y*.

The location on our graph is then marked with a dot. We repeat this process over and over for each point in our data set. The result is a scattering of points, which gives the scatterplot its name.

## Explanatory and Response

One important instruction that remains is to be careful which variable is on which axis. If our paired data consists of an explanatory and response pairing, then the explanatory variable is indicated on the x-axis. If both variables are considered to be explanatory, then we may choose which one is to be plotted on the x-axis and which one on the *y*-axis.

## Features of a Scatterplot

There are several important features of a scatterplot. By identifying these traits we can uncover more information about our data set. These features include:

- The overall trend among our variables. As we read from left to right, what is the big picture? An upward pattern, downward or cyclical?
- Any outliers from the overall trend. Are these outliers from the rest of our data, or are they influential points?
- The shape of any trend. Is this linear, exponential, logarithmic or something else?
- The strength of any trend. How closely do the data fit the overall pattern that we identified?

## Related Topics

Scatterplots that exhibit a linear trend can be analyzed with the statistical techniques of linear regression and correlation. Regression can be performed for other types of trends that are nonlinear.