Science, Tech, Math › Math Calculating the Correlation Coefficient Share Flipboard Email Print Illustration by Hugo Lin. ThoughtCo. Math Statistics Descriptive Statistics Statistics Tutorials Formulas Probability & Games Inferential Statistics Applications Of Statistics Math Tutorials Geometry Arithmetic Pre Algebra & Algebra Exponential Decay Functions Worksheets By Grade Resources View More By Courtney Taylor Professor of Mathematics Ph.D., Mathematics, Purdue University M.S., Mathematics, Purdue University B.A., Mathematics, Physics, and Chemistry, Anderson University Courtney K. Taylor, Ph.D., is a professor of mathematics at Anderson University and the author of "An Introduction to Abstract Algebra." our editorial process Courtney Taylor Updated January 28, 2020 There are many questions to ask when looking at a scatterplot. One of the most common is wondering how well a straight line approximates the data. To help answer this, there is a descriptive statistic called the correlation coefficient. We will see how to calculate this statistic. The Correlation Coefficient The correlation coefficient, denoted by r, tells us how closely data in a scatterplot fall along a straight line. The closer that the absolute value of r is to one, the better that the data are described by a linear equation. If r =1 or r = -1 then the data set is perfectly aligned. Data sets with values of r close to zero show little to no straight-line relationship. Due to the lengthy calculations, it is best to calculate r with the use of a calculator or statistical software. However, it is always a worthwhile endeavor to know what your calculator is doing when it is calculating. What follows is a process for calculating the correlation coefficient mainly by hand, with a calculator used for the routine arithmetic steps. Steps for Calculating r We will begin by listing the steps to the calculation of the correlation coefficient. The data we are working with are paired data, each pair of which will be denoted by (xi,yi). We begin with a few preliminary calculations. The quantities from these calculations will be used in subsequent steps of our calculation of r: Calculate x̄, the mean of all of the first coordinates of the data xi. Calculate ȳ, the mean of all of the second coordinates of the data yi. Calculate s x the sample standard deviation of all of the first coordinates of the data xi. Calculate s y the sample standard deviation of all of the second coordinates of the data yi. Use the formula (zx)i = (xi – x̄) / s x and calculate a standardized value for each xi. Use the formula (zy)i = (yi – ȳ) / s y and calculate a standardized value for each yi. Multiply corresponding standardized values: (zx)i(zy)i Add the products from the last step together. Divide the sum from the previous step by n – 1, where n is the total number of points in our set of paired data. The result of all of this is the correlation coefficient r. This process is not hard, and each step is fairly routine, but the collection of all of these steps is quite involved. The calculation of the standard deviation is tedious enough on its own. But the calculation of the correlation coefficient involves not only two standard deviations, but a multitude of other operations. An Example To see exactly how the value of r is obtained we look at an example. Again, it is important to note that for practical applications we would want to use our calculator or statistical software to calculate r for us. We begin with a listing of paired data: (1, 1), (2, 3), (4, 5), (5,7). The mean of the x values, the mean of 1, 2, 4, and 5 is x̄ = 3. We also have that ȳ = 4. The standard deviation of the x values is sx = 1.83 and sy = 2.58. The table below summarizes the other calculations needed for r. The sum of the products in the rightmost column is 2.969848. Since there are a total of four points and 4 – 1 = 3, we divide the sum of the products by 3. This gives us a correlation coefficient of r = 2.969848/3 = 0.989949. Table for Example of Calculation of Correlation Coefficient x y zx zy zxzy 1 1 -1.09544503 -1.161894958 1.272792057 2 3 -0.547722515 -0.387298319 0.212132009 4 5 0.547722515 0.387298319 0.212132009 5 7 1.09544503 1.161894958 1.272792057 Cite this Article Format mla apa chicago Your Citation Taylor, Courtney. "Calculating the Correlation Coefficient." ThoughtCo, Aug. 27, 2020, thoughtco.com/how-to-calculate-the-correlation-coefficient-3126228. Taylor, Courtney. (2020, August 27). Calculating the Correlation Coefficient. Retrieved from https://www.thoughtco.com/how-to-calculate-the-correlation-coefficient-3126228 Taylor, Courtney. "Calculating the Correlation Coefficient." ThoughtCo. https://www.thoughtco.com/how-to-calculate-the-correlation-coefficient-3126228 (accessed October 22, 2021). copy citation What Is Correlation in Statistics? Calculating the Mean Absolute Deviation How to Calculate Standard Deviation The Slope of the Regression Line and the Correlation Coefficient What Is a Least Squares Line? Functions with the T-Distribution in Excel Range Rule for Standard Deviation Linear Regression Analysis Math Glossary: Mathematics Terms and Definitions How to Calculate a Sample Standard Deviation What Are Residuals? How Are Outliers Determined in Statistics? Sum of Squares Formula Shortcut Paired Data in Statistics How to Use the STDEV.S Function in Excel When Is the Standard Deviation Equal to Zero?