# Least-Squares Regression

Regression is a technique used to predict future values based on known values. For instance, linear regression allows us to predict what an unknown Y value will be, given a series of known X and Y’s, and a given X value.

Given the following, it’s easy to see the pattern. But assuming no obvious pattern exists, regression can help us determine what the Y value will be given our known X values.

 X Y 2 3 4 6 6 9 8 12 10 15 12 14

The X value is known as the independent variable, the “predictor variable”, while the Y value is the value you’re being predicted.

The linear regression (or “least squares regression”) equation is Y’ = a + bX

• Y’ (Y-prime) is the predicted Y value for the X value
• a is the estimated value of Y when X is 0
• b is the slope (the average change in Y’ for each change in X)
• X is any value of the independent variable

There are additional formulas for both a and b.

Let’s take a look at the following data-set, that compares the number of calls made for a product against the number of sales:

 Calls (X) Sales (Y) 20 30 40 60 20 40 30 60 10 30 10 40 20 40 20 50 20 30 30 70 220 450

First we need to calculate the sum of X-squared, Y-squared and X*Y:

 Calls (X) Sales (Y) X2 Y2 XY 20 30 400 900 600 40 60 1600 3600 2400 20 40 400 1600 800 30 60 900 3600 1800 10 30 100 900 300 10 40 100 1600 400 20 40 400 1600 800 20 50 400 2500 1000 20 30 400 900 600 30 70 900 4900 2100 Total 220 450 5600 22100 10800

The top of the equation looks like this: b = 10(10800) – 220 * 450 / n(∑X2)-(∑X)2. We’ve simply filled in the values from our chart.

b = 10(10800) – 220 * 450
b = 108,000 – 99,000
b = 9,000 / n(∑X2)-( ∑X)2

Now we have to do the bottom half of the equation:

n(∑X2)-(∑X)2

=10(5600)-(220) 2
=56,000 – 48,400
=7,600

Returning to our equation:

b = 9,000 / 7,600
b = 1.1842

Now let’s move on to a:

a = 450 / 10 – 1.1842 * (220 / 10)
a = 45 – (1.1842 * 22)
a = 45 – 26.0524
a = 18.9476

So, going back to our original regression equation, Y’ = a + bX and plugging our numbers, we get:

Y’ = 18.9476 + (1.1842)X

To use this equation, we now put our desired value in for X. With an estimated 20 calls:

Y’ = 18.9476 + (1.1842)*20
Y’ = 18.9476 + 23.684
Y’ = 42.63

So, a salesperson who makes 20 calls will expect to make 42 sales.

Cite this article as: MacDonald, D.K., (2015), "Least-Squares Regression," retrieved on December 9, 2022 from http://dustinkmacdonald.com/least-squares-regression/.