Least-Squares Regression

Regression is a technique used to predict future values based on known values. For instance, linear regression allows us to predict what an unknown Y value will be, given a series of known X and Y’s, and a given X value.

Given the following, it’s easy to see the pattern. But assuming no obvious pattern exists, regression can help us determine what the Y value will be given our known X values.

X Y
2 3
4 6
6 9
8 12
10 15
12
14

 

The X value is known as the independent variable, the “predictor variable”, while the Y value is the value you’re being predicted.

The linear regression (or “least squares regression”) equation is Y’ = a + bX

  • Y’ (Y-prime) is the predicted Y value for the X value
  • a is the estimated value of Y when X is 0
  • b is the slope (the average change in Y’ for each change in X)
  • X is any value of the independent variable

There are additional formulas for both a and b.

a b

Let’s take a look at the following data-set, that compares the number of calls made for a product against the number of sales:

Calls (X) Sales (Y)
20 30
40 60
20 40
30 60
10 30
10 40
20 40
20 50
20 30
30 70
220 450

 

First we need to calculate the sum of X-squared, Y-squared and X*Y:

Calls (X) Sales (Y) X2 Y2 XY
20 30 400 900 600
40 60 1600 3600 2400
20 40 400 1600 800
30 60 900 3600 1800
10 30 100 900 300
10 40 100 1600 400
20 40 400 1600 800
20 50 400 2500 1000
20 30 400 900 600
30 70 900 4900 2100
Total 220 450 5600 22100 10800

 

Returning to our formula, let’s start with b first:

b

The top of the equation looks like this: b = 10(10800) – 220 * 450 / n(∑X2)-(∑X)2. We’ve simply filled in the values from our chart.

b = 10(10800) – 220 * 450
b = 108,000 – 99,000
b = 9,000 / n(∑X2)-( ∑X)2

Now we have to do the bottom half of the equation:

n(∑X2)-(∑X)2

=10(5600)-(220) 2
=56,000 – 48,400
=7,600

Returning to our equation:

b = 9,000 / 7,600
b = 1.1842

Now let’s move on to a:

a2

a = 450 / 10 – 1.1842 * (220 / 10)
a = 45 – (1.1842 * 22)
a = 45 – 26.0524
a = 18.9476

So, going back to our original regression equation, Y’ = a + bX and plugging our numbers, we get:

Y’ = 18.9476 + (1.1842)X

To use this equation, we now put our desired value in for X. With an estimated 20 calls:

Y’ = 18.9476 + (1.1842)*20
Y’ = 18.9476 + 23.684
Y’ = 42.63

So, a salesperson who makes 20 calls will expect to make 42 sales.



Cite this article as: MacDonald, D.K., (2015), "Least-Squares Regression," retrieved on November 17, 2017 from http://dustinkmacdonald.com/least-squares-regression/.

Facebooktwittergoogle_plusredditmailby feather

3 thoughts on “Least-Squares Regression

Leave a Reply

Your email address will not be published. Required fields are marked *