Regression is a technique used to predict future values based on known values. For instance, linear regression allows us to predict what an unknown Y value will be, given a series of known X and Y’s, and a given X value.

Given the following, it’s easy to see the pattern. But assuming no obvious pattern exists, regression can help us determine what the Y value will be given our known X values.

X | Y |

2 | 3 |

4 | 6 |

6 | 9 |

8 | 12 |

10 | 15 |

12 | |

14 |

The X value is known as the independent variable, the “predictor variable”, while the Y value is the value you’re being predicted.

The linear regression (or “least squares regression”) equation is Y’ = a + bX

- Y’ (Y-prime) is the predicted Y value for the X value
- a is the estimated value of Y when X is 0
- b is the slope (the average change in Y’ for each change in X)
- X is any value of the independent variable

There are additional formulas for both *a* and *b.*

Let’s take a look at the following data-set, that compares the number of calls made for a product against the number of sales:

Calls (X) | Sales (Y) |

20 | 30 |

40 | 60 |

20 | 40 |

30 | 60 |

10 | 30 |

10 | 40 |

20 | 40 |

20 | 50 |

20 | 30 |

30 | 70 |

220 | 450 |

First we need to calculate the sum of X-squared, Y-squared and X*Y:

Calls (X) | Sales (Y) | X2 | Y2 | XY | |

20 | 30 | 400 | 900 | 600 | |

40 | 60 | 1600 | 3600 | 2400 | |

20 | 40 | 400 | 1600 | 800 | |

30 | 60 | 900 | 3600 | 1800 | |

10 | 30 | 100 | 900 | 300 | |

10 | 40 | 100 | 1600 | 400 | |

20 | 40 | 400 | 1600 | 800 | |

20 | 50 | 400 | 2500 | 1000 | |

20 | 30 | 400 | 900 | 600 | |

30 | 70 | 900 | 4900 | 2100 | |

Total |
220 | 450 | 5600 | 22100 | 10800 |

Returning to our formula, let’s start with *b* first:

The top of the equation looks like this: *b* = 10(10800) – 220 * 450 / n(∑X^{2})-(∑X)^{2}. We’ve simply filled in the values from our chart.

*b* = 10(10800) – 220 * 450

*b* = 108,000 – 99,000

*b* = 9,000 / n(∑X^{2})-( ∑X)^{2}

Now we have to do the bottom half of the equation:

n(∑X^{2})-(∑X)^{2}

=10(5600)-(220)^{ 2
}=56,000 – 48,400

=7,600

Returning to our equation:

*b* = 9,000 / 7,600

*b* = 1.1842

Now let’s move on to *a:*

a = 450 / 10 – 1.1842 * (220 / 10)

a = 45 – (1.1842 * 22)

a = 45 – 26.0524

a = 18.9476

So, going back to our original regression equation, Y’ = a + bX and plugging our numbers, we get:

Y’ = 18.9476 + (1.1842)X

To use this equation, we now put our desired value in for X. With an estimated 20 calls:

Y’ = 18.9476 + (1.1842)*20

Y’ = 18.9476 + 23.684

Y’ = 42.63

So, a salesperson who makes 20 calls will expect to make 42 sales.

## 3 thoughts on “Least-Squares Regression”