Regression Line

Last Updated :

21 Aug, 2024

Blog Author :

N/A

Edited by :

Collins Enosh

Reviewed by :

Dheeraj Vaidya

Table Of Contents

arrow

Regression Line Definition

A regression line indicates a linear relationship between the dependent variables on the y-axis and the independent variables on the x-axis. The correlation is established by analyzing the data pattern formed by the variables.

The regression line is plotted closest to the data points in a regression graph. This statistical tool helps analyze the behavior of a dependent variable y when there is a change in the independent variable x—by substituting different values of x in the regression equation.

  • The regression line establishes a linear relationship between two sets of variables. The change in one variable is dependent on the changes to the other (independent variable).
  • The Least Squares Regression Line (LSRL) is plotted nearest to the data points (x, y) on a regression graph.
  • Regression is widely used in financial models like CAPM and investing measures like Beta to determine the feasibility of a project. It is also used for creating projections of investments and financial returns.
  • If Y is the dependent variable and X is the independent variable, the Y on X regression line equation is represented as follows:
     'Y = a + bX + ɛ.'

Regression Line Explained

A regression line is a statistical tool that depicts the correlation between two variables. Specifically, it is used when variation in one (dependent variable) depends on the change in the value of the other (independent variable).

There can be two cases of simple linear regression:

  1. The equation is Y on X, where the value of Y changes with a variation in the value of X.
  2. The equation is X on Y, where the change in X variable depends upon the Y variable's deviation.
Regression Line Explanation

Regression is extensively applied to various real-world scenarios—business, investment, finance, and marketing. For example, in finance, regression is majorly employed in the Beta and Capital Asset Pricing Model (CAPM—for estimating returns and budgeting.

Capital Asset Pricing Model (CAPM)

Using regression, the company can determine the appropriate asset price with respect to the cost of capital. In the stock market, it is used for determining the impact of stock price changes on the price of underlying commodities. 

In marketing, regression analysis can be used to determine how price fluctuation results in the increase or decrease in goods sales. It is very effective in creating sales projections for a future period—by correlating market conditions, weather predictions, economic conditions, and past sales.

Formula

The formula to determine the Least Squares Regression Line (LSRL) of Y on X is as follows:

Y=a + bX + ɛ

Here,

  • Y is the dependent variable.
  • a is the Y-intercept.
  • b is the slope of the regression line.
  • X is the independent variable.
  • ɛ is the residual (error).

Also,

b = (N∑XY-(∑X)(∑Y) / (N∑X2- (∑X)2) ;

And,

a = (∑Y - b ∑X) / N

Where N is the total number of observations.

Example

Let us look at a hypothetical example to understand real-world applications of the theory.

The finance manager of ABC Motors wants to correlate variation in sales and variation in the price of electric bikes. For this purpose, he analyzes data pertaining to the last five years.

We assume there is no error. The price and sales volume for the previous five years are as follows:

YearPrice (in $)Sales Volume
2017210015000
2018205016500
2019200021000
2020220019000
2021205020000

Based on the given data, determine the regression line of Y on X,

Solution:

Let us determine the regression line of Y on X:

Given:

  • Y = Sales Volume
  • X = Profit
  • N = 5
  • ɛ = 0
YearPrice (in $) (X)Sales Volume (Y)X2XY
2017210015000441000031500000
2018205016500420250033825000
2019200021000400000042000000
2020220019000484000041800000
2021205020000420250041000000
-104009150021655000190125000

Y = a + bX + ɛ

Let us first find out the value of b and a:

b = (N∑XY-(∑X)(∑Y) / (N∑X2- (∑X)2)

  • b = ((5×190125000) - (10400×91500)) / ( (5×21655000) - 104002 )
  • b = (950625000-951600000) / (08275000 -108160000)
  • b = - 8.478

a = (∑Y - b ∑X) / N

  • a = 91500 - ( - 8.478 × 10400) / 5
  • a = 35935
  • Y = 35935 + ( - 8.478 X) + 0
  • Y = 35935 – 8.478X

The data is represented as a regression line graph: 

Regression Line Example

(Source

Visualization of collected data makes data interpretation easier. The regression line is sometimes called the line of best fit.

It is important to note that real-world data cannot always be expressed with a regression equation. If the majority of observations follow a pattern, then the outliers can be eliminated. But sometimes, there is no obvious pattern. If there are random irregularities in collected data—the regression method is not suitable.

Frequently Asked Questions (FAQs)

What is a regression line?

A regression line depicts the relationship between two variables. It is applied in scenarios where the change in the value of the independent variable causes changes in the value of the dependent variable.

How to find a regression line?

The formula of the regression line for Y on X is as follows:

Y = a + bX + ɛ
Here Y is the dependent variable, a is the Y-intercept, b is the slope of the regression line, X is the independent variable, and ɛ is the residual (error).

What is the slope of a regression line?

The slope of a regression line is denoted by 'b,' which shows the variation in the dependent variable y brought out by changes in the independent variable x. The formula to determine the slope of the regression line for Y on X is as follows:

b = (N∑XY-(∑X)(∑Y) / (N∑X2- (∑X)2)

This has been a guide to what is Regression Line and its definition. We discuss its formula, calculation, equation, slope, examples & least squares regression line. You can learn more about it from the following articles -