Basics of simple linear regression in machine learning

Shari Nair
3 min readMay 18, 2021

--

Before I start what is linear regression, I want to introduce you what is regression and regression analysis.

Regression Analysis

In general, regression means to return to a former state but in statistics it’s a technique to understand the relationship between two variables.

Regression analysis is a statistical technique to estimate the relationship between dependent and independent variable. This dependent variable is also called criterion variable or response variable or endogenous variable. The independent variable is also called predictor variable or exogenous variable or regressors.

Regression analysis explains the changes in dependent variable with respect to changes in independent variable.

Regression analysis is used for 3 types of applications:

i. Finding out the effect of input variable on target variable

ii. Finding out the change in target variable with respect to one or more independent variable

iii. To find the trends

Types of Regression

Ø Linear regression

Ø Logistics regression

Ø Polynomial regression

Ø Ridge regression

Ø Lasso regression

Ø Elastic net regression

In this post I am gonna explain you basic idea of linear regression which I learned or learning from different online sources.

Linear Regression

Linear regression is a statistical method which comes under supervised machine learning algorithm used for predictive analysis. This approach is used for modeling the relationship between dependent variable and one or more independent variable. It focusses on conditional probability distribution of the response given by the predictors.

In linear regression, the output or dependent variable should be continuous in nature. Or we can say that linear regression makes predictions for continuous or numeric values such as sales, price, etc.

Linear regression shows linear relationship between dependent variable, say Y & independent variable, say X.

The general form of linear regression equation is:

Y= β0 + β1*X1 + β2*X2 +…+ βn*Xn

Where, β0 = intercept

β1, β12,…, βn = coefficients of independent variable

X1,X2,…,Xn = independent features

The goal of linear regression is to find the best fit line which minimizes the error between predicted values and actual values. That best fit line is called regression line.

Types of linear regression

Ø Simple linear regression

Ø Multiple linear regression

Simple linear regression

In this type of linear regression, only one independent variable(X1) is used to predict the value of a continuous dependent variable(Y).

The equations of simple linear regression is

Y = β0 + β1*X1

This line can be chosen in such a way that the error between predicted & actual value is minimum. In other words, we can say, the values β0 & β1 must be chosen so that the error is minimum.

Error = ∑ (actual — prediction)^2

Errors are also called residuals. In the above equation, if we don’t square the error the positive and negative values cancel out each other.

From the equation, y = β0 + β1 ,

β0 = y‾ — β1 * x‾

and,

β1 = ∑ (xi — x‾ )*(yi — y‾) / ∑ (xi — x‾ )^2

where , β0 = constant or intercept

β1 = coefficient of x

xi = Values of x from 0 to n

yi = values of y from 0 to n

x‾ = mean of xi

yi = mean of yi

If β1 > 0 , then x(predictor) & y(target) have a positive relationship.

Positive Correlation

If β1<0, then x(predictor) & y(target) have a negative relationship.

Negative Correlation

Thanks for reading my article. Happy reading !!! Stay safe

--

--

Shari Nair
Shari Nair

Written by Shari Nair

Learning to analyze data patterns and trends

No responses yet