Basics of simple linear regression in machine learning
Before I start what is linear regression, I want to introduce you what is regression and regression analysis.
Regression Analysis
In general, regression means to return to a former state but in statistics it’s a technique to understand the relationship between two variables.
Regression analysis is a statistical technique to estimate the relationship between dependent and independent variable. This dependent variable is also called criterion variable or response variable or endogenous variable. The independent variable is also called predictor variable or exogenous variable or regressors.
Regression analysis explains the changes in dependent variable with respect to changes in independent variable.
Regression analysis is used for 3 types of applications:
i. Finding out the effect of input variable on target variable
ii. Finding out the change in target variable with respect to one or more independent variable
iii. To find the trends
Types of Regression
Ø Linear regression
Ø Logistics regression
Ø Polynomial regression
Ø Ridge regression
Ø Lasso regression
Ø Elastic net regression
In this post I am gonna explain you basic idea of linear regression which I learned or learning from different online sources.
Linear Regression
Linear regression is a statistical method which comes under supervised machine learning algorithm used for predictive analysis. This approach is used for modeling the relationship between dependent variable and one or more independent variable. It focusses on conditional probability distribution of the response given by the predictors.
In linear regression, the output or dependent variable should be continuous in nature. Or we can say that linear regression makes predictions for continuous or numeric values such as sales, price, etc.
Linear regression shows linear relationship between dependent variable, say Y & independent variable, say X.
The general form of linear regression equation is:
Y= β0 + β1*X1 + β2*X2 +…+ βn*Xn
Where, β0 = intercept
β1, β12,…, βn = coefficients of independent variable
X1,X2,…,Xn = independent features
The goal of linear regression is to find the best fit line which minimizes the error between predicted values and actual values. That best fit line is called regression line.
Types of linear regression
Ø Simple linear regression
Ø Multiple linear regression
Simple linear regression
In this type of linear regression, only one independent variable(X1) is used to predict the value of a continuous dependent variable(Y).
The equations of simple linear regression is
Y = β0 + β1*X1
This line can be chosen in such a way that the error between predicted & actual value is minimum. In other words, we can say, the values β0 & β1 must be chosen so that the error is minimum.
Error = ∑ (actual — prediction)^2
Errors are also called residuals. In the above equation, if we don’t square the error the positive and negative values cancel out each other.
From the equation, y = β0 + β1 ,
β0 = y‾ — β1 * x‾
and,
β1 = ∑ (xi — x‾ )*(yi — y‾) / ∑ (xi — x‾ )^2
where , β0 = constant or intercept
β1 = coefficient of x
xi = Values of x from 0 to n
yi = values of y from 0 to n
x‾ = mean of xi
yi = mean of yi
If β1 > 0 , then x(predictor) & y(target) have a positive relationship.
If β1<0, then x(predictor) & y(target) have a negative relationship.
Thanks for reading my article. Happy reading !!! Stay safe