Linear Regression
It refers to a linear relationship between two or more variables. That means, if we draw this relationship in a two-dimensional space (between two variables), we get a straight line.
Y = 𝛉1 + 𝛉2X
where X is the explanatory variable
Y is the dependent variable.
The slope of the line is 𝛉2, and 𝛉1 is the intercept (the value of Y when X = 0).
Disadvantage:
It assumes that a single straight line is appropriate as a summary of the data.
Example:
Ordinary Least Square (OLS)
Ordinary Least Squares (OLS) is one of the methods to find the best fit line for a dataset using linear regression. It calculates the difference between the predicted and the actual value, squares it, and repeats this step for all data points. Thus, a sum of all the errors is computed. And this sum is the overall representation of how accurate the model is.
Next, the parameters of the model are tweaked such that this squared error is minimized until there is no scope for improvement.
Disadvantage:
- Sensitivity to Outliers.
- Data has to be normally distributed.
Example:
OLS Model using Boston Housing Dataset
Polynomial Regression
For most of the datasets, the data cannot be summarized by a straight line. That results in Underfitting. Now, to overcome underfitting, we need to increase the complexity of the Model. So, instead of representing the data by a straight line, we can generate a higher-order equation by adding powers to the original features as new features.
Thus, Y = 𝛉1 + 𝛉2X can be transformed into
Y = 𝛉1 + 𝛉2X + 𝛉3X2
Disadvantage:
- The presence of one or two outliers in the data can seriously affect the results of the nonlinear analysis.
- There are fewer model validation tools for the detection of outliers in the nonlinear regression model.
Example:
Gradient Descent Regression
Gradient Descent Regression Algorithm uses gradient descent to optimize the model. Now, this Algorithm aims at minimizing the value of the cost function J(𝛉1,𝛉2). We will start with some value of 𝛉1 and 𝛉2 and keep on changing the values until we get the Minimum value of J(𝛉1,𝛉2) i.e. best fit for the line that passes through the data points.
Disadvantage:
Example:
Gradient Descent Model in R
Decision Tree Regression
Disadvantage:
Example: Decision Tree Regression Model using Boston Housing Dataset
Applications of Regression:
Predict Stock prices, house prices, sales, etc.