Linear regression with python for machine learning | Verzeo

How to implement linear regression in machine learning using python

This is a question each individual who is going to get recruited is asked. Many give a variety of answers to this question. But almost every time they get the following response:


BY Kaushik

14th August 2020

Linear regression using python - Verzeo

Machine Learning as a field can simply be defined as the technique that makes a standalone machine or system understand different computing processes and can work on performing these processes on its own without any human intervention. But that doesn’t mean it doesn’t require the human sleight of hand at all, the system doesn’t get this capability on its own, it requires that the system is fed in repeated data and knowledge through complex and specialised procedures called algorithms that process this data to make the system capable of learning through these repetitive processes. Hence the name given to this process is Machine Learning.

In this article I will be explaining to you linear regression using python as a language. But before I go ahead and explain it, ill just briefly go over classification and regression.

Techniques in Machine Learning

Machine Learning majorly focuses on two methods, namely, Regression and Classification.

Classification

Classification is defined as the technique where the system learns about the data by “classifying” or categorizing into different sections. It is a supervised learning approach in Machine Learning. Since its habit of categorizing, it works mainly on discrete variables (that don’t have a consecutive distribution probability). Examples of classification are the possibility of a tumour being there or not.

There are a few types of classification methods, namely - Naive Bayes Classifier, Nearest Neighbor, Support Vector Machines, Decision Trees, Boosted Trees, Random Forest, Neural Networks.

Regression

Regression is defined as the technique used for analyzing and learning from continuous variables. Regression doesn’t categorize like classification does but rather works by arranging the values of all data points in terms of finding relational possibilities between the independent input variables and the dependent output variables. Like classification, Regression is also a supervised Machine Learning technique. Examples include Housing prices variation, Probability of winning/losing, sales forecasting.

The main types of regression methods are - Linear Regression, Polynomial Regression, Ridge Regression, Stepwise Regression, Logistic Regression etc.

Now that you have understood the what is classification and regression, I will now be focusing on Regression and its techniques. Primarily you will be learning about Linear regression and its types.

Linear Regression in python

Linear Regression is the simplest and the most commonly used regression techniques in Machine Learning. Linear regression checks for the relationship between input and output variables and the relation is plotted on a graph where it is denoted by a straight inclined slope.

The nature of the slope indicates that the independent and dependent variables have a linear relationship i.e y=x, where x is the independent input variable and y is the dependent variable.

The relationship between x and y is defined by the following equation:-

y=mx+c

Here, m and c are numerical constants. ‘“m” defines the slope of the equation and “c” is the unwanted error “noise” in the regression process.

Standard linear regression graph - Verzeo

The minimized equation for this technique can also be defined as:-

y=b0+b1x+c where;

b0=y-intercept, b1=slope, x->independent variable, y->dependent variable

Steps involved in Simple Linear Regression

Now that we have understood the basic idea behind the Simple Linear Regression and also learnt about its equation for representation, now we come to the main part of the process, the steps that form part of this algorithm. Note that we mainly use Python programming which is well-equipped to handle the requisite packages and processes. The steps involved are encapsulated in the following steps:-

  • Import required packages and library functions (mainly used is scikit-learn)
  • Import the dataset
  • Create the regression model to fit in the data
  • Evaluate the model based on the relation between inputs and concerned outputs
  • Obtain predicted response
  • Assess the accuracy of the evaluation using training and testing data

import numpy as np

import matplotlib.pyplot as plt

def estimate_coef(x, y):

# number of observations/points

n = np.size(x)

# mean of x and y vector

m_x, m_y = np.mean(x), np.mean(y)

# calculating cross-deviation and deviation about x

SS_xy = np.sum(y*x) - n*m_y*m_x

SS_xx = np.sum(x*x) - n*m_x*m_x

# calculating regression coefficients

b_1 = SS_xy / SS_xx

b_0 = m_y - b_1*m_x

return(b_0, b_1)

def plot_regression_line(x, y, b):

# plotting the actual points as scatter plot

plt.scatter(x, y, color = "m", marker = "o", s = 30)

# predicted response vector

y_pred = b[0] + b[1]*x

# plotting the regression line

plt.plot(x, y_pred, color = "g")

# putting labels

plt.xlabel('x')

plt.ylabel('y')

# function to show plot

plt.show()

def main():

# observations

x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

y = np.array([1, 3, 2, 5, 7, 8, 8, 9, 10, 12])

# estimating coefficients

b = estimate_coef(x, y)

print("Estimated coefficients:\nb_0 = {} \

\nb_1 = {}".format(b[0], b[1]))

# plotting regression line

plot_regression_line(x, y, b)

if _name_ == "_main_":

main()

Plotting the graphs

Once we are done with our programming and having gained the results, we now move on to the plotting of our results in a graphical representation. For this we require the use of a Python Machine Learning library called “Matplotlib” that is used for such purposes.

Once we retrieve the matplotlib library from the stack, we assign our data values to the function and put out the final output that is to be plotted and shown.

The graph of the linear regression equation can be expressed as the following:-

Linear regression graph according to above code - Verzeo

Other forms of Regression (Multiple and Polynomial Regression)

Apart from Linear Regression there are other forms of Regression techniques as well called Multiple Linear Regression and Polynomial Regression.

Multiple linear regression works similar to simple linear regression but is used to assess the relation between a single dependent variable and multiple independent variable whereas simple linear regression assesses the relation between single independent and dependent variables.

The equation for Multiple Linear Regression is as follows:-

y=b0+b1x1+b2x2+b3x3+........+bnxn

Polynomial Regression is the technique where we considered multiple characteristics of one variable. Polynomial Regression is particularly useful for representing a non-linear relationship between the independent and dependent variables as part of a polynomial equation in which both are assessed.

Conclusion

Having gone through this blog, we have covered the following topics:-

  • What is Regression and Classification?
  • The types and techniques involved in Regression analysis
  • Plotting the results
  • Multiple and polynomial regression

So that covers the basics of Regression and Linear Regression techniques in Python programming language. We hope you have got a rough idea of the same.

If this blog has piqued your interest in knowing more about Machine Learning then you can checkout Verzeo’s Machine Learning Pro-Degree Certification Program or Internship program. These programs can help you excel in the field of Machine Learning and make your bid to a successful career in the field with internationally-acclaimed certifications and a job in your hand when you finish.