Linear regression is one of the first learned statistics methods for the journey of machine learning.

The moment we talk about this, we will remember the statistics learned in our 11th or 12th grades. Yes, this is that same linear regression implemented in predicting variables of continuous values.

When we start using the libraries in linear regression, we feel like we have mastered the art of machine learning. But the reality is that we have just started to learn what is machine learning. It is just a cup of water from the ocean.

Before we could get into the implementation of linear regression in machine learning, let’s revise linear regression in statistics.

What is linear regression?

When a model tries to fit in a linear equation between two variables (an independent and a dependent variable), predicting the data is called simple linear regression.

The topic, linear regression, always reminds us of the equation Y = mX + C, where Y is the dependent variable, X is the independent variable, and C is the constant. The main goal of linear regression is to fit the model with two variables into this regression equation.

But this is just the definition of is linear regression. For better understanding, we need a real-life example. We can determine our shoe size with our height.

With the given weights and heights, we can approximate how the shoe size varies with height. But to be clear, we use a scatter plot, which gives the total review of data. So, the task is to find the shoe size when his/her height is given. From the given datasets, we can guess the size of the shoes. Yes, the model also does the same thing. The model predicts the size by comparing it with previous examples to find the actual size of the shoe.

And now, we are going to get into the actual concepts of linear regression.

* How to solve a linear regression problem?

* What are the types of linear regression?

* A detailed explanation.

* Linear Regression Model.

Let’s dive into our first question:


The basic knowledge required to solve a linear regression problem is to know a bit about statistical maths knowledge. So let’s start.

  1. Find the linear regression of X and Y from the given data and predict Y’s value if the value of X is 90.


The formula for linear regression is Y = mX + C

→ The first step is to find a mean value in X and Y, respectively. The mean value of X is 78 (X’), and Y’s mean value is 77 (Y’). (Note: mean is the sum of all the numbers divided by the total count of the numbers.)

→ The next step is to find the error, which is to reduce the value of mean (X’ or Y’) from each value of X or Y. And square the importance of the errors. Also, find the product of the mistakes of X and Y.

→ The next step is to find the slope, m.

m= ∑ [ (X-X’) (Y-Y’) ] / ∑ [ (Y-Y’)2 ]

m= 470/730


→ The final step is to find the y-intercept of the equation.

C = Y’ -mX’

C = 77 -(0.644) (78)

C = 26.76.

Thus, the required regression equation is Y = 26.76 + 0.644X

2) From the equation, we can find the value of Y by substituting the value of X in it.

When X= 90,

Y = 23.76 + 0.644(90)

Y = 81.72.

Hence the value of Y when X = 90 is 81.72.

Now I hope, solving problems in linear regression will be fine. Let’s go to the next question.


Linear regression is a method used to predict the continuous values of data. When the data consist of only one data, then it is SIMPLE LINEAR REGRESSION. If various data are used in finding particular deals, then it is called MULTIPLE LINEAR REGRESSION

1) Simple linear regression example: The best example of simple linear regression is

  • Predicting soil erosion concerning the rate of rainfall in a particular region. Here, only one parameter is used to assess soil erosion rate that is the rate of precipitation.
  • Predicting the shoe size of a person with the height of a person.
  • Predicting the fuel required for travel with the distance of the destination.

2) Multiple linear regression example: The best models for multiple linear regression is:

  • Purchasing a villa concerning prices, facilities, and distance to the city.
  • Acquiring a job concerning age, a field of experience, and language fluency.
  • Calculating the rate of suicides in a country concerning age, sex, and generation.


All the topics discussed till now is just an introduction towards the depth explanation.

Linear regression is defined as a continuous value prediction. Why is it explained so? That is because, the values are predicted with respect to the values fed as an input for training of the model. The model is provided with accurate values of the data and the model analyses the data using a scatter plot, trains and fits itself to a point predicted with respect to the given data.

The equation for linear regression is Y =MX + C

The term Y is the value to be predicted, the term M is the weight, the terms X is the value provided for prediction, and the term C is the bias. But this was not the explanation previously provided for these terms.

Yes, that is because in Machine Learning, the coefficient is called as weight and y-intercept is called as bias. The meanings remain the same but the usage differs according to the needs. Our model will try to learn the accurate weights and bias to best fit the line.

So how do we optimize our weights here? There comes our cost function. Cost function helps us to optimize our weights.

For linear regression the cost function we use MSE (Mean Squared Error). MSE measures the average of the squared distance between the observed and the actual value. The output that is provided is a single value, either as cost or as score. The main purpose of model is to minimize the MSE to increase the model accuracy.

Let us calculate the cost function using our linear equation: y = mx +c.

Then MSE is,

Where N is the total number of observations, 1/N ∑n is the mean and Y is the actual value of observation.

For model accuracy, we minimize the value of MSE. But how to minimize it?

Yes, here we use gradient descent to calculate the gradient of our cost function. Gradient descent looks at the error that our weight currently gives us, using the derivative of the cost function to find the gradient (The slope of the cost function using our current weight), and then changes our weight to move in the opposite direction of the gradient since the gradient points up the slope instead of down it and tries to decrease our error.

Let us calculate the gradient descent in our linear equation.

Since both the weight and the bias have equal impact on the prediction, we use partial derivation.

Now split the derivative,


using the Chain rule,

Connecting the parts to get the derivative,

We can now calculate the gradient of this cost function as follows:

Now we have covered most of the topics related to linear regression. Let us implement a linear regression model to get a clear view of the topic.


Simple linear regression is to predict a variable using the previously provided data. We have discussed how to solve a linear regression problem in statistics and have known the types of linear regression. Now we will take over the machine learning part of this topic. So How to implement a linear regression model.

The linear regression model predicts the shoe size of the person concerning their heights irrespective of their gender.

So, in the beginning, we have to install the libraries that are required for the model. Here we are using Sk learn, also called the sci-kit-learn, Numpy, Pandas, and Matplotlib. First, pip installs the packages that are not available.

Next, import all the necessary packages.

After importing the libraries, give inputs to the model. Here I have used the heights of my class students as X and their shoe size as Y.

Next, I am using the class LinearRegression from Scikit-learn as it helps perform all kinds of regression models and fits the model and trains it. For better prediction results, we are plotting a graph using Matplotlib. the color of the line can be selected accordingly.

Then the model has been put for prediction and accuracy.

Here is the output of our model. This is a simple linear regression model. The data provided for training are very few here. You can also add some more data into it for more accuracy and better prediction.

When you approach for prediction, you will have to enter the height for which you would like to predict the size according to which we can get the result.

The graph below shows the best fit line of this simple linear regression model.

That’s the end of this topic. A simple blog for beginners to start the journey of Machine Learning.

A Machine Learning aspirant

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store