The website is in Maintenance mode. We are in the process of adding more features.
Any new bookmarks, comments, or user profiles made during this time will not be saved.

Machine Learning Resources

What is Backpropagation? 

Bookmark this question

Related articles:
Explain the basic architecture of a Neural Network, model training and key hyper-parameters
What is an activation function? What are the different types? Discuss their pros and cons
Top 20 Deep Learning Interview Questions with detailed Answers (All free)

Backpropagation, Deep Learning Training
Backpropagation, Deep Learning Training (Source:


Backpropagation, short for “backward propagation of errors,” is a fundamental concept in the field of artificial neural networks, particularly in the context of training deep learning models. In Backward Propagation, the parameters of a Neural Network (i.e. weights and biases) are updated using a gradient descent optimization algorithm so that on each iteration, the gradient is one step closer to the minimum of the cost function. This minimizes the error between the predicted and the actual output thereby improving the predictive accuracy of the network.

Backpropagation consists of two main steps:

  1. Calculate gradients of the loss with respect to all model parameters
    In the backward pass, the algorithm works backward through the network to compute the gradients of the error with respect to the weights and biases. This is done using the chain rule from calculus, which allows for the calculation of how changes in the networkʼs parameters affect the error.
  1. Update the network weights using an optimization algorithm
    The computed gradients are used to adjust the weights and biases in the network, aiming to reduce the error (or loss). This is typically done using optimization algorithms like stochastic gradient descent (SGD) or its variants, such as Adam or RMSprop.
Backpropagation in a Neural Network

Backpropagation in a Neural Network
Source: Research

The above phenomenon is referred to as Backpropagation because the process starts at the output layer and then utilizes the chain rule to calculate derivatives as it works its way backward to the input layer.

After each step of backpropagation, another step of forward propagation occurs, during which the input data is forwarded through the network using the updated weights and biases. If gradient descent is functioning correctly, the gradient of the cost function should be lower than it was before the update for every iteration.

For a complete picture of neural network training, please refer to this article: Describe briefly the training process of a Neural Network model

Video Explanation

  • The 3-pack Backpropagation video series by Deep Lizard explains the intuition and mathematics behind backpropagation succinctly (Total Runtime: 33 mins)
Backpropagation video series by Deep Lizard
  • The 3-pack Backpropagation video series by 3Blue1Brown explains the concept of Backpropagation in more detail starting from Gradient descent, mechanics of Backprop and the math behind it (Total Runtime: 43 mins)
Backpropagation video series by 3Blue1Brown
  • If you want a hands-on understanding of Backpropagation, check out the the video by Andrej Karpathy, Former AI Director of Tesla. In this video, Andrej explains the basics of Backpropagation by working out an example on Jupyter Book. You’ll never forget Backprop once you do this. (Total Runtime: 2.5 hrs)
Backpropagation on Jupyter notebook, by Andrej Karpathy, Former AI Director of Tesla

Leave your Comments and Suggestions below:

Please Login or Sign Up to leave a comment

Partner Ad  

Find out all the ways
that you can

Explore Questions by Topics

Partner Ad

Learn Data Science with Travis - your AI-powered tutor |