What is Backpropagation? 

In Backwards Propagation, the parameters of a Neural Network (all of the weight and bias terms) are updated using a gradient descent optimization algorithm so that on each iteration, the gradient is one step closer to the minimum of the cost function. Backpropagation consists of taking derivatives of all of the weights and biases with respect to the cost function. It is referred to as Backpropagation since the derivation starts in the output layer and then makes use of the chain rule in order to perform the computation of derivatives as it works its way back to the input layer in reverse. After each step of backpropagation, another step of forward propagation occurs, in which the input data is fed forward through the network on the updated weights and biases. If gradient descent is working properly, the gradient of the cost function should be lower than it was before the update took place.