How does gradient descent differ from coordinate descent?
Coordinate descent does not require knowledge of or computation of the derivative of the objective function; rather, it only considers the coordinates of the function itself.
Coordinate descent does not require knowledge of or computation of the derivative of the objective function; rather, it only considers the coordinates of the function itself.
Batch Gradient Descent, Stochastic Gradient Descent, Mini Batch Gradient Descent
Gradient descent is an iterative optimization algorithm
Find out all the ways
that you can