How does gradient descent differ from coordinate descent?

Coordinate descent does not require knowledge of or computation of the derivative of the objective function; rather, it only considers the coordinates of the function itself. Whereas gradient descent moves in the direction of steepest descent, coordinate descent searches for the minimum of the function by separately moving along the axes of its coordinates. Gradient descent can be conceptualized by picturing a ball rolling down a hill until it reaches the bottom, while coordinate descent is more like someone starting at one corner of a city and following streets arranged in a grid pattern in order to reach the other side. Coordinate descent is preferred when computation and evaluation of a function’s derivative is time consuming.