What is Gradient Boosting (GBM)?

GBM is another ensemble-based supervised machine learning algorithm that is suitable for both regression and classification problems. The algorithm gets its name from the concept of Boosting, which is an iterative process that transforms shallow decision trees called weak learners into a sophisticated prediction mechanism. While bagging methods like Random Forest seek to reduce the prediction variance, boosting techniques also minimize the bias, thus theoretically creating a highly accurate model that is able to generalize well to unseen data.

In GBM, trees are built in a sequential manner, where the final model is a weighted sum of all of the individual trees. Instead of fitting the trees on the outcome, each subsequent ith tree is trained to fit on the residual of the model built using i-1 trees. This allows the algorithm to learn from its errors over time, and it continues in this sequential manner until a stopping criterion is reached. Furthermore, the trees used for boosting are smaller trees and in many cases tend to be just a stump (depth = 1). The trees are added to the model using a learning rate parameter, which is generally chosen to have very small values in the range 0.001 to 0.01. The small trees, along with a low learning rate, allows the model to learn slowly, and not overfit, thereby leading to a better performing model. Finally, in GBM, there is no bootstrapping (or resampling) of training data while building individual trees, as the goal is for the model to fit to the residuals from all the available data.