### What does Gradient in Gradient Boosted Trees refer to?

The “gradient” in Gradient Boosting Machine is a reference to the concept of gradient descent.

- Machine Learning 101 (30)
- Statistics 101 (38)
- Supervised Learning (114)
- Regression (42)
- Classification (46)
- Logistic Regression (10)
- Support Vector Machine (10)
- Naive Bayes (4)
- Discriminant Analysis (5)
- Classification Evaluations (9)

- Classification & Regression Trees (CART) (23)

- Unsupervised Learning (55)
- Clustering (28)
- Distance Measures (9)
- Dimensionality Reduction (9)

- Deep Learning (23)
- Data Preparation (34)
- General (5)
- Standardization (6)
- Missing data (7)
- Textual Data (16)

The “gradient” in Gradient Boosting Machine is a reference to the concept of gradient descent.

XGBoost is a modern implementation of Gradient Boosting Machine that works largely the same as the standard GBM.

Adaboost (Adaptive Boosting) is a simple boosting technique that is a predecessor to modern algorithms like GBM and its offshoots.

A weak learner refers to a prediction mechanism that produces results that are only slightly more predictive than those resulting from a random chance model.

For any decision-tree based method, feature importance can be measured in a couple of ways.

Tuning the combination of number of trees and learning rate is a good way to ensure you are creating a model with appropriate complexity.

On structured datasets, a well-tuned GBM more often outperforms a Random Forest.

In Random Forest, decision trees are constructed independently and the results are aggregated through either averaging or majority vote after all trees are created.

Advantages: High accuracy

Disadvantages: Requires some computing power and time spent in parameter tuning

Key hyper-parameters for a GBM are: Number of trees, learning rate, and maximum depth

Find out all the ways

that you can

- Machine Learning 101 (30)
- Statistics 101 (38)
- Supervised Learning (114)
- Regression (42)
- Classification (46)
- Logistic Regression (10)
- Support Vector Machine (10)
- Naive Bayes (4)
- Discriminant Analysis (5)
- Classification Evaluations (9)

- Classification & Regression Trees (CART) (23)

- Unsupervised Learning (55)
- Clustering (28)
- Distance Measures (9)
- Dimensionality Reduction (9)

- Deep Learning (23)
- Data Preparation (34)
- General (5)
- Standardization (6)
- Missing data (7)
- Textual Data (16)