The website is in Maintenance mode. We are in the process of adding more features.
Any new bookmarks, comments, or user profiles made during this time will not be saved.

Machine Learning Resources

What are the options for reporting feature importance from a decision-tree based model?

Bookmark this question

For any decision-tree based method, feature importance can be measured in a couple of ways. The most common approach is based on how much an attribute contributes to the construction of each decision tree during the training process. The most important features are used in the top split points on a given decision tree. The numeric measure of purity/impurity depends on the loss function used, but the same general intuition holds for both regression and classification. An overall measure of importance for each feature is found from averaging their importance across the entire ensemble.

Values for feature importances can be extracted from a fitted GBM model in most software packages, such as using the feature_importances_ attribute in Python. Another way to interpret variable importance is a permutation-based approach, where after the model is fit, the values of each attribute are randomly shuffled, and the most influential features are those in which altering its values leads to the largest drop-off in model performance. The permutation method is a model agnostic approach for identifying important predictors and is an asset in terms of improving interpretation in black box machine learning. 

Leave your Comments and Suggestions below:

Please Login or Sign Up to leave a comment

Partner Ad  

Find out all the ways
that you can

Explore Questions by Topics

Partner Ad

Learn Data Science with Travis - your AI-powered tutor |