xgboost 特徵重要性計算
在XGBoost中提供了三種特徵重要性的計算方法:
‘weight’ - the number of times a feature is used to split the data across all trees.
‘gain’ - the average gain of the feature when it is used in trees
‘cover’ - the average coverage of the feature when it is used in trees
簡單來說
weight就是在所有樹中特徵用來分割的節點個數總和;
gain就是特徵用於分割的平均增益
cover 的解釋有點晦澀,在[R-package/man/xgb.plot.tree.Rd]有比較詳盡的解釋:(
還是舉李航書上那個例子,我們用不同顏色來表示不同的特徵,繪製下圖