xgboost 特征重要性計算

阿新 • • 發佈：2018-11-13

get fontsize oos href .net pre con 繪制 leaf

在XGBoost中提供了三種特征重要性的計算方法：

‘weight’ - the number of times a feature is used to split the data across all trees.
‘gain’ - the average gain of the feature when it is used in trees
‘cover’ - the average coverage of the feature when it is used in trees

簡單來說
weight就是在所有樹中特征用來分割的節點個數總和；
gain就是特征用於分割的平均增益
cover 的解釋有點晦澀，在[R-package/man/xgb.plot.tree.Rd]有比較詳盡的解釋：(https://github.com/dmlc/xgboost/blob/f5659e17d5200bd7471a2e735177a81cb8d3012b/R-package/man/xgb.plot.tree.Rd)：the sum of second order gradient of training data classified to the leaf, if it is square loss, this simply corresponds to the number of instances in that branch. Deeper in the tree a node is, lower this metric will be。實際上coverage可以理解為被分到該節點的樣本的二階導數之和，而特征度量的標準就是平均的coverage值。

還是舉李航書上那個例子，我們用不同顏色來表示不同的特征，繪制下圖
技術分享圖片

xgboost 特征重要性計算

get fontsize oos href .net pre con 繪制 leaf 在XGBoost中提供了三種特征重要性的計算方法： ‘weight’ - the number of times a feature is used to split the data

xgboost 特征重要性計算

xgboost 特征重要性計算

sklearn中xgboost模塊中plot_importance函數（特征重要性）

XGBoost 輸出特征重要性以及篩選特征

用python實現LBP特征點計算

sklearn決策樹特征權重計算方法

xgboost 特征選擇，篩選特征的正要性

sklearn 可視化模型的訓練測試收斂情況和特征重要性

隨時更新———個人喜歡的關於模式識別、機器學習、推薦系統、圖像特征、深度學習、數值計算、目標跟蹤等方面個人主頁及博客

Python 振動分析叠代法計算高階特征值及特征向量

xgboost 特徵重要性計算

雲計算擁有這三個特征

UFLDL講義二十：卷積特征提取

在SCIKIT中做PCA 逆運算 -- 新舊特征轉換

4.AngularJS四大特征之二：雙向數據綁定

學習LBP特征

面向對象的特征有哪些？

二階線性差分方程中的根/特征值的討論

第二篇：使用Spark對MovieLens的特征進行提取

人臉識別中的harr特征提取（轉）

opencv實現一種改進的Fast特征檢測算法

xgboost 特征重要性計算

相關推薦