1. 程式人生 > >【Task4(2天)】 模型評估

【Task4(2天)】 模型評估

時間 cal wid boost ive div learn col 決策

記錄5個模型(邏輯回歸、SVM、決策樹、隨機森林、XGBoost)關於accuracy、precision,recall和F1-score、auc值的評分表格,並畫出ROC曲線。時間:2天

可以參照以下格式:

技術分享圖片

說明:這份數據集是金融數據(非原始數據,已經處理過了),我們要做的是預測貸款用戶是否會逾期。表格中 "status" 是結果標簽:0表示未逾期,1表示逾期。

1.繪圖繪表格函數

這裏直接用的是上一篇的處理後的數據,定義好的模型

from sklearn.metrics import recall_score,precision_score,f1_score,accuracy_score,roc_curve,roc_auc_score
import numpy as np
def plot_roc_curve(fpr_train, tpr_train,fpr_test,tpr_test, name=None):
    plt.plot(fpr_train, tpr_train, linewidth=2,c=r,label=train)
    plt.plot(fpr_test, tpr_test, linewidth=2,c=b,label=test)
    plt.plot([0, 1], [0, 1], k--)
    plt.axis([0, 1, 0, 1])
    plt.xlabel(
False Positive Rate) plt.ylabel(True Positive Rate) plt.title(name) plt.legend(loc=best) plt.show() def metrics(models,X_train_scaled,X_test_scaled,y_train,y_test): results_test = pd.DataFrame(columns=[recall_score,precision_score,f1_score,accuracy_score,AUC
]) results_train = pd.DataFrame(columns=[recall_score,precision_score,f1_score,accuracy_score,AUC]) for model in models: name = str(model) result_train = [] result_test = [] model = models[model] model.fit(X_train_scaled,y_train) y_pre_test = model.predict(X_test_scaled) y_pre_train = model.predict(X_train_scaled) result_test.append(round(recall_score(y_pre_test,y_test),2)) result_test.append(round(precision_score(y_pre_test,y_test),2)) result_test.append(round(f1_score(y_pre_test,y_test),2)) result_test.append(round(accuracy_score(y_pre_test,y_test),2)) result_test.append(round(roc_auc_score(y_pre_test,y_test),2)) result_train.append(round(recall_score(y_pre_train,y_train),2)) result_train.append(round(precision_score(y_pre_train,y_train),2)) result_train.append(round(f1_score(y_pre_train,y_train),2)) result_train.append(round(accuracy_score(y_pre_train,y_train),2)) result_train.append(round(roc_auc_score(y_pre_train,y_train),2)) fpr_train, tpr_train, thresholds_train = roc_curve(y_pre_train,y_train) fpr_test, tpr_test, thresholds_test = roc_curve(y_pre_test,y_test) plot_roc_curve(fpr_train, tpr_train,fpr_test,tpr_test,name) results_test.loc[name] = result_test results_train.loc[name] = result_train return results_test,results_train
results_test,results_train = metrics(models,X_train_scaled,X_test_scaled,y_train,y_test)

結果如下

訓練集:技術分享圖片(數模型過擬合的很厲害!!)

測試集:技術分享圖片

模型ROC曲線:

技術分享圖片技術分享圖片技術分享圖片技術分享圖片技術分享圖片

【Task4(2天)】 模型評估