機器學習一個小目標——Task4

阿新 • • 發佈：2018-11-21

任務【模型評估】

記錄五個模型關於precision，rescore，f1，auc，roc的評分表格，畫出auc和roc曲線圖

遇到難題

實驗程式碼

#!/usr/bin/env python 3.6
#-*- coding:utf-8 -*-
# @File    : Model_evaluation.py
# @Date    : 2018-11-20
# @Author  : 黑桃
# @Software: PyCharm 

import pickle
from matplotlib import pyplot as plt
from sklearn.externals import 
 joblib
from sklearn.metrics import accuracy_score, recall_score, f1_score, roc_auc_score, roc_curve

path = "E:/MyPython/Machine_learning_GoGoGo/"
"""=====================================================================================================================
1 讀取特徵
"""
print("0 讀取特徵")
f = open 
(path + 'feature/feature_V1.pkl', 'rb')
train, test, y_train,y_test= pickle.load(f)
f.close()

"""=====================================================================================================================
2 讀取模型
"""
print("1 讀取模型")
SVM_linear = joblib.load( path + "model/SVM_linear.pkl")
SVM_poly = 
 joblib.load( path + "model/SVM_poly.pkl")
SVM_rbf = joblib.load( path + "model/SVM_rbf.pkl")
SVM_sigmoid = joblib.load( path + "model/SVM_sigmoid.pkl")
lg_120 = joblib.load( path + "model/lg_120.pkl")
DT = joblib.load( path + "model/DT.pkl")
xgb_sklearn = joblib.load( path + "model/xgb_sklearn.pkl")
lgb_sklearn = joblib.load( path + "model/lgb_sklearn.pkl")
xgb = joblib.load( path + "model/xgb.pkl")
lgb = joblib.load( path + "model/lgb.pkl")




"""=====================================================================================================================
3 模型評估
"""

def model_evalua(clf, X_train, X_test, y_train, y_test,clf_name):
    y_train_pred = clf.predict(X_train)
    y_test_pred = clf.predict(X_test)
    y_train_pred_proba = clf.predict_proba(X_train)[:, 1]
    y_test_pred_proba = clf.predict_proba(X_test)[:, 1]
    """【AUC Score】"""
    print('AUC Score')
    print("Train_AUC Score ：{:.4f}".format(roc_auc_score(y_train, y_train_pred)))
    print("Test_AUC Score ：{:.4f}".format(roc_auc_score(y_test, y_test_pred)))

    """【準確性】"""
    print('準確性：')
    print('Train_準確性：{:.4f}'.format(accuracy_score(y_train, y_train_pred)))
    print('Test_準確性：{:.4f}'.format(accuracy_score(y_test, y_test_pred)))

    """【召回率】"""
    print('召回率：')
    print('Train_召回率：{:.4f}'.format(recall_score(y_train, y_train_pred)))
    print('Test_召回率：{:.4f}'.format(recall_score(y_test, y_test_pred)))

    """【f1_score】"""
    print('f1_score：')
    print('Train_f1_score：{:.4f}'.format(f1_score(y_train, y_train_pred)))
    print('Test_f1_score：{:.4f}'.format(f1_score(y_test, y_test_pred)))

    #描繪 ROC 曲線
    fpr_tr, tpr_tr, _ = roc_curve(y_train, y_train_pred_proba)
    fpr_te, tpr_te, _ = roc_curve(y_test, y_test_pred_proba)
    # KS
    print('KS：')
    print('Train：{:.4f}'.format(max(abs((fpr_tr - tpr_tr)))))
    print('Test：{:.4f}'.format(max(abs((fpr_te - tpr_te)))))
    plt.plot(fpr_tr, tpr_tr, 'r-',
             label = "Train:AUC: {:.3f} KS:{:.3f}".format(roc_auc_score(y_train, y_train_pred_proba),
                                                max(abs((fpr_tr - tpr_tr)))))
    plt.plot(fpr_te, tpr_te, 'g-',
             label="Test:AUC: {:.3f} KS:{:.3f}".format(roc_auc_score(y_test, y_test_pred_proba),
                                                       max(abs((fpr_tr - tpr_tr)))))
    plt.plot([0, 1], [0, 1], 'd--')
    plt.legend(loc='best')
    plt.title(clf_name + "ROC curse")
    plt.savefig(path +'picture/'+clf_name+'.jpg')
    plt.show()
print('-------------------SVM_linear-------------------')
model_evalua(SVM_linear, train, test, y_train, y_test,'SVM_linear')

print('-------------------SVM_poly-------------------：')
model_evalua(SVM_poly, train, test, y_train, y_test,'SVM_poly')

print('-------------------SVM_rbf-------------------：')
model_evalua(SVM_rbf, train, test, y_train, y_test,'SVM_rbf')

print('-------------------SVM_sigmoid-------------------：')
model_evalua(SVM_sigmoid, train, test, y_train, y_test,'SVM_sigmoid')

print('-------------------lg_120-------------------')
model_evalua(lg_120, train, test, y_train, y_test,'lg_120')

print('-------------------DT-------------------')
model_evalua(DT, train, test, y_train, y_test,'DT')

print('-------------------xgb_sklearn-------------------')
model_evalua(xgb_sklearn, train, test, y_train, y_test,'xgb_sklearn')

# print('-------------------xgb-------------------')
# model_evalua(xgb, train, test, y_train, y_test)

print('-------------------lgb_sklearn-------------------')
model_evalua(lgb_sklearn, train, test, y_train, y_test,'lgb_sklearn')
# print('-------------------lgb-------------------')
# model_evalua(lgb, train, test, y_train, y_test)

實驗結果

	precision	recall	f1_score	KS	ROC_AUC
SVM_linear	Train_準確性：0.7878Test_準確性：0.7442	Train_召回率：0.1683Test_召回率：0.3377	Train_f1_score：0.2781 Test_f1_score：0.4160	Train：0.4519 Test：0.2590	Train_AUC Score ：0.5774 Test_AUC Score ：0.6160
SVM_poly	Train_準確性：0.7815 Test_準確性：0.7267	Train_召回率：0.1027 Test_召回率：0.0597	Train_f1_score：0.1859 Test_f1_score：0.1055	Train：0.7099 Test：0.3082	Train_AUC Score ：0.5510 Test_AUC Score ：0.5164
SVM_rbf	Train_準確性：0.7971 Test_準確性：0.7589	Train_召回率：0.1894 Test_召回率：0.1455	Train_f1_score：0.3119 Test_f1_score：0.2456	Train：0.6474 Test：0.3723	Train_AUC Score ：0.5907 Test_AUC Score ：0.5655
SVM_sigmoid	Train_準確性：0.7265 Test_準確性：0.7092	Train_召回率：0.2809 Test_召回率：0.1584	Train_f1_score：0.3328 Test_f1_score：0.2272	Train：0.2216 Test：0.1235	Train_AUC Score ：0.5752 Test_AUC Score ：0.5356
lg_120	Train_準確性：0.4355 Test_準確性：0.4590	Train_召回率：0.6671 Test_召回率：0.7117	Train_f1_score：0.3647 Test_f1_score：0.4152	Train：0.0695 Test：0.0907	Train_AUC Score ：0.5142 Test_AUC Score ：0.5387
DT	Train_準確性：0.7920 Test_準確性：0.7505	Train_召回率：0.4245 Test_召回率：0.3169	Train_f1_score：0.4978 Test_f1_score：0.4067	Train：0.4126 Test：0.3524	Train_AUC Score ：0.6672 Test_AUC Score ：0.6138
xgb_sklearn	Train_準確性：0.8452 Test_準確性：0.7765	Train_召回率：0.4691 Test_召回率：0.3065	Train_f1_score：0.5954 Test_f1_score：0.4252	Train：0.6167 Test：0.3763	Train_AUC Score ：0.7175 Test_AUC Score ：0.6283
lgb_sklearn	Train_準確性：1.0000 Test_準確性：0.7680	Train_召回率：1.0000 Test_召回率：0.3117	Train_f1_score：1.0000 Test_f1_score：0.4203	Train：1.0000 Test：0.3761	Train_AUC Score ：1.0000 Test_AUC Score ：0.6242

參考文獻

ML實操 - 貸款使用者逾期情況分析
 ML - 貸款使用者逾期情況分析
 python matplotlib 畫圖儲存圖片簡單例子
 sklearn.metrics中的評估方法介紹（accuracy_score, recall_score, roc_curve, roc_auc_score, confusion_matrix）

機器學習一個小目標——Task4

任務【模型評估】記錄五個模型關於precision，rescore，f1，auc，roc的評分表格，畫出auc和roc曲線圖遇到難題實驗程式碼 #!/usr/bin/env python 3.6 #-*- coding:utf-8 -*- # @File : Mo

機器學習一個小目標——Task2

【任務二】構建SVM和決策樹模型進行預測【時間】11.16（今天）遇到的問題資料歸一化未完成資料眾數填充未完成實現程式碼資料處理 #!/usr/bin/env python 3.6 #-*- coding:utf-8 -*- # @Fi

機器學習一個小目標——Task６

1. 任務 2. 網格搜尋 2.1 什麼是Grid Search 網格搜尋？ 2.2 Simple Grid Search：簡單的網格搜尋 2.3 實現程式碼(使用SVM模型) 3. 交叉驗證 3.

機器學習一個小目標——Task5

@TOC 任務【任務五-特徵工程1】關於資料型別轉換以及缺失值處理（嘗試不同的填充看效果）以及你能借鑑的資料探索遇到的問題 SVM模型訓練時一直卡住不動，【原因是資料沒有標準化或者是歸一化】刪去無關特徵對每一列的資料進行統計，如果這一列的資料每一個都不同，即判

機器學習一個小目標——Task7

1. 任務【任務六-模型融合】用你目前評分最高的模型作為基準模型，和其他模型進行stacking融合，得到最終模型及評分 2. Stacking融合按照自己的理解第一層：使用交叉驗證的劃分方法，將訓練集劃分成5份, 使用第一個基分類器對劃分之後得到的test進行預測，得到

機器學習一個小目標——Task3

任務構建xgboost和lightgbm模型進行預測遇到的問題 LGB和XGB自帶介面預測(predict)的都是概率 LGB和XGBa用sklearn的介面(predict)是分類結果，預測(proba)是概率訓練之前都要將資料轉化為相應模型所需的格式

機器學習每週一個小目標

任務構建邏輯迴歸模型進行預測（在構建部分資料需要進行缺失值處理和資料型別轉換，如果不能處理，可以直接暴力刪除）資料集主要問題是根據資料建立一個邏輯迴歸模型來預測貸款是否逾期。遇到的問題 encoding=‘gb18030’,為什麼改為utf-8不可以？

機器學習-最小二乘法

red num class cat blank height mar 感覺時間一、引言這段時間學習《機器學習》，學到第5章的“Logistic回歸”，感覺相當吃力。追本溯源，從“Logistic回歸”到“線性回歸”，再到“最小二乘法”。最終定格到了《高等數學》（第六版

每天學習一個小功能：java文件上傳

set 下載 nts null 最大的上傳文件 getname response 完整 ====（1、）第一種、利用普通緩沖流進行文件上傳 ① 前端註意： 1、指定表單類型為文件上傳表單：enctype="multipart/form-data" 2、提交方式必須為

每天學習一個小功能：java文件下載

con map pri 字節流 del mes file request 及其思路： 1、獲取文件上傳到upload文件夾下的文件名 2、將文件名處理成上傳時的文件名並封裝成集合給前端展示 3、前端根據提交的文件名再後臺查找upload文件夾下查找並下載代碼： /*

2018給自己個plan,給自己一個小目標

目標 log div blog 框架沒有城市現狀 clas 元旦假期的開始，標誌著2017已悄然遠去；元旦的假期的結束，標誌著2018的開始。現在全國各地在飄雪，上海這個南方的城市在下雨。雨挺好的，雖沒有大雪那樣包容萬物通篇全收，卻少了少許的寒冷。元月開始給201

機器學習中的目標函數、損失函數、代價函數有什麽區別？

是我什麽 www 結構分享圖片最小技術分享這一作者：zzanswer鏈接：https://www.zhihu.com/question/52398145/answer/209358209來源：知乎著作權歸作者所有。商業轉載請聯系作者獲得授權，非商業轉載請註明出

27個機器學習的小抄你值得收藏

www. known lob note 包括如果 mov sta 總結機器學習(Machine Learning)有很多方面，當我開始研究學習它時，我發現了各種各樣的“小抄”，它們簡明地列出了給定主題的關鍵知識點。最終，我匯集了超過 20 篇的機器學習相關的小抄，其中一

弱監督學習和小目標檢測

在簡書上看到一篇弱監督學習的帖子，由於沒有使用簡書的習慣，因此分享下這篇帖子，感興趣的直接去參考原作者。侵刪作者：baiyu33 連結：https://www.jianshu.com/p/7b0161975225 來源：簡書本文收集整理弱監督學習和小目標檢測方面的資料。

【機器學習】傳統目標檢測演算法之級聯分類器Cascade

先附上參考文章吧。文章其實是“P. Viola, M. Jones. Rapid Object Detection using a Boosted Cascade of Simple Features[J].CVPR, 2001”的學習筆記，下面第二個連結是文獻的中英文版

機器學習演算法小整理之KNN

1、工作原理：存在一個樣本資料集合（訓練樣本集）且樣本集中每個資料都存在標籤（樣本集中每一資料與所屬分類的對應關係）。輸入沒有標籤的新資料後，將新資料的每個特徵與樣本集中資料對應的特徵進行比較，提取樣本集中特徵最相似資料（最近鄰）的分類標籤。一般只選擇樣本資料集中前K個最

Coursera吳恩達機器學習課程總結筆記及作業程式碼——第6周有關機器學習的小建議

1.1 Deciding what to try next 當你除錯你的學習演算法時，當面對測試集你的演算法效果不佳時，你會怎麼做呢？獲得更多的訓練樣本？嘗試更少的特徵？嘗試獲取附加的特徵？嘗試增加多項式的特徵？嘗試增加λ? 嘗試減小λ?

寫在2017 年初--先定一個小目標

2016年，王健林一句話火了，那就是"比如說給給自己先定一個小目標，先掙它一個億"。我想“先定一個小目標”火了是因為“一個億”，一個億對於大部分人來說都是個天文數字，對於王健林來說，就是個小目標。所以，結合個人實際，先給自己定一個小目標吧。我的小目標

機器學習--最小二乘法和加權線性迴歸

本文是對由Stanford大學Andrew Ng講授的機器學習課程進行個人心得總結。在上一篇文章裡我們討論了關於線性迴歸和梯度下降的相關知識，本文主要是接著該篇文章的內容繼續深入的討論線性迴歸中存在的兩個優化點，最小二乘法和加權線性迴歸。最小二乘

2019年，先定一個小目標吧

編程語言居住會有學習有時每天陽臺易到不同好不容易到周六了（淩晨過後已是周日），還沒有加班的日子，就睡了一個懶覺，終於能睡到自然醒了，感覺是一周裏最舒服的一天，但現在卻輾轉難眠，因為今天出去看了房子，覺得2019年，會有很多變化。一直覺得現在租的房子不好

機器學習一個小目標——Task4

任務【模型評估】

遇到難題

實驗程式碼

實驗結果

參考文獻

相關推薦