1. 程式人生 > >基於SVM特徵選擇的問題記錄

基於SVM特徵選擇的問題記錄

E:\Project_CAD\venv\lib\site-packages\sklearn\svm\base.py:922: ConvergenceWarning: Liblinear failed to converge, increase the number of iterations.
  "the number of iterations.", ConvergenceWarning)

Normally when an optimization algorithm does not converge, it is usually because the problem is not well-conditioned, perhaps due to a poor scaling of the decision variables. There are a few things you can try.

  1. Normalize your training data so that the problem hopefully becomes more well conditioned, which in turn can speed up convergence. One possibility is to scale your data to 0 mean, unit standard deviation using Scikit-Learn’s StandardScaler for an example. Note that you have to apply the StandardScaler fitted on the training data to the test data.
  2. Related to 1), make sure the other arguments such as regularization weight, C, is set appropriately.
  3. Set max_iter to a larger value. The default is 1000.

2 編寫歸一化函式時遇到的問題
1)參考1.歸一化提到的兩個部落格進行改進z-score歸一化函式的編寫(基於Python)
2)list列表沒有shape屬性。numpy.array to use shape attribute.
3)Python2和Python3在除法,取整和求模中有區別。
a. /是精確除法,//是向下取整除法,%是求模

(Python3)
b. %求模是基於向下取整除法規則的(Python3)
c. 四捨五入取整round, 向零取整int, 向下和向上取整函式math.floor, math.ceil(Python3)
d. //和math.floor在CPython中的不同
e. /在python 2 中是向下取整運算
f. C中%是向零取整求模。
4)資料的索引不能是浮點數features[m//2],m//2取整
5)**變數型別特別注意:**變數L在賦值時就定義了資料的型別,賦值1.0時是浮點型別,計算結果再賦值L時仍是浮點型別。若賦值定義時,賦值1則定義了整數型別,計算結果是浮點型別時賦值L仍以整數儲存,自動把浮點型轉換為整數型
6)0 1二分類的也可隨其他連續數值型的特徵一起歸一化,只是將0和1對映到另兩個數而已
7)new_scale函式無法呼叫切片型別的資料進行歸一化。解決辦法:取切片的值進行歸一化。詳見https://stackoverflow.com/questions/43290202/python-typeerror-unhashable-type-slice-for-encoding-categorical-data data.iloc[:,:55]=new_scale(data.values[:,:55])

3.固定訓練集和測試集後,每次訓練準確性還是不一致,原因是特徵篩選的資料集不定,篩選的特徵不一致導致。如何固定訓練集和測試集,如何進一步固定特徵篩選的資料集???
4.訓練模型時還是存在不收斂的問題,如何解決?
5.畫roc曲線,求解三個模型評估指標