SVM做鳶尾花分類預測 KeyError: “None of [Int64Index([0, 1, 2, 3], dtype=‘int64‘)] are in the [columns]錯誤探索
阿新 • • 發佈:2020-12-22
用支援向量機做鳶尾花分類預測時
敲程式碼有個關於KeyError的報錯,如下:
KeyError: “None of [Int64Index([0, 1, 2, 3], dtype=‘int64’)] are in the [columns]”
原始程式碼如下
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn import svm
from sklearn.metrics import accuracy_score
import numpy as np
data = pd.read_csv(r"G:\實驗6/iris.csv")
x, y = data[range(4)], data[4]
y = pd.Categorical(y).codes
x = x[[0,1,2,3]]
x_train, x_test, y_train, y_test = train_test_split(x, y, random_state=1, train_size=0.6)
clf=svm.SVC(C=0.4,kernel='rbf',gamma=20,decision_function_shape='ovr')
clf. fit(x_train, y_train.ravel())
print('訓練集準確率:', accuracy_score(y_train, clf.predict(x_train)))
print('測試集準確率:', accuracy_score(y_test, clf.predict(x_test)))
一開始看了很多博文
以為是資料型別的錯誤,或者是pandas包版本的問題,跟著他們的博文改,發現還是報錯。
然後探索發現並不是,檢視鳶尾花資料發現
鳶尾花資料集中沒有索引行,詳解見如下連結
https://www.cnblogs.com/komean/p/10629311.html
修改後程式碼不再報錯
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn import svm
from sklearn.metrics import accuracy_score
import numpy as np
data = pd.read_csv(r"G:\實驗6/iris.csv",header=None)
x, y = data[range(4)], data[4]
y = pd.Categorical(y).codes
x = x[[0,1,2,3]]
x_train, x_test, y_train, y_test = train_test_split(x, y, random_state=1, train_size=0.6)
clf=svm.SVC(C=0.4,kernel='rbf',gamma=20,decision_function_shape='ovr')
clf.fit(x_train, y_train.ravel())
print('訓練集準確率:', accuracy_score(y_train, clf.predict(x_train)))
print('測試集準確率:', accuracy_score(y_test, clf.predict(x_test)))
不再報錯,執行成功