k近鄰7-案例:鳶尾花種類預測—流程實現
阿新 • • 發佈:2021-09-13
1 資料集
2 方法
sklearn.neighbors.KNeighborsClassifier(n_neighbors=5,algorithm='auto')
algorithm(auto,ball_tree, kd_tree, brute) -- 選擇什麼樣的演算法進行計算
3 案例實現
- 匯入模組
from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.neighbors import KNeighborsClassifier
- 獲取sklearn資料集並進行分割
# 1.獲取資料集
iris = load_iris()
# 2.資料基本處理
# x_train,x_test,y_train,y_test為訓練集特徵值、測試集特徵值、訓練集目標值、測試集目標值
x_train, x_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2, random_state=22)
- 資料標準化,特徵值標準化
# 3、特徵工程:標準化 transfer = StandardScaler() x_train = transfer.fit_transform(x_train) x_test = transfer.transform(x_test)
- 模型訓練預測
# 4、機器學習(模型訓練) estimator = KNeighborsClassifier(n_neighbors=9) estimator.fit(x_train, y_train) # 5、模型評估 # 方法1:比對真實值和預測值 y_predict = estimator.predict(x_test) print("預測結果為:\n", y_predict) print("比對真實值和預測值:\n", y_predict == y_test) # 方法2:直接計算準確率 score = estimator.score(x_test, y_test) print("準確率為:\n", score)
注:scikit-learn 要用穩定版本 0.19.0