1. 程式人生 > 其它 >k近鄰7-案例:鳶尾花種類預測—流程實現

k近鄰7-案例:鳶尾花種類預測—流程實現

1 資料集

2 方法

sklearn.neighbors.KNeighborsClassifier(n_neighbors=5,algorithm='auto')
algorithm(auto,ball_tree, kd_tree, brute) -- 選擇什麼樣的演算法進行計算

3 案例實現

  • 匯入模組
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
  • 獲取sklearn資料集並進行分割
# 1.獲取資料集
iris = load_iris()

# 2.資料基本處理
# x_train,x_test,y_train,y_test為訓練集特徵值、測試集特徵值、訓練集目標值、測試集目標值
x_train, x_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2, random_state=22)
  • 資料標準化,特徵值標準化
# 3、特徵工程:標準化
transfer = StandardScaler()
x_train = transfer.fit_transform(x_train)
x_test = transfer.transform(x_test)
  • 模型訓練預測
# 4、機器學習(模型訓練)
estimator = KNeighborsClassifier(n_neighbors=9)
estimator.fit(x_train, y_train)
# 5、模型評估
# 方法1:比對真實值和預測值
y_predict = estimator.predict(x_test)
print("預測結果為:\n", y_predict)
print("比對真實值和預測值:\n", y_predict == y_test)
# 方法2:直接計算準確率
score = estimator.score(x_test, y_test)
print("準確率為:\n", score)

注:scikit-learn 要用穩定版本 0.19.0