1. 程式人生 > >交叉驗證之sklearn.model_selection.GridSearchCV

交叉驗證之sklearn.model_selection.GridSearchCV

from sklearn.model_selection import GridSearchCV
tree_param_grid={'min_samples_split':list((3,6,9)),'n_estimators':list((10,50,100))}#對這3*3個引數組合的結果進行比較
grid=GridSearchCV(RandomForestRegressor(),param_grid=tree_param_grid,cv=5)
grid.fit(data_train,target_train)
grid.best_score_#成員變數,輸出最好的結果
Out[45]:0.80685458833050794
grid.best_params_#成員變數,輸出最好的結果對應的引數
Out[46]:{'min_samples_split': 3, 'n_estimators': 100}

sklearn.model_selection.GridSearchCV(estimator,
param_grid,
scoring=None,
fit_params=None,
n_jobs=None,
iid=’warn’,
refit=True,
cv=’warn’,
verbose=0,
pre_dispatch=‘2*n_jobs’,
error_score=’raise-deprecating’,
return_train_score=’warn’)

只介紹重要的幾個引數:
estimator : estimator object
This is assumed to implement the scikit-learn estimator interface. Either estimator needs to provide a score function, or scoring must be passed.
要進行引數選擇的模型

param_grid : dict or list of dictionaries
A single string (see The scoring parameter: defining model evaluation rules) or a callable (see Defining your scoring strategy from metric functions) to evaluate the predictions on the test set.
要進行評價的模型引數,用字典表示

cv : int, cross-validation generator or an iterable, optional


Determines the cross-validation splitting strategy. Possible inputs for cv are:
•None, to use the default 3-fold cross validation,
•integer, to specify the number of folds in a (Stratified)KFold,
•An object to be used as a cross-validation generator.
•An iterable yielding train, test splits.
交叉驗證的交叉數。