MatLab2012b/MatLab2013b 分類器大全(svm,knn,隨機森林等)
阿新 • • 發佈:2019-02-17
train_data是訓練特徵資料, train_label是分類標籤。
Predict_label是預測的標籤。
MatLab訓練資料, 得到語義標籤向量 Scores(概率輸出)。
1.邏輯迴歸(多項式MultiNomial logistic Regression)
Factor = mnrfit(train_data, train_label);
Scores = mnrval(Factor, test_data);
scores是語義向量(概率輸出)。對高維特徵,吃不消。
2.隨機森林分類器(Random Forest)
Factor = TreeBagger(nTree, train_data, train_label);
[Predict_label,Scores] = predict(Factor, test_data);
scores是語義向量(概率輸出)。實驗中nTree = 500。
效果好,但是有點慢。2500行資料,耗時400秒。500萬行大資料分析,會咋樣?準備好一篇小說慢慢閱讀吧^_^
3.樸素貝葉斯分類(Naive Bayes)
Factor = NaiveBayes.fit(train_data, train_label);
Scores = posterior(Factor, test_data);
[Scores,Predict_label] = posterior(Factor, test_data);
Predict_label = predict(Factor, test_data);
accuracy = length(find(predict_label == test_label))/length(test_label)*100;
效果不佳。
4. 支援向量機SVM分類
Factor = svmtrain(train_data, train_label);
predict_label = svmclassify(Factor, test_data);
不能有語義向量 Scores(概率輸出)
支援向量機SVM(Libsvm)
Factor = svmtrain(train_label, train_data, '-b 1');
[predicted_label, accuracy, Scores] = svmpredict(test_label, test_data, Factor, '-b 1');
5.K近鄰分類器 (KNN)
predict_label = knnclassify(test_data, train_data,train_label, num_neighbors);
accuracy = length(find(predict_label == test_label))/length(test_label)*100;
不能有語義向量 Scores(概率輸出)
IDX = knnsearch(train_data, test_data);
IDX = knnsearch(train_data, test_data, 'K', num_neighbors);
[IDX, Dist] = knnsearch(train_data, test_data, 'K', num_neighbors);
IDX是近鄰樣本的下標集合,Dist是距離集合。
自己編寫, 實現概率輸出 Scores(概率輸出)
Matlab 2012新版本:
Factor = ClassificationKNN.fit(train_data, train_label, 'NumNeighbors', num_neighbors);
predict_label = predict(Factor, test_data);
[predict_label, Scores] = predict(Factor, test_data);
6.整合學習器(Ensembles for Boosting, Bagging, or Random Subspace)
Matlab 2012新版本:
Factor = fitensemble(train_data, train_label, 'AdaBoostM2', 100, 'tree');
Factor = fitensemble(train_data, train_label, 'AdaBoostM2', 100, 'tree', 'type', 'classification');
Factor = fitensemble(train_data, train_label, 'Subspace', 50, 'KNN');
predict_label = predict(Factor, test_data);
[predict_label, Scores] = predict(Factor, test_data);
效果比預期差了很多。不佳。
7. 判別分析分類器(discriminant analysis classifier)
Factor = ClassificationDiscriminant.fit(train_data, train_label);
Factor = ClassificationDiscriminant.fit(train_data, train_label, 'discrimType', '判別型別:偽線性...');
predict_label = predict(Factor, test_data);
Predict_label是預測的標籤。
MatLab訓練資料, 得到語義標籤向量 Scores(概率輸出)。
1.邏輯迴歸(多項式MultiNomial logistic Regression)
Factor = mnrfit(train_data, train_label);
Scores = mnrval(Factor, test_data);
scores是語義向量(概率輸出)。對高維特徵,吃不消。
2.隨機森林分類器(Random Forest)
Factor = TreeBagger(nTree, train_data, train_label);
[Predict_label,Scores] = predict(Factor, test_data);
scores是語義向量(概率輸出)。實驗中nTree = 500。
效果好,但是有點慢。2500行資料,耗時400秒。500萬行大資料分析,會咋樣?準備好一篇小說慢慢閱讀吧^_^
3.樸素貝葉斯分類(Naive Bayes)
Factor = NaiveBayes.fit(train_data, train_label);
Scores = posterior(Factor, test_data);
[Scores,Predict_label] = posterior(Factor, test_data);
Predict_label = predict(Factor, test_data);
accuracy = length(find(predict_label == test_label))/length(test_label)*100;
效果不佳。
4. 支援向量機SVM分類
Factor = svmtrain(train_data, train_label);
predict_label = svmclassify(Factor, test_data);
不能有語義向量 Scores(概率輸出)
支援向量機SVM(Libsvm)
Factor = svmtrain(train_label, train_data, '-b 1');
[predicted_label, accuracy, Scores] = svmpredict(test_label, test_data, Factor, '-b 1');
5.K近鄰分類器 (KNN)
predict_label = knnclassify(test_data, train_data,train_label, num_neighbors);
accuracy = length(find(predict_label == test_label))/length(test_label)*100;
不能有語義向量 Scores(概率輸出)
IDX = knnsearch(train_data, test_data);
IDX = knnsearch(train_data, test_data, 'K', num_neighbors);
[IDX, Dist] = knnsearch(train_data, test_data, 'K', num_neighbors);
IDX是近鄰樣本的下標集合,Dist是距離集合。
自己編寫, 實現概率輸出 Scores(概率輸出)
Matlab 2012新版本:
Factor = ClassificationKNN.fit(train_data, train_label, 'NumNeighbors', num_neighbors);
predict_label = predict(Factor, test_data);
[predict_label, Scores] = predict(Factor, test_data);
6.整合學習器(Ensembles for Boosting, Bagging, or Random Subspace)
Matlab 2012新版本:
Factor = fitensemble(train_data, train_label, 'AdaBoostM2', 100, 'tree');
Factor = fitensemble(train_data, train_label, 'AdaBoostM2', 100, 'tree', 'type', 'classification');
Factor = fitensemble(train_data, train_label, 'Subspace', 50, 'KNN');
predict_label = predict(Factor, test_data);
[predict_label, Scores] = predict(Factor, test_data);
效果比預期差了很多。不佳。
7. 判別分析分類器(discriminant analysis classifier)
Factor = ClassificationDiscriminant.fit(train_data, train_label);
Factor = ClassificationDiscriminant.fit(train_data, train_label, 'discrimType', '判別型別:偽線性...');
predict_label = predict(Factor, test_data);
[predict_label, Scores] = predict(Factor, test_data);
轉載自:http://blog.csdn.net/xuhaijiao99/article/details/15027093