caret包應用之四:模型預測與檢驗
阿新 • • 發佈:2019-01-01
原文地址:http://xccds.github.io/2011/09/caret_9105.html/
模型建立好後,我們可以利用predict函式進行預測,例如預測檢測樣本的前五個
predict(gbmFit1, newdata = testx)[1:5]為了比較不同的模型,還可用裝袋決策樹建立第二個模型,命名為gbmFit2
gbmFit2= train(trainx, trainy,method = "treebag",trControl = fitControl)另一種得到預測結果的方法是使用extractPrediction函式,得到的部分結果如下顯示
models = list(gbmFit1, gbmFit2)
predValues = extractPrediction(models,testX = testx, testY = testy)
head(predValues)
obs pred model dataType object
1 Active Active gbm Training Object1
2 Active Active gbm Training Object1
3 Active Inactive gbm Training Object1
4 Active Active gbm Training Object1
5 Active Active gbm Training Object1
6 Active Active gbm Training Object1從中可提取檢驗樣本的預測結果
testValues = subset(predValues, dataType == "Test")如果要得到預測概率,則使用extractProb函式
probValues = extractProb(models,testX = testx, testY = testy)對於分類問題的效能檢驗,最重要的是觀察預測結果的混淆矩陣
testProbs = subset(probValues, dataType == "Test")
Pred1 = subset(testValues, model == "gbm")結果如下,可見第一個模型在準確率要比第二個模型略好一些
Pred2 = subset(testValues, model == "treebag")
confusionMatrix(Pred1pred,Pred1obs)
confusionMatrix(Pred2pred,Pred2obs)
Reference
Prediction Active Inactive
Active 65 12
Inactive 9 45
Accuracy : 0.8397
Reference
Prediction Active Inactive
Active 63 12
Inactive 11 45
Accuracy : 0.8244
最後是利用ROCR包來繪製ROC圖
prob1 = subset(testProbs, model == "gbm")
prob2 = subset(testProbs, model == "treebag")
library(ROCR)
prob1lable=ifelse(prob1obs=='Active',yes=1,0)
pred1 = prediction(prob1Active,prob1lable)
perf1 = performance(pred1, measure="tpr", x.measure="fpr" )
plot( perf1 )