1. 程式人生 > >RF:RF實現根據乳腺腫瘤特征向量高精度(better)預測腫瘤的是惡性還是良性—Jason niu

RF:RF實現根據乳腺腫瘤特征向量高精度(better)預測腫瘤的是惡性還是良性—Jason niu

body pos tlab 高精度 total index 性能分析 總數 itl

%RF:RF實現根據乳腺腫瘤特征向量高精度(better)預測腫瘤的是惡性還是良性

load data.mat

a = randperm(569);
Train = data(a(1:500),:);
Test = data(a(501:end),:);

P_train = Train(:,3:end);
T_train = Train(:,2);

P_test = Test(:,3:end);
T_test = Test(:,2);

model = classRF_train(P_train,T_train);  

[T_sim,votes] = classRF_predict(P_test,model);

count_B = length(find(T_train == 1));
count_M = length(find(T_train == 2));
total_B = length(find(data(:,2) == 1));
total_M = length(find(data(:,2) == 2));
number_B = length(find(T_test == 1));
number_M = length(find(T_test == 2));
number_B_sim = length(find(T_sim == 1 & T_test == 1));
number_M_sim = length(find(T_sim == 2 & T_test == 2));
disp([‘病例總數:‘ num2str(569)...
      ‘  良性:‘ num2str(total_B)...
      ‘  惡性:‘ num2str(total_M)]);
disp([‘訓練集病例總數:‘ num2str(500)...
      ‘  良性:‘ num2str(count_B)...
      ‘  惡性:‘ num2str(count_M)]);
disp([‘測試集病例總數:‘ num2str(69)...
      ‘  良性:‘ num2str(number_B)...
      ‘  惡性:‘ num2str(number_M)]);
disp([‘良性乳腺腫瘤確診:‘ num2str(number_B_sim)...
      ‘  誤診:‘ num2str(number_B - number_B_sim)...
      ‘  確診率p1=‘ num2str(number_B_sim/number_B*100) ‘%‘]);
disp([‘惡性乳腺腫瘤確診:‘ num2str(number_M_sim)...
      ‘  誤診:‘ num2str(number_M - number_M_sim)...
      ‘  確診率p2=‘ num2str(number_M_sim/number_M*100) ‘%‘]);
  
figure

index = find(T_sim ~= T_test);
plot(votes(index,1),votes(index,2),‘r*‘)
hold on

index = find(T_sim == T_test);
plot(votes(index,1),votes(index,2),‘bo‘)
hold on

legend(‘紅色*是錯誤分類樣本‘,‘藍色空心圓是正確分類樣本‘)

plot(0:500,500:-1:0,‘r-.‘)
hold on

plot(0:500,0:500,‘r-.‘)
hold on

line([100 400 400 100 100],[100 100 400 400 100])

xlabel(‘輸出為類別1的決策樹棵數‘)
ylabel(‘輸出為類別2的決策樹棵數‘) 
title(‘隨機森林分類器性能分析—Jason niu‘)    


Accuracy = zeros(1,20);
for i = 50:50:1000
    i
    accuracy = zeros(1,100);
    for k = 1:100
        model = classRF_train(P_train,T_train,i);
        T_sim = classRF_predict(P_test,model);
        accuracy(k) = length(find(T_sim == T_test)) / length(T_test);
    end
     Accuracy(i/50) = mean(accuracy);
end


figure
plot(50:50:1000,Accuracy)
xlabel(‘隨機森林中決策樹棵數‘)
ylabel(‘分類正確率‘)
title(‘隨機森林中決策樹棵數對性能的影響—Jason niu‘)

  

RF:RF實現根據乳腺腫瘤特征向量高精度(better)預測腫瘤的是惡性還是良性—Jason niu