Coursera-機器學習(吳恩達)第四周-程式設計作業
阿新 • • 發佈:2019-02-04
1、Multi-class Classification
如果將這個題轉換為神經網路,相當於這個模型只有兩層:輸入層和輸出層,輸入層由400個神經元(畫素)組成,輸出層由10個神經元組成,輸出層的神經元編號為1到10,分別表示1到9和0(10表示0),每個神經元輸出結果是預測輸入影象是該神經元編號的概率,選取概率最大的神經元編號作為預測的數字。
1.3 Vectorizing Logistic Regression
function [J, grad] = lrCostFunction(theta, X, y, lambda) % Initialize some useful values m = length(y); % number of training examples % You need to return the following variables correctly J = 0; grad = zeros(size(theta)); J = (-y' * log(sigmoid(X * theta)) - (1 - y)' * log(1 - sigmoid(X * theta))) / m ... + lambda / 2 / m * sum(theta(2 : end) .^ 2); temp = theta; temp(1) = 0; grad = (X' * (sigmoid(X * theta) - y) + lambda * temp) / m; grad = grad(:); end
1.4 One-vs-all Classification
注意呼叫fmincg時,theta的初始值為 all_theta(c,:)' ,有轉置。
function [all_theta] = oneVsAll(X, y, num_labels, lambda) % Some useful variables m = size(X, 1); % 行數 5000 n = size(X, 2); % 列數 400 % You need to return the following variables correctly all_theta = zeros(num_labels, n + 1); % 10 * 401 % Add ones to the X data matrix X = [ones(m, 1) X]; % 5000 * 401 cost = 0; options = optimset('GradObj', 'on', 'MaxIter', 50); for c = 1 : num_labels % 傳入 lrCostFunction() 裡的 theta 是列向量,所以 all_theta(c,:)' all_theta(c,:) = fmincg(@(t)(lrCostFunction(t, X, (y == c), lambda)), all_theta(c,:)', options)'; endfor end
1.4.1 One-vs-all Prediction
function p = predictOneVsAll(all_theta, X) m = size(X, 1); num_labels = size(all_theta, 1); % You need to return the following variables correctly p = zeros(size(X, 1), 1); % Add ones to the X data matrix X = [ones(m, 1) X]; % X: 5000 * 401, all_theta: 10 * 401 % X * all_theta': 5000 * 10 [maxx, p] = max(X * all_theta', [], 2); end
2 Neural Networks
function p = predict(Theta1, Theta2, X)
%PREDICT Predict the label of an input given a trained neural network
% p = PREDICT(Theta1, Theta2, X) outputs the predicted label of X given the
% trained weights of a neural network (Theta1, Theta2)
% Useful values
m = size(X, 1);
num_labels = size(Theta2, 1);
% You need to return the following variables correctly
p = zeros(size(X, 1), 1);
X = [ones(m, 1) X];
a2 = [ones(m, 1) sigmoid(X * Theta1')];
%size(a2)
[maxx, p] = max(sigmoid(a2 * Theta2'), [], 2);
end