Softmax&SVM loss&gradient公式圖及其python實現
阿新 • • 發佈:2018-12-10
Softmax與SVM都是用來對資料進行分類的。Softmax常用於神經網路的輸出層,SVM常常直接與SGD配合實現物體分類。無論是Softmax還是SVM在工作時都需要計算出loss和gradient,學習使用中發現兩者有很多相似之處,特拿來對比學習。
公式
圖解
python程式碼實現
""" Structured softmax and SVM loss function. Inputs have dimension D, there are C classes, and we operate on minibatches of N examples. Inputs: - W: A numpy array of shape (D, C) containing weights. - X: A numpy array of shape (N, D) containing a minibatch of data. - y: A numpy array of shape (N,) containing training labels; y[i] = c means that X[i] has label c, where 0 <= c < C. Returns a tuple of: - loss as single float - gradient with respect to weights W; an array of same shape as W """ def softmax_loss_vectorized(W, X, y): loss = 0.0 dW = np.zeros_like(W) num_train = X.shape[0] score = X.dot(W) shift_score = score - np.max(score, axis=1, keepdims=True) # 對資料做了一個平移 shift_score_exp = np.exp(shift_score) shift_score_exp_sum = np.sum(shift_score_exp, axis=1, keepdims=True) score_norm = shift_score_exp / shift_score_exp_sum loss = np.sum(-np.log(score_norm[range(score_norm.shape[0]), y])) / num_train # dW d_score = score_norm d_score[range(d_score.shape[0]), y] -= 1 dW = X.T.dot(score_norm) / num_train return loss, dW def svm_loss_vectorized(W, X, y): delta = 1 num_training = X.shape[0] scores = X.dot(W) scores_gt_cls = scores[range(num_training), y][..., np.newaxis] scores_dis = scores - scores_gt_cls + delta scores_dis[range(num_training), y] -= delta scores_norm = np.maximum(0, scores_dis) loss = np.sum(scores_norm) / num_training d_scores = scores_norm d_scores[d_scores > 0] = 1 # 出現錯誤得分的地方統統設為1 row_sum=np.sum(d_scores, axis=1) d_scores[range(num_training), y] -= row_sum dW = X.T.dot(d_scores)/num_training return loss, dW