cs231n assignment1--Softmax
svm實現完了,這部分會相對比較輕鬆,大部分和svm類似。
關於梯度的推導,我主要參考這篇文章
http://www.jianshu.com/p/004c99623104multiclass
梯度推導:
向量化的實現和svm類似,實現過svm應該不難實現softmax
以下是softmax.py程式碼:
import numpy as np
from random import shuffle
def svm_loss_naive(W, X, y, reg):
"""
Structured SVM loss function, naive implementation (with loops).
Inputs have dimension D, there are C classes, and we operate on minibatches
of N examples.
Inputs:
- W: A numpy array of shape (D, C) containing weights.
- X: A numpy array of shape (N, D) containing a minibatch of data.
- y: A numpy array of shape (N,) containing training labels; y[i] = c means
that X[i] has label c, where 0 <= c < C.
- reg: (float) regularization strength
Returns a tuple of:
- loss as single float
- gradient with respect to weights W; an array of same shape as W
"""
dW = np.zeros(W.shape) # initialize the gradient as zero
# compute the loss and the gradient
num_classes = W.shape[1] # C
num_train = X.shape[0] # N
loss = 0.0
for i in xrange(num_train):
scores = X[i].dot(W)
correct_class_score = scores[y[i]]
for j in xrange(num_classes):
if j == y[i]:
continue
margin = scores[j] - correct_class_score + 1 # note delta = 1
if margin > 0:
loss += margin
dW[:,y[i]] -= X[i,:]
dW[:,j] += X[i,:]
# Right now the loss is a sum over all training examples, but we want it
# to be an average instead so we divide by num_train.
loss /= num_train
dW /=num_train
# Add regularization to the loss.
loss += 0.5 * reg * np.sum(W * W)
dW +=reg*W;
#############################################################################
# TODO: #
# Compute the gradient of the loss function and store it dW. #
# Rather that first computing the loss and then computing the derivative, #
# it may be simpler to compute the derivative at the same time that the #
# loss is being computed. As a result you may need to modify some of the #
# code above to compute the gradient. #
#############################################################################
return loss, dW
def svm_loss_vectorized(W, X, y, reg):
"""
Structured SVM loss function, vectorized implementation.
Inputs and outputs are the same as svm_loss_naive.
"""
loss = 0.0
num_train= X.shape[0]
dW = np.zeros(W.shape) # initialize the gradient as zero
scores = np.dot(X,W)
correct_class_scores = scores[np.arange(num_train),y]
correct_class_scores = np.reshape(correct_class_scores,(num_train,-1))
margin = scores-correct_class_scores+1.0 # numpy廣播
margin[np.arange(num_train),y]=0.0
margin[margin<=0]=0.0
loss += np.sum(margin)/num_train
loss += 0.5*reg*np.sum(W*W)
#############################################################################
# TODO: #
# Implement a vectorized version of the structured SVM loss, storing the #
# result in loss. #
#############################################################################
pass
#############################################################################
# END OF YOUR CODE #
#############################################################################
margin[margin>0]=1.0
row_sum = np.sum(margin,axis=1)
margin[np.arange(num_train),y] = -row_sum
dW = 1.0/num_train*np.dot(X.T,margin) + reg*W;
#############################################################################
# TODO: #
# Implement a vectorized version of the gradient for the structured SVM #
# loss, storing the result in dW. #
# #
# Hint: Instead of computing the gradient from scratch, it may be easier #
# to reuse some of the intermediate values that you used to compute the #
# loss. #
#############################################################################
pass
#############################################################################
# END OF YOUR CODE #
#############################################################################
return loss, dW
作業部分程式碼:
Softmax exercise
Complete and hand in this completed worksheet (including its outputs and any supporting code outside of the worksheet) with your assignment submission. For more details see the assignments page on the course website.
This exercise is analogous to the SVM exercise. You will:
implement a fully-vectorized loss function for the Softmax classifier
implement the fully-vectorized expression for its analytic gradient
check your implementation with numerical gradient
use a validation set to tune the learning rate and regularization strength
optimize the loss function with SGD
visualize the final learned weights
import random
import numpy as np
from cs231n.data_utils import load_CIFAR10
import matplotlib.pyplot as plt
%matplotlib inline
plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'
# for auto-reloading extenrnal modules
# see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython
%load_ext autoreload
%autoreload 2
def get_CIFAR10_data(num_training=49000, num_validation=1000, num_test=1000, num_dev=500):
"""
Load the CIFAR-10 dataset from disk and perform preprocessing to prepare
it for the linear classifier. These are the same steps as we used for the
SVM, but condensed to a single function.
"""
# Load the raw CIFAR-10 data
cifar10_dir = 'cs231n/datasets/cifar-10-batches-py'
X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)
# subsample the data
mask = range(num_training, num_training + num_validation)
X_val = X_train[mask]
y_val = y_train[mask]
mask = range(num_training)
X_train = X_train[mask]
y_train = y_train[mask]
mask = range(num_test)
X_test = X_test[mask]
y_test = y_test[mask]
mask = np.random.choice(num_training, num_dev, replace=False)
X_dev = X_train[mask]
y_dev = y_train[mask]
# Preprocessing: reshape the image data into rows
X_train = np.reshape(X_train, (X_train.shape[0], -1))
X_val = np.reshape(X_val, (X_val.shape[0], -1))
X_test = np.reshape(X_test, (X_test.shape[0], -1))
X_dev = np.reshape(X_dev, (X_dev.shape[0], -1))
# Normalize the data: subtract the mean image
mean_image = np.mean(X_train, axis = 0)
X_train -= mean_image
X_val -= mean_image
X_test -= mean_image
X_dev -= mean_image
# add bias dimension and transform into columns
X_train = np.hstack([X_train, np.ones((X_train.shape[0], 1))])
X_val = np.hstack([X_val, np.ones((X_val.shape[0], 1))])
X_test = np.hstack([X_test, np.ones((X_test.shape[0], 1))])
X_dev = np.hstack([X_dev, np.ones((X_dev.shape[0], 1))])
return X_train, y_train, X_val, y_val, X_test, y_test, X_dev, y_dev
# Invoke the above function to get our data.
X_train, y_train, X_val, y_val, X_test, y_test, X_dev, y_dev = get_CIFAR10_data()
print 'Train data shape: ', X_train.shape
print 'Train labels shape: ', y_train.shape
print 'Validation data shape: ', X_val.shape
print 'Validation labels shape: ', y_val.shape
print 'Test data shape: ', X_test.shape
print 'Test labels shape: ', y_test.shape
print 'dev data shape: ', X_dev.shape
print 'dev labels shape: ', y_dev.shape
Softmax Classifier
Your code for this section will all be written inside cs231n/classifiers/softmax.py.
# First implement the naive softmax loss function with nested loops.
# Open the file cs231n/classifiers/softmax.py and implement the
# softmax_loss_naive function.
from cs231n.classifiers.softmax import softmax_loss_naive
import time
# Generate a random softmax weight matrix and use it to compute the loss.
W = np.random.randn(3073, 10) * 0.0001
loss, grad = softmax_loss_naive(W, X_dev, y_dev, 0.0)
# As a rough sanity check, our loss should be something close to -log(0.1).
print 'loss: %f' % loss
print 'sanity check: %f' % (-np.log(0.1))
# Complete the implementation of softmax_loss_naive and implement a (naive)
# version of the gradient that uses nested loops.
loss, grad = softmax_loss_naive(W, X_dev, y_dev, 0.0)
# As we did for the SVM, use numeric gradient checking as a debugging tool.
# The numeric gradient should be close to the analytic gradient.
from cs231n.gradient_check import grad_check_sparse
f = lambda w: softmax_loss_naive(w, X_dev, y_dev, 0.0)[0]
grad_numerical = grad_check_sparse(f, W, grad, 10)
# similar to SVM case, do another gradient check with regularization
loss, grad = softmax_loss_naive(W, X_dev, y_dev, 1e2)
f = lambda w: softmax_loss_naive(w, X_dev, y_dev, 1e2)[0]
grad_numerical = grad_check_sparse(f, W, grad, 10)
# Now that we have a naive implementation of the softmax loss function and its gradient,
# implement a vectorized version in softmax_loss_vectorized.
# The two versions should compute the same results, but the vectorized version should be
# much faster.
tic = time.time()
loss_naive, grad_naive = softmax_loss_naive(W, X_dev, y_dev, 0.00001)
toc = time.time()
print 'naive loss: %e computed in %fs' % (loss_naive, toc - tic)
from cs231n.classifiers.softmax import softmax_loss_vectorized
tic = time.time()
loss_vectorized, grad_vectorized = softmax_loss_vectorized(W, X_dev, y_dev, 0.00001)
toc = time.time()
print 'vectorized loss: %e computed in %fs' % (loss_vectorized, toc - tic)
# As we did for the SVM, we use the Frobenius norm to compare the two versions
# of the gradient.
grad_difference = np.linalg.norm(grad_naive - grad_vectorized, ord='fro')
print 'Loss difference: %f' % np.abs(loss_naive - loss_vectorized)
print 'Gradient difference: %f' % grad_difference
# Use the validation set to tune hyperparameters (regularization strength and
# learning rate). You should experiment with different ranges for the learning
# rates and regularization strengths; if you are careful you should be able to
# get a classification accuracy of over 0.35 on the validation set.
from cs231n.classifiers import Softmax
results = {}
best_val = -1
best_softmax = None
learning_rates = [1e-7, 5e-7]
regularization_strengths = [5e4, 1e8]
for i in range(np.shape(learning_rates)[0]):
for j in range(np.shape(learning_rates)[0]):
softmax = Softmax()
learning_rate = learning_rates[i]
reg = regularization_strengths[j]
loss_hist = softmax.train(X_train, y_train, learning_rate, reg,
num_iters=1500, verbose=True)
y_train_pred = softmax.predict(X_train)
training_accuracy=np.mean(y_train == y_train_pred)
y_val_pred = softmax.predict(X_val)
validation_accuracy=np.mean(y_val == y_val_pred)
results[(learning_rate,reg)]=(training_accuracy,validation_accuracy)
if(best_val<validation_accuracy):
best_val = validation_accuracy
best_softmax = softmax
################################################################################
# TODO: #
# Use the validation set to set the learning rate and regularization strength. #
# This should be identical to the validation that you did for the SVM; save #
# the best trained softmax classifer in best_softmax. #
################################################################################
pass
################################################################################
# END OF YOUR CODE #
################################################################################
# Print out results.
for lr, reg in sorted(results):
train_accuracy, val_accuracy = results[(lr, reg)]
print 'lr %e reg %e train accuracy: %f val accuracy: %f' % (
lr, reg, train_accuracy, val_accuracy)
print 'best validation accuracy achieved during cross-validation: %f' % best_val
# evaluate on test set
# Evaluate the best softmax on test set
y_test_pred = best_softmax.predict(X_test)
test_accuracy = np.mean(y_test == y_test_pred)
print 'softmax on raw pixels final test set accuracy: %f' % (test_accuracy, )
# Visualize the learned weights for each class
w = best_softmax.W[:-1,:] # strip out the bias
w = w.reshape(32, 32, 3, 10)
w_min, w_max = np.min(w), np.max(w)
classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
for i in xrange(10):
plt.subplot(2, 5, i + 1)
# Rescale the weights to be between 0 and 255
wimg = 255.0 * (w[:, :, :, i].squeeze() - w_min) / (w_max - w_min)
plt.imshow(wimg.astype('uint8'))
plt.axis('off')
plt.title(classes[i])
相關推薦
cs231n assignment1--Softmax
svm實現完了,這部分會相對比較輕鬆,大部分和svm類似。 關於梯度的推導,我主要參考這篇文章 http://www.jianshu.com/p/004c99623104multiclass 梯度推導: 向量化的實現和svm類似,實現過svm應該不難
cs231n作業:assignment1 - softmax
title: cs231n作業:assignment1 - softmax id: cs231n-1h-3 tags: cs231n homework categories: AI Deep Learning date: 2018-09-27 16:02:
斯坦福CS231n assignment1:softmax損失函式求導
斯坦福CS231n assignment1:softmax損失函式求導 在前文斯坦福CS231n assignment1:SVM影象分類原理及實現中我們講解了利用SVM模型進行影象分類的方法,本文我們講解影象分類的另一種實現,利用softmax進行影象分類。
斯坦福深度學習課程cs231n assignment1作業筆記三:softmax實現相關
任務 實現向量化的損失函式 實現向量化的梯度計算 分析梯度與數值梯度的驗證 使用驗證集來選擇超引數 使用SGD優化方法 視覺化權重 理論知識 softmax損失函式 令W為權重矩陣,大小為D×C;x為輸入,大小為1×D;b為偏置項,大小為1×C。那麼模型的輸
斯坦福cs231n課程記錄——assignment1 Softmax
目錄 Softmax原理 某些API解釋 Softmax實現 作業問題記錄 Softmax優化 Softmax運用 參考文獻 一、Softmax原理 二、某些API解釋 lambda函式 作用:lambda 定義
cs231n-assignment1-SVM/Softmax/two-layer-nets梯度求解
上週完成了cs231n的assignment1,作業中的難點是SVM/Softmax/two-layer-nets的梯度求導,特此寫篇部落格進行總結。 作業assignment1的資源連結在這裡:http://download.csdn.net/detail/
CS231N assignment1
位置 元素 rand ali num 計算 ini itl 分享圖片 # Visualize some examples from the dataset. # We show a few examples of training images from each cla
CS231N assignment1 SVM
from cs231n.classifiers.softmax import softmax_loss_naive 線性分類器SVM,分成兩個部分 1.a score function that maps the raw data to class scores,也就是所謂的f(w,x)
CS231n Assignment1總結
lecture3一些關於鏈式法則的基本知識。 下面是對assignment1的程式碼一些關鍵點或者有意思實現的總結 參考答案:https://github.com/sharedeeply/cs231n-assignment-solution/blob/master/assignment1/
影象與機器學習-2-基礎知識及cs231n/assignment1
part 1 機器學習基礎知識: 包括線性迴歸,邏輯迴歸,交叉熵,softmax,KNN,神經網路中梯度的傳遞思想。關於線性迴歸和邏輯迴歸部分的知識,可以參考這個部落格的內容,就不再累述:http://blog.csdn.net/viewcode/article/details/8
CS231n assignment1 Q5 Level Representations: Image Feature
這個作業是討論對影象畫素進行進一步計算得到的特徵來訓練線性分類器是否可以提高效能。 對於每張圖,我們會計算梯度方向直方圖(HOG)特徵和用HSV(Hue色調,Saturation飽和度,Value明度)顏色空間的色調特徵。把每張圖的梯度方向直方圖和顏色直方圖特徵合併形成我們最後的特徵向量。 HOG大致可以捕捉
cs231n assignment1 環境搭建+實踐操作
網易雲課程視訊及作業連結 http://study.163.com/course/courseMain.htm?courseId=1003223001 1. 環境搭建 根據我第一篇的文章成功進入了環境。我用的是VM12+Ubuntu14.04.5,適合電腦配置低的童鞋(啊哦……)
cs231n assignment1 關於svm_loss_vectorized中程式碼的梯度部分
個人覺得svm和softmax的梯度部分是這份作業的難點,參考了一些程式碼覺得還是難以理解,網上似乎也沒有相關的解釋,所以想把自己的想法貼出來,提供一個參考。 首先貼上參考的程式碼: def svm_loss_vectorized(W, X, y,
斯坦福深度學習課程cs231n assignment1作業筆記二:SVM實現相關
前言 本次作業需要完成: 實現SVM損失函式,並且是完全向量化的 實現相關的梯度計算,也是向量化的 使用數值梯度驗證梯度是否正確 使用驗證集來選擇一組好的學習率以及正則化係數 使用SGD方法優化loss 視覺化最終的權重 程式碼實現 使用for迴圈計算SVM
CS231N assignment1——SVM
Multiclass Support Vector Machine exercise Complete and hand in this completed worksheet (including its outputs and any supporting
CS231n assignment1 -- Two-layer neural network
接近assignment1的尾聲了,這次我們要完成的是一個兩層的神經網路,要求如下: RELU使用np.maximum()即可; Softmax與作業上個part相同,可以直接照搬。 不同的地方在求導,兩個全連線層,共有W1 b1 W2 b2四個引數。對於它
CS231n-assignment1 K-fold 交叉驗證 python 中字典的用法
num_folds = 5 k_choices = [1, 3, 5, 8, 10, 12, 15, 20, 50, 100] X_train_folds = [] y_train_folds = [] ###################################
【Python 程式碼】CS231n中Softmax線性分類器、非線性分類器對比舉例(含python繪圖顯示結果)
#CS231n中線性、非線性分類器舉例(Softmax) #注意其中反向傳播的計算 # -*- coding: utf-8 -*- import numpy as np import matplotlib.pyplot as plt N = 100 # num
CS231n——Assignment1-KNN
一、KNN 1.讀取資料 import numpy as np import random from cs231n.data_utils import load_CIFAR10 import matplotlib.pyplot as plt import os pl
CS231n-2017 Assignment1 k-近鄰方法、SVM、Softmax、兩層神經網路
一、k近鄰方法 1. 使用兩層迴圈計算距離矩陣 訓練資料X_train和測試資料X中每一行是一個樣本點。距離矩陣dists中每一行為X中的一點與X_train中各個點的距離。 k_nearest_neighbor檔案中的compute_distances_two_loops()