CS231n Assignment1總結
阿新 • • 發佈:2018-11-22
lecture3一些關於鏈式法則的基本知識。
下面是對assignment1的程式碼一些關鍵點或者有意思實現的總結
參考答案:https://github.com/sharedeeply/cs231n-assignment-solution/blob/master/assignment1/
資料集讀取
對於assigment1中使用的資料集,從其中讀出來照片是一張被拉成一維向量的圖片,所以輸入X_train,X_test的行數為輸入資料的個數,列數為圖片的畫素量。y是一個 的矩陣,行數為資料量,每個值為 的值。
KNN
部分訓練資料顯示函式
# Visualize some examples from the dataset.
# We show a few examples of training images from each class.
classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
num_classes = len(classes)
samples_per_class = 7
for y, cls in enumerate(classes):
idxs = np.flatnonzero(y_train == y)# 返回不為0的值所在的座標
idxs = np.random.choice(idxs, samples_per_class, replace=False)# 不重複取samples_per_class個數據,按概率取值也是該函式
for i, idx in enumerate(idxs):
plt_idx = i * num_classes + y + 1 # 圖片的計數從1開始
plt.subplot(samples_per_class, num_classes, plt_idx)
plt.imshow(X_train[idx].astype('uint8'))
plt.axis('off')
if i == 0:
plt.title(cls)
plt.show()
關於 物件的呼叫,在檔案中是通過
from cs231n.classifiers import KNearestNeighbor
# Create a kNN classifier instance.
# Remember that training a kNN classifier is a noop:
# the Classifier simply remembers the data and does no further processing
classifier = KNearestNeighbor()
classifier.train(X_train, y_train)
語句實現的,這裡需要介紹__iniy__.py檔案的作用:
-
Python中package的標識,不能刪除
-
定義__all__用來模糊匯入
-
編寫Python程式碼(不建議在__init__中寫python模組,可以在包中在建立另外的模組來寫,儘量保證__init__.py簡單)
參考:https://www.cnblogs.com/AlwinXu/p/5598543.html
通過classifier資料夾中的__init__.py檔案使得上述呼叫可以實現
物件的實現較為簡單,不展開介紹,這裡主要介紹一些沒見過的語句
# 將train_sq複製為(num_train, num_test)大小的矩陣
train_sq = np.broadcast_to(train_sq, shape=(num_train, num_test))# train_sq需要為一個二維矩陣,不能為(n,)形式的矩陣
交叉驗證的實現:
for k in k_choices:
# 進行交叉驗證
acc = []
for i in range(num_folds):
x = X_train_folds[0:i] + X_train_folds[i+1:]
x = np.concatenate(x, axis=0) # 使用concatenate將4個訓練集拼在一起
y = y_train_folds[0:i] + y_train_folds[i+1:]
y = np.concatenate(y) # 對label使用同樣的操作
test_x = X_train_folds[i]
test_y = y_train_folds[i]
classifier = KNearestNeighbor() # 定義model
classifier.train(x, y) # 將訓練集讀入
dist = classifier.compute_distances_no_loops(test_x) # 計算距離矩陣
y_pred = classifier.predict_labels(dist, k) # 預測結果
accuracy = np.mean(y_pred == test_y) # 計算準確率
acc.append(accuracy)
k_to_accuracies[k] = acc # 計算交叉驗證的平均準確率
SVM
np.hstack()# 在水平方向平鋪
np.vstack()# 在豎直方向堆疊
關於SVM引數更新(本來以為很簡單,但在實現過程中還是發現有很多地方不清楚,實現演算法果然是最好的學習方法)
def svm_loss_vectorized(W, X, y, reg):
"""
Structured SVM loss function, vectorized implementation.
Inputs and outputs are the same as svm_loss_naive.
"""
loss = 0.0
dW = np.zeros(W.shape) # initialize the gradient as zero
#############################################################################
# TODO:
# Implement a vectorized version of the structured SVM loss, storing the #
# result in loss. #
#############################################################################
num_train = X.shape[0] # 得到樣本的數目
scores = np.dot(X, W) # 計算所有的得分
y_score = scores[np.arange(num_train), y].reshape((-1, 1)) # 這句話是將每一行資料所屬類別(即y的值)處的值取出
mask = (scores - y_score + 1) > 0 # 有效的score下標
scores = (scores - y_score + 1) * mask # 有效的得分
loss = (np.sum(scores) - num_train * 1) / num_train # 去掉每個樣本多加的對應label得分,然後平均(算有效score的時候應該不算y所屬類別的預測值)
loss += reg * np.sum(W * W)
#############################################################################
# END OF YOUR CODE #
#############################################################################
#############################################################################
# TODO:
# Implement a vectorized version of the gradient for the structured SVM #
# loss, storing the result in dW. #
# #
# Hint: Instead of computing the gradient from scratch, it may be easier #
# to reuse some of the intermediate values that you used to compute the #
# loss. #
#############################################################################
# dw = x.T * dl/ds
ds = np.ones_like(scores) # 初始化ds
ds *= mask # 有效的score梯度為1,無效的為0(需要更細的值只有max取值不為0位置處的值)
ds[np.arange(num_train), y] = -1 * (np.sum(mask, axis=1) - 1) # 每個樣本對應label的梯度計算了(有效的score次),取負號(在求導時,資料原來類別處的導數為每行mask的和減1,減1是因為類別處的score也算一個有效值),其餘有效score處的導數為1
dW = np.dot(X.T, ds) / num_train # 平均
dW += 2 * reg * W # 加上正則項的梯度
#############################################################################
# END OF YOUR CODE #
#############################################################################
return loss, dW
Softmax
def softmax_loss_vectorized(W, X, y, reg):
"""
Softmax loss function, vectorized version.
Inputs and outputs are the same as softmax_loss_naive.
"""
# Initialize the loss and gradient to zero.
loss = 0.0
dW = np.zeros_like(W)
#############################################################################
# TODO: Compute the softmax loss and its gradient using no explicit loops. #
# Store the loss in loss and the gradient in dW. If you are not careful #
# here, it is easy to run into numeric instability. Don't forget the #
# regularization! #
#############################################################################
scores = np.dot(X, W) # 計算得分
scores -= np.max(scores, axis=1, keepdims=True) # 數值穩定性,這裡實際上是減去一個常數對求導沒有影響
scores = np.exp(scores) # 取指數
scores /= np.sum(scores, axis=1, keepdims=True) # 計算softmax
ds = np.copy(scores) # 初始化loss對scores的梯度
ds[np.arange(X.shape[0]), y] -= 1 # 求出scores的梯度(對softMax求導之後得到對資料相應類別處的值減1,參考:https://www.jianshu.com/p/c02a1fbffad6)
dW = np.dot(X.T, ds) # 求出w的梯度
loss = scores[np.arange(X.shape[0]), y] # 計算loss
loss = -np.log(loss).sum() #求交叉熵
loss /= X.shape[0]
dW /= X.shape[0]
loss += reg * np.sum(W * W)
dW += 2 * reg * W
#############################################################################
# END OF YOUR CODE #
#############################################################################
return loss, dW
Two-layer network
這個的求導實際上是將softmax與SVM合併,即softmax求導和Max函式的求導
def loss(self, X, y=None, reg=0.0):
"""
Compute the loss and gradients for a two layer fully connected neural
network.
Inputs:
- X: Input data of shape (N, D). Each X[i] is a training sample.
- y: Vector of training labels. y[i] is the label for X[i], and each y[i] is
an integer in the range 0 <= y[i] < C. This parameter is optional; if it
is not passed then we only return scores, and if it is passed then we
instead return the loss and gradients.
- reg: Regularization strength.
Returns:
If y is None, return a matrix scores of shape (N, C) where scores[i, c] is
the score for class c on input X[i].
If y is not None, instead return a tuple of:
- loss: Loss (data loss and regularization loss) for this batch of training
samples.
- grads: Dictionary mapping parameter names to gradients of those parameters
with respect to the loss function; has the same keys as self.params.
"""
# Unpack variables from the params dictionary
W1, b1 = self.params['W1'], self.params['b1']
W2, b2 = self.params['W2'], self.params['b2']
N, D = X.shape
# Compute the forward pass
scores = None
#############################################################################
# TODO: Perform the forward pass, computing the class scores for the input. #
# Store the result in the scores variable, which should be an array of #
# shape (N, C). #
#############################################################################
s1 = np.dot(X, W1) + b1 # (N, H)
s1_act = (s1 > 0) * s1
scores = np.dot(s1_act, W2) + b2 # (N, C)
#############################################################################
# END OF YOUR CODE #
#############################################################################
# If the targets are not given then jump out, we're done
if y is None:
return scores
# Compute the loss
loss = None
#############################################################################
# TODO: Finish the forward pass, and compute the loss. This should include #
# both the data loss and L2 regularization for W1 and W2. Store the result #
# in the variable loss, which should be a scalar. Use the Softmax #
# classifier loss. #
#############################################################################
scores -= np.max(scores, axis=1, keepdims=True) # 數值穩定性
scores = np.exp(scores)
scores /= np.sum(scores, axis=1, keepdims=True) # softmax
loss = -np.log(scores[np.arange(N), y]).sum()
loss /= X.shape[0]
loss += reg * np.sum(W1**2)
loss += reg * np.sum(W2**2)
#############################################################################
# END OF YOUR CODE #
#############################################################################
# Backward pass: compute gradients
grads = {}
#############################################################################
# TODO: Compute the backward pass, computing the derivatives of the weights #
# and biases. Store the results in the grads dictionary. For example, #
# grads['W1'] should store the gradient on W1, and be a matrix of same size #
#############################################################################
ds2 = np.copy(scores) # 計算ds
ds2[np.arange(X.shape[0]), y] -= 1
ds2 = ds2 / X.shape[0]
grads['W2'] = np.dot(s1_act.T, ds2) + 2 * reg * W2
grads['b2'] = np.sum(ds2, axis=0)
ds1 = np.dot(ds2, W2.T)
ds1 = (s1 > 0) * ds1
grads['W1'] = np.dot(X.T, ds1) + 2 * reg * W1
grads['b1'] = np.sum(ds1, axis=0)
#############################################################################
# END OF YOUR CODE #
#############################################################################
return loss, grads
一些影象上的實現
一張影象HOG的實現
忽略顏色資訊提取紋理資訊
def hog_feature(im):
"""Compute Histogram of Gradient (HOG) feature for an image
Modified from skimage.feature.hog
http://pydoc.net/Python/scikits-image/0.4.2/skimage.feature.hog
Reference:
Histograms of Oriented Gradients for Human Detection
Navneet Dalal and Bill Triggs, CVPR 2005
Parameters:
im : an input grayscale or rgb image
Returns:
feat: Histogram of Gradient (HOG) feature
"""
# convert rgb to grayscale if needed
if im.ndim == 3:
image = rgb2gray(im)
else:
image = np.at_least_2d(im)
sx, sy = image.shape # image size
orientations = 9 # number of gradient bins
cx, cy = (8, 8) # pixels per cell
gx = np.zeros(image.shape)
gy = np.zeros(image.shape)
gx[:, :-1] = np.diff(image, n=1, axis=1) # compute gradient on x-direction 計算y方向的梯度
gy[:-1, :] = np.diff(image, n=1, axis=0) # compute gradient on y-direction 計算x方向的梯度
grad_mag = np.sqrt(gx ** 2 + gy ** 2) # gradient magnitude
grad_ori = np.arctan2(gy, (gx + 1e-15)) * (180 / np.pi) + 90 # gradient orientation
n_cellsx = int(np.floor(sx / cx)) # number of cells in x
n_cellsy = int(np.floor(sy / cy)) # number of cells in y
# compute orientations integral images
orientation_histogram = np.zeros((n_cellsx, n_cellsy, orientations))
for i in range(orientations):
# create new integral image for this orientation
# isolate orientations in this range
temp_ori = np.where(grad_ori < 180 / orientations * (i + 1),
grad_ori, 0)# np.where是說滿足條件的位置取grad_ori處的值,不滿足條件的取0值
temp_ori = np.where(grad_ori >= 180 / orientations * i,
temp_ori, 0)
# select magnitudes for those orientations
cond2 = temp_ori > 0
temp_mag = np.where(cond2, grad_mag, 0)
orientation_histogram[:,:,i] = uniform_filter(temp_mag, size=(cx, cy))[int(cx/2)::cx, int(cy/2)::cy]#uiform_filter均值濾波,以一定大小的kernel,取kernel中所有值和的平均替代中心處的值,輸入輸出的大小相同,所以0:cx/2的值為填充的值計算的,所以從cx/2開始算。 int(cx/2)::cx是說從int(cx/2)到最後以cx為間隔取值。該方法求histogram很巧妙。
return orientation_histogram.ravel()
HSV
忽略紋理資訊提取顏色資訊
def color_histogram_hsv(im, nbin=10, xmin=0, xmax=255, normalized=True):
"""
Compute color histogram for an image using hue.
Inputs:
- im: H x W x C array of pixel data for an RGB image.
- nbin: Number of histogram bins. (default: 10)
- xmin: Minimum pixel value (default: 0)
- xmax: Maximum pixel value (default: 255)
- normalized: Whether to normalize the histogram (default: True)
Returns:
1D vector of length nbin giving the color histogram over the hue of the
input image.
"""
ndim = im.ndim
bins = np.linspace(xmin, xmax, nbin+1)
hsv = matplotlib.colors.rgb_to_hsv(im/xmax) * xmax# rgb轉hsv
imhist, bin_edges = np.histogram(hsv[:,:,0], bins=bins, density=normalized)
imhist = imhist * np.diff(bin_edges)
# return histogram
return imhist