1. 程式人生 > >CS231n Assignment1總結

CS231n Assignment1總結

lecture3一些關於鏈式法則的基本知識。
下面是對assignment1的程式碼一些關鍵點或者有意思實現的總結
參考答案:https://github.com/sharedeeply/cs231n-assignment-solution/blob/master/assignment1/

資料集讀取

對於assigment1中使用的資料集,從其中讀出來照片是一張被拉成一維向量的圖片,所以輸入X_train,X_test的行數為輸入資料的個數,列數為圖片的畫素量。y是一個 ( n

, ) (n,) 的矩陣,行數為資料量,每個值為 c l a s s
class
的值。

KNN

部分訓練資料顯示函式

# Visualize some examples from the dataset.
# We show a few examples of training images from each class.
classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
num_classes = len(classes)
samples_per_class =
7 for y, cls in enumerate(classes): idxs = np.flatnonzero(y_train == y)# 返回不為0的值所在的座標 idxs = np.random.choice(idxs, samples_per_class, replace=False)# 不重複取samples_per_class個數據,按概率取值也是該函式 for i, idx in enumerate(idxs): plt_idx = i * num_classes + y + 1 # 圖片的計數從1開始 plt.subplot(samples_per_class, num_classes, plt_idx) plt.imshow(X_train[idx].astype('uint8')) plt.axis('off') if i == 0: plt.title(cls) plt.show()

關於 k n n knn 物件的呼叫,在檔案中是通過

from cs231n.classifiers import KNearestNeighbor

# Create a kNN classifier instance. 
# Remember that training a kNN classifier is a noop: 
# the Classifier simply remembers the data and does no further processing 
classifier = KNearestNeighbor()
classifier.train(X_train, y_train)

語句實現的,這裡需要介紹__iniy__.py檔案的作用:

  1. Python中package的標識,不能刪除

  2. 定義__all__用來模糊匯入

  3. 編寫Python程式碼(不建議在__init__中寫python模組,可以在包中在建立另外的模組來寫,儘量保證__init__.py簡單)

參考:https://www.cnblogs.com/AlwinXu/p/5598543.html
通過classifier資料夾中的__init__.py檔案使得上述呼叫可以實現
k n n knn 物件的實現較為簡單,不展開介紹,這裡主要介紹一些沒見過的語句

# 將train_sq複製為(num_train, num_test)大小的矩陣
train_sq = np.broadcast_to(train_sq, shape=(num_train, num_test))# train_sq需要為一個二維矩陣,不能為(n,)形式的矩陣

交叉驗證的實現:

for k in k_choices:
    # 進行交叉驗證
    acc = []
    for i in range(num_folds):
        x = X_train_folds[0:i] + X_train_folds[i+1:]
        x = np.concatenate(x, axis=0)  # 使用concatenate將4個訓練集拼在一起
        y = y_train_folds[0:i] + y_train_folds[i+1:]
        y = np.concatenate(y)  # 對label使用同樣的操作
        test_x = X_train_folds[i]
        test_y = y_train_folds[i]
        
        classifier = KNearestNeighbor()  # 定義model
        classifier.train(x, y)  # 將訓練集讀入
        dist = classifier.compute_distances_no_loops(test_x)  # 計算距離矩陣
        y_pred = classifier.predict_labels(dist, k)  # 預測結果
        accuracy = np.mean(y_pred == test_y)  # 計算準確率
        acc.append(accuracy)
    k_to_accuracies[k] = acc  # 計算交叉驗證的平均準確率

SVM

np.hstack()# 在水平方向平鋪
np.vstack()# 在豎直方向堆疊

關於SVM引數更新(本來以為很簡單,但在實現過程中還是發現有很多地方不清楚,實現演算法果然是最好的學習方法)

def svm_loss_vectorized(W, X, y, reg):
   """
   Structured SVM loss function, vectorized implementation.
   Inputs and outputs are the same as svm_loss_naive.
   """
   loss = 0.0
   dW = np.zeros(W.shape)  # initialize the gradient as zero

   #############################################################################
   # TODO:
   # Implement a vectorized version of the structured SVM loss, storing the    #
   # result in loss.                                                           #
   #############################################################################
   num_train = X.shape[0]  # 得到樣本的數目
   scores = np.dot(X, W)  # 計算所有的得分
   y_score = scores[np.arange(num_train), y].reshape((-1, 1))  # 這句話是將每一行資料所屬類別(即y的值)處的值取出
   mask = (scores - y_score + 1) > 0  # 有效的score下標
   scores = (scores - y_score + 1) * mask  # 有效的得分
   loss = (np.sum(scores) - num_train * 1) / num_train  # 去掉每個樣本多加的對應label得分,然後平均(算有效score的時候應該不算y所屬類別的預測值)
   loss += reg * np.sum(W * W)
   #############################################################################
   #                             END OF YOUR CODE                              #
   #############################################################################

   #############################################################################
   # TODO:
   # Implement a vectorized version of the gradient for the structured SVM     #
   # loss, storing the result in dW.                                           #
   #                                                                           #
   # Hint: Instead of computing the gradient from scratch, it may be easier    #
   # to reuse some of the intermediate values that you used to compute the     #
   # loss.                                                                     #
   #############################################################################
   # dw = x.T * dl/ds
   ds = np.ones_like(scores)  # 初始化ds
   ds *= mask  # 有效的score梯度為1,無效的為0(需要更細的值只有max取值不為0位置處的值)
   ds[np.arange(num_train), y] = -1 * (np.sum(mask, axis=1) - 1)  # 每個樣本對應label的梯度計算了(有效的score次),取負號(在求導時,資料原來類別處的導數為每行mask的和減1,減1是因為類別處的score也算一個有效值),其餘有效score處的導數為1
   dW = np.dot(X.T, ds) / num_train   # 平均
   dW += 2 * reg * W  # 加上正則項的梯度
   #############################################################################
   #                             END OF YOUR CODE                              #
   #############################################################################

   return loss, dW

Softmax

def softmax_loss_vectorized(W, X, y, reg):
   """
   Softmax loss function, vectorized version.
   Inputs and outputs are the same as softmax_loss_naive.
   """
   # Initialize the loss and gradient to zero.
   loss = 0.0
   dW = np.zeros_like(W)

   #############################################################################
   # TODO: Compute the softmax loss and its gradient using no explicit loops.  #
   # Store the loss in loss and the gradient in dW. If you are not careful     #
   # here, it is easy to run into numeric instability. Don't forget the        #
   # regularization!                                                           #
   #############################################################################
   scores = np.dot(X, W)  # 計算得分
   scores -= np.max(scores, axis=1, keepdims=True)  # 數值穩定性,這裡實際上是減去一個常數對求導沒有影響
   scores = np.exp(scores)  # 取指數
   scores /= np.sum(scores, axis=1, keepdims=True)  # 計算softmax
   ds = np.copy(scores)  # 初始化loss對scores的梯度
   ds[np.arange(X.shape[0]), y] -= 1  # 求出scores的梯度(對softMax求導之後得到對資料相應類別處的值減1,參考:https://www.jianshu.com/p/c02a1fbffad6)
   dW = np.dot(X.T, ds)  # 求出w的梯度
   loss = scores[np.arange(X.shape[0]), y] # 計算loss
   loss = -np.log(loss).sum()  #求交叉熵
   loss /= X.shape[0]
   dW /= X.shape[0]
   loss += reg * np.sum(W * W)
   dW += 2 * reg * W
   #############################################################################
   #                          END OF YOUR CODE                                 #
   #############################################################################

   return loss, dW

Two-layer network

這個的求導實際上是將softmax與SVM合併,即softmax求導和Max函式的求導

def loss(self, X, y=None, reg=0.0):
       """
       Compute the loss and gradients for a two layer fully connected neural
       network.
       Inputs:
       - X: Input data of shape (N, D). Each X[i] is a training sample.
       - y: Vector of training labels. y[i] is the label for X[i], and each y[i] is
         an integer in the range 0 <= y[i] < C. This parameter is optional; if it
         is not passed then we only return scores, and if it is passed then we
         instead return the loss and gradients.
       - reg: Regularization strength.
       Returns:
       If y is None, return a matrix scores of shape (N, C) where scores[i, c] is
       the score for class c on input X[i].
       If y is not None, instead return a tuple of:
       - loss: Loss (data loss and regularization loss) for this batch of training
         samples.
       - grads: Dictionary mapping parameter names to gradients of those parameters
         with respect to the loss function; has the same keys as self.params.
       """
       # Unpack variables from the params dictionary
       W1, b1 = self.params['W1'], self.params['b1']
       W2, b2 = self.params['W2'], self.params['b2']
       N, D = X.shape

       # Compute the forward pass
       scores = None
       #############################################################################
       # TODO: Perform the forward pass, computing the class scores for the input. #
       # Store the result in the scores variable, which should be an array of      #
       # shape (N, C).                                                             #
       #############################################################################
       s1 = np.dot(X, W1) + b1  # (N, H)
       s1_act = (s1 > 0) * s1
       scores = np.dot(s1_act, W2) + b2  # (N, C)
       #############################################################################
       #                              END OF YOUR CODE                             #
       #############################################################################

       # If the targets are not given then jump out, we're done
       if y is None:
           return scores

       # Compute the loss
       loss = None
       #############################################################################
       # TODO: Finish the forward pass, and compute the loss. This should include  #
       # both the data loss and L2 regularization for W1 and W2. Store the result  #
       # in the variable loss, which should be a scalar. Use the Softmax           #
       # classifier loss.                                                          #
       #############################################################################
       scores -= np.max(scores, axis=1, keepdims=True)  # 數值穩定性
       scores = np.exp(scores)
       scores /= np.sum(scores, axis=1, keepdims=True)  # softmax
       loss = -np.log(scores[np.arange(N), y]).sum()
       loss /= X.shape[0]
       loss += reg * np.sum(W1**2)
       loss += reg * np.sum(W2**2)
       #############################################################################
       #                              END OF YOUR CODE                             #
       #############################################################################
       # Backward pass: compute gradients
       grads = {}
       #############################################################################
       # TODO: Compute the backward pass, computing the derivatives of the weights #
       # and biases. Store the results in the grads dictionary. For example,       #
       # grads['W1'] should store the gradient on W1, and be a matrix of same size #
       #############################################################################
       ds2 = np.copy(scores)  # 計算ds
       ds2[np.arange(X.shape[0]), y] -= 1
       ds2 = ds2 / X.shape[0]
       grads['W2'] = np.dot(s1_act.T, ds2) + 2 * reg * W2
       grads['b2'] = np.sum(ds2, axis=0)

       ds1 = np.dot(ds2, W2.T)
       ds1 = (s1 > 0) * ds1
       grads['W1'] = np.dot(X.T, ds1) + 2 * reg * W1
       grads['b1'] = np.sum(ds1, axis=0)
       #############################################################################
       #                              END OF YOUR CODE                             #
       #############################################################################

       return loss, grads

一些影象上的實現

一張影象HOG的實現

忽略顏色資訊提取紋理資訊

def hog_feature(im):
 """Compute Histogram of Gradient (HOG) feature for an image
 
      Modified from skimage.feature.hog
      http://pydoc.net/Python/scikits-image/0.4.2/skimage.feature.hog
    
    Reference:
      Histograms of Oriented Gradients for Human Detection
      Navneet Dalal and Bill Triggs, CVPR 2005
    
   Parameters:
     im : an input grayscale or rgb image
     
   Returns:
     feat: Histogram of Gradient (HOG) feature
   
 """
 
 # convert rgb to grayscale if needed
 if im.ndim == 3:
   image = rgb2gray(im)
 else:
   image = np.at_least_2d(im)

 sx, sy = image.shape # image size
 orientations = 9 # number of gradient bins
 cx, cy = (8, 8) # pixels per cell

 gx = np.zeros(image.shape)
 gy = np.zeros(image.shape)
 gx[:, :-1] = np.diff(image, n=1, axis=1) # compute gradient on x-direction 計算y方向的梯度
 gy[:-1, :] = np.diff(image, n=1, axis=0) # compute gradient on y-direction 計算x方向的梯度
 grad_mag = np.sqrt(gx ** 2 + gy ** 2) # gradient magnitude
 grad_ori = np.arctan2(gy, (gx + 1e-15)) * (180 / np.pi) + 90 # gradient orientation

 n_cellsx = int(np.floor(sx / cx))  # number of cells in x
 n_cellsy = int(np.floor(sy / cy))  # number of cells in y
 # compute orientations integral images
 orientation_histogram = np.zeros((n_cellsx, n_cellsy, orientations))
 for i in range(orientations):
   # create new integral image for this orientation
   # isolate orientations in this range
   temp_ori = np.where(grad_ori < 180 / orientations * (i + 1),
                       grad_ori, 0)# np.where是說滿足條件的位置取grad_ori處的值,不滿足條件的取0值
   temp_ori = np.where(grad_ori >= 180 / orientations * i,
                       temp_ori, 0)
   # select magnitudes for those orientations
   cond2 = temp_ori > 0
   temp_mag = np.where(cond2, grad_mag, 0)
   orientation_histogram[:,:,i] = uniform_filter(temp_mag, size=(cx, cy))[int(cx/2)::cx, int(cy/2)::cy]#uiform_filter均值濾波,以一定大小的kernel,取kernel中所有值和的平均替代中心處的值,輸入輸出的大小相同,所以0:cx/2的值為填充的值計算的,所以從cx/2開始算。 int(cx/2)::cx是說從int(cx/2)到最後以cx為間隔取值。該方法求histogram很巧妙。
 
 return orientation_histogram.ravel()

HSV

忽略紋理資訊提取顏色資訊

def color_histogram_hsv(im, nbin=10, xmin=0, xmax=255, normalized=True):
 """
 Compute color histogram for an image using hue.
 Inputs:
 - im: H x W x C array of pixel data for an RGB image.
 - nbin: Number of histogram bins. (default: 10)
 - xmin: Minimum pixel value (default: 0)
 - xmax: Maximum pixel value (default: 255)
 - normalized: Whether to normalize the histogram (default: True)
 Returns:
   1D vector of length nbin giving the color histogram over the hue of the
   input image.
 """
 ndim = im.ndim
 bins = np.linspace(xmin, xmax, nbin+1)
 hsv = matplotlib.colors.rgb_to_hsv(im/xmax) * xmax# rgb轉hsv
 imhist, bin_edges = np.histogram(hsv[:,:,0], bins=bins, density=normalized)
 imhist = imhist * np.diff(bin_edges)

 # return histogram
 return imhist