cs231n作業:assignment1 - features
GitHub地址:https://github.com/ZJUFangzh/cs231n
個人部落格:fangzh.top
抽取影象的HOG和HSV特徵。
對於每張圖,我們會計算梯度方向直方圖(HOG)特徵和用HSV(Hue色調,Saturation飽和度,Value明度)顏色空間的色調特徵。把每張圖的梯度方向直方圖和顏色直方圖特徵合併形成我們最後的特徵向量。
粗略的講呢,HOG應該可以捕捉到影象的紋理特徵而忽略了顏色資訊,顏色直方圖會表示影象的顏色特徵而忽略了紋理特徵(詳細見這篇)。所以我們預估把兩者結合起來得到的效果應該是比用其中一種得到的效果好。對於後面的bonus,驗證一下這個設想是不錯的選擇。
hog_feature
和color_histogram_hsv
兩個函式都只對一張圖做操作並返回這張圖片的特徵向量。extract_features
函式接收一堆圖片和一個list的特徵函式,然後用每個特徵函式在每張圖片上過一遍,把結果存到一個矩陣裡面,矩陣的每一行都是一張圖片的所有特徵的合併。
在features.py中寫了兩個特徵的計算方法,HOG是改寫了scikit-image的fog介面,並且首先要轉換成灰度圖。顏色直方圖是實現用matplotlib.colors.rgb_to_hsv的介面把圖片從RGB變成HSV,再提取明度(value),把value投射到不同的bin當中去。關於HOG的原理請谷歌百度。
如果出錯:
orientation_histogram[:,:,i] = uniform_filter(temp_mag, size=(cx, cy))[cx/2::cx, cy/2::cy].T
這行報錯,“TypeError: slice indices must be integers or None or have an index method”,可以把程式碼改成: orientation_histogram[:,:,i] = uniform_filter(temp_mag, size=(cx, cy))[int(cx/2)::cx, int(cy/2)::cy].T
通過這一步,把原來的資料集都提取出了特徵,換成了X_train_feats,X_val_feats,X_test_feats
from cs231n.features import *
num_color_bins = 10 # Number of bins in the color histogram
feature_fns = [hog_feature, lambda img: color_histogram_hsv(img, nbin=num_color_bins)]
X_train_feats = extract_features(X_train, feature_fns, verbose=True)
X_val_feats = extract_features(X_val, feature_fns)
X_test_feats = extract_features(X_test, feature_fns)
# Preprocessing: Subtract the mean feature
mean_feat = np.mean(X_train_feats, axis=0, keepdims=True)
X_train_feats -= mean_feat
X_val_feats -= mean_feat
X_test_feats -= mean_feat
# Preprocessing: Divide by standard deviation. This ensures that each feature
# has roughly the same scale.
std_feat = np.std(X_train_feats, axis=0, keepdims=True)
X_train_feats /= std_feat
X_val_feats /= std_feat
X_test_feats /= std_feat
# Preprocessing: Add a bias dimension
X_train_feats = np.hstack([X_train_feats, np.ones((X_train_feats.shape[0], 1))])
X_val_feats = np.hstack([X_val_feats, np.ones((X_val_feats.shape[0], 1))])
X_test_feats = np.hstack([X_test_feats, np.ones((X_test_feats.shape[0], 1))])
SVM
跟之前都一樣的,把訓練集換成 ***_feats就行了
# Use the validation set to tune the learning rate and regularization strength
from cs231n.classifiers.linear_classifier import LinearSVM
learning_rates = [1e-9, 1e-8, 1e-7]
regularization_strengths = [5e4, 5e5, 5e6]
results = {}
best_val = -1
best_svm = None
learning_rates =[5e-9, 7.5e-9, 1e-8]
regularization_strengths = [(5+i)*1e6 for i in range(-3,4)]
################################################################################
# TODO: #
# Use the validation set to set the learning rate and regularization strength. #
# This should be identical to the validation that you did for the SVM; save #
# the best trained classifer in best_svm. You might also want to play #
# with different numbers of bins in the color histogram. If you are careful #
# you should be able to get accuracy of near 0.44 on the validation set. #
################################################################################
for learning_rate in learning_rates:
for regularization_strength in regularization_strengths:
svm = LinearSVM()
loss_hist = svm.train(X_train_feats, y_train, learning_rate=learning_rate, reg=regularization_strength,
num_iters=1500, verbose=False)
y_train_pred = svm.predict(X_train_feats)
y_val_pred = svm.predict(X_val_feats)
y_train_acc = np.mean(y_train_pred==y_train)
y_val_acc = np.mean(y_val_pred==y_val)
results[(learning_rate,regularization_strength)] = [y_train_acc, y_val_acc]
if y_val_acc > best_val:
best_val = y_val_acc
best_svm = svm
################################################################################
# END OF YOUR CODE #
################################################################################
# Print out results.
for lr, reg in sorted(results):
train_accuracy, val_accuracy = results[(lr, reg)]
print('lr %e reg %e train accuracy: %f val accuracy: %f' % (
lr, reg, train_accuracy, val_accuracy))
print('best validation accuracy achieved during cross-validation: %f' % best_val)
Neural Network on image features
from cs231n.classifiers.neural_net import TwoLayerNet
input_dim = X_train_feats.shape[1]
hidden_dim = 500
num_classes = 10
net = TwoLayerNet(input_dim, hidden_dim, num_classes)
best_net = None
################################################################################
# TODO: Train a two-layer neural network on image features. You may want to #
# cross-validate various parameters as in previous sections. Store your best #
# model in the best_net variable. #
################################################################################
best_val = -1
learning_rates = [1.2e-3, 1.5e-3, 1.75e-3]
regularization_strengths = [1, 1.25, 1.5 , 2]
for lr in learning_rates:
for reg in regularization_strengths:#
# net = TwoLayerNet(input_dim, hidden_dim, num_classes)
loss_hist = net.train(X_train_feats, y_train, X_val_feats, y_val,
num_iters=1000, batch_size=200,
learning_rate=lr, learning_rate_decay=0.95,
reg=reg, verbose=False)
y_train_pred = net.predict(X_train_feats)
y_val_pred = net.predict(X_val_feats)
y_train_acc = np.mean(y_train_pred==y_train)
y_val_acc = np.mean(y_val_pred==y_val)
results[(lr,reg)] = [y_train_acc, y_val_acc]
if y_val_acc > best_val:
best_val = y_val_acc
best_net = net
for lr, reg in sorted(results):
train_accuracy, val_accuracy = results[(lr, reg)]
print('lr %e reg %e train accuracy: %f val accuracy: %f' % (
lr, reg, train_accuracy, val_accuracy))
print('best validation accuracy achieved during cross-validation: %f' % best_val)
################################################################################
# END OF YOUR CODE #
################################################################################