使用RNN進行影象分類

阿新 • • 發佈：2019-01-13

基礎介紹

如何使用RNN進行mnist的分類呢？其實對應到RNN裡面就是個Sequence Classification問題.
先看下CS231n中關於RNN部分的一張圖:

rnn_cell

其實影象的分類對應上圖就是個many to one的問題. 對於mnist來說其影象的size是28*28，如果將其看成28個step，每個step的size是28的話，是不是剛好符合上圖. 當我們得到最終的輸出的時候將其做一次線性變換就可以加softmax來分類了，其實挺簡單的.

具體實現

對於常見的RNN cell的使用總結:

rnn_cell_in_tf

獲取資料

很簡單，tf自帶都幫我們寫好了，直接呼叫就行了.

import 
 tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist_data = input_data.read_data_sets('data/mnist', one_hot=True)

如何不存在data/mnist這個目錄，其會自己下載mnist資料，要是你的網路不行也可以自己去mnist的網站下載然後將資料放在目錄下就可以了.

tf貼心到什麼程度呢？連batch generator都幫我們寫好了，直接用next_batch就可以獲得下一個batch的資料.

train_x, train_y = mnist_data.train.images, mnist_data.train.labels
test_x, test_y = mnist_data.test.images, mnist_data.test.labels
batch_x, batch_y = mnist.train.next_batch(batch_size)

training examples是55000， test examples是10000，validation examples是5000.

定義網路

我們使用3層的GRU，hidden units是200的帶dropout的RNN來作為mnist分類的網路，具體程式碼如下:

cells = list()
for _ in range(num_layers):
    cell = tf.nn.rnn_cell.GRUCell(num_units=num_hidden)
    cell = tf.nn.rnn_cell.DropoutWrapper(cell=cell, output_keep_prob=1.0 
-dropout)
    cells.append(cell)
network = tf.nn.rnn_cell.MultiRNNCell(cells=cells)
outputs, last_state = tf.nn.dynamic_rnn(cell=network, inputs=data, dtype=tf.float32)

# get last output
outputs = tf.transpose(outputs, (1, 0, 2))
last_output = tf.gather(outputs, int(outputs.get_shape()[0])-1)

# linear transform
out_size = int(target.get_shape()[1])
weight, bias = initialize_weight_bias(in_size=num_hidden, out_size=out_size)
logits = tf.add(tf.matmul(last_output, weight), bias)

return logits

因為mnist太簡單，這個簡單的網路其實已經可以搞定mnist的分類問題，後期的test acc可以到0.985（within 3 epoches).

訓練和測試

分類嘛，還是使用cross entropy作為loss，然後計算下錯誤率是多少，程式碼如下:
batch_size = 64, lr = 0.001

# placeholders
input_x = tf.placeholder(tf.float32, shape=(None, 28, 28))
input_y = tf.placeholder(tf.float32, shape=(None, 10))
dropout = tf.placeholder(tf.float32)
input_logits = model(input_x, input_y, dropout)

# loss and error rate op
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=input_logits, labels=input_y))
train_op = tf.train.RMSPropOptimizer(0.001).minimize(loss)
input_prob = tf.nn.softmax(input_logits)
error_count = tf.not_equal(tf.arg_max(input_prob, 1), tf.arg_max(input_y, 1))
error_rate_op = tf.reduce_mean(tf.cast(error_count, tf.float32))

input_x和input_y表示輸入的image和label，model就是上面定義的3層GRU模型；可以使用tf.summary來使用tensorboard檢視訓練時的error rate和loss等資訊.

訓練程式碼:

for step in range(total_steps):
    train_x, train_y = mnist_data.train.next_batch(default_batch_size)
    train_x = train_x.reshape(-1, 28, 28)
    feed_dict = {input_x: train_x,
                 input_y: train_y,
                 dropout: default_dropout}
    _, summary = session.run([train_op, merge_summary_op], feed_dict=feed_dict)
    # write logs
  summary_writer.add_summary(summary, global_step=epoch*total_steps+step)

測試程式碼:

# test
if step > 0 and (step % test_freq == 0):
    avg_error = 0
    for test_step in range(total_test_steps):
        test_x, test_y = mnist_data.test.next_batch(default_batch_size)
        test_x = test_x.reshape(-1, 28, 28)
        feed_dict = {input_x: test_x,
                     input_y: test_y,
                     dropout: 0}
        test_error = session.run(error_rate_op, feed_dict=feed_dict)
        avg_error += test_error / total_test_steps
    print('epoch: %d, steps: %d, avg_test_error: %.4f' % (epoch, step, avg_error))

結果

訓練時的loss和error_rate:

train_loss

測試的error_rate:

test_error

我只跑了3個epoch，錯誤率基本降低到1.5%左右，亦即正確率在98.5%左右，多跑幾個epoch可能錯誤率還能繼續降低，不過對於我們這個demo來說已經夠了.

使用RNN進行影象分類

基礎介紹

具體實現

獲取資料

定義網路

訓練和測試

結果

使用RNN進行影象分類

【機器學習--SVM+Hog特徵描述進行影象分類】

tensorflow 學習：用CNN進行影象分類

使用Keras預訓練模型ResNet50進行影象分類

用Inception-V3模型進行影象分類

基於tensorflow + Vgg16進行影象分類識別

opencv輸出特徵資料、libsvm進行影象分類輸出置信度、matlab輸出ROC曲線

tensorflow 1.0 學習：用Google訓練好的模型來進行影象分類

tensorflow 1.0 學習：用別人訓練好的模型來進行影象分類

Label-image進行影象分類

The More You Know: Using Knowledge Graphs for Image Classification ——用知識圖譜進行影象分類論文閱讀筆記

基於tensorflow + Vgg16進行影象分類識別的實驗

Tensorflow用別人訓練好的模型進行影象分類（可執行）

Tensorflow學習（7）用別人訓練好的模型進行影象分類

tensorflow學習筆記十一：用別人訓練好的模型來進行影象分類

[深度學習框架] Keras上使用RNN進行mnist分類

用PyTorch實現一個卷積神經網路進行影象分類

Windows下caffe用fine-tuning訓練好的caffemodel來進行影象分類

Android端使用TensorFlow進行影象分類

Keras入門（五）搭建ResNet對CIFAR-10進行影象分類

使用RNN進行影象分類

基礎介紹

具體實現

獲取資料

定義網路

訓練和測試

結果

相關推薦