Tensorflow學習筆記-基於LeNet5結構的ORL資料集人臉識別

阿新 • • 發佈：2019-01-08

參考文獻:
《基於卷積神經網路的人臉識別研究》李春利，柳振東，惠康華

文章中基於經典的網路LeNet-5的結構，提出了一種適用於ORL資料集的CNN結構，在該資料集上取得了較高的識別率。

本文是在參考此論文的基礎上，使用tensorflow實現了文中相關理論。

訓練集下載解壓後可以看到，ORL訓練集一共有40類，每一類有10張bmp型別的圖片。

s1中圖片

首先我們需要做的就是將這些資料讀入，製作我們自己的訓練集和測試集。

input_path = "./orl"
train_path = "./train"
test_path = "./test"

if not os.path.exists 
(train_path):
    os.mkdir(train_path)

if not os.path.exists(test_path):
    os.mkdir(test_path)

for i in range(1, 41):
    if not os.path.exists(train_path + '/' + str(i)):
        os.mkdir(train_path + '/' + str(i))
    if not os.path.exists(test_path + '/' + str(i)):
        os.mkdir(test_path + '/' 
 + str(i))


# 生成訓練和測試的資料
def generate_data(train_path, test_path):
    index = 1
    output_index = 1
    for (dirpath, dirnames, filenames) in os.walk(input_path):
        # 打亂檔案列表，相當於是隨機選取8張訓練集，2張測試
        random.shuffle(filenames)
        for filename in filenames:
            if filename.endswith('.bmp' 
):
                img_path = dirpath + '/' + filename
                # 使用opencv 讀取圖片
                img_data = cv2.imread(img_path)
                # 按照論文中的將圖片大小調整為28 * 28
                img_data = cv2.resize(img_data, (28, 28), interpolation=cv2.INTER_AREA)
                if index < 3:
                    cv2.imwrite(test_path + '/' + str(output_index) + '/' + str(index) + '.jpg', img_data)
                    index += 1
                elif 10 >= index >= 3:
                    cv2.imwrite(train_path + '/' + str(output_index) + '/' + str(index) + '.jpg', img_data)
                    index += 1
                if index > 10:
                    output_index += 1
                    index = 1

執行完後我們便得到了320張訓練集，80張測試集，所得的樣本都是通過隨機選取。

訓練集：

訓練集

第一類別

測試集：

將train和test寫入到tfrecord的同時進行標註

# 生成整數型的屬性
def _int64_feature(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))


# 生成字串型別
def _bytes_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))


train_path = "./train/"
test_path = "./test/"
classes = {i: i for i in range(1, 41)}
writer_train = tf.python_io.TFRecordWriter("orl_train.tfrecords")
writer_test = tf.python_io.TFRecordWriter("orl_test.tfrecords")


def generate():
    # 遍歷字典
    for index, name in enumerate(classes):
        train = train_path + str(name) + '/'
        test = test_path + str(name) + '/'
        for img_name in os.listdir(train):
            img_path = train + img_name  # 每一個圖片的地址
            img = cv2.imread(img_path)
            img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
            img_raw = img.tobytes()
            example = tf.train.Example(features=tf.train.Features(feature={
                'label': _int64_feature(index + 1),
                'img_raw': _bytes_feature(img_raw)
            }))
            writer_train.write(example.SerializeToString())
        for img_name in os.listdir(test):
            img_path = test + img_name  # 每一個圖片的地址
            img = cv2.imread(img_path)
            img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
            img_raw = img.tobytes()
            example = tf.train.Example(features=tf.train.Features(feature={
                'label': _int64_feature(index + 1),
                'img_raw': _bytes_feature(img_raw)
            }))
            writer_test.write(example.SerializeToString())
    writer_test.close()
    writer_train.close()

接下來開始訓練：

def train(data, label):
    x = tf.placeholder(tf.float32,
                       [BATCH_SIZE, SIZE, SIZE, orl_inference.NUM_CHANNELS],
                       name='x-input')

    y_ = tf.placeholder(tf.float32, [None, orl_inference.OUTPUT_NODE], name='y-output')

    # 使用L2正則化計算損失函式
    regularizer = tf.contrib.layers.l2_regularizer(REGULARIZATION_RATE)

    min_after_dequeue = 100
    capacity = min_after_dequeue + 3 * BATCH_SIZE
    image_batch, label_batch = tf.train.shuffle_batch(
        [data, label], batch_size=BATCH_SIZE,
        capacity=capacity, min_after_dequeue=min_after_dequeue
    )

    y = orl_inference.inference(x, False, regularizer)

    global_step = tf.Variable(0, trainable=False)

    variable_averages = tf.train.ExponentialMovingAverage(
        MOVING_AVERAGE_DECAY, global_step
    )

    variable_averages_op = variable_averages.apply(tf.trainable_variables())

    # 計算交叉熵作為刻畫預測值和真實值之間的損失函式
    cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=y, labels=tf.argmax(y_, 1))

    # 計算所有樣例中交叉熵的平均值
    cross_entropy_mean = tf.reduce_mean(cross_entropy)

    # 總損失等於交叉熵損失和正則化損失的和
    loss = cross_entropy_mean + tf.add_n(tf.get_collection('losses'))

    # 設定指數衰減的學習率
    learning_rate = tf.train.exponential_decay(
        LEARNING_RATE_BASE,
        global_step,
        320 / BATCH_SIZE,
        LEARNING_RATE_DECAY,
        staircase=True
    )

    # 優化損失函式
    train_step = tf.train.GradientDescentOptimizer(learning_rate) \
        .minimize(loss, global_step=global_step)

    with tf.control_dependencies([train_step, variable_averages_op]):
        train_op = tf.no_op(name='train')
    saver = tf.train.Saver()

    # 驗證
    # accuracy = tf.reduce_mean()
    with tf.Session() as sess:
        tf.global_variables_initializer().run()
        coord = tf.train.Coordinator()
        threads = tf.train.start_queue_runners(sess=sess, coord=coord)
        # 迭代的訓練網路
        for i in range(TRAINING_STEPS):
            xs, ys = sess.run([image_batch, label_batch])
            xs = xs / 255.0
            reshaped_xs = np.reshape(xs, (BATCH_SIZE,
                                          SIZE,
                                          SIZE,
                                          orl_inference.NUM_CHANNELS))
            # 將影象和標籤資料通過tf.train.shuffle_batch整理成訓練時需要的batch
            ys = get_label(ys)
            _, loss_value, step = sess.run([train_op, loss, global_step],
                                           feed_dict={x: reshaped_xs, y_: ys})

            if i % 100 == 0:
                # 每10輪輸出一次在訓練集上的測試結果
                acc = loss.eval({x: reshaped_xs, y_: ys})
                print("After %d training step[s], loss on training"
                      " batch is %g. " % (step, loss_value))

                saver.save(
                    sess, os.path.join(MODEL_SAVE_PATH, MODEL_NAME),
                    global_step=global_step
                )
                # logit = orl_inference.inference(image_batch)
        coord.request_stop()
        coord.join(threads)

訓練過程

進行驗證：

def evaluate():
    with tf.Graph().as_default() as g:
        filename_queue = tf.train.string_input_producer(["orl_test.tfrecords"])
        reader = tf.TFRecordReader()
        _, serialized_example = reader.read(filename_queue)
        features = tf.parse_single_example(serialized_example,
                                           features={
                                               'label': tf.FixedLenFeature([], tf.int64),
                                               'img_raw': tf.FixedLenFeature([], tf.string),
                                           })
        img = tf.decode_raw(features['img_raw'], tf.uint8)
        img = tf.reshape(img, [28, 28, 1])
        label = tf.cast(features['label'], tf.int32)
        min_after_dequeue = 100
        capacity = min_after_dequeue + 3 * 200
        image_batch, label_batch = tf.train.shuffle_batch(
            [img, label], batch_size=80,
            capacity=capacity, min_after_dequeue=min_after_dequeue
        )

        x = tf.placeholder(tf.float32,
                           [80,
                            orl_inference.IMAGE_SIZE,
                            orl_inference.IMAGE_SIZE,
                            orl_inference.NUM_CHANNELS],
                           name='x-input')
        y_ = tf.placeholder(
            tf.float32, [None, orl_inference.OUTPUT_NODE], name='y-input'
        )

        y = orl_inference.inference(x, None, None)

        correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
        variable_averages = tf.train.ExponentialMovingAverage(
            orl_train.MOVING_AVERAGE_DECAY
        )
        variable_to_restore = variable_averages.variables_to_restore()
        saver = tf.train.Saver(variable_to_restore)

        # 每隔EVAL_INTERVAL_SECS秒呼叫一次
        while True:
            with tf.Session() as sess:
                test = cv2.imread('./data/20/10.jpg')
                test = cv2.cvtColor(test, cv2.COLOR_BGR2GRAY)
                test = np.array(test)
                test = test / 255.0
                test_re = np.reshape(test, (1, 28, 28, 1))

                coord = tf.train.Coordinator()
                threads = tf.train.start_queue_runners(sess=sess, coord=coord)
                xs, ys = sess.run([image_batch, label_batch])
                ys = get_label(ys)
                xs = xs / 255.0
                validate_feed = {x: xs,
                                 y_: ys}

                cpkt = tf.train.get_checkpoint_state(
                    orl_train.MODEL_SAVE_PATH
                )
                if cpkt and cpkt.model_checkpoint_path:
                    # 載入模型
                    saver.restore(sess, cpkt.model_checkpoint_path)
                    # 通過檔名得到模型儲存時迭代的輪數
                    global_step = cpkt.model_checkpoint_path \
                        .split('/')[-1].split('-')[-1]
                    # result = sess.run(y, feed_dict={x: test_re})
                    # re = np.where(result == np.max(result))
                    # ss = tf.argmax(result, 1)
                    # tt = np.argmax(result, 1)
                    # print('result is %d'%(tt[0] + 1))
                    # # print('hehe')
                    accuracy_score = sess.run(accuracy,feed_dict=validate_feed)
                    print("After %s training steps, validation "
                          "accuracy = %g" % (global_step, accuracy_score))
                else:
                    print("No checkpoint file found")
                    return
            time.sleep(EVAL_INTERVAL_SECS)

驗證結果

此次實驗參考了《Tensorflow 實戰Google深度學習框架》這本書的內容，根據所學內容，將文獻中的實驗實踐了一遍，也算是加深了理解。
完整程式碼：聽說star的人會變帥

Tensorflow學習筆記-基於LeNet5結構的ORL資料集人臉識別

參考文獻: 《基於卷積神經網路的人臉識別研究》李春利，柳振東，惠康華文章中基於經典的網路LeNet-5的結構，提出了一種適用於ORL資料集的CNN結構，在該資料集上取得了較高的識別率。本文是在參考此論文的基礎上，使用tensorflow實現了文中相關

TensorFlow學習筆記（五）—— MNIST —— 資料下載，讀取

MNIST資料下載本教程的目標是展示如何下載用於手寫數字分類問題所要用到的（經典）MNIST資料集。教程檔案本教程需要使用以下檔案：檔案目的下載用於訓練和測試的MNIST資料集的原始碼備註： input_data.py

tensorflow學習筆記——多執行緒輸入資料處理框架

　　之前我們學習使用TensorFlow對影象資料進行預處理的方法。雖然使用這些影象資料預處理的方法可以減少無關因素對影象識別模型效果的影響，但這些複雜的預處理過程也會減慢整個訓練過程。為了避免影象預處理成為神經網路模型訓練效率的瓶頸，TensorFlow提供了一套多執行緒處理輸入資料的框架。　　下面總結了

深度學習框架tensorflow學習與應用4（MNIST資料集分類的簡單版本示例）

資料集我們要訓練機器學習, 那麼就要用到訓練資料. 這次我們使用MNIST_data資料集在程式中要匯入該資料集, 語句:mnist = input_data.read_data_sets("MNIST_data", one_hot=True)one_hot 意思是把資料集變成[

caffe學習筆記6--訓練自己的資料集

這一部分記錄下如何用caffe訓練自己的資料集,這裡使用AlexNet的網路結構。該結構及相應的solver檔案在CAFFE/models/bvlc_alexnet目錄下，使用train_val.prototxt和solver.prototxt兩個檔案首先，在$CAFF

誰說菜鳥不會資料分析（入門篇）----- 學習筆記2（結構為王：確定分析思路 4P 5W2H ）

1、資料分析方法論確定分析思路需要以營銷、管理等理論為指導，把這些跟資料分析相關的營銷、管理等理論統稱為資料分析方法論。資料分析方法論主要用來指導資料分析師進行一次完整的資料分析，更多的是指資料分析思路，如從哪方面開展資料分析？各方面包含什麼內容和指標。資料分析方法論主要

Tensorflow學習筆記——基本結構

tensorflow構建神經網路的基本框架 1：匯入模組，生成資料 import 常量定義生成資料集 2：前向傳播定義輸入，輸出 x = y_= W1 = W2 = &nb

tensorflow學習筆記(北京大學) tf5_1minst_forward.py 完全解析 mnist資料集

#coding:utf-8 #tensorflow學習筆記(北京大學) tf5_1minst_forward.py 完全解析 mnist資料集 #QQ群：476842922（歡迎加群討論學習 import tensorflow as tf #網路輸入節點為784個（代表每張輸入圖片的畫素個

（print除去省略號）tensorflow學習筆記(北京大學) tf4_1_0.py 完全解析列印完整資料

# -*- coding: utf-8 -*- """ Created on Thu Nov 1 12:24:34 2018 #tensorflow學習筆記(北京大學) tf3_7_0.py 完全解析列印完整資料 #QQ群：476842922（歡迎加群討論學習 """ import te

tensorflow學習筆記1:影象資料的一些簡單操作

博主學習TensorFlow不久，學習路上也是遇到不少問題。所以決定寫一個系列的學習筆記，算是記錄下學習歷程，方便以後翻閱。當然如果可以幫助到一些新手的話就更好了，高手請繞道。 1.影象資料的採集: &nbs

tensorflow實戰(黃文堅唐源) 學習筆記1--LeNet5

sift演算法（傳統）尺度不變特徵轉換(Scale-invariant feature transform或SIFT)是一種電腦視覺的演算法用來偵測與描述影像中的區域性性特徵，它在空間尺度中尋找極值點，並提取出其位置、尺度、旋轉不變數，此演算法由 David Lowe在1999年所發表

誰說菜鳥不會資料分析（工具篇）----- 學習筆記2（結構為王：確定分析思路）

1、資料分析方法論確定分析思路需要以營銷、管理等理論為指導，把這些跟資料分析相關的營銷、管理等理論統稱為資料分析方法論。資料分析方法論主要用來指導資料分析師進行一次完整的資料分析，更多的是指資料分析思路，如從哪方面開展資料分析？各方面包含什麼內容和指標。資料分析方

TensorFlow學習筆記(10) 影象資料處理

通過對影象的預處理，可以避免模型受到無關因素的影響，可以提高模型的準確率。影象編碼處理影象在儲存時並不是直接記錄影象矩陣中各個畫素值，而是記錄經過壓縮編碼之後的結果。將一張影象還原成矩陣，需要解碼的過程。TF提供了對jpeg和png格式影象的編碼/解碼函式： import matplotl

TensorFlow學習筆記(9) TFRecord 輸入資料格式

TF提供了一種統一的格式來儲存資料，這個格式就是TFRecord。TFRecord檔案中的資料都是通過tf.train.Example Protocol Buffer的格式儲存的。tf.train.Example中包括一個從屬性名稱到取值的字典。其中屬性名稱為一個字串，取值為字串、實數列表或者整數列

《資料結構與演算法 python語言描述》學習筆記（二）————抽象資料型別和Python類

第一部分：學習內容概要抽象資料型別 Python的類第二部分：學習筆記抽象資料型別 1.抽象資料型別（Abstract Data Type，ADT），通過一套介面闡述說明這一程式部分的可用功能，但不不限制功能的實現方法。 2.抽象資料型

Tensorflow學習筆記：資料集加工和轉化為TensorFlow專用格式——Finetuning，貓狗大戰，VGGNet的重新針對訓練

Kaggle 貓狗大戰貓狗大戰的資料集來源於Kaggle上的一個競賽：Dogs vs. Cats 貓狗大戰的資料集下載地址http://www.kaggle.com/c/dogs-vs-cats，其中資料集有12500只貓和12500只狗 ,官方資料集下載需要帳號，大

資料結構與演算法學習筆記之適合大規模的資料排序

前言　　在資料排序的演算法中，不同資料規模應當使用合適的排序演算法才能達到最好的效果，如小規模的資料排序，可以使用氣泡排序、插入排序，選擇排序，他們的時間複雜度都為O（n2），大規模的資料排序就可以使用歸併排序和快速排序，時間複雜度為O（nlogn）。今天我們就來看一下歸併排序和快速排序。正文　　

TensorFlow學習筆記（5）--實現卷積神經網路（MNIST資料集）

這裡使用TensorFlow實現一個簡單的卷積神經網路，使用的是MNIST資料集。網路結構為：資料輸入層–卷積層1–池化層1–卷積層2–池化層2–全連線層1–全連線層2（輸出層），這是一個簡單但非常有代表性的卷積神經網路。 import tensorflow

TensorFlow學習筆記（4）--實現多層感知機（MNIST資料集）

前面使用TensorFlow實現一個完整的Softmax Regression，並在MNIST資料及上取得了約92%的正確率。現在建含一個隱層的神經網路模型（多層感知機）。 import tensorflow as tf import numpy as np

《TensorFlow學習筆記》卷積神經網路CNN實戰-cifar10資料集（tensorboard視覺化）

IDE：pycharm Python: Python3.6 OS: win10 tf : CPU版本程式碼可在github中下載，歡迎star，謝謝 CNN-CIFAR-10 一、CIFAR10資料集資料集程式碼下載 from te

Tensorflow學習筆記-基於LeNet5結構的ORL資料集人臉識別

訓練集：

測試集：

相關推薦