(人臉識別2-5)——人臉識別模型訓練集處理

阿新 • • 發佈：2018-12-31

人臉識別模型訓練集處理

前面已經實現將人臉拍攝下來儲存在traindata資料夾內，但是這還不夠，我們需要對測試圖片進行大小的調整，因為有些圖片的格式長短不一樣，這樣對於後面我們訓練模型不方便處理，而且較大的圖片檔案處理運算量也非常大。

這裡再次說一下問價夾的存放形式，不然無法讀取，程式無法執行，如果看懂了程式，稍加改動其實還是可以適應不同的資料夾形式

這裡先貼上程式碼，所有詳細的解釋都在註釋上

# -*- coding: utf-8 -*-

import os
import numpy as np
import cv2

IMAGE_SIZE = 64


# 按照指定影象大小調整尺寸 

def resize_image(image, height=IMAGE_SIZE, width=IMAGE_SIZE):
    top, bottom, left, right = (0, 0, 0, 0)

    # 獲取影象尺寸
    h, w, _ = image.shape

    # 對於長寬不相等的圖片，找到最長的一邊
    longest_edge = max(h, w)

    # 計算短邊需要增加多上畫素寬度使其與長邊等長
    if h < longest_edge:
        dh = longest_edge - h
        top = dh // 2 

        bottom = dh - top
    elif w < longest_edge:
        dw = longest_edge - w
        left = dw // 2
        right = dw - left
    else:
        pass

        # RGB顏色
    BLACK = [0, 0, 0]

    # 給影象增加邊界，是圖片長、寬等長，cv2.BORDER_CONSTANT指定邊界顏色由value指定
    constant = cv2.copyMakeBorder(image, top, bottom, left, right, cv2.BORDER_CONSTANT, value=BLACK)

    # 調整影象大小並返回 

    return cv2.resize(constant, (height, width))


# 讀取訓練資料i
images = []
labels = []

def file_exit(path_name,son_path_name):#判斷path_name資料夾下是否有file_name
    lists = os.listdir(path_name)#該目錄下的所有資料夾
    for list in lists:#遍歷所有檔案，如果存在與son_path_name同名的資料夾，返回1即找到測試集檔案
        if list == son_path_name:
            print ('file exits')
            return 1
    return 0


def read_path(path_name,son_path_name):#讀取路徑下的資料集，併為每張圖片新增一個類別標籤
    parent_path = os.path.abspath(os.path.join(path_name, '..'))
    exit_code=file_exit(parent_path,son_path_name)#判斷該路徑下是否有測試集目錄

    if exit_code == 1:#如果有，則全路徑是當前路徑上一級＋測試集資料夾名
        fullpath = parent_path +'/'+ son_path_name

        for dir_item in os.listdir(fullpath):#遍歷所有的圖片

            if dir_item.endswith('.jpg'):#如果格式是圖片，則進行大小處理
                image = cv2.imread(fullpath+'/'+dir_item)
                image = resize_image(image, IMAGE_SIZE, IMAGE_SIZE)

                images.append(image)
                labels.append(son_path_name)
    print labels
    return images, labels


# 從指定路徑讀取訓練資料，path_name是當前檔案所在路徑，son_path_name是需要尋找的存放測試資料的子目錄名
def load_dataset(path_name,son_path_name):
    images, labels = read_path(path_name,son_path_name)#讀取子目錄下的所有測試資料集

    # 將輸入的所有圖片轉成四維陣列，尺寸為(圖片數量*IMAGE_SIZE*IMAGE_SIZE*3)
    # IMAGE_SIZE為64，故對我來說尺寸為1200 * 64 * 64 * 3
    # 圖片為64 * 64畫素,一個畫素3個顏色值(RGB)
    images = np.array(images)
    print(images.shape)

    # 標註資料，'traindata'資料夾下都是訓練集的臉部影象，全部指定為0，另外一個資料夾下是測試集的，全部指定為1
    labels = np.array([0 if label==('traindata') else 1 for label in labels])
    print labels
    return images, labels


if __name__ == '__main__':
    path_name=os.getcwd()#getcwd()獲取當前.py檔案所在目錄，載入與此目錄所在路徑上一級的traindata資料夾下的所有訓練檔案
    images, labels = load_dataset(path_name,'traindata')

原博主是對資料夾的一些操作進行全部處理，這種處理方式在我看來不是很好，這裡自己程式碼自己進行重寫和新增一些自己的方法，進行資料夾的讀取和操作，對於相關的資料夾的處理操作可以參見：python——【轉載】os操作檔案目錄

主要的檔案操作程式碼在：

def file_exit(path_name,son_path_name):#判斷path_name資料夾下是否有file_name
    lists = os.listdir(path_name)#該目錄下的所有資料夾
    for list in lists:#遍歷所有檔案，如果存在與son_path_name同名的資料夾，返回1即找到測試集檔案
        if list == son_path_name:
            print ('file exits')
            return 1
    return 0


def read_path(path_name,son_path_name):#讀取路徑下的資料集，併為每張圖片新增一個類別標籤
    parent_path = os.path.abspath(os.path.join(path_name, '..'))
    exit_code=file_exit(parent_path,son_path_name)#判斷該路徑下是否有測試集目錄

    if exit_code == 1:#如果有，則全路徑是當前路徑上一級＋測試集資料夾名
        fullpath = parent_path +'/'+ son_path_name

        for dir_item in os.listdir(fullpath):#遍歷所有的圖片

            if dir_item.endswith('.jpg'):#如果格式是圖片，則進行大小處理
                image = cv2.imread(fullpath+'/'+dir_item)
                image = resize_image(image, IMAGE_SIZE, IMAGE_SIZE)

                images.append(image)
                labels.append(son_path_name)
    print labels
    return images, labels

這裡分為兩個方法，第一個方法是判斷該路徑下是否有一個名為son_path_name的資料夾，也就是traindata資料夾。第二個方法是讀取當前Python檔案的路徑的上一級目錄，也就是前面圖片貼出來的openvideo_test資料夾，我們需要進入到該路徑的父目錄才能訪問traindata資料夾。具體操作請看上面程式碼。

下面我們將利用keras庫對這些訓練資料進行模型的訓練。

(人臉識別2-5)——人臉識別模型訓練集處理

人臉識別模型訓練集處理

(人臉識別2-5)——人臉識別模型訓練集處理

opencv-人臉識別-2增加人臉資料集

py4CV例子2.5車牌識別和svm算法重構

京東金融大數據競賽豬臉識別（5）- 識別方法之二

5cifar100資料集的讀取-5.1/5.2/5.3TensorFlow讀取Cifar100資料集(上/中/下)

python實現人臉檢測及識別（2）---- 利用keras庫訓練人臉識別模型

OpenCV——人臉識別模型訓練（2）

opencv_人臉檢測、模型訓練、人臉識別

基於Python3.7和opencv的人臉識別（含資料收集，模型訓練）

TensorFlow實現人臉識別(4)--------對人臉樣本進行訓練，儲存人臉識別模型

OpenCV實踐之路——人臉識別之二模型訓練

[DeeplearningAI筆記]卷積神經網絡4.1-4.5 人臉識別/one-shot learning/Siamase網絡/Triplet損失/將面部識別轉化為二分類問題

python ubuntu dlib 5 -人臉識別並打分

2019.1.5--人臉識別環境搭建（tensflow及各模組安裝）

百度人臉檢測識別 python3.5 APIV3版本

微信小程式-人臉識別（2)實現人臉識別功能

SmileyFace——基於OpenCV的人臉人眼檢測、面部識別程序

kaldi中文語音識別thchs30模型訓練程式碼功能和配置引數解讀

在伺服器上執行kaldi說話人識別模型訓練程式遇到的小問題

Tensorflow 實戰Google深度學習框架第五章 5.2.1Minister數字識別原始碼

(人臉識別2-5)——人臉識別模型訓練集處理

人臉識別模型訓練集處理

相關推薦