CelebA 資料集影象裁剪

阿新 • • 發佈：2018-12-13

CalebA人臉資料集（官網連結）是香港中文大學的開放資料，包含10,177個名人身份的202,599張人臉圖片，並且都做好了特徵標記，這對人臉相關的訓練是非常好用的資料集。
在這裡插入圖片描述
每張圖片都有標註人臉的屬性。

但是在某些時候，我們只需要提取人臉所在位置的影象，資料集中給出了人臉的五個關鍵點座標的標註資訊以及人臉bbox標註資訊，根據這些資訊，可以對資料集進行處理，產生新的只包含人臉的資料集。
在這裡插入圖片描述
下面是處理資料集的程式碼：

# encoding:utf-8

import cv2
import numpy as np
import os
import sys
from tqdm import tqdm

# 要處理的圖片路徑 

img_path = 'img_celeba/'
# 新圖片儲存路徑
new_img_path = 'CelebA_img/'
# 人臉landmark標註檔案地址
landmark_anno_file_path = 'Anno/list_landmarks_celeba.txt'
# 人臉bbox標註檔案地址
face_boundingbox_anno_file_path = 'Anno/list_bbox_celeba.txt'
# 新的人臉landmark標註檔案地址
new_landmark_anno_file_path = 'Anno/new_list_landmarks_celeba.txt' 


# 新圖片的高度及寬度
new_h = 256
new_w = 256

if not os.path.exists(img_path):
    print("image path not exist.")
    exit(-1)

if not os.path.exists(landmark_anno_file_path):
    print("landmark_anno_file not exist.")
    exit(-1)

if not os.path.exists(face_boundingbox_anno_file_path):
    print("face_boundingbox_anno_file not exist." 
)
    exit(-1)

if not os.path.exists(new_img_path):
    os.makedirs(new_img_path)
else:
    os.sys('rm -rf %s/*'%new_img_path)

# 載入檔案
landmark_anno_file = open(landmark_anno_file_path, 'r')
face_boundingbox_anno_file = open(face_boundingbox_anno_file_path, 'r')
new_landmark_anno_file = open(new_landmark_anno_file_path, 'w')
landmark_anno = landmark_anno_file.readlines()
face_bbox = face_boundingbox_anno_file.readlines()
for i in tqdm(range(2, len(landmark_anno))):
    landmark_split = landmark_anno[i].split()
    face_bbox_split = face_bbox[i].split()
    filename = landmark_split[0]
    if filename != face_bbox_split[0]:
        print(filename, face_bbox_split[0])
        break
    landmark = []
    face = []
    for j in range(1, len(landmark_split)):
        landmark.append(int(landmark_split[j]))
    for j in range(1, len(face_bbox_split)):
        face.append(int(face_bbox_split[j]))
    landmark = np.array(landmark)
    landmarks= np.resize(landmark, (5, 2))
    face = np.array(face)
    
    try:
        path = os.path.join(img_path, filename)
        new_path = os.path.join(new_img_path, filename)
        if not os.path.exists(path):
            print(path, 'not exist')
            continue
        img = cv2.imread(path)

        # 裁剪影象
        newImg = img[face[1]:face[3]+face[1], face[0]:face[2]+face[0]]

        # 重新計算新的landmark座標並存儲
        new_landmark_str = ""
        new_landmark_str += filename+'\t'
        for landmark in landmarks:
            landmark[0] -= face[0]
            landmark[1] -= face[1]
            landmark[0] = round(landmark[0]*(new_w*1.0/newImg.shape[1]))
            landmark[1] = round(landmark[1]*(new_h*1.0/newImg.shape[0]))
            new_landmark_str += str(landmark[0])+'\t'+str(landmark[1])+'\t'
        new_landmark_str += '\n'
        new_landmark_anno_file.write(new_landmark_str)
        new_landmark_anno_file.flush()
        resizeImg = cv2.resize(newImg, (new_h, new_w))
        # 儲存新圖片
        cv2.imwrite(new_path, resizeImg)
    except:
        print("filename:%s process failed"%(filename))

landmark_anno_file.close()
face_boundingbox_anno_file.close()
new_landmark_anno_file.close()

CelebA 資料集影象裁剪

CelebA 資料集影象裁剪

CelebA資料集

CelebA資料集簡單介紹，及做人臉識別時資料集的處理

CelebA資料集詳細屬性統計

MUNIT訓練自己的資料集(影象風格轉換)

Keras：自建資料集影象分類的模型訓練、儲存與恢復

caffe練習例項（4）——caffe實現caltech101資料集影象分類

資料集-影象匹配

騰訊AI Lab開源業內最大規模多標籤影象資料集（附下載地址）

[資料集]遙感影象建築/道路資料集

建立自己的影象資料集

影象識別資料集處理——python 檔案操作

《Gluon 動手學深度學習》顯示影象資料集Fashion-MNIST

《TensorFlow：實戰Google深度學習框架》——6.1 影象識別中經典資料集介紹

機器視覺、影象處理、機器學習領域相關程式碼和工程專案和資料集集合

影象分類和目標檢測常用資料集介紹

Polygon-RNN++ （影象分割資料集自動標註）

【資料集整理】人體行為識別和影象識別

【數字影象處理系列四】影象資料集增強方式總結和實現

如何快速構建深度學習影象資料集

CelebA 資料集影象裁剪

相關推薦