data augmentation for object detecting目標檢測xml檔案擴增（旋轉實現）

阿新 • • 發佈：2019-01-05

1. 背景描述：

在利用CNN做目標檢測時，資料量不足時，旋轉源影象進行資料的擴充。

例：
源影象如下圖所示：
這裡寫圖片描述
標記所得xml檔案中目標資訊如下：

<object>
        <name>airplane</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin 
>431</xmin>
            <ymin>367</ymin>
            <xmax>607</xmax>
            <ymax>453</ymax>
        </bndbox>
    </object>
    <object>
        <name>airplane</name>
        <pose>Unspecified</pose>
        <truncated 
>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>570</xmin>
            <ymin>419</ymin>
            <xmax>768</xmax>
            <ymax>512</ymax>

想要將源影象旋轉任意角度，相對應xml檔案中的bndbox資訊則需要更新。

2. 思路：

找到原圖中標記方框的四個邊中點座標，計算其旋轉後的座標位置，然後利用cv2.boundingRect函式找到四個新座標的外接矩形作為新的xml檔案中的bndbox值寫入。

3. 程式碼實現過程：

# coding:utf-8
# Copyright@hitzym, Dec,09,2017 at HIT
# blog:http://blog.csdn.net/yinhuan1649/article/category/7330626
import cv2
import math
import numpy as np
import xml.etree.ElementTree as ET
import os

def rotate_image(src, angle, scale=1):
    w = src.shape[1]
    h = src.shape[0]
    # 角度變弧度
    rangle = np.deg2rad(angle)  # angle in radians
    # now calculate new image width and height
    nw = (abs(np.sin(rangle) * h) + abs(np.cos(rangle) * w)) * scale
    nh = (abs(np.cos(rangle) * h) + abs(np.sin(rangle) * w)) * scale
    # ask OpenCV for the rotation matrix
    rot_mat = cv2.getRotationMatrix2D((nw * 0.5, nh * 0.5), angle, scale)
    # calculate the move from the old center to the new center combined
    # with the rotation
    rot_move = np.dot(rot_mat, np.array([(nw - w) * 0.5, (nh - h) * 0.5, 0]))
    # the move only affects the translation, so update the translation
    # part of the transform
    rot_mat[0, 2] += rot_move[0]
    rot_mat[1, 2] += rot_move[1]
    dst = cv2.warpAffine(src, rot_mat, (int(math.ceil(nw)), int(math.ceil(nh))), flags=cv2.INTER_LANCZOS4)
    # 仿射變換
    return dst

# 對應修改xml檔案
def rotate_xml(src, xmin, ymin, xmax, ymax, angle, scale=1.):
    w = src.shape[1]
    h = src.shape[0]
    rangle = np.deg2rad(angle)  # angle in radians
    # now calculate new image width and height
    # 獲取旋轉後圖像的長和寬
    nw = (abs(np.sin(rangle)*h) + abs(np.cos(rangle)*w))*scale
    nh = (abs(np.cos(rangle)*h) + abs(np.sin(rangle)*w))*scale
    # ask OpenCV for the rotation matrix
    rot_mat = cv2.getRotationMatrix2D((nw*0.5, nh*0.5), angle, scale)
    # calculate the move from the old center to the new center combined
    # with the rotation
    rot_move = np.dot(rot_mat, np.array([(nw-w)*0.5, (nh-h)*0.5,0]))
    # the move only affects the translation, so update the translation
    # part of the transform
    rot_mat[0, 2] += rot_move[0]
    rot_mat[1, 2] += rot_move[1]                                   # rot_mat是最終的旋轉矩陣
    # point1 = np.dot(rot_mat, np.array([xmin, ymin, 1]))          #這種新畫出的框大一圈
    # point2 = np.dot(rot_mat, np.array([xmax, ymin, 1]))
    # point3 = np.dot(rot_mat, np.array([xmax, ymax, 1]))
    # point4 = np.dot(rot_mat, np.array([xmin, ymax, 1]))
    point1 = np.dot(rot_mat, np.array([(xmin+xmax)/2, ymin, 1]))   # 獲取原始矩形的四個中點，然後將這四個點轉換到旋轉後的座標系下
    point2 = np.dot(rot_mat, np.array([xmax, (ymin+ymax)/2, 1]))
    point3 = np.dot(rot_mat, np.array([(xmin+xmax)/2, ymax, 1]))
    point4 = np.dot(rot_mat, np.array([xmin, (ymin+ymax)/2, 1]))
    concat = np.vstack((point1, point2, point3, point4))            # 合併np.array
    # 改變array型別
    concat = concat.astype(np.int32)
    rx, ry, rw, rh = cv2.boundingRect(concat)                        #rx,ry,為新的外接框左上角座標，rw為框寬度，rh為高度，新的xmax=rx+rw,新的ymax=ry+rh
    return rx, ry, rw, rh

# 使影象旋轉60,90,120,150,210,240,300度
xmlpath = './xml/'          #源影象路徑
imgpath = './imgs/'         #源影象所對應的xml檔案路徑
rotated_imgpath = './rotatedimg/'
rotated_xmlpath = './rotatedxml/'
for angle in (60, 90, 120, 150, 180, 210, 240, 300):
    for i in os.listdir(imgpath):
        a, b = os.path.splitext(i)                            #分離出檔名a
        img = cv2.imread(imgpath + a + '.jpg')
        rotated_img = rotate_image(img,angle)
        cv2.imwrite(rotated_imgpath + a + '_'+ str(angle) +'d.jpg',rotated_img)
        print str(i) + ' has been rotated for '+ str(angle)+'°'
        tree = ET.parse(xmlpath + a + '.xml')
        root = tree.getroot()
        for box in root.iter('bndbox'):
            xmin = float(box.find('xmin').text)
            ymin = float(box.find('ymin').text)
            xmax = float(box.find('xmax').text)
            ymax = float(box.find('ymax').text)
            x, y, w, h = rotate_xml(img, xmin, ymin, xmax, ymax, angle)
            # cv2.rectangle(rotated_img, (x, y), (x+w, y+h), [0, 0, 255], 2)   #可在該步驟測試新畫的框位置是否正確
            # cv2.imshow('xmlbnd',rotated_img)
            # cv2.waitKey(200)
            box.find('xmin').text = str(x)
            box.find('ymin').text = str(y)
            box.find('xmax').text = str(x+w)
            box.find('ymax').text = str(y+h)
        tree.write(rotated_xmlpath + a + '_'+ str(angle) +'d.xml')
        print str(a) + '.xml has been rotated for '+ str(angle)+'°'

4. 測試旋轉結果

將xml中的bounding box 顯示在圖片上用來測試旋轉後結果是否正確

注：
- xml檔案需要和其對應的jpg檔案檔名一樣
- e.g. monkey001.jpg 對應 monkey001.xml
- 上程式碼

# coding:utf-8
# Copyright@hitzym, Dec,09,2017 at HIT
# blog:http://blog.csdn.net/yinhuan1649/article/category/7330626
import cv2
import xml.etree.ElementTree as ET
import os

imgpath = './testimgs/'          #旋轉後的影象路徑
xmlpath = './testxml/'           #旋轉後的xml檔案路徑
for img in os.listdir(imgpath):
    a, b = os.path.splitext(img)
    img = cv2.imread(imgpath + a +'.jpg')
    tree = ET.parse(xmlpath + a + '.xml')
    root = tree.getroot()
    for box in root.iter('bndbox'):
        x1 = int(box.find('xmin').text)
        y1 = int(box.find('ymin').text)
        x2 = int(box.find('xmax').text)
        y2 = int(box.find('ymax').text)
        cv2.rectangle(img,(x1,y1),(x2, y2), [0,255,0], 2)
    cv2.imshow("test", img)
    # cv2.waitKey(1000)
    if 1 == cv2.waitKey(0):
        pass

原圖：
結果圖：
這是旋轉60°的結果圖

稍有改動

感謝！

data augmentation for object detecting目標檢測xml檔案擴增（旋轉實現）

1. 背景描述：在利用CNN做目標檢測時，資料量不足時，旋轉源影象進行資料的擴充。例：源影象如下圖所示：標記所得xml檔案中目標資訊如下： <object> <name>airplane</na

Data Augmentation for Object detection: Rethinking image transforms for bounding boxes

Data Augmentation for Object detection: Rethinking image transforms for bounding boxesWhen it comes to getting good performances from deep learning tasks,

目標檢測的評估過程（參考SSD300）

n_classes = 20+1(背景) 1.對網路的輸出進行decode (batch, n_boxes_total, n_classes + 4 + 8)---->(batch, 200, 6) 2。將整個資料集的預測結果寫成一個巢狀list (batch, 200, 6)—>

目標檢測領域部分論文（13-18）

原始連結:https://github.com/amusi/awesome-object-detection object-detection This is a list of awesome articles about object detection. R-CN

利用皮爾遜相關係數找出與目標最相關的特徵（Python實現）

#coding:utf-8 #檢測各特徵和輻照度之間的相關性以及各個特徵之間的相關性 from __future__ import division import tensorflow as tf import math import csv from sklearn imp

算法系列（三）插入排序的兩種改進：規避邊界檢測和取消交換（Java實現）

前言：演算法第四版習題2.1.24插入排序的哨兵和習題2.1.25不需要交換的插入排序規避邊界檢測：在插入排序的實現中先找到最小的元素並將其置於陣列的第一個位置，可以省掉內迴圈的判斷條件 j>0 。能夠省略判斷條件的元素稱為哨兵。 public class Ex

AndroidManifest.xml檔案中（uses-feature）解釋

語法（SYNTAX）： <uses-featureandroid:name="string"android:required=["true" | "false"] android:glEsVersion="integer"/>

python目標檢測xml轉txt

轉換之後的txt格式如下：可以根據自己需要的格式調整，分別為圖片名，座標還有，類別 import os import sys import xml.etree.ElementTree as ET import glob def xml_to_txt(indir,

Object Detection目標檢測全面總結--重要

原地址：https://handong1587.github.io/deep_learning/2015/10/09/object-detection.html Object Detection Published: 09 Oct 2015 Cat

Matlab: 深度學習目標檢測xml標註資訊批量統計

""" https://blog.csdn.net/gusui7202/article/details/83239142 qhy。 """ 程式1：xml_read() #xml讀取程式2：mian() #xml內容統計使用：將兩個程式放入同一個資料夾

使用sklearn-theano來做object detection目標檢測 (OverFeat)

目前目標檢測比較好的兩種方法是Fast-RCNN和OverFeat。我更喜歡OverFeat，因為OverFeat完全使用CovNet來做Classification，Localization和detection，比Pip Line的方法好多了，簡潔並且減少經驗性的東西。最

目標檢測之模型篇（4）【EAST】

文章目錄 1. 前言 2. 實現 2.1 Pipeline 2.2 網路設計 2.3 標籤生成 2.4 損失函式 2.5 訓練 2.6 位置感知的NMS 3. 結果 4. 總結 5.

目標檢測之模型篇（3）【DMPNet】

文章目錄 1. 前言 2. 實現 2.1 Roughly recall text with quadrilateral sliding window 2.2 Finely localize text with quadrangle

目標檢測之模型篇（2）【RRPN】

文章目錄 1. 前言 2. 實現 2.1 關鍵idea 2.2 模型結構 2.3 具體細節 1.Rotated Bounding Box Representation-旋轉矩形框的表示 2.Rotati

深度學習目標檢測經典模型比較（RCNN、Fast RCNN、Faster RCNN）

深度學習目標檢測經典模型比較（RCNN、Fast RCNN、Faster RCNN） Faster rcnn是用來解決計算機視覺(CV)領域中目標檢測(Object Detection)的問題的。區別目標分類、定位、檢測一、傳統的目標檢測方法其實目標檢

珍藏 | 基於深度學習的目標檢測全面梳理總結（下）

關於作者：@李家丞同濟大學數學系本科在讀，現為格靈深瞳演算法部實習生。作者個人主頁：李家丞 | 個人主頁 | 關於我導言：目標檢測的任務表述如何從影象中解析出可供計算機理解的資訊，是機器視覺的中心問題。深度學習模型由於其強大的表示能力，加之資料量的積累和

珍藏 | 基於深度學習的目標檢測全面梳理總結（上）

關於作者：@李家丞同濟大學數學系本科在讀，現為格靈深瞳演算法部實習生。作者個人主頁：李家丞|個人主頁|關於我導言：目標檢測的任務表述如何從影象中解析出可供計算機理解的資訊，是機器視覺的中心問題。深度學習模型由於其強大的表示能力，加之資料量的積累和計算力的

caffe目標檢測模型訓練全過程（三）目標檢測第一步

遍歷整圖查詢蝴蝶位置 2018/04/22 訓練模型對於識別背景和蝴蝶有比較好的效果，基本對不會識別錯誤，接下來，將通過整圖遍歷的原始而又野蠻的方式對一張原始圖片進行處理，進而查詢到蝴蝶的具體位置。具體思路如下圖。對原圖進行縮放成理想大小，例如，最小邊長縮放為227*6畫素，最大邊長等比

目標檢測演算法的演進（two-stage檢測演算法）：R-CNN、SPP-Net、Fast R-CNN、Faster R-CNN、Mask R-CNN

什麼是目標檢測（object detection）：目標檢測（object detection），就是在給定的一張圖片中精確找到物體所在的位置，並標註出物體的類別。所以，目標檢測要解決的問題就是物體在哪裡以及是什麼的整個流程問題。但是，在實際照片中，物體的尺寸變化範圍很大，擺放物體的

深度學習（三）——tiny YOLO演算法實現實時目標檢測（tensorflow實現）

一、背景介紹 YOLO演算法全稱You Only Look Once，是Joseph Redmon等人於15年3月發表的一篇文章。本實驗目標為實現YOLO演算法，借鑑了一部分材料，最終實現了輕量級的簡化版YOLO——tiny YOLO，其優勢在於實現簡單，目標檢測迅速。 [1]文章連結：ht

data augmentation for object detecting目標檢測xml檔案擴增（旋轉實現）

1. 背景描述：

2. 思路：

3. 程式碼實現過程：

4. 測試旋轉結果

相關推薦