Faster RCNN原始碼學習四

阿新 • • 發佈：2018-12-11

bbox_transform.py

# --------------------------------------------------------
# Fast R-CNN
# Copyright (c) 2015 Microsoft
# Licensed under The MIT License [see LICENSE for details]
# Written by Ross Girshick
# --------------------------------------------------------

import numpy as np
#函式作用：返回anchor相對於GT的（dx,dy,dw,dh）四個迴歸值，shape（len（anchors），4）
def bbox_transform(ex_rois, gt_rois):
    #計算每一個anchor的width與height
    ex_widths = ex_rois[:, 2] - ex_rois[:, 0] + 1.0
    ex_heights = ex_rois[:, 3] - ex_rois[:, 1] + 1.0
    #計算每一個anchor中心點x，y座標
    ex_ctr_x = ex_rois[:, 0] + 0.5 * ex_widths
    ex_ctr_y = ex_rois[:, 1] + 0.5 * ex_heights
    #注意：當前的GT不是最一開始傳進來的所有GT，而是與對應anchor最匹配的GT，可能有重複資訊
    #計算每一個GT的width與height
    gt_widths = gt_rois[:, 2] - gt_rois[:, 0] + 1.0
    gt_heights = gt_rois[:, 3] - gt_rois[:, 1] + 1.0
    #計算每一個GT的中心點x，y座標
    gt_ctr_x = gt_rois[:, 0] + 0.5 * gt_widths
    gt_ctr_y = gt_rois[:, 1] + 0.5 * gt_heights
    #要對bbox進行迴歸需要4個量，dx、dy、dw、dh，分別為橫縱平移量、寬高縮放量
    #此迴歸與fast-rcnn迴歸不同，fast要做的是在cnn卷積完之後的特徵向量進行迴歸，dx、dy、dw、dh都是對應與特徵向量
    #此時由於是對原影象可視野中的anchor進行迴歸，更直觀
    #定義 Tx=Pwdx(P)+Px Ty=Phdy(P)+Py Tw=Pwexp(dw(P)) Th=Phexp(dh(P))
    #P為anchor，T為target，最後要使得T～G，G為ground-True
    #迴歸量dx(P)，dy(P)，dw(P)，dh(P)，即dx、dy、dw、dh
    targets_dx = (gt_ctr_x - ex_ctr_x) / ex_widths
    targets_dy = (gt_ctr_y - ex_ctr_y) / ex_heights
    targets_dw = np.log(gt_widths / ex_widths)
    targets_dh = np.log(gt_heights / ex_heights)
    #targets_dx, targets_dy, targets_dw, targets_dh都為（anchors.shape[0]，）大小
    #所以targets為（anchors.shape[0]，4）
    targets = np.vstack(
        (targets_dx, targets_dy, targets_dw, targets_dh)).transpose()
    return targets
#boxes為anchor資訊，deltas為'rpn_bbox_pred'層資訊
#函式作用:得到改善後的anchor的資訊（x1,y1,x2,y2）
def bbox_transform_inv(boxes, deltas):
    #boxes.shape[0]=K*A=Height*Width*A
    if boxes.shape[0] == 0:
        return np.zeros((0, deltas.shape[1]), dtype=deltas.dtype)

    boxes = boxes.astype(deltas.dtype, copy=False)
    #得到Height*Width*A個anchor的寬，高，中心點的x，y座標
    widths = boxes[:, 2] - boxes[:, 0] + 1.0
    heights = boxes[:, 3] - boxes[:, 1] + 1.0
    ctr_x = boxes[:, 0] + 0.5 * widths
    ctr_y = boxes[:, 1] + 0.5 * heights
    #deltas本來就只有4列，依次存（dx,dy,dw,dh）,每一行表示一個anchor
    #0::4表示先取第一個元素，以後每4個取一個，所以取的index為（0,4,8,12,16...），但是deltas本來就只有4列，所以只能取到一個值
    dx = deltas[:, 0::4]
    dy = deltas[:, 1::4]
    dw = deltas[:, 2::4]
    dh = deltas[:, 3::4]
    #預測後的中心點，與w與h
    pred_ctr_x = dx * widths[:, np.newaxis] + ctr_x[:, np.newaxis]
    pred_ctr_y = dy * heights[:, np.newaxis] + ctr_y[:, np.newaxis]
    pred_w = np.exp(dw) * widths[:, np.newaxis]
    pred_h = np.exp(dh) * heights[:, np.newaxis]
    #預測後的（x1,y1,x2,y2）存入 pred_boxes
    pred_boxes = np.zeros(deltas.shape, dtype=deltas.dtype)
    # x1
    pred_boxes[:, 0::4] = pred_ctr_x - 0.5 * pred_w
    # y1
    pred_boxes[:, 1::4] = pred_ctr_y - 0.5 * pred_h
    # x2
    pred_boxes[:, 2::4] = pred_ctr_x + 0.5 * pred_w
    # y2
    pred_boxes[:, 3::4] = pred_ctr_y + 0.5 * pred_h

    return pred_boxes
#函式作用：使得boxes位於圖片內
def clip_boxes(boxes, im_shape):
    """
    Clip boxes to image boundaries.
    """
    #im_shape[0]為圖片高，im_shape[1]為圖片寬
    #使得boxes位於圖片內
    # x1 >= 0
    boxes[:, 0::4] = np.maximum(np.minimum(boxes[:, 0::4], im_shape[1] - 1), 0)
    # y1 >= 0
    boxes[:, 1::4] = np.maximum(np.minimum(boxes[:, 1::4], im_shape[0] - 1), 0)
    # x2 < im_shape[1]
    boxes[:, 2::4] = np.maximum(np.minimum(boxes[:, 2::4], im_shape[1] - 1), 0)
    # y2 < im_shape[0]
    boxes[:, 3::4] = np.maximum(np.minimum(boxes[:, 3::4], im_shape[0] - 1), 0)
    return boxes

Faster RCNN原始碼學習四

py-faster-rcnn原始碼解讀系列（四）——anchor_target_layer.py

本文介紹了在solver中出現的用python定義的layer，顧名思義，該layer主要功能是產生anchor,並對anchor進行評分等操作，詳細見程式碼註釋。 class AnchorTargetLayer(caffe.Layer): """ As

py-faster-rcnn原始碼AnchorTargetLayer

本文介紹了在solver中出現的用python定義的layer，顧名思義，該layer主要功能是產生anchor,並對anchor進行評分等操作，詳細見程式碼註釋。 cl

faster rcnn原始碼理解imdb，roidb，blob很關鍵

原 faster rcnn原始碼理解 2016年12月12日 23:07:19 zbxzc 閱讀數：15173 &

Faster rcnn原始碼理解（4）

上一篇我們說完了AnchorTargetLayer層，然後我將Faster rcnn中的其他層看了，這裡把ROIPoolingLayer層說一下；我先說一下它的實現原理：RPN生成的roi區域大小是對應與輸入影象大小（而且每一個roi大小都不同，因為先是禪城九種anchors，又經過迴歸，所以大

Faster rcnn原始碼理解（3）

緊接著之前的部落格，我們繼續來看faster rcnn中的AnchorTargetLayer層：該層定義在lib>rpn>中，見該層定義：首先說一下這一層的目的是輸出在特徵圖上所有點的anchors（經過二分類和迴歸）；（1）輸入blob：bottom[0]儲存特徵圖資訊

Faster rcnn原始碼理解（2）

接著上篇的部落格，咱們繼續看一下Faster RCNN的程式碼～上次大致講完了Faster rcnn在訓練時是如何獲取imdb和roidb檔案的，主要都在train_rpn()的get_roidb()函式中，train_rpn()函式後面的部分基本沒什麼需要講的了，那我們再回到訓練流程中來：

faster rcnn原始碼解析

之前一直是使用faster rcnn對其中的程式碼並不是很瞭解，這次剛好復現mask rcnn就仔細閱讀了faster rcnn，主要參考程式碼是pytorch-faster-rcnn ，部分參考和借用了以下部落格的圖片 [1] CNN目標檢測（一

faster rcnn原始碼解讀總結

1.初始資料通過imdb類的操作放在它的屬性roidb裡。 2.roidb只是一個字典，可以拿出來當做一個單獨的字典，脫離imdb。 3.roi_data_layer下的layer就是input-da

Python原始碼學習(四)-builtins模組的初始化

Module的初始化是從系統預定義的PyModuleDef開始的 typedef struct PyModuleDef{ PyModuleDef_Base m_base; const char* m_name; const char* m_doc; Py_

faster rcnn原始碼解讀（六）之minibatch

Keras版Faster-RCNN程式碼學習（IOU，RPN）1

config.py from keras import backend as K import math class Config: def __init__(self): self.verbose = True

Vue.js原始碼學習四 —— 渲染 Render 初始化過程學習

今天我們來學習下Vue的渲染 Render 原始碼~ 還是從初始化方法開始找程式碼，在 src/core/instance/index.js 中，先執行了 renderMixin 方法，然後在Vue例項化的時候執行了 vm._init 方法，在這個 v

Faster—RCNN原始碼解析之demo.py

1、模型選擇，以及分類型別： CLASSES = ('__background__', 'aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car',

faster rcnn 原始碼解讀

faster rcnn 原始碼解讀原始碼解析 faster rcnn是在fast rcnn的基礎上，包裝了rpn的提取以及網路共享，所以，這裡，我們著重看一下這一部分的程式碼，fast rcnn的原始碼解析，之後會在另外的部落格中介紹。 to

Faster rcnn原始碼理解（1）

這段時間看了不少論文，回頭看看，感覺還是有必要將Faster rcnn的原始碼理解一下，畢竟後來很多方法都和它有相近之處，同時理解該框架也有助於以後自己修改和編寫自己的框架。好的開始吧～這裡我們跟著Faster rcnn的訓練流程來一步一步梳理，進入tools\train_f

jQuery原始碼學習(四)

佇列queue() 佇列(先進先出)方法,執行順序的管理。 <script type="text/javascript"> //大體框架 //佇列其實就是一個數組 jQuery.extend([//工具方法

深度學習目標檢測系列：faster RCNN實現|附python原始碼

目標檢測一直是計算機視覺中比較熱門的研究領域，有一些常用且成熟的演算法得到業內公認水平，比如RCNN系列演算法、SSD以及YOLO等。如果你是從事這一行業的話，你會使用哪種演算法進行目標檢測任務呢？在我尋求在最短的時間內構建最精確的模型時，我嘗試了其中的R-CNN系列演算法，如果讀者們對這方面的

caffe學習（四）：py-faster-rcnn配置，執行測試程式（Ubuntu）

上一篇部落格中講了在Ubuntu下安裝caffe的經驗總結（各種問題，簡直懷疑人生了）。部落格連結：點我開啟 faster-rcnn有兩個版本，分別是Python的和MATLAB的。這裡介紹python版本的faster-rcnn的配置。網上有很多相關的教程，起初我在配置

Faster RCNN 學習筆記

rect 使用博客一個 lib pan nms 解析 ret 下面的介紹都是基於VGG16 的Faster RCNN網絡，各網絡的差異在於Conv layers層提取特征時有細微差異，至於後續的RPN層、Pooling層及全連接的分類和目標定位基本相同.一）、整體框架我

Faster RCNN原始碼學習四

bbox_transform.py

相關推薦