1. 程式人生 > 程式設計 >Pytorch mask-rcnn 實現細節分享

Pytorch mask-rcnn 實現細節分享

DataLoader

Dataset不能滿足需求需自定義繼承torch.utils.data.Dataset時需要override __init__,__getitem__,__len__ ,否則DataLoader匯入自定義Dataset時缺少上述函式會導致NotImplementedError錯誤

Numpy 廣播機制:

讓所有輸入陣列都向其中shape最長的陣列看齊,shape中不足的部分都通過在前面加1補齊

輸出陣列的shape是輸入陣列shape的各個軸上的最大值

如果輸入陣列的某個軸和輸出陣列的對應軸的長度相同或者其長度為1時,這個陣列能夠用來計算,否則出錯

當輸入陣列的某個軸的長度為1時,沿著此軸運算時都用此軸上的第一組值

CUDA在pytorch中的擴充套件:

torch.utils.ffi中使用create_extension擴充:

 def create_extension(name,headers,sources,verbose=True,with_cuda=False,package=False,relative_to='.',**kwargs):
 """Creates and configures a cffi.FFI object,that builds PyTorch extension.

 Arguments:
  name (str): package name. Can be a nested module e.g. ``.ext.my_lib``.
  headers (str or List[str]): list of headers,that contain only exported
   functions
  sources (List[str]): list of sources to compile.
  verbose (bool,optional): if set to ``False``,no output will be printed
   (default: True).
  with_cuda (bool,optional): set to ``True`` to compile with CUDA headers
   (default: False)
  package (bool,optional): set to ``True`` to build in package mode (for modules
   meant to be installed as pip packages) (default: False).
  relative_to (str,optional): path of the build file. Required when
   ``package is True``. It's best to use ``__file__`` for this argument.
  kwargs: additional arguments that are passed to ffi to declare the
   extension. See `Extension API reference`_ for details.

 .. _`Extension API reference`: https://docs.python.org/3/distutils/apiref.html#distutils.core.Extension
 """
 base_path = os.path.abspath(os.path.dirname(relative_to))
 name_suffix,target_dir = _create_module_dir(base_path,name)
 if not package:
  cffi_wrapper_name = '_' + name_suffix
 else:
  cffi_wrapper_name = (name.rpartition('.')[0] +
        '.{0}._{0}'.format(name_suffix))

 wrapper_source,include_dirs = _setup_wrapper(with_cuda)
 include_dirs.extend(kwargs.pop('include_dirs',[]))

 if os.sys.platform == 'win32':
  library_dirs = glob.glob(os.getenv('CUDA_PATH','') + '/lib/x64')
  library_dirs += glob.glob(os.getenv('NVTOOLSEXT_PATH','') + '/lib/x64')

  here = os.path.abspath(os.path.dirname(__file__))
  lib_dir = os.path.join(here,'..','lib')

  library_dirs.append(os.path.join(lib_dir))
 else:
  library_dirs = []
 library_dirs.extend(kwargs.pop('library_dirs',[]))

 if isinstance(headers,str):
  headers = [headers]
 all_headers_source = ''
 for header in headers:
  with open(os.path.join(base_path,header),'r') as f:
   all_headers_source += f.read() + '\n\n'

 ffi = cffi.FFI()
 sources = [os.path.join(base_path,src) for src in sources]
 # NB: TH headers are C99 now
 kwargs['extra_compile_args'] = ['-std=c99'] + kwargs.get('extra_compile_args',[])
 ffi.set_source(cffi_wrapper_name,wrapper_source + all_headers_source,sources=sources,include_dirs=include_dirs,library_dirs=library_dirs,**kwargs)
 ffi.cdef(_typedefs + all_headers_source)

 _make_python_wrapper(name_suffix,'_' + name_suffix,target_dir)

 def build():
  _build_extension(ffi,cffi_wrapper_name,target_dir,verbose)
 ffi.build = build
 return ffi

補充知識:maskrcnn-benchmark 程式碼詳解之 resnet.py

1Resnet 結構

Resnet 一般分為5個卷積(conv)層,每一層為一個stage。其中每一個stage中由不同數量的相同的block(區塊)構成,這些區塊的個數就是block_count,第一個stage跟其他幾個stage結構完全不同,也可以看做是由單獨的區塊構成的,因此由區塊不停堆疊構成的第二層到第5層(即stage2-stage5或conv2-conv5),分別定義為index1-index4.就像搭積木一樣,這四個層可有基本的區塊搭成。下圖為resnet的基本結構:

Pytorch mask-rcnn 實現細節分享

以下程式碼通過控制區塊的多少,搭建出不同的Resnet(包括Resnet50等):

# -----------------------------------------------------------------------------
# Standard ResNet models
# -----------------------------------------------------------------------------
# ResNet-50 (包括所有的階段)
# ResNet 分為5個階段,但是第一個階段都相同,變化是從第二個階段開始的,所以下面的index是從第二個階段開始編號的。其中block_count為該階段區塊的個數
ResNet50StagesTo5 = tuple(
 StageSpec(index=i,block_count=c,return_features=r)
 for (i,c,r) in ((1,3,False),(2,4,(3,6,(4,True))
)
# ResNet-50 up to stage 4 (excludes stage 5)
ResNet50StagesTo4 = tuple(
 StageSpec(index=i,True))
)
# ResNet-101 (including all stages)
ResNet101StagesTo5 = tuple(
 StageSpec(index=i,23,True))
)
# ResNet-101 up to stage 4 (excludes stage 5)
ResNet101StagesTo4 = tuple(
 StageSpec(index=i,True))
)
# ResNet-50-FPN (including all stages)
ResNet50FPNStagesTo5 = tuple(
 StageSpec(index=i,True),True))
)
# ResNet-101-FPN (including all stages)
ResNet101FPNStagesTo5 = tuple(
 StageSpec(index=i,True))
)
# ResNet-152-FPN (including all stages)
ResNet152FPNStagesTo5 = tuple(
 StageSpec(index=i,8,36,True))
)

根據以上的不同組合方案,maskrcnn benchmark可以搭建起不同的backbone

def _make_stage(
 transformation_module,in_channels,bottleneck_channels,out_channels,block_count,num_groups,stride_in_1x1,first_stride,dilation=1,dcn_config={}
):
 blocks = []
 stride = first_stride
 # 根據不同的配置,構造不同的卷基層
 for _ in range(block_count):
  blocks.append(
   transformation_module(
    in_channels,stride,dilation=dilation,dcn_config=dcn_config
   )
  )
  stride = 1
  in_channels = out_channels
 return nn.Sequential(*blocks)

這幾種不同的backbone之後被整合為一個統一的物件以便於呼叫,其程式碼為:

_STAGE_SPECS = Registry({
 "R-50-C4": ResNet50StagesTo4,"R-50-C5": ResNet50StagesTo5,"R-101-C4": ResNet101StagesTo4,"R-101-C5": ResNet101StagesTo5,"R-50-FPN": ResNet50FPNStagesTo5,"R-50-FPN-RETINANET": ResNet50FPNStagesTo5,"R-101-FPN": ResNet101FPNStagesTo5,"R-101-FPN-RETINANET": ResNet101FPNStagesTo5,"R-152-FPN": ResNet152FPNStagesTo5,})

2區塊(block)結構 

2.1 Bottleneck結構

剛剛提到,在Resnet中,第一層卷基層可以看做一種區塊,而第二層到第五層由不同的稱之為Bottleneck的區塊堆疊二層。第一層可以看做一個stem區塊。其中Bottleneck的結構如下:

Pytorch mask-rcnn 實現細節分享

在maskrcnn benchmark中構造以上結構的程式碼為:

class Bottleneck(nn.Module):
 def __init__(
  self,dilation,norm_func,dcn_config
 ):
  super(Bottleneck,self).__init__()
  # 區塊旁邊的旁支
  self.downsample = None
  if in_channels != out_channels:
   # 獲得卷積的步長 使用一個長度為1的卷積核對輸入特徵進行卷積,使得其輸出通道數等於主體部分的輸出通道數
   down_stride = stride if dilation == 1 else 1
   self.downsample = nn.Sequential(
    Conv2d(
     in_channels,kernel_size=1,stride=down_stride,bias=False
    ),norm_func(out_channels),)
   for modules in [self.downsample,]:
    for l in modules.modules():
     if isinstance(l,Conv2d):
      nn.init.kaiming_uniform_(l.weight,a=1)
 
  if dilation > 1:
   stride = 1 # reset to be 1
 
  # The original MSRA ResNet models have stride in the first 1x1 conv
  # The subsequent fb.torch.resnet and Caffe2 ResNe[X]t implementations have
  # stride in the 3x3 conv
  # 步長
  stride_1x1,stride_3x3 = (stride,1) if stride_in_1x1 else (1,stride)
  # 區塊中主體部分,這一部分為固定結構
  # 使得特徵經過長度大小為1的卷積核
  self.conv1 = Conv2d(
   in_channels,stride=stride_1x1,bias=False,)
  self.bn1 = norm_func(bottleneck_channels)
  # TODO: specify init for the above
  with_dcn = dcn_config.get("stage_with_dcn",False)
  if with_dcn:
   # 使用dcn網路
   deformable_groups = dcn_config.get("deformable_groups",1)
   with_modulated_dcn = dcn_config.get("with_modulated_dcn",False)
   self.conv2 = DFConv2d(
    bottleneck_channels,defrost=with_modulated_dcn,kernel_size=3,stride=stride_3x3,groups=num_groups,deformable_groups=deformable_groups,bias=False
   )
  else:
   # 使得特徵經過長度大小為3的卷積核
   self.conv2 = Conv2d(
    bottleneck_channels,padding=dilation,dilation=dilation
   )
   nn.init.kaiming_uniform_(self.conv2.weight,a=1)
 
  self.bn2 = norm_func(bottleneck_channels)
 
  self.conv3 = Conv2d(
   bottleneck_channels,bias=False
  )
  self.bn3 = norm_func(out_channels)
 
  for l in [self.conv1,self.conv3,]:
   nn.init.kaiming_uniform_(l.weight,a=1)
 
 def forward(self,x):
  identity = x
 
  out = self.conv1(x)
  out = self.bn1(out)
  out = F.relu_(out)
 
  out = self.conv2(out)
  out = self.bn2(out)
  out = F.relu_(out)
 
  out0 = self.conv3(out)
  out = self.bn3(out0)
 
  if self.downsample is not None:
   identity = self.downsample(x)
 
  out += identity
  out = F.relu_(out)
 
  return out

2.2 Stem結構

剛剛提到Resnet的第一層可以看做是一個Stem結構,其結構的程式碼為:

class BaseStem(nn.Module):
 def __init__(self,cfg,norm_func):
  super(BaseStem,self).__init__()
  # 獲取backbone的輸出特徵層的輸出通道數,由使用者自定義
  out_channels = cfg.MODEL.RESNETS.STEM_OUT_CHANNELS
  # 輸入通道數為影象的三原色,輸出為輸出通道數,這一部分是固定的,又Resnet論文定義的
  self.conv1 = Conv2d(
   3,kernel_size=7,stride=2,padding=3,bias=False
  )
  self.bn1 = norm_func(out_channels)
 
  for l in [self.conv1,x):
  x = self.conv1(x)
  x = self.bn1(x)
  x = F.relu_(x)
  x = F.max_pool2d(x,padding=1)
  return x

2.3 兩種結構的衍生與封裝

在maskrcnn benchmark中,對上面提到的這兩種block結構進行的衍生和封裝,Bottleneck和Stem分別衍生出帶有Batch Normalization 和 Group Normalizetion的封裝類,分別為:BottleneckWithFixedBatchNorm,StemWithFixedBatchNorm,BottleneckWithGN,StemWithGN. 其程式碼過於簡單,就不做註釋:

class BottleneckWithFixedBatchNorm(Bottleneck):
 def __init__(
  self,num_groups=1,stride_in_1x1=True,stride=1,dcn_config={}
 ):
  super(BottleneckWithFixedBatchNorm,self).__init__(
   in_channels=in_channels,bottleneck_channels=bottleneck_channels,out_channels=out_channels,num_groups=num_groups,stride_in_1x1=stride_in_1x1,stride=stride,norm_func=FrozenBatchNorm2d,dcn_config=dcn_config
  )
 
 
class StemWithFixedBatchNorm(BaseStem):
 def __init__(self,cfg):
  super(StemWithFixedBatchNorm,self).__init__(
   cfg,norm_func=FrozenBatchNorm2d
  )
 
 
class BottleneckWithGN(Bottleneck):
 def __init__(
  self,dcn_config={}
 ):
  super(BottleneckWithGN,norm_func=group_norm,dcn_config=dcn_config
  )
 
 
class StemWithGN(BaseStem):
 def __init__(self,cfg):
  super(StemWithGN,self).__init__(cfg,norm_func=group_norm)
 
 
_TRANSFORMATION_MODULES = Registry({
 "BottleneckWithFixedBatchNorm": BottleneckWithFixedBatchNorm,"BottleneckWithGN": BottleneckWithGN,})

接著,這兩種結構關於BN和GN的四種衍生類被封裝起來,以便於呼叫。其封裝為:

_TRANSFORMATION_MODULES = Registry({
 "BottleneckWithFixedBatchNorm": BottleneckWithFixedBatchNorm,})
 
_STEM_MODULES = Registry({
 "StemWithFixedBatchNorm": StemWithFixedBatchNorm,"StemWithGN": StemWithGN,})

3 Resnet總體結構

3.1 Resnet結構

在以上的基礎上,我們可以在以上結構上進一步搭建起真正的Resnet. 其中包括第一層卷基層,和其他四個階段,程式碼為:

class ResNet(nn.Module):
 def __init__(self,cfg):
  super(ResNet,self).__init__()
 
  # If we want to use the cfg in forward(),then we should make a copy
  # of it and store it for later use:
  # self.cfg = cfg.clone()
 
  # Translate string names to implementations
  # 第一層conv層,也是第一階段,以stem的形式展現
  stem_module = _STEM_MODULES[cfg.MODEL.RESNETS.STEM_FUNC]
  # 得到指定的backbone結構
  stage_specs = _STAGE_SPECS[cfg.MODEL.BACKBONE.CONV_BODY]
  #  得到具體bottleneck結構,也就是指出組成backbone基本模組的型別
  transformation_module = _TRANSFORMATION_MODULES[cfg.MODEL.RESNETS.TRANS_FUNC]
 
  # Construct the stem module
  self.stem = stem_module(cfg)
 
  # Constuct the specified ResNet stages
  # 用於group normalization設定的組數
  num_groups = cfg.MODEL.RESNETS.NUM_GROUPS
  # 指定每一組擁有的通道數
  width_per_group = cfg.MODEL.RESNETS.WIDTH_PER_GROUP
  # stem是第一層的結構,它的輸出也就是第二層一下的組合結構的輸入通道數,內部通道數是可以自由定義的
  in_channels = cfg.MODEL.RESNETS.STEM_OUT_CHANNELS
  # 使用group的數目和每一組的通道數來得出組成backbone基本模組的內部通道數
  stage2_bottleneck_channels = num_groups * width_per_group
  # 第二階段的輸出通道數
  stage2_out_channels = cfg.MODEL.RESNETS.RES2_OUT_CHANNELS
  self.stages = []
  self.return_features = {}
  for stage_spec in stage_specs:
   name = "layer" + str(stage_spec.index)
   # 以下每一階段的輸入輸出層的通道數都可以由stage2層的得到,即2倍關係
   stage2_relative_factor = 2 ** (stage_spec.index - 1)
   bottleneck_channels = stage2_bottleneck_channels * stage2_relative_factor
   out_channels = stage2_out_channels * stage2_relative_factor
   stage_with_dcn = cfg.MODEL.RESNETS.STAGE_WITH_DCN[stage_spec.index -1]
   # 得到每一階段的卷積結構
   module = _make_stage(
    transformation_module,stage_spec.block_count,cfg.MODEL.RESNETS.STRIDE_IN_1X1,first_stride=int(stage_spec.index > 1) + 1,dcn_config={
     "stage_with_dcn": stage_with_dcn,"with_modulated_dcn": cfg.MODEL.RESNETS.WITH_MODULATED_DCN,"deformable_groups": cfg.MODEL.RESNETS.DEFORMABLE_GROUPS,}
   )
   in_channels = out_channels
   self.add_module(name,module)
   self.stages.append(name)
   self.return_features[name] = stage_spec.return_features
 
  # Optionally freeze (requires_grad=False) parts of the backbone
  self._freeze_backbone(cfg.MODEL.BACKBONE.FREEZE_CONV_BODY_AT)
  
#   固定某一層的引數不再更新
 def _freeze_backbone(self,freeze_at):
  if freeze_at < 0:
   return
  for stage_index in range(freeze_at):
   if stage_index == 0:
    m = self.stem # stage 0 is the stem
   else:
    m = getattr(self,"layer" + str(stage_index))
   for p in m.parameters():
    p.requires_grad = False
 
 def forward(self,x):
  outputs = []
  x = self.stem(x)
  for stage_name in self.stages:
   x = getattr(self,stage_name)(x)
   if self.return_features[stage_name]:
    outputs.append(x)
  return outputs

3.2 Resnet head結構

Head,在我理解看來就是完成某種功能的網路結構,Resnet head就是指使用Bottleneck塊堆疊成不同的用於構成Resnet的功能網路結構,它內部結構相似,完成某種功能。在此不做過多介紹,因為是上面的Resnet子結構

class ResNetHead(nn.Module):
 def __init__(
  self,block_module,stages,width_per_group=64,stride_init=None,res2_out_channels=256,dcn_config={}
 ):
  super(ResNetHead,self).__init__()
 
  stage2_relative_factor = 2 ** (stages[0].index - 1)
  stage2_bottleneck_channels = num_groups * width_per_group
  out_channels = res2_out_channels * stage2_relative_factor
  in_channels = out_channels // 2
  bottleneck_channels = stage2_bottleneck_channels * stage2_relative_factor
 
  block_module = _TRANSFORMATION_MODULES[block_module]
 
  self.stages = []
  stride = stride_init
  for stage in stages:
   name = "layer" + str(stage.index)
   if not stride:
    stride = int(stage.index > 1) + 1
   module = _make_stage(
    block_module,stage.block_count,first_stride=stride,dcn_config=dcn_config
   )
   stride = None
   self.add_module(name,module)
   self.stages.append(name)
  self.out_channels = out_channels
 
 def forward(self,x):
  for stage in self.stages:
   x = getattr(self,stage)(x)
  return x

以上這篇Pytorch mask-rcnn 實現細節分享就是小編分享給大家的全部內容了,希望能給大家一個參考,也希望大家多多支援我們。