PP-YOLO: An Effective and Efficient Implementation of Object Detector

阿新 • • 發佈：2021-09-08

PP-YOLO-V1

https://arxiv.org/abs/2007.12099

---------------------------------------------------------

2021-09-07

backbone：resnet50-vd dcn

　　1X1和stride=2同時出現造成資訊損失將下采樣放到3X3卷積

訓練策略：

　　EMA指數移動平均

　　DropBlock

　　邊界框迴歸　　

　　　　yolo:

　　　　　　Bx=sigmoid(Tx)+Cx

　　　　　　By=sigmoid(Ty)+Cy

　　　　　　Bw=Aw*exp(Tw)

　　　　　　Bh=Ah*exp(Th)

　　　　RCNN:

　　　　　　Bx=Aw*Tx+Ax

　　　　　　By=Ah*Ty+Ay

　　　　　　Bw=Aw*exp(Tw)

　　　　　　Bh=Ah*exp(Th)

　　CoordConv：座標變換問題卷積平移不變性
　　SPP ROIpooling

def MixUP(x, y, alpha=1.0, use_cuda=True):
    if alpha > 0:
        lam = np.random.beta(alpha, alpha)
    else:
        lam = 1.

    batch  
= x.size()[0]
    if use_cuda:
        idx = torch.randperm(batch).cuda()
    else:
        idx = torch.randperm(batch)

    mixup_x = lam * x + (1 - lam) * x[idx]
    y_a, y_b = y, y[idx]

    return mixup_x, y_a, y_b, lam


def mixup_criterion(y_a, y_b, lam):
    return lambda criterion, pred: lam * criterion(pred, y_a) + (1 - lam) * criterion(pred, y_b)


 
class EMA():
    def __init__(self, model, decay):
        self.model = model
        self.decay = decay
        self.shadow = {}
        self.backup = {}

    def register(self):
        for name, param in self.model.named_parameters():
            if param.requires_grad:
                self.shadow[name] = param.data.clone()

    def update(self):
        for name, param in self.model.named_parameters():
            if param.requires_grad:
                assert name in self.shadow
                new_average = (1.0 - self.decay) * param.data + self.decay * self.shadow[name]
                self.shadow[name] = new_average.clone()

    def apply_shadow(self):
        for name, param in self.model.named_parameters():
            if param.requires_grad:
                assert name in self.shadow
                self.backup[name] = param.data
                param.data = self.shadow[name]

    def restore(self):
        for name, param in self.model.named_parameters():
            if param.requires_grad:
                assert name in self.backup
                param.data = self.backup[name]
        self.backup = {}


class DropBlock2D(nn.Module):
    def __init__(self, drop_prob, block_size):
        super(DropBlock2D, self).__init__()
        self.drop_prob = drop_prob
        self.block_size = block_size

    def forward(self, x):
        if not self.training or (self.drop_prob - 0) < 1e-9:
            return x
        else:
            gamma = self.compute_gamma(x)
            mask = (torch.rand(x.shape[0], *x.shape[2:]) < gamma).float()
            mask = mask.to(x.device)
            block_mask = self.compute_block_mask(mask)
            out = x * block_mask[:, None, :, :]
            out = out * block_mask.numel() / block_mask.sum()

            return out

    def compute_gamma(self, x):
        return self.drop_prob / (self.block_size ** 2)

    def compute_block_mask(self, mask):
        block_mask = F.max_pool2d(input=mask[:, None, :, :], kernel_size=(self.block_size, self.block_size),
                                  stride=(1, 1), padding=self.block_size // 2)
        if self.block_size % 2 == 0:
            block_mask = block_mask[:, :, :-1, :-1]
        block_mask = 1 - block_mask.squeeze(1)

        return block_mask


class SPPLayer(nn.Module):
    def __init__(self, num_levels):
        super(SPPLayer, self).__init__()
        self.num_levels = num_levels

    def forward(self, x):
        n, c, h, w = x.size()
        for i in range(self.num_levels):
            level = i + 1
            kernel_size = (math.ceil(h / level), math.ceil(w / level))
            stride = (math.ceil(h / level), math.ceil(w / level))
            padding = (
            math.floor((kernel_size[0] * level - h + 1) / 2), math.floor((kernel_size[1] * level - w + 1) / 2))
            tensor = F.max_pool2d(x, kernel_size=kernel_size, stride=stride, padding=padding)
            if (i == 0):
                out = tensor.view(n, -1)
            else:
                out = torch.cat((out, tensor.view(n, -1)), 1)

        return out


class Conv(nn.Module):
    def __init__(self, inchannel, outchannel, kernel_size, stride=1):
        super(Conv, self).__init__()
        self.conv = nn.Sequential(
            nn.Conv2d(inchannel, outchannel, kernel_size, stride, kernel_size // 2, bias=False),
            nn.BatchNorm2d(outchannel),
            nn.LeakyReLU(negative_slope=0.1)
        )


class Upsample(nn.Module):
    def __init__(self, inchannel, outchannel):
        super(Upsample, self).__init__()
        self.upsample = Conv(inchannel, outchannel, 1)

    def forward(self, x, target_size):
        x = self.upsample(x)
        x = F.interpolate(x, target_size, mode="bilinear", align_corners=False)
        return x


class Downsample(nn.Module):
    def __init__(self, inchannel, outchannel):
        super(Downsample, self).__init__()
        self.downsample = Conv(inchannel, outchannel, 3, 2)

    def forward(self, x):
        return self.downsample(x)


class PAN(nn.Module):
    def __init__(self, feature_channels):
        super(PAN, self).__init__()
        self.init_trans = nn.ModuleList(
            [Conv(channel, channel // 2, 1) for channel in feature_channels[:-1]] + [nn.Sequential(
                Conv(feature_channels[-1], feature_channels[-1] // 2, 1),
                Conv(feature_channels[-1] // 2, feature_channels[-1], 3),
                Conv(feature_channels[-1], feature_channels[-1] // 2, 1)
            )])
        self.up_trans=nn.ModuleList([self.trans(channel) for channel in feature_channels[:-1]]+[nn.Identity()])
        self.down_trans=nn.ModuleList([nn.Identity()]+[self.trans(channel) for channel in feature_channels[1:]])
        self.upsamples=nn.ModuleList([Upsample(high//2,low//2) for high,low in zip(feature_channels[1:],feature_channels[:-1])])
        self.downsamples = nn.ModuleList([Downsample(low//2,high//2) for low,high in zip(feature_channels[:-1],feature_channels[1:])])

    def trans(self,channel):
        return nn.Sequential(
            Conv(channel,channel//2,1),
            Conv(channel//2,channel,3),
            Conv(channel,channel//2,1),
            Conv(channel//2,channel,3),
            Conv(channel,channel//2,1)
        )

    def forward(self,features):
        features=[layer(f) for layer,f in zip(self.init_trans,features)]
        features[-1]=self.up_trans[-1](features[-1])
        for idx in range(len(features)-1,0,-1):
            features[idx-1]=torch.cat([features[idx-1],self.upsamples[idx-1](features[idx],features[idx-1].shape[2:])],dim=1)
            features[idx-1]=self.up_trans[idx-1](features[idx-1])
        features[0]=self.down_trans[0](features[0])
        for idx in range(0,len(features)-1):
            features[idx+1]=torch.cat([self.downsamples[idx](features[idx]),features[idx+1]],dim=1)
            features[idx+1]=self.down_trans[idx+1](features[idx+1])

        return features


class Mish(nn.Module):
    def __init__(self):
        super(Mish, self).__init__()

    def forward(self,x):
        return x*torch.tanh(F.softplus(x))


class AddCoords(nn.Module):
    def __init__(self,with_r=False):
        super(AddCoords, self).__init__()
        self.with_r=with_r

    def forward(self,input_tensor):
        N,C,H,W=input_tensor.size()
        h_channel=torch.arange(H).repeat(1,W,1)
        w_channel=torch.arange(W).repeat(1,H,1).transpose(1,2)
        h_channel=h_channel.float()/(H-1)
        w_channel=w_channel.float()/(W-1)
        h_channel=h_channel*2-1
        w_channel=w_channel*2-1
        h_channel=h_channel.repeat(N,1,1,1).transpose(2,3)
        w_channel=w_channel.repeat(N,1,1,1).transpose(2,3)
        out_tensor=torch.cat([input_tensor,h_channel.type_as(input_tensor),w_channel.type_as(input_tensor)],dim=1)
        if self.with_r:
            r=torch.sqrt(torch.pow(h_channel.type_as(input_tensor)-0.5,2)+torch.pow(w_channel.type_as(input_tensor)-0.5,2))
            out_tensor=torch.cat([out_tensor,r],dim=1)

        return out_tensor


class CoordConv(nn.Module):
    def __init__(self,with_r,inchannel,outchannel,kernel_size):
        super(CoordConv, self).__init__()
        self.addcoord=AddCoords(with_r)
        inchannel+=2
        if with_r:
            inchannel+=1
        self.conv=nn.Sequential(
            nn.Conv2d(inchannel,outchannel,kernel_size,1,kernel_size//2,bias=False),
            nn.BatchNorm2d(outchannel),
            nn.LeakyReLU(0.1,inplace=True)
        )

    def forward(self,x):
        x=self.addcoord(x)
        x=self.conv(x)
        return x


def box_area(boxes:torch.Tensor)->torch.Tensor:
    return (boxes[:,2]-boxes[:,0])*(boxes[:,3]-boxes[:,1])


def box_iou(boxes1:torch.Tensor,boxes2:torch.Tensor)->torch.Tensor:
    area1=box_area(boxes1)
    area2=box_area(boxes2)
    lt=torch.max(boxes1[:,None,:2],boxes2[:,:2])
    rb=torch.min(boxes1[:,None,2:],boxes2[:,2:])
    wh=(rb-lt).clamp(min=0)
    inter=wh[:,:,0]*wh[:,:,1]
    iou=inter/(area1[:,None]+area2-inter)
    return iou


def soft_nms(boxes:torch.Tensor,scores:torch.Tensor,soft_thre,iou_thre,weight_method,sigma):
    keep=[]
    idxs = scores.argsort()
    while idxs.numel()>0:
        idxs = scores.argsort()
        if idxs.size(0)==1:
            keep.append(idxs[-1])
            break
        keep_len=len(keep)
        max_score_idx=idxs[-(keep_len+1)]
        max_score_box=boxes[max_score_idx][None,:]
        idxs=idxs[:-(keep_len+1)]
        other_boxes=boxes[idxs]
        keep.append(max_score_idx)
        ious=box_iou(max_score_box,other_boxes)
        if weight_method=="linear":
            thre_bool=ious[0]>=iou_thre
            thre_idxs=idxs[thre_bool]
            scores[thre_idxs]*=(1.-ious[0][thre_bool])
        elif weight_method=="gauss":
            scores[idxs]*=torch.exp(-(ious[0]*ious[0])/sigma)

    keep=idxs.new(keep)
    keep=keep[scores[keep]>soft_thre]
    boxes=boxes[keep]
    scores=scores[keep]

    return boxes,scores

PP-YOLO: An Effective and Efficient Implementation of Object Detector

PP-YOLO-V1 https://arxiv.org/abs/2007.12099 --------------------------------------------------------- 2021-09-07

yolo-v4：Optimal Speed and Accuracy of Object Detection解析

YOLOv4: Optimal Speed and Accuracy of Object Detection 摘要深度學習發展至今，依據產生了許多優秀的技術。其中一些技術對特定的資料集或小資料集有著良好的表現；而有一些技術擁有著普遍的適用性，在各個領域、

B. Omkar and Last Class of Math 思維lcm

題意　　給你一個n，要求給出兩個整數a和b，使得a+b=n且lcm(a,b)最小。思路　　結論：答案是k和n-k，k為n的最大真因子。

Codeforces Round #655 (Div. 2) B. Omkar and Last Class of Math (數學)

題意:給你一個正整數\$n\$,求兩個正整數\$a\$和\$b\$,使得\$a+b=n\$,並且\$LCM(a,b)\$要儘可能的小.

Omkar and Last Class of Math

In Omkar\'s last class of math, he learned about the least common multiple, orLCMLCM.LCM(a,b)LCM(a,b)is the smallest positive integerxxwhich is divisible by bothaaandbb.

CppCon筆記--Back to Basics: RAII and the Rule of Zero

1.RAII 和 rule of three C++程式設計很多時候需要手動管理資源，其中包括資源的獲取，使用和釋放，而手動對資源釋放是很容易出錯的一個環節。

C++ Templates (2.1 類模板Stack的實現 Implementation of Class Template Stack)

返回完整目錄目錄2.1 類模板Stack的實現 Implementation of Class Template Stack2.1.1 宣告類模板 Declaration of Class Templates2.1.2 成員函式實現 Implementation of Member Functions

An ffmpeg and SDL Tutorial在ffmpeg-1.0.1上的更新，tutorial02

An ffmpeg and SDL Tutorial在ffmpeg-1.0.1上的更新 Tutorial02 http://cutebunny.blog.51cto.com/301216/1150226

論文閱讀筆記5-An Asynchronous Energy-Efficient CNN Accelerator with Reconfigurable Architecture

一、Title An Asynchronous Energy-Efficient CNN Accelerator with Reconfigurable Architecture 二、Abstract & Introduction

文獻翻譯——YOLOv4:Optimal Speed and Accuracy of Object Detection

摘要目前有很多可以提高CNN準確性的演算法。這些演算法的組合在龐大資料集上進行測試、對實驗結果進行理論驗證都是非常必要的。有些演算法只在特定的模型上有效果，並且只對特定的問題有效，或者只對小規模

PP-YOLO超越YOLOv4-目標檢測的進步

作者|Jacob Solawetz 編譯|Flin 來源|towardsdatascience PP-YOLO評估指標比現有最先進的物件檢測模型YOLOv4表現出更好的效能。然而，百度的作者寫道：

PP-YOLO優化策略與案例分享 -- 電力巡檢

目錄前言無人電力巡檢低成本部署方案專案背景方案選擇專案難點方案選擇解決方案實際案例

Leetcode 34 Find First and Last Position of Element in Sorted Array

題目描述給定已經排序好的序列nums，當給定target時，要求在時間複雜度為O(logn)內，查詢target在nums中的出現範圍，如果不存在，則返回[-1,-1]。

【leetcode-Python】-二分搜尋-34 Find First and Last Position of Element in Sorted Array

技術標籤：leetcode 目錄題目連結題目描述示例解決思路一解決思路一Python實現

Multiclass and multioutput overview of sklearn

Multiclass and multioutput algorithms https://scikit-learn.org/stable/modules/multiclass.html# sklearn 支援如下典型型別學習

PaddleDetection 匯出PP-YOLO 型別模型時報錯AssertionError: Bad argument number for Assign: 2, expecting 3 解決記錄

詳細報錯記錄: paddle_38) ziyueshijue@ziyueshijue-desktop:~/work/cppwork/padleDemo/model/PaddleDetection$ python tools/export_model.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o weights=https:/

setTimeout and setInterval counterpart of Python

setTimeout https://docs.python.org/3.6/library/sched.html The sched module defines a class which implements a general purpose event scheduler:

論文筆記3：SegFormer Simple and Efficient Design for Semantic Segmentation with Transformers

論文地址：https://arxiv.org/abs/2105.15203 1 引言文章提出了一種基於transformer的語義分割網路，不同於ViT模型，SegFormer使用一種分層特徵表示的方法，每個transformer層的輸出特徵尺寸逐層遞減，通過這種方式

HDU7074 Little prince and the garden of roses

HDU7074Little prince and the garden of roses 首先可以對於每一個顏色分別考慮。如果存在與\$(i,j)\$，就在\$i,j+n\$中間連一條邊。

【資料結構】演算法排序陣列中查詢元素的第一個和最後一個位置 Find First And Last Position of Element in Sorted Array

目錄排序陣列中查詢元素的第一個和最後一個位置 Find First And Last Position of Element in Sorted Array思路Tag

PP-YOLO: An Effective and Efficient Implementation of Object Detector

相關推薦