[CVPR 2016] Weakly Supervised Deep Detection Networks論文筆記

阿新 • • 發佈：2018-04-02

del found score feature 圖片 http spl span 根據

Weakly Supervised Deep Detection Networks，Hakan Bilen，Andrea Vedaldi

https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Bilen_Weakly_Supervised_Deep_CVPR_2016_paper.pdf

亮點

把弱監督檢測問題解釋為proposal排序的問題，通過比較所有proposal的類別分數得到一個比較正確的排序，這種思想與檢測中評測標準的計算方法一致

相關工作

The MIL strategy results in a non-convex optimization problem; in practice, solvers tend to get stuck in local optima

such that the quality of the solution strongly depends on the initialization.

developing various initialization strategies [19, 5, 32, 4]

[19] propose a self-paced learning strategy
[5] initialize object locations based on the objectness score.
[4] propose a multi-fold split of the training data to escape local optima.

on regularizing the optimization problem [31, 1].

[31] apply Nesterov’s smoothing technique to the latent SVM formulation
[1] propose a smoothed version of MIL that softly labels object instances instead of choosing the highest scoring ones.

Another line of research in WSD is based on the idea of identifying the similarity between image parts.

[31] propose a discriminative graph-based algorithm that selects a subset of windows such that each window is connected to its nearest neighbors in positive images.
[32] extend this method to discover multiple co-occurring part configurations.
[36] propose an iterative technique that applies a latent semantic clustering via latent Semantic Analysis (pLSA)
[2] propose a formulation that jointly learns a discriminative model and enforces the similarity of the selected object regions via a discriminative convex clustering algorithm

方法

本文采用的方法非常簡單易懂，主要分為以下三部：

將特征和region proposal的結果輸入spatial pyramid pooling層，取出與區域相關的特征向量，並輸入兩個fc層
分類：fc層的輸出通過softmax分類器，計算出這一區域類別
檢測：fc層的輸出通過softmax分類器，與上面不同的是歸一化的時候不是用類別歸一化，而是用所有區域的分數進行歸一化，通過區域之間的對比找到包含該類別信息最多的區域

某區域r屬於某類別c的得分，為後兩部分的積
全圖的類別得分，為所有區域屬於該類別的得分之和

技術分享圖片

訓練的loss function如下

技術分享圖片

最後一項是一個校準項（按照理解輕微更改了，感覺論文notation有點問題），其目的是通過拉近feature的距離約束解的平滑性（即與正確解相近的proposal也應該得到高分）。

實驗結果

本文根據basenet不同給出了4種model：S (VGG-F), M (VGG-M-1024), L (VGG-VD16)和Ens（前三種ensemble的模型）

Ablation:

Object proposal

Baseline mAP: Selective Search S 31.1%, M 30.9%, L 24.3%, Ens. 33.3%
Edge Box: +0~1.2%
Edge Box + Edge Box Score: +1.8~5.9%

Spatial regulariser (compared with Edge Box + Edge Box Score) mAP +1.2~4.4%

VOC2007

mAP on test: S +2.9%, M +3.3%, L +3.2%, Ens. +7.7% compared with [36] + context
CorLoc on trainval: S +5.7%, M +7.6%, L +5%, Ens. +9.5% compared with [36]
Classification AP on test: S +7.9% compared with VGG-F, M +6.5% compared with VGG-M-1024, L +0.4% compared with VGG-VD16, Ens. -0.3% compared with VGG-VD16

VOC2010

mAP on test: +8.8% compared with [4]
CorLoc on trainval: +4.5% compared with [4]

缺點

本文有一個明顯的缺點是只考慮了一張圖中某類別物體只出現一次的情況（regulariser中僅限制了最大值及其周圍的框），這一點在文中給出的failure cases中也有所體現。

[CVPR 2016] Weakly Supervised Deep Detection Networks論文筆記

del found score feature 圖片 http spl span 根據 p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px "Helvetica Neue"; color: #323333 } p.p2

Weakly Supervised Deep Detection Networks 學習筆記

Weakly Supervised Deep Detection Networks 詳細解讀論文大致意思是通過影象級的標註資訊訓練網路達到目標檢測的目的，文中

[CVPR2015] Is object localization for free? – Weakly-supervised learning with convolutional neural networks論文筆記

sed pooling was 技術分享 sco 評測 5.0 ict highest p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 15.0px "Helvetica Neue"; color: #323333 } p.p2

論文閱讀筆記3——基於域適應弱監督學習的目標檢測Cross-Domain Weakly-Supervised Object Detection through Progressive Domain A

本文是東京大學發表於 CVPR 2018 的工作，論文提出了基於域適應的弱監督學習策略，在源域擁有充足的例項級標註的資料，但目標域僅有少量影象級標註的資料的情況下，儘可能準確地實現對目標域資料的物體檢測。 ■ 連結 | https://www.paperweekly.site/papers/21

DeepPose: Human Pose Estimation via Deep Neural Networks論文翻譯

翻譯點選連結獲取基本思想級聯網路架構：在第一階段將影象輸入後得到大致位置，在之後的階段利用相同的網路架構得到更精細的結果。對級聯的所有階段使用相同的網路架構，但學習不同的網路引數。其中網路架構使用的是Alex，所不同的是loss函式，AlexNet是用於分類的，而本文的架構是用於迴

Training Very Deep Networks論文筆記

Abstract Theoretical and empirical evidence indicates that the depth of neural networks is crucial for their success. However, training becomes

Joint Deep Learning For Pedestrian Detection（論文筆記-深度學習：行人檢測）

一、摘要：行人檢測主要分為四部分：特徵提取、形變處理、遮擋處理和分類。現存方法都是四個部分獨立進行，本文聯合深度學習將四個部分結合在一起，最大化其能力。二、引言

[paper reading] C-MIL: Continuation Multiple Instance Learning for Weakly Supervised Object Detection CVPR2019

ppi ges cores sets spatial 完整 with rop ima MIL陷入局部最優，檢測到局部，無法完整的檢測到物體。將instance劃分為空間相關和類別相關的子集。在這些子集中定義一系列平滑的損失近似代替原損失函數，優化這些平滑損失。 C-MIL

《You Only Look Once: Unified, Real-Time Object Detection》論文筆記

1. 論文思想 YOLO（YOLO-v1）是最近幾年提出的目標檢測模型，它不同於傳統的目標檢測模型，將檢測問題轉換到一個迴歸問題，以空間分隔的邊界框和相關的類概率進行目標檢測。在一次前向運算中，一個單一的神經網路直接從完整的影象中預測邊界框和類概率。由於整個檢測管道是一個單一的網路，

WRNS：Wide Residual Networks 論文筆記

轉載請標明出處，理解不到位的地方也希望大家批評指正，謝謝！前言俗話說，高白瘦才是唯一的出路。但在深度學習界貌似並不是這樣。Wide Residual Networks就要證明自己，矮胖的神經網路也是潛力股。其實從名字中就可以看出來，Wide Re

Decoupled Networks 論文筆記

0 摘要基於內積運算的卷積操作一直是卷積神經網路（CNN）的核心元件，也是學習視覺表示的關鍵。我們觀察發現，CNN學習的特徵是類內差異（特徵的幅值）和類間差異（特徵間的夾角，語義差異）的耦合。我們提出了一種通用的解耦學習框架，該框架對類內差異和類間差

Densely Connected Convolutional Networks 論文筆記

0 摘要最近的成果顯示，如果神經網路各層到輸入和輸出層採用更短的連線，那麼網路可以設計的更深、更準確且訓練起來更有效率。本文根據這個現象，提出了Dense Convolutional Network (DenseNet)，它以前饋的方式將每個層都連線

Channel Pruning for Accelerating Very Deep Neural Networks 演算法筆記

這是一篇ICCV2017的文章，關於用通道剪枝（channel pruning）來做模型加速，通道減枝是模型壓縮和加速領域的一個重要分支。文章的核心內容是對訓練好的模型進行通道剪枝（channel pruning），而通道減枝是通過迭代兩步操作進行的：第

論文閱讀筆記二十六：Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks（CVPR 2016）

論文源址：https://arxiv.org/abs/1506.01497 tensorflow程式碼：https://github.com/endernewton/tf-faster-rcnn 摘要目標檢測依賴於區域proposals演算法對目標的位置進

論文翻譯——Scalable Object Detection using Deep Neural Networks

Scalable Object Detection using Deep Neural Networks 作者：Dumitru Erhan,Christian Szegedy, Alexander Toshev等發表時間

論文筆記 / Mitosis Detection in Breast Cancer Histology Images with Deep Neural Networks

僅供參考，如有翻譯不到位的地方敬請指出。轉載請標明出處！論文地址：https://link.springer.com/chapter/10.1007/978-3-642-40763-5_51 摘要我們使用含有最大池化層的深度卷積神經網路來檢測乳腺組織學影象中的有絲分裂。訓練網路以

【深度學習論文筆記】Deep Neural Networks for Object Detection

論文:<<Deep Neural Networks for Object Detection>> 作者:Christian Szegedy Al

VGGnet論文總結（VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION）

lrn cli 共享融合 loss sca 得到同時 works VGGNet的主要貢獻：　　1、增加了網絡結構的深度　　2、使用了更小的filter（3*3） 1 introduction 這部分主要說明了，由於在所有的卷積網絡上使用了3*3的filter，所以使

論文筆記-Personal Recommendation Using Deep Recurrent Neural Networks in NetEase

use clas max onf 一位 url base 輸入 ont 思路：利用RNN對用戶瀏覽順序建模，利用FNN模擬CF，兩個網絡聯合學習 RNN網絡結構：輸出層的state表示用戶瀏覽的某一頁面，可以看做是一個one-hot表示，state0到3是依次瀏覽的

論文筆記--PCN:Real-Time Rotation-Invariant Face Detection with Progressive Calibration Networks

.com 角度 ati 分享圖片直接算法二級使用計算測試demo：https://github.com/Jack-CV/PCN 關鍵詞：rotation-invariant face detection， rotation-in-plane， coarse-t

[CVPR 2016] Weakly Supervised Deep Detection Networks論文筆記

相關推薦