OverFeat:Integrated Recognition, Localization and Detection using Convolutional Networks

阿新 • • 發佈：2021-08-30

概
主要內容

Sermanet P., Eigen D., Zhang X., Mathieu M., Fergus R., LeCun Y. OverFeat:integrated recognition, localization and detection using convolutional networks. In International Conference on Learning Representations (ICLR), 2014.

概

通常的sliding windows需要大量的計算量: 首先我們需要框出一個區域, 再將該區域進行判斷, 當區域(windows)的數量很多的時候, 這麼做是非常耗時的.

但是本文作者發現, 通過卷積, 可以將所有的區域一次性計算, 使得大量重複計算能夠節省下來. 個人覺得還是非常有意思的.

主要內容

如上圖所示, 第一行展示了對一個普通圖片進行判斷的過程:

input: \(14 \times 14 \times *\), 經過\(5 \times 5\)的卷積核(stride=1, padding=0), 得到:
\(10 \times 10 \times *\)的mappings, 再經過\(2 \times 2\)的pooling (stride=2, padding=0), 得到:
\(5 \times 5 \times *\)的mappings, 到此為特徵提取階段;

接下來, 是分類器部分, 實際上, 原本是全連線層部分, 我們首先以全連線層的角度過一遍, 令\(d_1=5 \times 5 \times *\):
通過\(W \in \mathbb{R}^{d_2 \times d_1}\) 將特徵對映為\(d_2\)的向量;
再通過\(W' \in \mathbb{R}^{C \times d_2}\) 將特徵對映為\(C\)的向量(C表示類別數目);
既然全連線層是特殊的卷積, 4相當於
\(d_1\)個\(5 \times 5\)的卷積作用於特徵, 5相當於
\(d_2\)個\(1 \times 1\)的卷積, 6相當於
\(C\)個\(1 \times 1\)

的卷積.

再來看第二行, 其輸入為\(16 \times 16\)大小的圖片, 輸出是\(2 \times 2 \times C\), 而且藍色部分之間是相互對應的. 設想, 我們將\(16 \times 16\)的圖片通過sliding windows (stride=2)可以劃分出四幅圖片, 而這四個圖片經過網路所得到的logits正好是最後輸出的\(2\times 2\)中所對應的位置, 這意味著我們一次性計算了所有的windows, 但是計算量卻並沒有太多增加.

那麼, 相應的windows是怎麼劃分的呢?

倘若網路每一層的核的stride為\(s_1, s_2, \cdots, s_k\), 那麼windows之間的stride應該為

\[s_1 \times s_2 \times \cdots \times s_k. \]

注: stride是固定的, 但是圖片的大小不一定固定, 像ResNet, 由於全連線層前有一個average pooling的操作, 故我們可以傳入大小不定的圖片進去.

問: 但是有些卷積核還有padding的操作, 這個該如何理解呢?(小誤差?)

OverFeat:Integrated Recognition, Localization and Detection using Convolutional Networks

目錄概主要內容 Sermanet P., Eigen D., Zhang X., Mathieu M., Fergus R., LeCun Y. OverFeat:integrated recognition, localization and detection using convolutional networks. In International Conference

論文：Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks 閱讀筆記

一、論文 (16)Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks https://arxiv.org/abs/1604.02878

Long-term Recurrent Convolutional Networks for Visual Recognition and Description

視覺識別和描述的長期遞迴卷積網路摘要：基於深度卷積網路的模型主導了最近的影象解釋任務。我們調查了也經常使用的模型是否對涉及序列，視覺和其他方面的任務有效。我們描述了一類遞迴卷積體系結構，它是

抓取檢測之 Convolutional multi-grasp detection using grasp path for RGBD images

技術標籤：機器人-抓取檢測計算機視覺神經網路神經網路計算機視覺深度學習

深度學習論文翻譯解析（九）：Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

論文標題：Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition　　　　　　標題翻譯：用於視覺識別的深度卷積神經網路中的空間金字塔池

深度學習論文翻譯解析（十）：Visualizing and Understanding Convolutional Networks

論文標題：Visualizing and Understanding Convolutional Networks 　　標題翻譯：視覺化和理解卷積網路

CCS - Digital Transmission via Carrier Modulation - Phase Demodulation and Detection

Phase Demodulation and Detection Matlab Coding 1 % MATLAB script 2 3 M = 4; 4 Es = 1;% Energy per symbol

CCS - Digital Transmission via Carrier Modulation - Demodulation and Detection of QAM (Quadrature Amplitude Modulation)

Demodulation and Detection of QAM (Quadrature Amplitude Modulation) Matlab Coding For the M = 8 QAM signal constellation shown in Figure 7.21(b) that

Very Deep Convolutional Networks for Large-Scale Image Recognition-VGGNet解讀

作者：HYH 日期：2020-9-10 論文期刊：ICLR2015 標籤：VGG 論文：《Very Deep Convolutional Networks for Large-Scale Image Recognition》

【論文筆記（5）ECCV2020】Graph convolutional networks for learning with few clean and many noisy labels

Graph convolutional networks for learning with few clean and many noisy labels AbstractIntroductionRelated WrokProblem formulationCleaning with graph convolutional networksLearning a classi

paper1—Machine Learning Approach for Ship Detection using Remotely Sensed Images

1、Tensor Flow Tensor Flow is a programming system developed by Google which represents computations as graphs. Computation Graph is first constructed to train the neural network and then exe

peakdet: Peak detection using MATLAB 峰識別峰面積計算 peak area 相關matlab基本詳解

轉自http://www.billauer.co.il/peakdet.html peakdet: Peak detection using MATLAB peakdet：使用MATLAB的峰值檢測

目標檢測兩個基礎部分——backbone and detection head

轉自：《目標檢測》-第2章-Backbone與Detection head 　　這裡簡單介紹以下目標檢測網路構成的兩個基礎部分：Backbone 和 Detection head.

Generate Fake Image and Detection

Generate Fake Image and Detection Generate Fake Image 【深圳大學】深度文件影象偽造 L. Zhao, C. Chen and J. Huang, “Deep Learning-based Forgery Attack onDocument Images”, IEEE Transactions on Imag

Visualizing and Understanding Convolutional Networks論文復現筆記

目錄Visualizing and Understanding Convolutional Networks 論文復現筆記AbstractIntroductionApproachVisualization with a Deconvnet關於Deconvnet的實現Convnet Visualization對於一個給定的Feature map，論文中

Nearest cluster-based intrusion detection through convolutional neural networks 筆記

Nearest cluster-based intrusion detection through convolutional neural networks 技術要點 So, the primary innovation of this study is the definition ofa new deep learning pipeline, that couples the ch

筆記：Joint Type Inference on Entities and Relations via Graph Convolutional Networks

Joint Type Inference on Entities and Relations via Graph Convolutional Networks 作者：Sun et al., 2019 ACL

Fully Convolutional Networks for Semantic Segmentation

FCN論文地址：https://arxiv.org/abs/1411.4038 FCN原始碼地址：https://github.com/shelhamer/fcn.berkeleyvision.org

FCN論文解讀：FCN-Fully Convolutional Networks for Semantic Segmentation

FCN原文作為語義分割領域的開山之作，對其進行研究和閱讀幾乎是入門語義分割領域的基礎，這篇部落格整理了自己閱讀該論文的一些心得感悟和收穫。

理解 Deformable Convolutional Networks

理解 Deformable Convolutional Networks Feng Nie AI Scientist 1 空洞卷積 1.1 理解空洞卷積在影象分割領域，影象輸入到CNN（典型的網路比如FCN）中，FCN先像傳統的CNN那樣對影象做卷積再pool

OverFeat:Integrated Recognition, Localization and Detection using Convolutional Networks

概

主要內容

相關推薦