Overview：end-to-end深度學習網絡在超分辨領域的應用（待續）

阿新 • • 發佈：2018-10-02

向量不同的這就是 src dimens sep max pos pca

1. SRCNN
- Contribution
- Inspiration
- Network
  - O. Pre-processing
  - I. Patch extraction and representation
  - II. Non-linear mapping
  - III. Reconstruction
- Story
- Further learning

1. SRCNN

Home page
http://mmlab.ie.cuhk.edu.hk/projects/SRCNN.html

2014 ECCV ，2015 TPAMI .

Contribution

end-to-end深度學習應用在超分辨領域的開山之作，2018年被引超1000次。（非 end-to-end 見 Story.3 ）

指出了傳統方法（ sparse-coding-based SR methods ）和深度學習方法的關系，具有指導意義。
SRCNN網絡非常簡單，PSNR、SSIM 等卻有小幅提升（< 1dB）。具體而言：
- The method (SRCNN) directly learns an end-to-end mapping between the low/high-resolution images.
- 由於是端到端網絡，因此 training 是對整體框架的全面優化（具體見 Inspiration.2 ）。
應用（測試）時是完全 feed-forward ，因此網絡速度快。

Inspiration

This problem (SR) is inherently ill-posed since a multiplicity of solutions exist for any given low-resolution pixel.
Such a problem is typically mitigated by constraining the solution space by strong prior information.
註：訓練 CNN 就是在學習先驗知識。
傳統方法著重於學習和優化 dictionaries ，但對其他部分鮮有優化。
但對於CNN，其卷積層負責 patch extraction and aggregation ，隱藏層充當 dictionaries ，因此都會被統一優化。

換句話說，我們只需要極少的 pre/post-processing 。
過去，我們用 a set of pre-trained bases such as PCA, DCT, Haar 來表示 patches 。
現在，我們用不同的卷積核，就實現了多樣化的表示。
由於 overlapping ，因此卷積使用的像素信息比簡單的字典映射更多。

Network

技術分享圖片

O. Pre-processing

將低分辨率的圖片，通過 Bicubic interpolation 得到 \(\mathbf Y\) 。註意我們仍然稱之為 low-resolution image 。

I. Patch extraction and representation

從 \(\mathbf Y\) 提取出 overlapping patches ，每一個 patch 都代表一個 high-dimensional vector 。
這些向量共同組成一個 set of feature maps 。
每一個 vector 的維數，既是總特征數，也是 feature map 的總數。

\[ F_1(\mathbf Y) = max(0, W_1 * \mathbf Y + B_1) \]

II. Non-linear mapping

通過一個非線性變換，由原 high-dimensional vector 變換到另一個 high-dimensional vector 。
該 high-dimensional vector 又組成了一個 set of feature maps ，在概念上代表著 high-resolution patch 。

\[ F_2(\mathbf Y) = max(0, W_2 * F_1(\mathbf Y) + B_2) \]

III. Reconstruction

生成接近 ground truth: \(\mathbf X\) 的 output 。

過去常用取平均的方法。實際上，平均也是一個特殊的卷積。
因此我們不妨直接用一個卷積。
此時，輸出patch不再是簡單的平均，還可以是頻域上的平均等（取決於 high-dimensional vector 的性質）。

\[ F_3(\mathbf Y) = W_2 * F_2(\mathbf Y) + B_3 \]

註意不要再非線性處理。

Story

深度CNN日益受歡迎的3大誘因：
- 更強大的GPU；
- 更多的數據（如ImageNet）；
- ReLU的提出，加快收斂的同時保持良好質量。
CNN此前被用於 natural image denoising and removing noisy patterns (dirt/rain) ，用於 SR 是頭一回。
這就是講好故事的重要性，無非是映射 pairs 不同。
auto-encoder 也曾被用於超分辨網絡，但仍沒有擺脫 separated framework 的弊端。

Further learning

Traditional sparse-coding-based SR methods
從低分辨率圖像到 \(\mathbf Y\) 采用的是 Bicubic interpolation ，實際上也是卷積。但為什麽不當作卷積層呢？
文中解釋，因為輸出比輸入還大，為了有效利用 well-optimized implementations sucha as cuda-convnet ，就暫且不當作一個“層”。

Overview：end-to-end深度學習網絡在超分辨領域的應用（待續）

Overview：end-to-end深度學習網絡在超分辨領域的應用（待續）

1. SRCNN

Contribution

Inspiration

Network

O. Pre-processing

I. Patch extraction and representation

II. Non-linear mapping

III. Reconstruction

Story

Further learning

深度學習模型在各個框架之間轉換（待續）

Overview：end-to-end深度學習網絡在超分辨領域的應用（待續）

深度學習網絡結構中超參數momentum了解

第一篇：基於深度學習的人臉特徵點檢測 - 背景（轉載）

mxnet-深度學習網絡

【深度學習】基於caffe的表情識別（四）：在Intel AI DevCloud上訓練模型

珍藏 | 基於深度學習的目標檢測全面梳理總結（下）

珍藏 | 基於深度學習的目標檢測全面梳理總結（上）

深度學習之(神經網路)單層感知器（python）（一）

深度學習---迴圈神經網路RNN詳解（LSTM）

這些深度學習術語，你瞭解多少？（上）

這些深度學習術語，你瞭解多少？（下）

深度學習之卷積神經網路入門（2）

深度學習——被Intel caffe支配的恐懼（一）

深度學習筆記——理論與推導之Backpropagation（二）

深度學習必須熟悉的演算法之word2vector（一）

深度學習演算法的點雲分割-PointNet（五）

深度學習筆記——理論與推導之DNN（三）

深度學習演算法的點雲分割-Pointnet++（一）

深度學習中的數據增強技術（二）

Overview：end-to-end深度學習網絡在超分辨領域的應用（待續）

1. SRCNN

Contribution

Inspiration

Network

O. Pre-processing

I. Patch extraction and representation

II. Non-linear mapping

III. Reconstruction

Story

Further learning

相關推薦