論文解讀-RRU-Net: The Ringed Residual U-Net for Image Splicing Forgery Detection

阿新 • • 發佈：2022-04-07

論文解讀-RRU-Net: The Ringed Residual U-Net for Image Splicing Forgery Detection

Abstract

The proposed RRU-Net is an end-to-end image essence attribute segmentation network, which is independent of human visual system, it can accomplish the forgery detection without any preprocessing and post-processing. The core idea of the RRU-Net is to strengthen the learning way of CNN, which is inspired by the recall and the consolidation mechanism ofthe human brain and implemented by the propagation and the feedback process of the residual in CNN. The residual propagation recalls the input feature information to solve the gradient degradation problem in the deeper network; the residual feedback consolidates the input feature information to make the differences ofimage attributes between the un-tampered and tampered regions bemore obvious.

該RRU網路是一個獨立於人類視覺系統的端到端的影象本質屬性分割網路，不需要任何前處理和後處理就可以完成偽造檢測。RRU網路的核心思想是強化CNN的學習方式，其靈感來源於人腦的回憶和鞏固機制，並通過CNN中殘差的傳播和反饋過程來實現。殘差傳播召回輸入特徵資訊，解決深層網路中的梯度退化問題；殘差反饋對輸入的特徵資訊進行整合，使未篡改區域和篡改區域的影象屬性差異更加明顯。

1.Introduction

For improving the detected tampered regions, the detection methods [1, 27] use the non-overlapping image patch as the input of CNNs. However, when an image patch totally comes from the tampered regions, this image patch will be judged un-tampered label. In [15], the authors utilize the bigger image patch to reveal the image attributes of the tampered regions, however, the detection method may fail if the forgery image is small. For the existing CNN-based detection methods, since they use the image patch as the input of the network, the contextual spatial information is lost, which easily causes incorrect prediction.

Moreover, when the network architecture is deeper, the gradient degradation problem will appear and the discrimination of features will become weaker, which will lead to the splicing forgery detection more difﬁcult or even fail.

為了改進檢測到的篡改區域，檢測方法 [1, 27] 使用非重疊影象塊作為 CNN 的輸入。但是，當一個影象塊完全來自被篡改區域時，該影象塊將被判斷為未篡改標籤。在[15]中，作者利用較大的影象塊來揭示被篡改區域的影象屬性，但是如果偽造影象很小，檢測方法可能會失敗。對於現有的基於 CNN 的檢測方法，由於它們使用影象塊作為網路的輸入，因此會丟失上下文空間資訊，從而容易導致錯誤的預測。而且，當網路架構更深時，會出現梯度退化問題，特徵的辨別能力會變弱，這會導致拼接偽造檢測更加困難甚至失敗。
For overcoming the drawbacks of traditional feature extraction-based methods, meanwhile, further solving the problems of current CNN-based detection methods, a ringed residual U-Net (RRU-Net) is proposed in this paper. RRU-Net is an end-to-end image essence attribute segmentation network, which is independent of human visual system, it can directly locate the forgery regions without any preprocessing and post-processing. Furthermore, RRU-Net can effectively decrease incorrect prediction since it makes better use of the contextual spatial information in a image.

And most of all, the ringed residual structure in RRU-Net can strengthen the learning way of CNN and simultaneously prevent the gradient degradation problem of deeper network, which ensure the discrimination of image essence
attribute features be more obvious while the features are extracted among layers of network.

為了克服傳統基於特徵提取的方法的缺點，同時進一步解決當前基於CNN的檢測方法存在的問題，本文提出了一種環形殘差U-Net（RRU-Net）。 RRU-Net是一種端到端的影象本質屬性分割網路，它獨立於人類視覺系統，無需任何預處理和後處理即可直接定位偽造區域。此外，RRU-Net 可以有效地減少錯誤預測，因為它更好地利用了影象中的上下文空間資訊。

最重要的是，RRU-Net中的環狀殘差結構可以加強CNN的學習方式，同時防止更深網路的梯度退化問題，保證在層間提取特徵的同時，對影象本質屬性特徵的區分更加明顯。的網路。

3. The Ringed Residual U-Net (RRU-Net)

3.1. Residual Propagation

According to the discussion above, the differences of image essence attributes are the signiﬁcant basis for detecting image splicing forgery, however, the gradient degradation problem will destroy the basis when the network architecture gets deeper. For solving the gradient degradation problem, we add the residual propagation to each stacked layers. A building block is shown in Fig. 2, which consists of two convolutional (dilated convolution [31], dconv) layers and residual propagation. The output of the building block is deﬁned as:

\[y_{f}=F\left(x,\left\{W_{i}\right\}\right)+W_{s} * x \]

where, $x$ and $y_{f}$ are the input and output of the building block, $W_{i}$ represents the weights of layer $ i $, the function $F\left(x,\left\{W_{i}\right\}\right)$ represents the residual mapping to be learned. For the example in Fig. 2 that has two convolutional layers,$F=W_{2} \sigma\left(W_{1} * x\right)$ in which $\sigma$ denotes ReLU [19] and the biases are omitted for simplifying notations. The linear projection $W_{s}$ is used to change the dimension of x to match the dimension of $F\left(x,\left\{W_{i}\right\}\right)$ . The operation $ F + W_{s} * x$ is performed by a shortcut connection and element-wise addition.

The residual propagation looks like the recall mechanism of the human brain. We may forget the previous knowledge when we learn several more new knowledge, so we need
the recall mechanism to help us arouse those previous fuzzy memories.

3.2. Residual Feedback

It is obvious that, in splicing forgery detection, if the differences of image essence attributes between the un-tampered and tampered regions can be further strengthened, the performance of the detection can be further improved. In [36], the proposed method superposes the additional difference of noise attribute by passing the forgery imag through an SRM ﬁlter layer to enhance detection results. The SRM ﬁlter layer has a certain effect, however, it is a manual choosing method and can only for the RGB image forgery detection. Moreover, when the un-tampered and tampered regions come from the cameras with the same brand and model, the SRM ﬁlter layer will reduce effectiveness sharply, since they have same noise attribute. For further strengthening the differences of image essence attributes, the residual feedback is proposed, which is an automatic learning method and not just focus on one or several
speciﬁc image attributes. Furthermore, we design a simple and effective attention mechanism, which take advantage of ideas of Hu et al. [9], and then we add it on the residual feedback to pay more attention to the discriminative features of input information. In this attention mechanism, we opt to employ a simple gating mechanism with a sigmoid activation function to learn a nonlinear interaction between
discriminative feature channels and avoid diffusion of feature information, and then we superpose the response values obtained by sigmoid activation on input information to
amplify differences of image essence attributes between the un-tampered and tampered regions. The residual feedback in a building block is shown Fig. 3 and is deﬁned as Eq.(3),

\[y_{b}=\left(s\left(G\left(y_{f}\right)\right)+1\right) * x \]

where, $x$ is the input, $y_{f}$ is the output of residual propagation deﬁned in Eq.(2), $y_{b}$ is the enhanced input. The function G is a linear projection, which is used to change the dimensions of $y_{f}$. The function $s$ is a sigmoid activation function.In contrast to the recall mechanism imitated by the residual propagation, the residual feedback seems to act as the consolidation mechanism of the human brain, we need to consolidate the knowledge already learned by us to obtain the new feature comprehensionp. The residual feedback can amplify the differences of image essence attributes between the un-tampered and tampered regions in the input, as shown in Fig. 1.(c), the tampered region ’eagle’ is am- pliﬁed to global maximal response values by the residual feedback. Furthermore, it also has two far-reaching effects:

(1) the strengthening of the discriminative features can simultaneously be viewed as the repression of the negative label features;

(2) the convergence rate of network in the training process is more fast.

3.3. Ringed Residual Structure and Network Archi-tectures

The proposed ringed residual structure that combines the residual propagation and the residual feedback is shown in Fig. 4.

所提出的結合了殘差傳播和殘差反饋的環形殘差結構如圖4所示。
To sum up, the ringed residual structure guarantees the discrimination of image essence attribute features be more obvious while the features are extracted among layers of network, which can achieve better and stable detection performance than traditional feature extraction-based detection methods and existing CNN-based detection methods.

綜上所述，環狀殘差結構在網路各層之間提取特徵的同時，保證了影象本質屬性特徵的判別更加明顯，與傳統的基於特徵提取的檢測方法和現有的基於CNN的檢測方法相比，能夠獲得更好、穩定的檢測效能。RRU-Net的網路架構如圖5所示，它是一個端到端的影象本質屬性分割網路，無需任何預處理和後處理即可直接檢測拼接偽造。

4.1. Detection at Pixel Level

4.2. Detection at Image Level

5. Conclusion

In this paper, we propose a ringed residual U-Net (RRU-Net) for image splicing forgery detection, which is an end-to-end image essence property segmentation network and can achieve the forgery detection without any preprocessing and post-processing. Inspiring by the recall and consolidation mechanisms of the human brain, the proposed RRU-Net strengthens the learning way of CNN by the propagation and feedback process of the residual. Simultaneously,
we also prove the validity of the ringed residual structure in RRU-Net from theoretical analysis and experimental comparison. We will further explore and visualize the latent discriminative feature between tampered and un-tampered regions to explain the key issues of image splicing forgery detection in our future works.

在本文中，我們提出了一種用於影象拼接偽造檢測的環形殘差U-Net（RRUNet），它是一種端到端的影象本質屬性分割網路，無需任何預處理和後處理即可實現偽造檢測。受人腦回憶和鞏固機制的啟發，所提出的 RRUNet 通過殘差的傳播和反饋過程加強了 CNN 的學習方式。同時，我們還通過理論分析和實驗比較證明了 RRU-Net 中環狀殘差結構的有效性。我們將進一步探索和視覺化篡改和未篡改區域之間的潛在判別特徵，以解釋我們未來工作中影象拼接偽造檢測的關鍵問題。

論文解讀-RRU-Net: The Ringed Residual U-Net for Image Splicing Forgery Detection

論文解讀-RRU-Net: The Ringed Residual U-Net for Image Splicing Forgery Detection

Abstract

1.Introduction

3. The Ringed Residual U-Net (RRU-Net)

3.1. Residual Propagation

3.2. Residual Feedback

3.3. Ringed Residual Structure and Network Archi-tectures

4.1. Detection at Pixel Level

4.2. Detection at Image Level

5. Conclusion

論文解讀-RRU-Net: The Ringed Residual U-Net for Image Splicing Forgery Detection

論文解讀（PCL）《Probabilistic Contrastive Learning for Domain Adaptation》

論文學習 Dilated Inception U-Net (DIU-Net) for Brain Tumor Segmentation 1

萌新的U-Net 影象分割網路及其衍生網路論文閱讀筆記

U-Net在2022年相關研究的論文推薦

《T-GCN: A Temporal Graph Convolutional Network for Trafﬁc Prediction》論文解讀

U-Net閱讀筆記

[論文解讀] 多機器人系統動態任務分配綜述

鑑別力感知的通道剪枝——Discrimination-aware Channel Pruning論文解讀

R-CNN論文解讀-將RCNN的多段訓練合併為一段，使用RoI池化層統一尺度-最大優點是訓練與檢測速度快

論文解讀-土地性質和家庭性質對出行需求的時空影響

FCN論文解讀：FCN-Fully Convolutional Networks for Semantic Segmentation

神經網路“煉丹爐”內部構造長啥樣？牛津大學博士小姐姐用論文解讀

【論文解讀】隱式篇章關係分類：我們需要談一談評估（ACL 2020）

Relation-Shape Convolutional Neural Network for Point Cloud Analysis 論文解讀

論文閱讀：Rethinking the Inception Architecture for Computer Vision

論文解讀1-LiteFlowNet3: Resolving Correspondence Ambiguity for More Accurate Optical Flow Estimation

【論文解讀】【文字檢測】PixelLink

【論文解讀】【文字檢測】SegLink

[論文解讀]A Quantitative Analysis Framework for Recurrent Neural Network

論文解讀-RRU-Net: The Ringed Residual U-Net for Image Splicing Forgery Detection

論文解讀-RRU-Net: The Ringed Residual U-Net for Image Splicing Forgery Detection

Abstract

1.Introduction

3. The Ringed Residual U-Net (RRU-Net)

3.1. Residual Propagation

3.2. Residual Feedback

3.3. Ringed Residual Structure and Network Archi-tectures

4.1. Detection at Pixel Level

4.2. Detection at Image Level

5. Conclusion

相關推薦