Image Paragraph論文合輯

阿新 • • 發佈：2018-12-03

A Hierarchical Approach for Generating Descriptive Image Paragraphs (CPVR 2017) Li Fei-Fei.

資料集地址: http://cs.stanford.edu/people/ranjaykrishna/im2p/index.html

Workflow:

1.decompose the input image by detecting objects and other regions of interest

2.aggregate features across these regions to produce a pooled representation richly expressing the image semantics

3.take this feature vector as input by a hierarchical recurrent neural network composed of two levels: a sentence RNN and a word RNN.

4.sentence RNN receives the image features ,decides how many sentences to generate in the resulting paragraph, and produce an input topic vector for each sentence.

5.word RNN use this topic vector to generate the words of a single sentence.

Region Detector:

CNN+RPN

resize image-->pass through a CNN to get feature maps-->region proposal network(RPN) process the resulting feature maps-->regions of interest are projected onto the convolutional feature maps-->the corresponding region of the feature map is resized to a fixed size using bilinear interpolation and processed by two fully-connected layers to give a vector of dimension D for each region.

Given a dataset of images and ground-truth regions of interest, the region detector can be trained end-to-end fashion for object detection and for dense captioning.

Region Pooling:

elementwise maximum, W_pool and b_pool are learned parameters, v_i stands for a set of vectors produced by the region detector.

Hierarchical Recurrent Network:

Why Hierachical?

1.It reduces the length of time over which the recurrent networks must reason.

2.the generated paragraphs contain numbers of sentences, both the paragraph and sentence RNNs need only reason over much shorter time-scales, making learning an appropriate representation much more tractable

Sentence RNN: take the pooled region vector v_p as input and produce a sequence of hidden states h₁,h₂,...,h_S one for each sentence in the paragraph. Each hidden state used in two ways, produce a distributin p_i to determine whether to stop and produce the topic vector t_i for the i-th sentence of the paragraph ,which is the input of the word RNN.

Word RNN: the same as the LSTM components in the image captionings.

Training and Sampling:

training loss l(x,y) for the example (x,y) is a weighted sum of the two cross-entropy terms: a sentence loss l_sent on the stopping distribution p_i , and a word loss l_word on the word distribution p_ij

Experiments:

Recurrent Topic-Transition GAN for Visual Paragraph Generation (ICCV 2017)
Xiaodan Liang, Zhiting Hu, Hao Zhang, Chuang Gan, Eric Xing
RTT-GAN

Towards Diverse and Natural Image Descriptions via a Conditional GAN (ICCV 2017)

Diverse and Coherent Paragraph Generation from Images (ECCV 2018)

github: https://github.com/metro-smiles/CapG_RevG_Code

The authors propose to augment paragraph generation techniques with "coherence vectors," "global topic vectors," and modeling of the inherent ambiguity of associating paragraphs with images, via a variational auto-encoder formulation.

Training for Diversity in Image Paragraph Captioning (EMNLP 2018)

github: https://github.com/lukemelas/image-paragraph-captioning

Image Paragraph論文合輯

A Hierarchical Approach for Generating Descriptive Image Paragraphs (CPVR 2017) Li Fei-Fei.

Training for Diversity in Image Paragraph Captioning (EMNLP 2018)

Image Paragraph論文合輯

Image Captioning論文合輯

Image Caption論文合輯2

iPhone8發布後那些搞笑Geek段子合輯 #精選搞笑GEEK段子

微信小程序--圖片相關問題合輯

IDEA攻略合輯

【李宏毅深度學習合輯】Advanced Topics in Deep Learning - Imitation Learning

20、【opencv入門】霍夫變換：霍夫線變換，霍夫圓變換合輯

【opencv入門】重映射 & SURF特征點檢測合輯

34、【opencv入門】重映射 & SURF特征點檢測合輯

Rethinking Atrous Convolution for Semantic Image Segmentation論文解

形態學影象處理:開運算、閉運算、形態學梯度、頂帽、黑帽合輯

11月26日雲棲精選夜讀 | 機器學習高質量資料集大合輯（附連結）

機器學習高質量資料集大合輯（附連結）

Colorful Image Colorization 論文筆記

OpenCV3檢測直線或圓：霍夫線變換，霍夫圓變換合輯

Multi-modal Sentence Summarization with Modality Attention and Image Filtering 論文筆記

JAVA.J2EE.Extjs.Hibernate.Servlet開發實戰免費教程合輯（轉）

熬夜吐血整理最全web前端面試題合輯（三）

熬夜吐血整理最全web前端面試題合輯（四）

Image Paragraph論文合輯

A Hierarchical Approach for Generating Descriptive Image Paragraphs (CPVR 2017) Li Fei-Fei.

Training for Diversity in Image Paragraph Captioning (EMNLP 2018)

相關推薦