1. 程式人生 > >[ECCV 2018筆記] Learning Class Prototypes via Structure Alignment for Zero-Shot Recognition

[ECCV 2018筆記] Learning Class Prototypes via Structure Alignment for Zero-Shot Recognition

Abstract

Zero-shot learning (ZSL) aims to recognize objects of novel classes without any training samples of specific classes, which is achieved by exploiting the semantic information and auxiliary datasets. Recently most ZSL approaches focus on learning visual-semantic embeddings to transfer knowledge from the auxiliary datasets to the novel classes. However, few works study whether the semantic information is discriminative or not for the recognition task. To tackle such problem, we propose a coupled dictionary learning approach to align the visual-semantic structures using the class prototypes, where the discriminative information lying in the visual space is utilized to improve the less discriminative semantic space. Then, zero-shot recognition can be performed in different spaces by the simple nearest neighbor approach using the learned class prototypes. Extensive experiments on four benchmark datasets show the effectiveness of the proposed approach.

翻譯一下摘要: 零樣本學習(ZSL)旨在通過利用語義資訊和輔助資料集,來識別沒有訓練樣本的新類的目標。最近,大多數ZSL方法集中在學習視覺-語義嵌入,將知識從輔助資料集遷移到新類別。然而,很少有工作研究語義資訊對ZSL任務是否有判別性。為了解決這個問題,我們提出了一個耦合字典學習方法使用類原型來對齊視覺-語義結構,其中使用視覺空間中的判別性資訊提升判別性不強的語義空間。然後,在不同的空間中用學到的類原型做簡單的最近鄰搜尋完成零樣本識別任務。四個基準資料集上的大量的實驗證明了所提方法的有效性。

問題1:什麼是視覺-語義結構?又是怎麼對齊的? 問題2:語義資訊是否具有判別性,怎麼度量?具有判別性,對ZSL任務有什麼好處?可解釋嗎?