VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
2021-07-22 08:54:20
Paper:https://arxiv.org/pdf/2104.11178.pdf
1. Background and Motivation:
本文嘗試用一個共享的 backbone 來學習三個模態的特徵表達,並且是用 transformer 的框架,自監督的方式去學習。作者認為監督學習的自監督有如下兩個問題:
1). 無法充分利用海量無標籤資料;
2). CV 的眾多工中,獲得有標籤資料,是非常困難的。
因此,本文嘗試從無監督學習的角度,提出了 VATT 模型。
如上圖所示,更殘暴的是,作者直接讓三個模態共享同一個骨幹網路。實驗證明,與模態無關的骨幹網路可以取得與不同模態的骨幹網路,相似的結果。
==
Stay Hungry,Stay Foolish ...相關推薦
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text 2021-07-22 08:54:20
自監督-Self-supervised Learning on Graphs:Deep Insights and New Directions
動機 圖資料於影象或者文字資料不同, 影象或者文字時屬於歐式資料且都是服從獨立同分布; 而對於圖資料而言, 它是非歐式資料, 並且圖中的節點相互連線表示著他們獨立同分布的
Self2Self With Dropout: Learning Self-Supervised Denoising From Single Image【使用單張影象進行自監督學習去噪】
文章目錄 Self2Self With Dropout: Learning Self-Supervised Denoising From Single Image【使用單張影象進行自監督學習去噪】一、相關概念1.1 監督學習(Supervised learning)1.2 無監督學習(Unsuper
自監督- Multi-Stage Self-Supervised Learning for Graph Convolutional Networks on Graphs with Few Labeled Nodes
標籤: 自監督、 圖神經 動機 首先, 由於很難改變 GCNs 固有的淺層結構, 如何設計一種基於 GCNs 的一致高效的訓練演算法來提高其在標籤節點較少的圖上的泛化效能?
論文解讀(CSSL)《Contrastive Self-supervised Learning for Graph Classification》
論文資訊 論文標題:Contrastive Self-supervised Learning for Graph Classification論文作者:Jiaqi Zeng, Pengtao Xie論文來源:2020, AAAI論文地址:download 論文程式碼:download
Sub-graph Contrast for Scalable Self-Supervised Graph Representation Learning
Sub-graph Contrast for Scalable Self-Supervised Graph Representation Learning 動機 圖表示學習最近引起了很多關注。由於有限的計算和記憶體成本,現有的以完整圖資料為基礎的圖神經網路不可擴充套件。因此,在
Contrastive Self-Supervised Learning 的對比學習框架和設計新方法
https://towardsdatascience.com/a-framework-for-contrastive-self-supervised-learning-and-designing-a-new-approach-3caab5d29619
[論文理解] Bootstrap Your Own Latent A New Approach to Self-Supervised Learning
Bootstrap Your Own Latent A New Approach to Self-Supervised Learning Intro 文章提出一種不需要負樣本來做自監督學習的方法,提出交替更新假說解釋EMA方式更新target network防止collapse的原因,同時用梯度解釋
Self-Supervised Learning
Self-Supervised Learning 參考知乎文章:https://zhuanlan.zhihu.com/p/108906502(Self-supervised Learning 再次入門)
論文解讀(Survey)《Self-supervised Learning on Graphs: Contrastive, Generative,or Predictive》第一部分:問題闡述
論文資訊 論文標題:Self-supervised Learning on Graphs: Contrastive, Generative,or Predictive論文作者:Lirong Wu, Haitao Lin, Cheng Tan,Zhangyang Gao, and Stan.Z.Li論文來源:2022, ArXiv論文地址:downl
論文解讀(Survey)《Self-supervised Learning on Graphs: Contrastive, Generative,or Predictive》第二部分:對比學習
論文資訊 論文標題:Self-supervised Learning on Graphs: Contrastive, Generative,or Predictive論文作者:Lirong Wu, Haitao Lin, Cheng Tan,Zhangyang Gao, and Stan.Z.Li論文來源:2022, ArXiv論文地址:downl
Self-Training using Selection Network for Semi-supervised Learning
論文閱讀: Self-Training using Selection Network for Semi-supervised Learning 作者說明 版權宣告:本文為博主原創文章,遵循 CC 4.0 BY-SA 版權協議,轉載請附上原文出處連結和本宣告。
Socially-Aware Self-Supervised Tri-Training for Recommendation
SEPT Socially-Aware Self-Supervised Tri-Training for Recommendation ABSTRACT 自監督學習(Self-supervised learning, SSL)可以從原始資料中自動生成真實樣本。
Remote Sensing Images Semantic Segmentation with General Remote Sensing Vision Model via a Self-Supervised Contrastive Learning Method
論文閱讀: Remote Sensing Images Semantic Segmentation with General Remote Sensing Vision Model via a Self-Supervised Contrastive Learning Method
論文解讀-ACL-2021-ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer
本篇論文美團已經給出了很詳細的解讀 論文:https://arxiv.org/abs/2105.11741 程式碼:https://github.com/yym6472/ConSERT
A Unified Deep Model of Learning from both Data and Queries for Cardinality Estimation 論文解讀(SIGMOD 2021 UAE)
A Unified Deep Model of Learning from both Data and Queries for Cardinality Estimation 論文解讀(SIGMOD 2021)
A Unified Deep Model of Learning from both Data and Queries for Cardinality Estimation 論文解讀(SIGMOD 2021)
A Unified Deep Model of Learning from both Data and Queries for Cardinality Estimation 論文解讀(SIGMOD 2021)
Temporal Ensembling for Semi-Supervised Learning
目錄 概 主要內容 *-model Temporal ensembling 超引數的選擇 Laine S. and Aila T. Temporal ensembling for semi-supervised learning. In International Conference on Learning Representations (ICLR)
【原創】【論文閱讀】2020 Learning From Noisy Large-Scale Datasets With Minimal Supervision
論文地址:https://vision.cornell.edu/se3/wp-content/uploads/2017/04/DeepLabelCleaning_CVPR.pdf 利用大規模有噪資料訓練模型的常用方法是在有噪資料上做預訓練,在精標資料上做精調。 本文提出一種利用
無監督學習 MoCo: Momentum Contrast for Unsupervised Visual Representation Learning
用於視覺表示學習的動量對比。 作者:Kaiming He 以及FAIR的一眾大佬 Summary 這篇文章主要解決的是無監督視覺表示學習問題。作者從將對比學習看做字典查詢(dictionary look-up)出發,使用佇列(queue)和