VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text

阿新 • • 發佈：2021-07-22

VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text

2021-07-22 08:54:20

Paper:https://arxiv.org/pdf/2104.11178.pdf

1. Background and Motivation:

本文嘗試用一個共享的 backbone 來學習三個模態的特徵表達，並且是用 transformer 的框架，自監督的方式去學習。作者認為監督學習的自監督有如下兩個問題：

　　1). 無法充分利用海量無標籤資料；

　　2). CV 的眾多工中，獲得有標籤資料，是非常困難的。

因此，本文嘗試從無監督學習的角度，提出了 VATT 模型。

如上圖所示，更殘暴的是，作者直接讓三個模態共享同一個骨幹網路。實驗證明，與模態無關的骨幹網路可以取得與不同模態的骨幹網路，相似的結果。

Stay Hungry，Stay Foolish ...

VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text

VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text 2021-07-22 08:54:20

自監督-Self-supervised Learning on Graphs:Deep Insights and New Directions

動機圖資料於影象或者文字資料不同, 影象或者文字時屬於歐式資料且都是服從獨立同分布; 而對於圖資料而言, 它是非歐式資料, 並且圖中的節點相互連線表示著他們獨立同分布的

Self2Self With Dropout: Learning Self-Supervised Denoising From Single Image【使用單張影象進行自監督學習去噪】

文章目錄 Self2Self With Dropout: Learning Self-Supervised Denoising From Single Image【使用單張影象進行自監督學習去噪】一、相關概念1.1 監督學習（Supervised learning）1.2 無監督學習（Unsuper

自監督- Multi-Stage Self-Supervised Learning for Graph Convolutional Networks on Graphs with Few Labeled Nodes

標籤: 自監督、圖神經動機首先, 由於很難改變 GCNs 固有的淺層結構, 如何設計一種基於 GCNs 的一致高效的訓練演算法來提高其在標籤節點較少的圖上的泛化效能？

論文解讀（CSSL）《Contrastive Self-supervised Learning for Graph Classification》

論文資訊論文標題：Contrastive Self-supervised Learning for Graph Classification論文作者：Jiaqi Zeng, Pengtao Xie論文來源：2020, AAAI論文地址：download 論文程式碼：download

Sub-graph Contrast for Scalable Self-Supervised Graph Representation Learning

Sub-graph Contrast for Scalable Self-Supervised Graph Representation Learning 動機圖表示學習最近引起了很多關注。由於有限的計算和記憶體成本，現有的以完整圖資料為基礎的圖神經網路不可擴充套件。因此，在

Contrastive Self-Supervised Learning 的對比學習框架和設計新方法

https://towardsdatascience.com/a-framework-for-contrastive-self-supervised-learning-and-designing-a-new-approach-3caab5d29619

[論文理解] Bootstrap Your Own Latent A New Approach to Self-Supervised Learning

Bootstrap Your Own Latent A New Approach to Self-Supervised Learning Intro 文章提出一種不需要負樣本來做自監督學習的方法，提出交替更新假說解釋EMA方式更新target network防止collapse的原因，同時用梯度解釋

Self-Supervised Learning

Self-Supervised Learning 參考知乎文章：https://zhuanlan.zhihu.com/p/108906502（Self-supervised Learning 再次入門）

論文解讀（Survey）《Self-supervised Learning on Graphs: Contrastive, Generative,or Predictive》第一部分：問題闡述

論文資訊論文標題：Self-supervised Learning on Graphs: Contrastive, Generative,or Predictive論文作者：Lirong Wu, Haitao Lin, Cheng Tan,Zhangyang Gao, and Stan.Z.Li論文來源：2022, ArXiv論文地址：downl

論文解讀（Survey）《Self-supervised Learning on Graphs: Contrastive, Generative,or Predictive》第二部分：對比學習

Self-Training using Selection Network for Semi-supervised Learning

論文閱讀： Self-Training using Selection Network for Semi-supervised Learning 作者說明版權宣告：本文為博主原創文章，遵循 CC 4.0 BY-SA 版權協議，轉載請附上原文出處連結和本宣告。

Socially-Aware Self-Supervised Tri-Training for Recommendation

SEPT Socially-Aware Self-Supervised Tri-Training for Recommendation ABSTRACT 自監督學習(Self-supervised learning, SSL)可以從原始資料中自動生成真實樣本。

Remote Sensing Images Semantic Segmentation with General Remote Sensing Vision Model via a Self-Supervised Contrastive Learning Method

論文閱讀： Remote Sensing Images Semantic Segmentation with General Remote Sensing Vision Model via a Self-Supervised Contrastive Learning Method

VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text

VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text

自監督-Self-supervised Learning on Graphs:Deep Insights and New Directions

Self2Self With Dropout: Learning Self-Supervised Denoising From Single Image【使用單張影象進行自監督學習去噪】

自監督- Multi-Stage Self-Supervised Learning for Graph Convolutional Networks on Graphs with Few Labeled Nodes

論文解讀（CSSL）《Contrastive Self-supervised Learning for Graph Classification》

Sub-graph Contrast for Scalable Self-Supervised Graph Representation Learning

Contrastive Self-Supervised Learning 的對比學習框架和設計新方法

[論文理解] Bootstrap Your Own Latent A New Approach to Self-Supervised Learning

Self-Supervised Learning

論文解讀（Survey）《Self-supervised Learning on Graphs: Contrastive, Generative,or Predictive》第一部分：問題闡述

論文解讀（Survey）《Self-supervised Learning on Graphs: Contrastive, Generative,or Predictive》第二部分：對比學習

Self-Training using Selection Network for Semi-supervised Learning

Socially-Aware Self-Supervised Tri-Training for Recommendation

Remote Sensing Images Semantic Segmentation with General Remote Sensing Vision Model via a Self-Supervised Contrastive Learning Method

論文解讀-ACL-2021-ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

A Unified Deep Model of Learning from both Data and Queries for Cardinality Estimation 論文解讀（SIGMOD 2021 UAE）

A Unified Deep Model of Learning from both Data and Queries for Cardinality Estimation 論文解讀（SIGMOD 2021）

Temporal Ensembling for Semi-Supervised Learning

【原創】【論文閱讀】2020 Learning From Noisy Large-Scale Datasets With Minimal Supervision

無監督學習 MoCo: Momentum Contrast for Unsupervised Visual Representation Learning

VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text

相關推薦