論文筆記：Divide and Conquer: A Deep CASA Approach to Talker-Independent Monaural Speaker Separation

阿新 • • 發佈：2022-04-17

Divide and Conquer: A Deep CASA Approach to Talker-Independent Monaural Speaker Separation

介紹

受到CASA的啟發，提出了一種deep casa方法，用於兩個說話人的分離。不依賴說話人的分離問題需要解決置換問題（permutation problem）。主要通過PIT和DC兩種主要方法來解決置換問題。

本文提出的方法在simultaneous grouping階段，利用具有密集連線層的 UNet 卷積神經網路 (CNN) 來提高幀級分離的效能。為了克服逆STFT中噪聲相位的影響，探索了新的復值STFT訓練目標函式和time domain訓練函式來進行train。在sequential grouping階段，使用TCN網路來改善效能（在說話人跟蹤方面表現較好）。

Deep CASA

Simultaneous Grouping Stage

這一階段用於將每一幀的頻譜分離為兩個說話人。對應第c個說話人的STFT估計。訓練過程遵循tPIT準則。Dense-UNet網路的輸出成估計不同說話人的T-F masks，然後將混合的頻譜與mask相乘，就可以實現說話人分離![image-20220406101124843](Divide and Conquer A Deep CASA Approach to Talker-Independent Monaural Speaker Separation.assets/image-20220406101124843.png)

Sequential Grouping Stage

這一階段的主要目的在於track所有幀級別的頻譜估計將他們分配給不同的說話人。

將混合的頻譜和說話人頻譜的估計共同作為網路的輸入。NN網路通過訓練，可以將每一個幀級別的輸入變為一個D維的embedding vector V(t)。Target label A(t)用來表示tPIT輸出的分配。之後提出了這一階段的訓練目標函式：![image-20220406102010224](Divide and Conquer A Deep CASA Approach to Talker-Independent Monaural Speaker Separation.assets/image-20220406102010224.png)

通過訓練這一函式，對應於相同分配的V(t),變得更近，不同分配的V(t)變得更遠。因此在inference階段，用K-means演算法來對V(t)進行聚類，在每一幀產生binary label，用於組織Simultaneous Grouping Stage的幀級輸出。

論文筆記：Divide and Conquer: A Deep CASA Approach to Talker-Independent Monaural Speaker Separation

Divide and Conquer: A Deep CASA Approach to Talker-Independent Monaural SpeakerSeparation 介紹受到CASA的啟發，提出了一種deep casa方法，用於兩個說話人的分離。不依賴說話人的分離問題需要解決置換問題（pe

論文筆記：WWW 2019 Heterogeneous Graph Attention Network

1.前言論文連結：https://arxiv.org/pdf/1903.07293v1.pdf github：https://github.com/Jhy1993/HAN

論文筆記：KDD 2019 Heterogeneous Graph Neural Network

1. 前言論文連結：https://dl.acm.org/doi/10.1145/3292500.3330961 github：https://github.com/chuxuzhang/KDD2019_HetGNN

論文筆記：Enhancing Pre-trained Chinese Character Representation with Word-aligned Attention

1. 概述目前，很多NLP演算法大多采用主流的預訓練模型+下游任務微調這樣的演算法架構。預訓練模型種類繁多，如下圖

論文筆記：Towards Practical Differential Privacy for SQL Queries FLEX工具 PrivSql主要參考和對比的物件

這篇文章提出的FLEX工具，是PrivSQL作者主要參考的工具和實驗對比的物件於是很有必要讀一下這篇文章

論文翻譯：2020_A Hybrid DSP/Deep Learning Approach to Real-Time Full-Band Speech Enhancement

文章方向：語音增強論文地址：基於DSP/深度學習的實時全頻帶語音增強方法部落格地址：https://www.cnblogs.com/LXP-Never/p/15144882.html

論文筆記：多工學習在美團推薦中的應用

©NLP論文解讀原創•作者|小欣導讀本文重點對2021年KDD的一篇關於多工學習的論文（也是美團在多工學習領域的一個應用）《Modeling the Sequential Dependence among Audience Multi-step Conversions with Mul

筆記：Relation Classification via Convolutional Deep Neural Network

Relation Classification via Convolutional Deep Neural Network 作者：Zeng D et al. 目錄 Introduction

1937. OverCooked! 2 -- constructive algorithms,divide and conquer

1 #include <bits/stdc++.h> 2 using namespace std; 3 const int maxn = 1010; 4 typedef pair<int, int> pii;

論文筆記：InductivE_Inductive Learning on Commonsense Knowledge Graph Completion

本文提出一個基於歸納學習（inductive learning）的常識知識圖譜補全框架，旨在從可見實體構成的圖中歸納出常識模式，從而用於實現不可見實體的補全，即本文提出的常識知識補全。

論文翻譯：2020_GCRN_Learning Complex Spectral Mapping With Gated Convolutional Recurrent Networks for Monaural Speech Enhancement

論文地址：使用門控捲積迴圈網路學習複數譜對映以增強單耳語音程式碼地址：https://github.com/JupiterEthan/GCRN-complex

論文筆記：Divide and Conquer: A Deep CASA Approach to Talker-Independent Monaural Speaker Separation

Divide and Conquer: A Deep CASA Approach to Talker-Independent Monaural Speaker Separation

介紹

Deep CASA

Simultaneous Grouping Stage

Sequential Grouping Stage

論文筆記：Divide and Conquer: A Deep CASA Approach to Talker-Independent Monaural Speaker Separation

論文筆記：WWW 2019 Heterogeneous Graph Attention Network

論文筆記：KDD 2019 Heterogeneous Graph Neural Network

論文筆記：Enhancing Pre-trained Chinese Character Representation with Word-aligned Attention

論文筆記：Towards Practical Differential Privacy for SQL Queries FLEX工具 PrivSql主要參考和對比的物件

論文翻譯：2020_A Hybrid DSP/Deep Learning Approach to Real-Time Full-Band Speech Enhancement

論文筆記：多工學習在美團推薦中的應用

筆記：Relation Classification via Convolutional Deep Neural Network

1937. OverCooked! 2 -- constructive algorithms,divide and conquer

論文筆記：InductivE_Inductive Learning on Commonsense Knowledge Graph Completion

論文翻譯：2020_GCRN_Learning Complex Spectral Mapping With Gated Convolutional Recurrent Networks for Monaural Speech Enhancement

【2019】A Game-Theoretic Approach to Computation Offloading in Satellite Edge Computing

論文閱讀筆記：《SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation》

論文翻譯：2021_Decoupling magnitude and phase optimization with a two-stage deep network

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation 論文筆記

【論文筆記】A Survey on Deep Learning for Named Entity Recognition

《ContextNet：Context-Aware Image Matting for Simultaneous Foreground and Alpha Estimation》論文筆記

《SLIQ：A fast scalable classifier for data mining》論文筆記

行為識別論文筆記（一）：Going Deeper into Action Recognition - A Survey

DeText: A Deep Text Ranking Framework with BERT論文筆記

論文筆記：Divide and Conquer: A Deep CASA Approach to Talker-Independent Monaural Speaker Separation

Divide and Conquer: A Deep CASA Approach to Talker-Independent Monaural Speaker Separation

介紹

Deep CASA

Simultaneous Grouping Stage

Sequential Grouping Stage

相關推薦