[論文理解] Mutual Information Neural Estimation

阿新 • • 發佈：2021-10-02

Mutual Information Neural Estimation

互資訊定義：

\(I(X;Z) = \int_{X \times Z} log\frac{d\mathbb{P}(XZ)}{d\mathbb{P}(X) \otimes \mathbb{P}(Z)}d\mathbb{P}(XZ)\)

CPC文章裡用下面這個公式定義要更加容易理解，都是一樣的：

\[I(x;z) = \sum_{x,z}p(x,z) log \frac{p(x,z)}{p(x)p(z)} \]

互資訊越大，表明兩個變數依賴關係越強，互資訊越小，表示兩個隨機變數越獨立。

KL散度的對偶問題：

因此根據KL散度和其對偶問題之間的關係我們可以得到：

\[D_{K L}(\mathbb{P} \| \mathbb{Q}) \geq \sup _{T \in \mathcal{F}} \mathbb{E}_{\mathbb{P}}[T]-\log \left(\mathbb{E}_{\mathbb{Q}}\left[e^{T}\right]\right) \]

利用上式優化互資訊的下界：

\[I(X ; Z) \geq I_{\Theta}(X, Z) \]\[I_{\Theta}(X, Z)=\sup _{\theta \in \Theta} \mathbb{E}_{\mathbb{P}_{X Z}}\left[T_{\theta}\right]-\log \left(\mathbb{E}_{\mathbb{P}_{X} \otimes \mathbb{P}_{Z}}\left[e^{T_{\theta}}\right]\right) \]

優化演算法：

一般來說z的分佈用高斯分佈，x和z的分佈(marginal distribution)都好取樣；

對於joint distribution，用一個神經網路來建模，即F(x,z)，然後其結果就是joint distribution的取樣了。

代入公式計算即可。

class Mine(nn.Module):
    def __init__(self, input_size=2, hidden_size=100):
        super().__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.fc2 = nn.Linear(hidden_size, hidden_size)
        self.fc3 = nn.Linear(hidden_size, 1)
        
    def forward(self, input):
        output = F.elu(self.fc1(input))
        output = F.elu(self.fc2(output))
        output = self.fc3(output)
        return output

def mutual_information(joint, marginal, mine_net):
    t = mine_net(joint)
    et = torch.exp(mine_net(marginal))
    mi_lb = torch.mean(t) - torch.log(torch.mean(et))
    return mi_lb, t, et

[論文理解] Mutual Information Neural Estimation

Mutual Information Neural Estimation 互資訊定義： \\(I(X;Z) = \\int_{X \\times Z} log\\frac{d\\mathbb{P}(XZ)}{d\\mathbb{P}(X) \\otimes \\mathbb{P}(Z)}d\\mathbb{P}(XZ)\\)

論文解讀（GMI）《Graph Representation Learning via Graphical Mutual Information Maximization》

Paper Information 論文作者：Zhen Peng、Wenbing Huang、Minnan Luo、Q. Zheng、Yu Rong、Tingyang Xu、Junzhou Huang論文來源：WWW 2020論文地址：download程式碼地址：download

A Micro Lie Theory 論文理解

找到一篇 2018 年的論文 [1]，是 Quaternion kinematics for the error-state Kalman filter[2] 的作者 Joan Solà 寫的。

General matrix representations for B-splines 論文理解

這篇論文 [1] 比較基礎，在很多與 B 樣條有關的論文中都能找到對它的引用。

ECCV2020論文-稀疏性表示-Neural Sparse Representation for Image Restoration翻譯

Neural Sparse Representation for Image Restoration 用於影象復原的神經稀疏表示 Abstract 在基於稀疏編碼的影象恢復模型中，基於稀疏表示的魯棒性和有效性，我們研究了深度網路中神經元的稀疏性。我們的

Bipartite Graph Embedding via Mutual Information Maximization

BiGI ABSTRACT 二部圖的嵌入表示近來引起了人們的大量關注。但是之前的大多數方法採用基於隨機遊走或基於重構的目標，這些方法對於學習區域性圖結構通常很有效。

[論文理解] 半監督論文總結（一）

Semi-supervised Papers Review CatGAN arxiv：https://arxiv.org/pdf/1511.06390.pdf 主要貢獻：修改原始GAN的目標函式為

[論文理解] Quantizing Deep Convolutional Networks For Efficient Inference A Whitepaper

Quantizing Deep Convolutional Networks For Efficient Inference A Whitepaper Question：到底加速在哪？

[論文理解] Adversarial Examples Improve Image Recognition

Adversarial Examples Improve Image Recognition 這篇文章提出了auxiliary BN來對生成對OOD樣本做BN，乾淨對樣本用原始的BN，相當於是兩個BN處理。

[論文理解] 人臉識別論文總結（一）

Face Recognition Papers Review Partial FC: Training 10 Million Identities on a Single Machine arxiv: https://arxiv.org/pdf/2010.05222v2.pdf

[論文理解] Bootstrap Your Own Latent A New Approach to Self-Supervised Learning

Bootstrap Your Own Latent A New Approach to Self-Supervised Learning Intro 文章提出一種不需要負樣本來做自監督學習的方法，提出交替更新假說解釋EMA方式更新target network防止collapse的原因，同時用梯度解釋