1. 程式人生 > 其它 >Neural Factorization Machines for Sparse Predictive Analytics

Neural Factorization Machines for Sparse Predictive Analytics

目錄

He X. and Chua T. Neural factorization machines for sparse predictive analytics. In International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2017.

引入 B-Interaction Layer 引入 二階的特徵交叉, 並通過 MLP 提取 high-order 資訊. 和 DeepFM 的區別就是並聯和串聯的區別?

主要內容

  1. 稀疏特徵 \(\bm{x}\)
    ;
  2. 通過 embedding layer 獲得:
\[\mathcal{V}_x = \{x_1 \bm{v}_1, x_2 \bm{v}_2, \cdots, x_n \bm{v}_n\}; \]
  1. 通過 Bi-Interaction Layer 獲得交叉特徵:
\[f_{BI}(\mathcal{V}_x) = \sum_{i=1}^n \sum_{j = i + 1} x_i \bm{v}_i \odot x_j \bm{v}_j, \]

其中 \(\odot\) 是 element-wise 乘法;
4. 通過 MLP 獲得 high-order 資訊:

\[\bm{z}_1 = \sigma_1(W_1 f_{BI}(\mathcal{V}_x) + \bm{b}_1), \\ \bm{z}_2 = \sigma_2(W_2 \bm{z}_1) + \bm{b}_2), \\ \vdots \\ \bm{z}_L = \sigma_L(W_L \bm{z}_{L-1}) + \bm{b}_L). \\ \]
  1. NFM:
\[\hat{y}_{NFM}(\bm{x}) = w_0 + \bm{w}^T\bm{x} + \bm{h}^T \bm{z}_L. \]
  1. 如果是預測得分, 可以通過
\[L_{reg} = \sum_{\bm{x} \in \mathcal{X}} (\hat{y}(\bm{x}) - y(\bm{x}))^2 \]

來訓練, 如果是分類, 則可以用 log loss ...

程式碼

[official]
[PyTorch]
[TensorFlow]