網路嵌入演算法-Network Embedding-LINE/LANE/M-NMF
本文結構安排
- M-NMF
- LANE
- LINE
什麼是Network Embedding?
LINE
-
[Information Network]
An information network is defined as , where is the set
of vertices, each representing a data object and is the
set of edges between the vertices, each representing a relationship between two data objects. Each edge is an ordered pair and is associated with a weight , which indicates the strength of the relation. If is undirected, we have and ; if G is directed, we have and -
[First-order Proximity] The first-order proximity in a network is the local pairwise proximity between two vertices. For each pair of vertices linked by an edge , the weight on that edge, , indicates the first-order proximity between u and v. If no edge is observed between u and v, their first-order proximity is 0. The first-order proximity usually implies the similarity of two nodes in a real-world network.
LINE with First-order Proximity:The first-order proximity refers to the local pairwise proximity between the vertices in the network. For each undirected edge , the joint probability between vertex and as follows:
where $u_{i} \in R^{d} $ is the low-dimensional vector representation of vertex . ,where .
And its empirical probability can be defined as ,where .To preserve the first-order proximity we can minimize the following objective function:
where is the distance between two distributions. We choose to minimize the KL-divergence of two probability distributions. Replacing with KL-divergence and omitting some constants, we have:
-
[Second-order Proximity] The second-order proximity between a pair of vertices (u,v) in a network is the similarity between their neighborhood network structures. Mathematically, let