1. 程式人生 > 其它 >Spark Graphx--連通分量

Spark Graphx--連通分量

技術標籤:Sparksparksparkgraphx連通分量

連通分量

1.什麼是連通分量

連通分量是一個子圖,其中任何兩個頂點通過一條邊或一系列邊相互連線,其頂點是原始圖頂點集的子集,其邊是原始圖邊集的子集

2.計算連通分量的方法

class Graph[VD, ED] {
  def connectedComponents(): Graph[VertexID, ED]
}

3.示例

即去掉了與其他頂點無關的那個頂點資訊

package cn.kgc.spark.graphx

import org.apache.spark.
SparkContext import org.apache.spark.graphx.{Edge, Graph, PartitionID, VertexId} import org.apache.spark.rdd.RDD import org.apache.spark.sql.SparkSession object Demo10_ConnectCompents { def main(args: Array[String]): Unit = { // 建立SparkSession val spark: SparkSession = SparkSession.builder(
) .appName(this.getClass.getName) .master("local[4]") .getOrCreate() // // 建立SparkContext val sc: SparkContext = spark.sparkContext val users: RDD[(VertexId, (String, PartitionID))] = sc.parallelize(Array( (1L, ("Alice", 28)), (2L, ("Bob"
, 27)), (3L, ("Charlie", 65)), (4L, ("David", 42)), (5L, ("Ed", 55)), (6L, ("Fran", 50)), (7L,("zhsang",41)) )) val cntCall: RDD[Edge[PartitionID]] = sc.parallelize(Array( Edge(2L, 1L, 7), Edge(2L, 4L, 2), Edge(3L, 2L, 4), Edge(3L, 6L, 3), Edge(4L, 1L, 1), Edge(5L, 2L, 2), Edge(5L, 3L, 8), Edge(5L, 6L, 3) )) val graph: Graph[(String, PartitionID), PartitionID] = Graph(users, cntCall) //呼叫api計算連通分量,得到連通分量子圖 //即去掉了與其他頂點無關的頂點7得到的子圖稱為連通分量 graph.connectedComponents().triplets.foreach(println) } } //輸出 ((3,1),(2,1),4) ((5,1),(3,1),8) ((2,1),(1,1),7) ((2,1),(4,1),2) ((4,1),(1,1),1) ((5,1),(6,1),3) ((3,1),(6,1),3) ((5,1),(2,1),2)