1. 程式人生 > >Spark Mlib(一) svm

Spark Mlib(一) svm

SVM(Support Vector Machine)指的是支援向量機,是常見的一種判別方法。在機器學習領域,是一個有監督的學習模型,通常用來進行模式識別、分類以及迴歸分析。下面是spark官網給出的例子。原網址為http://spark.apache.org/docs/latest/mllib-linear-methods.html#classification

import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.mllib.classification.{SVMModel, SVMWithSGD}
import org.apache.spark.mllib.evaluation.BinaryClassificationMetrics import org.apache.spark.mllib.util.MLUtils object spark_svm { def main(args :Array[String]): Unit = { val sparkConf = new SparkConf().setMaster("local").setAppName("testTansformition") val sc = new SparkContext(sparkConf)
//載入訓練資料 LIBSVM資料格式. val data = MLUtils.loadLibSVMFile(sc, "data/mllib/sample_libsvm_data.txt") // 劃分訓練集和測試機集(訓練集60%,測試集40%) val splits = data.randomSplit(Array(0.6, 0.4), seed = 11L) val training = splits(0).cache() val test = splits(1) // 訓練模型 val numIterations = 100 val model =
SVMWithSGD.train(training, numIterations) // 清楚預設閾值 model.clearThreshold() // 對測試集進行預測 val scoreAndLabels = test.map { point => val score = model.predict(point.features) (score, point.label) } //獲取評價指標 val metrics = new BinaryClassificationMetrics(scoreAndLabels) val auROC = metrics.areaUnderROC() println(s"Area under ROC = $auROC") // 儲存和載入模型示例 model.save(sc, "target/tmp/scalaSVMWithSGDModel") val sameModel = SVMModel.load(sc, "target/tmp/scalaSVMWithSGDModel") Thread.sleep(30*30*1000); } }