Spark MLlib模型(一) 支持向量機【Support Vector Machine】
阿新 • • 發佈:2017-12-08
ssi p s ext edi sgd 訓練集 turn cati eight
目錄
支持向量機原理
支持向量機代碼(Spark Python)
支持向量機原理 |
待續...
返回目錄
支持向量機代碼(Spark Python) |
代碼裏數據:https://pan.baidu.com/s/1jHWKG4I 密碼:acq1
# -*-coding=utf-8 -*- from pyspark import SparkConf, SparkContext sc = SparkContext(‘local‘) from pyspark.mllib.classification import SVMWithSGD, SVMModelfrom pyspark.mllib.regression import LabeledPoint # Load and parse the data 加載和解析數據,將每一個數轉化為浮點數。每一行第一個數作為標記,後面的作為特征 def parsePoint(line): values = [float(x) for x in line.split(‘ ‘)] return LabeledPoint(values[0], values[1:]) data = sc.textFile("data/mllib/sample_svm_data.txt") print data.collect()[0] #1 0 2.52078447201548 0 0 0 2.004684436494304 2.00034729926846..... parsedData = data.map(parsePoint) print parsedData.collect()[0] #(1.0,[0.0,2.52078447202,0.0,0.0,0.0,2.00468.... # Build the model 建立模型 model = SVMWithSGD.train(parsedData, iterations=100) # Evaluating the model on training data 評估模型在訓練集上的誤差 labelsAndPreds = parsedData.map(lambdap: (p.label, model.predict(p.features))) trainErr = labelsAndPreds.filter(lambda lp: lp[0] != lp[1]).count() / float(parsedData.count()) print("Training Error = " + str(trainErr)) # Save and load model 保存模型和加載模型 model.save(sc, "pythonSVMWithSGDModel") sameModel = SVMModel.load(sc, "pythonSVMWithSGDModel") print sameModel.predict(parsedData.collect()[0].features) #1
返回目錄
Spark MLlib模型(一) 支持向量機【Support Vector Machine】