多層感知機（MLP）演算法原理及Spark MLlib呼叫例項（Scala/Java/Python）

阿新 • • 發佈：2018-12-30

多層感知機

演算法簡介：

多層感知機是基於反向人工神經網路（feedforwardartificial neural network）。多層感知機含有多層節點，每層節點與網路的下一層節點完全連線。輸入層的節點代表輸入資料，其他層的節點通過將輸入資料與層上節點的權重w以及偏差b線性組合且應用一個啟用函式，得到該層輸出。多層感知機通過方向傳播來學習模型，其中我們使用邏輯損失函式以及L-BFGS。K＋1層多層感知機分類器可以寫成矩陣形式如下：

$y(x) = {f_k}(...{f_2}(w_2^T{f_1}(w_1^Tx + {b_1}) + {b_2})... + {b_k})$

中間層節點使用sigmoid方程：

$f({z_i}) = \frac{1}{{1 + {e^{ - {z_i}}}}}$

輸出層使用softmax方程：

$f({z_i}) = \frac{{{e^{{z_i}}}}}{{\sum\limits_{k = 1}^N {{e^{{z_k}}}} }}$

輸出層中N代表類別數目。

引數：

featuresCol:

型別：字串型。

含義：特徵列名。

labelCol:

型別：字串型。

含義：標籤列名。

layers:

型別：整數陣列型。

含義：層規模，包括輸入規模以及輸出規模。

maxIter:

型別：整數型。

含義：迭代次數（>=0）。

predictionCol:

型別：字串型。

含義：預測結果列名。

seed:

型別：長整型。

含義：隨機種子。

stepSize:

型別：雙精度型。

含義：每次迭代優化步長。

tol:

型別：雙精度型。

含義：迭代演算法的收斂性。

示例：

Scala:

import org.apache.spark.ml.classification.MultilayerPerceptronClassifier
import org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator

// Load the data stored in LIBSVM format as a DataFrame.
val data = spark.read.format("libsvm")
  .load("data/mllib/sample_multiclass_classification_data.txt")
// Split the data into train and test
val splits = data.randomSplit(Array(0.6, 0.4), seed = 1234L)
val train = splits(0)
val test = splits(1)
// specify layers for the neural network:
// input layer of size 4 (features), two intermediate of size 5 and 4
// and output of size 3 (classes)
val layers = Array[Int](4, 5, 4, 3)
// create the trainer and set its parameters
val trainer = new MultilayerPerceptronClassifier()
  .setLayers(layers)
  .setBlockSize(128)
  .setSeed(1234L)
  .setMaxIter(100)
// train the model
val model = trainer.fit(train)
// compute accuracy on the test set
val result = model.transform(test)
val predictionAndLabels = result.select("prediction", "label")
val evaluator = new MulticlassClassificationEvaluator()
  .setMetricName("accuracy")
println("Accuracy: " + evaluator.evaluate(predictionAndLabels))

Java:

import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;
import org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel;
import org.apache.spark.ml.classification.MultilayerPerceptronClassifier;
import org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator;

// Load training data
String path = "data/mllib/sample_multiclass_classification_data.txt";
Dataset<Row> dataFrame = spark.read().format("libsvm").load(path);
// Split the data into train and test
Dataset<Row>[] splits = dataFrame.randomSplit(new double[]{0.6, 0.4}, 1234L);
Dataset<Row> train = splits[0];
Dataset<Row> test = splits[1];
// specify layers for the neural network:
// input layer of size 4 (features), two intermediate of size 5 and 4
// and output of size 3 (classes)
int[] layers = new int[] {4, 5, 4, 3};
// create the trainer and set its parameters
MultilayerPerceptronClassifier trainer = new MultilayerPerceptronClassifier()
  .setLayers(layers)
  .setBlockSize(128)
  .setSeed(1234L)
  .setMaxIter(100);
// train the model
MultilayerPerceptronClassificationModel model = trainer.fit(train);
// compute accuracy on the test set
Dataset<Row> result = model.transform(test);
Dataset<Row> predictionAndLabels = result.select("prediction", "label");
MulticlassClassificationEvaluator evaluator = new MulticlassClassificationEvaluator()
  .setMetricName("accuracy");
System.out.println("Accuracy = " + evaluator.evaluate(predictionAndLabels));

Python：

from pyspark.ml.classification import MultilayerPerceptronClassifier
from pyspark.ml.evaluation import MulticlassClassificationEvaluator

# Load training data
data = spark.read.format("libsvm")\
    .load("data/mllib/sample_multiclass_classification_data.txt")
# Split the data into train and test
splits = data.randomSplit([0.6, 0.4], 1234)
train = splits[0]
test = splits[1]
# specify layers for the neural network:
# input layer of size 4 (features), two intermediate of size 5 and 4
# and output of size 3 (classes)
layers = [4, 5, 4, 3]
# create the trainer and set its parameters
trainer = MultilayerPerceptronClassifier(maxIter=100, layers=layers, blockSize=128, seed=1234)
# train the model
model = trainer.fit(train)
# compute accuracy on the test set
result = model.transform(test)
predictionAndLabels = result.select("prediction", "label")
evaluator = MulticlassClassificationEvaluator(metricName="accuracy")
print("Accuracy: " + str(evaluator.evaluate(predictionAndLabels)))

多層感知機（MLP）演算法原理及Spark MLlib呼叫例項（Scala/Java/Python）

多層感知機（MLP）演算法原理及Spark MLlib呼叫例項（Scala/Java/Python）

MLlib--多層感知機（MLP）演算法原理及Spark MLlib呼叫例項（Scala/Java/Python）

隨機森林迴歸（Random Forest）演算法原理及Spark MLlib呼叫例項（Scala/Java/python）

梯度迭代樹（GBDT）演算法原理及Spark MLlib呼叫例項（Scala/Java/python）

二分K均值演算法原理及Spark MLlib呼叫例項(Scala/Java/Python)

二十種特徵變換方法及Spark MLlib呼叫例項（Scala/Java/python）（一）

三種特徵選擇方法及Spark MLlib呼叫例項（Scala/Java/python）

二十種特徵變換方法及Spark MLlib呼叫例項（Scala/Java/python）（二）

記一下機器學習筆記多層感知機的反向傳播演算法

DeepLearning tutorial（3）MLP多層感知機原理簡介+程式碼詳解

用pytorch實現多層感知機（MLP)（全連線神經網路FC）分類MNIST手寫數字體的識別

Deep learning with Theano 官方中文教程（翻譯）（三）——多層感知機（MLP）

深度學習基礎（二）—— 從多層感知機（MLP）到卷積神經網路（CNN）

深度學習筆記二：多層感知機（MLP）與神經網路結構

多層感知機（MLP）

神經網路之多層感知機MLP的實現（Python+TensorFlow）

Keras簡單實現多層感知機（MLP）程式碼

MLP多層感知機（人工神經網路）原理及程式碼實現

基於神經網路（多層感知機）識別手寫數字

Deeplearning4j 實戰（5）：基於多層感知機的Mnist壓縮以及在Spark實現

多層感知機（MLP）演算法原理及Spark MLlib呼叫例項（Scala/Java/Python）

相關推薦