ALINK(三十七):模型評估(二)迴歸評估 (EvalRegressionBatchOp)
阿新 • • 發佈:2021-06-19
Java 類名:com.alibaba.alink.operator.batch.evaluation.EvalRegressionBatchOp
Python 類名:EvalRegressionBatchOp
功能介紹
迴歸評估是對迴歸演算法的預測結果進行效果評估,支援下列評估指標。
SST 總平方和(Sum of Squared for Total)
SST=\sum_{i=1}^{N}(y_i-\bar{y})^2
SSE 誤差平方和(Sum of Squares for Error)
SSE=\sum_{i=1}^{N}(y_i-f_i)^2
SSR 迴歸平方和(Sum of Squares for Regression)
SSR=\sum_{i=1}^{N}(f_i-\bar{y})^2
R^2 判定係數(Coefficient of Determination)
R^2=1-\dfrac{SSE}{SST}
R 多重相關係數(Multiple Correlation Coeffient)
R=\sqrt{R^2}
MSE 均方誤差(Mean Squared Error)
MSE=\dfrac{1}{N}\sum_{i=1}^{N}(f_i-y_i)^2
RMSE 均方根誤差(Root Mean Squared Error)
RMSE=\sqrt{MSE}
SAE/SAD 絕對誤差(Sum of Absolute Error/Difference)
SAE=\sum_{i=1}^{N}|f_i-y_i|
MAE/MAD 平均絕對誤差(Mean Absolute Error/Difference)
MAE=\dfrac{1}{N}\sum_{i=1}^{N}|f_i-y_i|
MAPE 平均絕對百分誤差(Mean Absolute Percentage Error)
MAPE=\dfrac{100}{N}\sum_{i=1}^{N}|\dfrac{f_i-y_i}{y_i}|
count 行數
explained variance 解釋方差
explained Variance=\dfrac{SSR}{N}
引數說明
名稱 |
中文名稱 |
描述 |
型別 |
是否必須? |
預設值 |
labelCol |
標籤列名 |
輸入表中的標籤列名 |
String |
✓ |
|
predictionCol |
預測結果列名 |
預測結果列名 |
String |
✓ |
程式碼示例
Python 程式碼
from pyalink.alink import * import pandas as pd useLocalEnv(1) df = pd.DataFrame([ [0, 0], [8, 8], [1, 2], [9, 10], [3, 1], [10, 7] ]) inOp = BatchOperator.fromDataframe(df, schemaStr='pred int, label int') metrics = EvalRegressionBatchOp().setPredictionCol("pred").setLabelCol("label").linkFrom(inOp).collectMetrics() print("Total Samples Number:", metrics.getCount()) print("SSE:", metrics.getSse()) print("SAE:", metrics.getSae()) print("RMSE:", metrics.getRmse()) print("R2:", metrics.getR2())
Java 程式碼
import org.apache.flink.types.Row; import com.alibaba.alink.operator.batch.BatchOperator; import com.alibaba.alink.operator.batch.evaluation.EvalRegressionBatchOp; import com.alibaba.alink.operator.batch.source.MemSourceBatchOp; import com.alibaba.alink.operator.common.evaluation.RegressionMetrics; import org.junit.Test; import java.util.Arrays; import java.util.List; public class EvalRegressionBatchOpTest { @Test public void testEvalRegressionBatchOp() throws Exception { List <Row> df = Arrays.asList( Row.of(0, 0), Row.of(8, 8), Row.of(1, 2), Row.of(9, 10), Row.of(3, 1), Row.of(10, 7) ); BatchOperator <?> inOp = new MemSourceBatchOp(df, "pred int, label int"); RegressionMetrics metrics = new EvalRegressionBatchOp().setPredictionCol("pred").setLabelCol("label").linkFrom( inOp).collectMetrics(); System.out.println("Total Samples Number:" + metrics.getCount()); System.out.println("SSE:" + metrics.getSse()); System.out.println("SAE:" + metrics.getSae()); System.out.println("RMSE:" + metrics.getRmse()); System.out.println("R2:" + metrics.getR2()); } }
執行結果
Total Samples Number: 6.0 SSE: 15.0 SAE: 7.0 RMSE: 1.5811388300841898 R2: 0.8282442748091603