1. 程式人生 > 其它 >ALINK(三十七):模型評估(二)迴歸評估 (EvalRegressionBatchOp)

ALINK(三十七):模型評估(二)迴歸評估 (EvalRegressionBatchOp)

Java 類名:com.alibaba.alink.operator.batch.evaluation.EvalRegressionBatchOp

Python 類名:EvalRegressionBatchOp

功能介紹

迴歸評估是對迴歸演算法的預測結果進行效果評估,支援下列評估指標。

SST 總平方和(Sum of Squared for Total)

SST=\sum_{i=1}^{N}(y_i-\bar{y})^2

SSE 誤差平方和(Sum of Squares for Error)

SSE=\sum_{i=1}^{N}(y_i-f_i)^2

SSR 迴歸平方和(Sum of Squares for Regression)

SSR=\sum_{i=1}^{N}(f_i-\bar{y})^2

R^2 判定係數(Coefficient of Determination)

R^2=1-\dfrac{SSE}{SST}

R 多重相關係數(Multiple Correlation Coeffient)

R=\sqrt{R^2}

MSE 均方誤差(Mean Squared Error)

MSE=\dfrac{1}{N}\sum_{i=1}^{N}(f_i-y_i)^2

RMSE 均方根誤差(Root Mean Squared Error)

RMSE=\sqrt{MSE}

SAE/SAD 絕對誤差(Sum of Absolute Error/Difference)

SAE=\sum_{i=1}^{N}|f_i-y_i|

MAE/MAD 平均絕對誤差(Mean Absolute Error/Difference)

MAE=\dfrac{1}{N}\sum_{i=1}^{N}|f_i-y_i|

MAPE 平均絕對百分誤差(Mean Absolute Percentage Error)

MAPE=\dfrac{100}{N}\sum_{i=1}^{N}|\dfrac{f_i-y_i}{y_i}|

count 行數

explained variance 解釋方差

explained Variance=\dfrac{SSR}{N}

引數說明

名稱

中文名稱

描述

型別

是否必須?

預設值

labelCol

標籤列名

輸入表中的標籤列名

String

predictionCol

預測結果列名

預測結果列名

String

程式碼示例

Python 程式碼

from pyalink.alink import *
import pandas as pd
useLocalEnv(1)
df = pd.DataFrame([
    [0, 0],
    [8, 8],
    [1, 2],
    [9, 10],
    [3, 1],
    [10, 7]
])
inOp = BatchOperator.fromDataframe(df, schemaStr='pred int, label int')
metrics = EvalRegressionBatchOp().setPredictionCol("pred").setLabelCol("label").linkFrom(inOp).collectMetrics()
print("Total Samples Number:", metrics.getCount())
print("SSE:", metrics.getSse())
print("SAE:", metrics.getSae())
print("RMSE:", metrics.getRmse())
print("R2:", metrics.getR2())

Java 程式碼

import org.apache.flink.types.Row;
import com.alibaba.alink.operator.batch.BatchOperator;
import com.alibaba.alink.operator.batch.evaluation.EvalRegressionBatchOp;
import com.alibaba.alink.operator.batch.source.MemSourceBatchOp;
import com.alibaba.alink.operator.common.evaluation.RegressionMetrics;
import org.junit.Test;
import java.util.Arrays;
import java.util.List;
public class EvalRegressionBatchOpTest {
  @Test
  public void testEvalRegressionBatchOp() throws Exception {
    List <Row> df = Arrays.asList(
      Row.of(0, 0),
      Row.of(8, 8),
      Row.of(1, 2),
      Row.of(9, 10),
      Row.of(3, 1),
      Row.of(10, 7)
    );
    BatchOperator <?> inOp = new MemSourceBatchOp(df, "pred int, label int");
    RegressionMetrics metrics = new EvalRegressionBatchOp().setPredictionCol("pred").setLabelCol("label").linkFrom(
      inOp).collectMetrics();
    System.out.println("Total Samples Number:" + metrics.getCount());
    System.out.println("SSE:" + metrics.getSse());
    System.out.println("SAE:" + metrics.getSae());
    System.out.println("RMSE:" + metrics.getRmse());
    System.out.println("R2:" + metrics.getR2());
  }
}

執行結果

Total Samples Number: 6.0
SSE: 15.0
SAE: 7.0
RMSE: 1.5811388300841898
R2: 0.8282442748091603