Java 类名:com.alibaba.alink.operator.batch.evaluation.EvalBinaryClassBatchOp

Python 类名:EvalBinaryClassBatchOp

功能介绍

二分类评估是对二分类算法的预测结果进行效果评估。

支持Roc曲线,LiftChart曲线,K-S曲线,Recall-Precision曲线绘制。

流式的实验支持累计统计和窗口统计,除却上述四条曲线外,还给出Auc/Kappa/Accuracy/Logloss随时间的变化曲线。

给出整体的评估指标包括:AUC、K-S、PRC, 不同阈值下的Precision、Recall、F-Measure、Sensitivity、Accuracy、Specificity和Kappa。

混淆矩阵

ALINK(三十六):模型评估(一)二分类评估 (EvalBinaryClassBatchOp)_类名

Roc曲线

横坐标:FPR

纵坐标:TPR

AUC

Roc曲线下面的面积

K-S

横坐标:阈值

纵坐标:TPR和FPR

KS

K-S曲线两条纵轴的最大差值

Recall-Precision曲线

横坐标:Recall

纵坐标:Precision

PRC

Recall-Precision曲线下面的面积

 

ALINK(三十六):模型评估(一)二分类评估 (EvalBinaryClassBatchOp)_lua_02

 

 ALINK(三十六):模型评估(一)二分类评估 (EvalBinaryClassBatchOp)_java_03

 

 ALINK(三十六):模型评估(一)二分类评估 (EvalBinaryClassBatchOp)_python_04

 

 

 

 

参数说明

名称

中文名称

描述

类型

是否必须?

默认值

predictionDetailCol

预测详细信息列名

预测详细信息列名

String

 

labelCol

标签列名

输入表中的标签列名

String

 

positiveLabelValueString

正样本

正样本对应的字符串格式。

String

 

null

代码示例

Python 代码

from pyalink.alink import *
import pandas as pd
useLocalEnv(1)
df = pd.DataFrame([
    ["prefix1", "{\"prefix1\": 0.9, \"prefix0\": 0.1}"],
    ["prefix1", "{\"prefix1\": 0.8, \"prefix0\": 0.2}"],
    ["prefix1", "{\"prefix1\": 0.7, \"prefix0\": 0.3}"],
    ["prefix0", "{\"prefix1\": 0.75, \"prefix0\": 0.25}"],
    ["prefix0", "{\"prefix1\": 0.6, \"prefix0\": 0.4}"]
])
inOp = BatchOperator.fromDataframe(df, schemaStr='label string, detailInput string')
metrics = EvalBinaryClassBatchOp().setLabelCol("label").setPredictionDetailCol("detailInput").linkFrom(inOp).collectMetrics()
print("AUC:", metrics.getAuc())
print("KS:", metrics.getKs())
print("PRC:", metrics.getPrc())
print("Accuracy:", metrics.getAccuracy())
print("Macro Precision:", metrics.getMacroPrecision())
print("Micro Recall:", metrics.getMicroRecall())
print("Weighted Sensitivity:", metrics.getWeightedSensitivity())

Java 代码

import org.apache.flink.types.Row;
import com.alibaba.alink.operator.batch.BatchOperator;
import com.alibaba.alink.operator.batch.evaluation.EvalBinaryClassBatchOp;
import com.alibaba.alink.operator.batch.source.MemSourceBatchOp;
import com.alibaba.alink.operator.common.evaluation.BinaryClassMetrics;
import org.junit.Test;
import java.util.Arrays;
import java.util.List;
public class EvalBinaryClassBatchOpTest {
  @Test
  public void testEvalBinaryClassBatchOp() throws Exception {
    List <Row> df = Arrays.asList(
      Row.of("prefix1", "{\"prefix1\": 0.9, \"prefix0\": 0.1}"),
      Row.of("prefix1", "{\"prefix1\": 0.8, \"prefix0\": 0.2}"),
      Row.of("prefix1", "{\"prefix1\": 0.7, \"prefix0\": 0.3}"),
      Row.of("prefix0", "{\"prefix1\": 0.75, \"prefix0\": 0.25}"),
      Row.of("prefix0", "{\"prefix1\": 0.6, \"prefix0\": 0.4}")
    );
    BatchOperator <?> inOp = new MemSourceBatchOp(df, "label string, detailInput string");
    BinaryClassMetrics metrics = new EvalBinaryClassBatchOp().setLabelCol("label").setPredictionDetailCol(
      "detailInput").linkFrom(inOp).collectMetrics();
    System.out.println("AUC:" + metrics.getAuc());
    System.out.println("KS:" + metrics.getKs());
    System.out.println("PRC:" + metrics.getPrc());
    System.out.println("Accuracy:" + metrics.getAccuracy());
    System.out.println("Macro Precision:" + metrics.getMacroPrecision());
    System.out.println("Micro Recall:" + metrics.getMicroRecall());
    System.out.println("Weighted Sensitivity:" + metrics.getWeightedSensitivity());
  }
}

运行结果

AUC: 0.8333333333333334
KS: 0.6666666666666666
PRC: 0.9027777777777777
Accuracy: 0.6
Macro Precision: 0.8
Micro Recall: 0.6
Weighted Sensitivity: 0.6