使用Flink开发的应用常见实时写入,即更新每一条流水的统计结果至数据库。在生产环境下,数据库为多个应用共用,那随着Flink应用的不断新增部署,数据库的写入压力越来越大,直至Flink应用出现阻塞等待、背压的情况出现,情况更严重的情况下会导致任务checkpoint超时、重启。

从以上角度出发,考虑针对一些特定场景优化Flink应用,降低写入频次,减轻对数据库的压力。


待优化的代码:

Table clearTransTable = tableEnv.fromDataStream(srcStream,$("stlBranchId"),$("stlOrgId"),$("txnAmt"),$("payTimeTs"),$("bigmccId"));
tableEnv.createTemporaryView("clearTransTable", clearTransTable);
tableEnv.executeSql(
                "  CREATE TABLE rts_clear_mcc_day_trans(\n" +
                        "    stl_branch_id VARCHAR,\n" +
                        "    stl_org_id VARCHAR,\n" +
                        "    bigmcc_id VARCHAR,\n" +
                        "    time_date VARCHAR,\n" +
                        "    trans_count BIGINT,\n" +
                        "    trans_amount DOUBLE,\n" +
                        "    primary key(time_date,stl_branch_id,stl_org_id) NOT ENFORCED\n"+
                        ") WITH (\n" +
                        "    'connector' = 'jdbc',\n" +
                        "    'url' = '" + SINK_JDBC_URL + "',\n" +
                        "    'table-name' = '" + SINK_TABLE_NAME + "',\n" +
                        "    'username' = '"+ SINK_JDBC_USERNAME +"',\n" +
                        "    'password' = '" + SINK_JDBC_PWD + "',\n" +
                        "    'scan.auto-commit' = 'true',\n" +
                        "    'sink.buffer-flush.max-rows' = '100',\n" +
                        "    'sink.buffer-flush.interval' = '1s'\n" +
                        ")");
tableEnv.executeSql("INSERT INTO rts_clear_mcc_day_trans\n" +
                "       SELECT stlBranchId as stl_branch_id, " +
                "              stlOrgId as stl_org_id,\n" +
                "              bigmccId as bigmcc_id,\n" +
                "              DATE_FORMAT(payTimeTs ,'yyyy-MM-dd') as time_day," +
                "              count(1) as trans_count,\n" +
                "              sum(txnAmt) as trans_amount\n" +
                "       FROM clearTransTable\n" +
                "       GROUP BY stlBranchId,stlOrgId,bigmccId,\n" +
                "                DATE_FORMAT(payTimeTs,'yyyy-MM-dd')\n");

不开窗和使用普通分组GROUP BY的原因:

这里因为无法保证上游业务的流水在时间上100%实时有序,为了不漏算流水,没有开窗口而是使用了分组GROUP BY,每条流水都会触发计算+写库;


优化的原因:

该业务场景为大类流水统计,key空间不大,大概几百上千个,也就是说每秒会对少量的key的统计结果反复更新,在业务上并不要求此种大类统计计算数据做到100%实时,达到分钟级实时即可;


优化后的代码:

Table clearTransTable = tableEnv.fromDataStream(sreStream,$("stlBranchId"),$("stlOrgId"),$("txnAmt"),$("payTimeTs"),$("bigmccId"));

tableEnv.createTemporaryView("clearTransTable", clearTransTable);

String aggregateSql = "SELECT stlBranchId , " +
                "stlOrgId , " +
                "bigmccId , " +
                "DATE_FORMAT(payTimeTs ,'yyyy-MM-dd') as timeDay," +
                "count(1) as transCount," +
                "sum(txnAmt) as transAmount " +
                "FROM clearTransTable " +
                "group by stlBranchId,stlOrgId,bigmccId," +
                "DATE_FORMAT(payTimeTs,'yyyy-MM-dd')";

Table aggregatedTable = tableEnv.sqlQuery(aggregateSql);
aggregatedTable.printSchema();

SingleOutputStreamOperator<MccResult> res = tableEnv.toRetractStream(aggregatedTable, MccResult.class).flatMap(new FlatMapFunction<Tuple2<Boolean, MccResult>, MccResult>() {
            @Override
            public void flatMap(Tuple2<Boolean, MccResult> value, Collector<MccResult> out) throws Exception {
                if (value.f0) {
                    out.collect(value.f1);
                }
            }
        }).returns(MccResult.class);

tableEnv.createTemporaryView("aggregatedTable", res);

Table result = tableEnv.sqlQuery("SELECT stlBranchId,stlOrgId,bigmccId,timeDay,LAST_VALUE(transCount) AS transCount,LAST_VALUE(transAmount) AS transAmount FROM aggregatedTable GROUP BY TUMBLE(proctime(), INTERVAL '1' MINUTES),stlBranchId,stlOrgId,bigmccId,timeDay ");

        tableEnv.toAppendStream(result,MccResult.class).addSink(JdbcSink.sink("insert into rts_clear_mcc_day_trans(stl_branch_id,stl_org_id,bigmcc_id,time_date,trans_count,trans_amount) values(?,?,?,?,?,?) on duplicate key update trans_count = VALUES(`trans_count`), trans_amount = VALUES(`trans_amount`)",
                (JdbcStatementBuilder<MccResult>) (ps, t)->{
                    ps.setString(1, t.getStlBranchId());
                    ps.setString(2, t.getStlOrgId());
                    ps.setString(3, t.getBigmccId());
                    ps.setString(4, t.getTimeDay());
                    ps.setLong(5, t.getTransCount());
                    ps.setDouble(6, t.getTransAmount());
                },
                new JdbcExecutionOptions.Builder().withBatchSize(10).withMaxRetries(3).build(),
                new JdbcConnectionOptions.JdbcConnectionOptionsBuilder()
                        .withUrl(SystemConstant.SINK_JDBC_URL)
                        .withDriverName(SystemConstant.SINK_JDBC_DRIVER)
                        .withUsername(SystemConstant.SINK_JDBC_USERNAME)
                        .withPassword(SystemConstant.SINK_JDBC_PWD)
                        .build()
        )).uid("jdbcSink").name("jdbcSink");

env.execute("RtsClearMccDayTrans");

1. GROUP BY后的聚合结果先不直接写入数据库,而是转化为一个中间表aggregatedTable;

2. 将中间表aggregatedTable转化为DataStream;

3. 使用一个flatMap算子将DataStream中的回撤流水剔除掉,获得聚合结果的更新流res;

4. 将res开一个一分钟窗口,根据聚合key取每个key在一分钟内的最新一条记录,这里用到了LAST_VALUE()函数,为什么不用MAX或者MIN函数,因为计算的trans_amount可能有负数,用MAX或者MIN都不准;

5. 再将聚合后的流水转化为DataStream流后SINK到JDBC;


该优化方法有两个前提:

1.  参与统计的Key空间不大

2. 统计实时要求性不高