一、需求背景

最近在搞即席查询平台时,有个分析师提交了一个简单带有字段排序的sql语句,查询了一张总10亿的数据,很快就把impala集群cpu资源打爆,各种报警此起彼伏,大量的查询都被阻塞。运维同学在检查了impala的n多个节点以后才发现该sql语句,通过在ui界面 cancel掉该查询,才解除报警,cpu,内存等检测指标主键恢复正常。通过这次事件反映出至少俩个问题:

1.如何预发重sql提交到impala集群

2.如何快速定位并取消重sql查询

关于检测重sql以及怎么预防提交,以及提交后的惩罚机制,在本文中不做讨论。本文将重点,调研下,如果快速将running中的重sql查询任务取消掉,减轻impala集群的压力。

二、方案调研

2.1 impala版本以及特性

在本文中impala使用的2.8的社区版本。在impala 指导说明中交互式取消查询。

sql 语句调试 断点 mysql_bc

 1.通过web ui 查询running中的query ,点击cancel取消查询

sql 语句调试 断点 mysql_java_02

  由于impala是分布式的,每个impalad 都是中央协调器,都可以做 query sql接收者;当前jdbc  访问impala,通过负载均衡,轮询到每一个impalad节点。

   通过web ui 方式抓取cancel接口  再遍历每个节点,请求接口cancel 重查询。这种方式前提,我们需要获取到 queryId。在通过hive jdbc方式提交的查询,在hivestatement中没有get到queryId的方法。因此该 方式可行性也比较差。

2.在impala-shell 中cancel 掉

   然而这种方式并不友好,不是所有用户都可以用户impala-shell的权限,尤其是分析师。

3.通过调度jdbc的statament的cancel方式,从客户端取消查询

  参考下 HiveStatement 的cancel方法。

sql 语句调试 断点 mysql_大数据_03

/*
   * (non-Javadoc)
   *
   * @see java.sql.Statement#cancel()
   */

  @Override
  public void cancel() throws SQLException {
    checkConnection("cancel");
    if (isCancelled) {
      return;
    }

    transportLock.lock();
    try {
      if (stmtHandle != null) {
        TCancelOperationReq cancelReq = new TCancelOperationReq(stmtHandle);
        TCancelOperationResp cancelResp = client.CancelOperation(cancelReq);
        Utils.verifySuccessWithInfo(cancelResp.getStatus());
      }
    } catch (SQLException e) {
      throw e;
    } catch (Exception e) {
      throw new SQLException(e.toString(), "08S01", e);
    } finally {
      transportLock.unlock();
    }
    isCancelled = true;
  }

 废话不多说,直接上代码

/**
 * @Description impalajdbc 测试类
 * @Date 2022/8/6 13:17
 */
@Slf4j
public class ImpalaJdbcTest {


    private static final String JDBC_DRIVER = "org.apache.hive.jdbc.HiveDriver";

    private HiveConnection connection;

    private static final String USER_NAME = "testUser";

    private static final String PWD = "eyJ0eXAiOiJKV1QiL";

    private static final String URL = "jdbc:hive2://yourimpalaurl/impala_test;auth=noSasl";


    @Before
    public void initConnenction() throws Exception {
        Class.forName(JDBC_DRIVER);
        connection = (HiveConnection) DriverManager.getConnection(URL, USER_NAME, PWD);
    }


    @Test
    public void testQeury2() {
        String sql = "select * from dw_dim.dim_member_d_his where dt = '2022-07-30'  \n" +
                "  order by mem_id desc\n" +
                "  limit 1000";
        Statement state = null;
        long begin = 0;
        try {
            state = connection.createStatement();
            begin = System.currentTimeMillis();
            Thread cancelThread = new Thread(new StatementCancelHandle(state, begin));
            cancelThread.start();
            ResultSet resultSet = state.executeQuery(sql);
            List<String> columns = JdbcUtils.getAllColumns(resultSet);
            log.info("columns:{}", columns);
            int columnCount = resultSet.getMetaData().getColumnCount();
            List<String[]> rese = new ArrayList<>(4);
            int n = 0;
            while (resultSet.next()) {
                String[] row = new String[columnCount];
                for (int i = 1; i < columnCount; i++) {
                    row[i - 1] = resultSet.getString(i);
                }

                rese.add(row);
                n++;
                log.info("row:{}, val:{}", n, row);
            }

        } catch (Exception e) {
            log.error("eoor:", e);
        } finally {
            closedStatement(state);
            closed();
            log.info("userd time:{}", (System.currentTimeMillis() - begin) / 1000);
        }

    }


    private void closedStatement(Statement state) {
        try {
            if (!Objects.isNull(state)) {
                state.close();
            }
        } catch (SQLException throwables) {
            log.error("closedStatement_error:", throwables);
        }
    }

    private void closed() {
        try {
            connection.close();
        } catch (SQLException throwables) {
            log.error("closedStatement_error:", throwables);
        }
    }


    public class StatementCancelHandle implements Runnable {

        private Statement statement;

        private Long mainBeginTime;

        public StatementCancelHandle(Statement statement, Long mainBeginTime) {
            this.statement = statement;
            this.mainBeginTime = mainBeginTime;
        }

        @Override
        public void run() {
            log.info("begin to StatementCancelHandle, init take time:{}", (System.currentTimeMillis() - this.mainBeginTime));

            try {
                log.info("StatementCancelHandle begin time:{}", System.currentTimeMillis());
                int n = 0;
                Thread.sleep(3000);
                long be = System.currentTimeMillis();
                statement.cancel();
                long end = System.currentTimeMillis();
                log.info("userd time:{}", (end - be));
            } catch (Exception e) {
                log.error("statement cancel error, e:", e);
            }


        }
    }

}

  通过测试:

2022-08-06 14:04:00 [INFO] com.impalatest.ImpalaJdbcTest$StatementCancelHandle.run(ImpalaJdbcTest.java:117) - begin to StatementCancelHandle, init take time:0
2022-08-06 14:04:00 [INFO] com.impalatest.ImpalaJdbcTest$StatementCancelHandle.run(ImpalaJdbcTest.java:120) - StatementCancelHandle begin time:1659765840647
2022-08-06 14:04:03 [INFO] com.impalatest.ImpalaJdbcTest$StatementCancelHandle.run(ImpalaJdbcTest.java:126) - userd time:34
2022-08-06 14:04:03 [ERROR] com.impalatest.ImpalaJdbcTest.testDorisQeury2(ImpalaJdbcTest.java:75) - eoor:
java.sql.SQLException: Cancelled

	at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:296) ~[hive-jdbc-1.2.1.jar:1.2.1]
	at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:392) ~[hive-jdbc-1.2.1.jar:1.2.1]
	at com.impalatest.ImpalaJdbcTest.testDorisQeury2(ImpalaJdbcTest.java:57) [test-classes/:?]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_281]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_281]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_281]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_281]
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) [junit-4.12.jar:4.12]
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) [junit-4.12.jar:4.12]
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) [junit-4.12.jar:4.12]
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) [junit-4.12.jar:4.12]
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) [junit-4.12.jar:4.12]
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) [junit-4.12.jar:4.12]
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) [junit-4.12.jar:4.12]
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) [junit-4.12.jar:4.12]
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) [junit-4.12.jar:4.12]
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) [junit-4.12.jar:4.12]
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) [junit-4.12.jar:4.12]
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) [junit-4.12.jar:4.12]
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) [junit-4.12.jar:4.12]
	at org.junit.runners.ParentRunner.run(ParentRunner.java:363) [junit-4.12.jar:4.12]
	at org.junit.runner.JUnitCore.run(JUnitCore.java:137) [junit-4.12.jar:4.12]
	at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69) [junit-rt.jar:?]
	at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33) [junit-rt.jar:?]
	at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:220) [junit-rt.jar:?]
	at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:53) [junit-rt.jar:?]
2022-08-06 14:04:03 [INFO] com.impalatest.ImpalaJdbcTest.testQeury(ImpalaJdbcTest.java:79) - userd time:3

Process finished with exit code 0

通过测试,使用hive jdbc方式查询impala时,可以通过用查询生成的statement,调用cancel方法,能够中断服务端查询。

至于再即席查询平台中,对于时间运行的大任务,可以将每次查询的statement 保存在服务端本地的map中,将手动取消查询任务id,放在redis 中,查询服务定时轮询redis中取消查询的id, 再调用本地的statement的cancel方法,取消查询任务。

四、参考文档

1.Apache Impala Guide