一、需求背景
最近在搞即席查询平台时,有个分析师提交了一个简单带有字段排序的sql语句,查询了一张总10亿的数据,很快就把impala集群cpu资源打爆,各种报警此起彼伏,大量的查询都被阻塞。运维同学在检查了impala的n多个节点以后才发现该sql语句,通过在ui界面 cancel掉该查询,才解除报警,cpu,内存等检测指标主键恢复正常。通过这次事件反映出至少俩个问题:
1.如何预发重sql提交到impala集群
2.如何快速定位并取消重sql查询
关于检测重sql以及怎么预防提交,以及提交后的惩罚机制,在本文中不做讨论。本文将重点,调研下,如果快速将running中的重sql查询任务取消掉,减轻impala集群的压力。
二、方案调研
2.1 impala版本以及特性
在本文中impala使用的2.8的社区版本。在impala 指导说明中交互式取消查询。
1.通过web ui 查询running中的query ,点击cancel取消查询
由于impala是分布式的,每个impalad 都是中央协调器,都可以做 query sql接收者;当前jdbc 访问impala,通过负载均衡,轮询到每一个impalad节点。
通过web ui 方式抓取cancel接口 再遍历每个节点,请求接口cancel 重查询。这种方式前提,我们需要获取到 queryId。在通过hive jdbc方式提交的查询,在hivestatement中没有get到queryId的方法。因此该 方式可行性也比较差。
2.在impala-shell 中cancel 掉
然而这种方式并不友好,不是所有用户都可以用户impala-shell的权限,尤其是分析师。
3.通过调度jdbc的statament的cancel方式,从客户端取消查询
参考下 HiveStatement 的cancel方法。
/*
* (non-Javadoc)
*
* @see java.sql.Statement#cancel()
*/
@Override
public void cancel() throws SQLException {
checkConnection("cancel");
if (isCancelled) {
return;
}
transportLock.lock();
try {
if (stmtHandle != null) {
TCancelOperationReq cancelReq = new TCancelOperationReq(stmtHandle);
TCancelOperationResp cancelResp = client.CancelOperation(cancelReq);
Utils.verifySuccessWithInfo(cancelResp.getStatus());
}
} catch (SQLException e) {
throw e;
} catch (Exception e) {
throw new SQLException(e.toString(), "08S01", e);
} finally {
transportLock.unlock();
}
isCancelled = true;
}
废话不多说,直接上代码
/**
* @Description impalajdbc 测试类
* @Date 2022/8/6 13:17
*/
@Slf4j
public class ImpalaJdbcTest {
private static final String JDBC_DRIVER = "org.apache.hive.jdbc.HiveDriver";
private HiveConnection connection;
private static final String USER_NAME = "testUser";
private static final String PWD = "eyJ0eXAiOiJKV1QiL";
private static final String URL = "jdbc:hive2://yourimpalaurl/impala_test;auth=noSasl";
@Before
public void initConnenction() throws Exception {
Class.forName(JDBC_DRIVER);
connection = (HiveConnection) DriverManager.getConnection(URL, USER_NAME, PWD);
}
@Test
public void testQeury2() {
String sql = "select * from dw_dim.dim_member_d_his where dt = '2022-07-30' \n" +
" order by mem_id desc\n" +
" limit 1000";
Statement state = null;
long begin = 0;
try {
state = connection.createStatement();
begin = System.currentTimeMillis();
Thread cancelThread = new Thread(new StatementCancelHandle(state, begin));
cancelThread.start();
ResultSet resultSet = state.executeQuery(sql);
List<String> columns = JdbcUtils.getAllColumns(resultSet);
log.info("columns:{}", columns);
int columnCount = resultSet.getMetaData().getColumnCount();
List<String[]> rese = new ArrayList<>(4);
int n = 0;
while (resultSet.next()) {
String[] row = new String[columnCount];
for (int i = 1; i < columnCount; i++) {
row[i - 1] = resultSet.getString(i);
}
rese.add(row);
n++;
log.info("row:{}, val:{}", n, row);
}
} catch (Exception e) {
log.error("eoor:", e);
} finally {
closedStatement(state);
closed();
log.info("userd time:{}", (System.currentTimeMillis() - begin) / 1000);
}
}
private void closedStatement(Statement state) {
try {
if (!Objects.isNull(state)) {
state.close();
}
} catch (SQLException throwables) {
log.error("closedStatement_error:", throwables);
}
}
private void closed() {
try {
connection.close();
} catch (SQLException throwables) {
log.error("closedStatement_error:", throwables);
}
}
public class StatementCancelHandle implements Runnable {
private Statement statement;
private Long mainBeginTime;
public StatementCancelHandle(Statement statement, Long mainBeginTime) {
this.statement = statement;
this.mainBeginTime = mainBeginTime;
}
@Override
public void run() {
log.info("begin to StatementCancelHandle, init take time:{}", (System.currentTimeMillis() - this.mainBeginTime));
try {
log.info("StatementCancelHandle begin time:{}", System.currentTimeMillis());
int n = 0;
Thread.sleep(3000);
long be = System.currentTimeMillis();
statement.cancel();
long end = System.currentTimeMillis();
log.info("userd time:{}", (end - be));
} catch (Exception e) {
log.error("statement cancel error, e:", e);
}
}
}
}
通过测试:
2022-08-06 14:04:00 [INFO] com.impalatest.ImpalaJdbcTest$StatementCancelHandle.run(ImpalaJdbcTest.java:117) - begin to StatementCancelHandle, init take time:0
2022-08-06 14:04:00 [INFO] com.impalatest.ImpalaJdbcTest$StatementCancelHandle.run(ImpalaJdbcTest.java:120) - StatementCancelHandle begin time:1659765840647
2022-08-06 14:04:03 [INFO] com.impalatest.ImpalaJdbcTest$StatementCancelHandle.run(ImpalaJdbcTest.java:126) - userd time:34
2022-08-06 14:04:03 [ERROR] com.impalatest.ImpalaJdbcTest.testDorisQeury2(ImpalaJdbcTest.java:75) - eoor:
java.sql.SQLException: Cancelled
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:296) ~[hive-jdbc-1.2.1.jar:1.2.1]
at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:392) ~[hive-jdbc-1.2.1.jar:1.2.1]
at com.impalatest.ImpalaJdbcTest.testDorisQeury2(ImpalaJdbcTest.java:57) [test-classes/:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_281]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_281]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_281]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_281]
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) [junit-4.12.jar:4.12]
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) [junit-4.12.jar:4.12]
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) [junit-4.12.jar:4.12]
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) [junit-4.12.jar:4.12]
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) [junit-4.12.jar:4.12]
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) [junit-4.12.jar:4.12]
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) [junit-4.12.jar:4.12]
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) [junit-4.12.jar:4.12]
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) [junit-4.12.jar:4.12]
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) [junit-4.12.jar:4.12]
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) [junit-4.12.jar:4.12]
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) [junit-4.12.jar:4.12]
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) [junit-4.12.jar:4.12]
at org.junit.runners.ParentRunner.run(ParentRunner.java:363) [junit-4.12.jar:4.12]
at org.junit.runner.JUnitCore.run(JUnitCore.java:137) [junit-4.12.jar:4.12]
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69) [junit-rt.jar:?]
at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33) [junit-rt.jar:?]
at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:220) [junit-rt.jar:?]
at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:53) [junit-rt.jar:?]
2022-08-06 14:04:03 [INFO] com.impalatest.ImpalaJdbcTest.testQeury(ImpalaJdbcTest.java:79) - userd time:3
Process finished with exit code 0
通过测试,使用hive jdbc方式查询impala时,可以通过用查询生成的statement,调用cancel方法,能够中断服务端查询。
至于再即席查询平台中,对于时间运行的大任务,可以将每次查询的statement 保存在服务端本地的map中,将手动取消查询任务id,放在redis 中,查询服务定时轮询redis中取消查询的id, 再调用本地的statement的cancel方法,取消查询任务。