问题
flink SQL连接hive以及hudi 报错java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
,查询资料后发现就是guava版本冲突造成的。
hive 3.1.2版本内置的guava是19.0版本的,而hadoop中的guava是27.0-jre版本的,flink内置的guava也有多个版本。彼此之间版本就冲突了。
Flink SQL> CREATE CATALOG myhive WITH (
> 'type' = 'hive',
> 'default-database' = 'default',
> 'hive-conf-dir' = '/data/xxx/hive/conf'
> );
2021-10-12 17:12:58,816 INFO org.apache.hadoop.hive.conf.HiveConf [] - Found configuration file file:/data/module/hive/conf/hive-site.xml
Exception in thread "main" org.apache.flink.table.client.SqlClientException: Unexpected exception. This is a bug. Please consider filing an issue.
at org.apache.flink.table.client.SqlClient.main(SqlClient.java:215)
Caused by: java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
at org.apache.hadoop.mapred.JobConf.setJar(JobConf.java:518)
at org.apache.hadoop.mapred.JobConf.setJarByClass(JobConf.java:536)
at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:430)
at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:5141)
at org.apache.hadoop.hive.conf.HiveConf.<init>(HiveConf.java:5109)
at org.apache.flink.table.catalog.hive.HiveCatalog.createHiveConf(HiveCatalog.java:230)
at org.apache.flink.table.catalog.hive.HiveCatalog.<init>(HiveCatalog.java:169)
at org.apache.flink.table.catalog.hive.factories.HiveCatalogFactory.createCatalog(HiveCatalogFactory.java:97)
at org.apache.flink.table.api.internal.TableEnvironmentImpl.createCatalog(TableEnvironmentImpl.java:1121)
at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeOperation(TableEnvironmentImpl.java:1019)
at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:666)
at org.apache.flink.table.client.gateway.local.LocalExecutor.lambda$executeSql$1(LocalExecutor.java:315)
at org.apache.flink.table.client.gateway.local.ExecutionContext.wrapClassLoader(ExecutionContext.java:256)
at org.apache.flink.table.client.gateway.local.LocalExecutor.executeSql(LocalExecutor.java:315)
at org.apache.flink.table.client.cli.CliClient.callDdl(CliClient.java:739)
at org.apache.flink.table.client.cli.CliClient.callDdl(CliClient.java:734)
at org.apache.flink.table.client.cli.CliClient.callCommand(CliClient.java:381)
at java.util.Optional.ifPresent(Optional.java:159)
at org.apache.flink.table.client.cli.CliClient.open(CliClient.java:214)
at org.apache.flink.table.client.SqlClient.openCli(SqlClient.java:144)
at org.apache.flink.table.client.SqlClient.start(SqlClient.java:115)
at org.apache.flink.table.client.SqlClient.main(SqlClient.java:201)
Shutting down the session...
done.
解决办法
查了好久才发现是hive-exec
包中的guava冲突造成的。
hive-exec
将19.0版本的guava shade
进去了
而 flink-hive-connector 将hive-exec
shade进去了。hudi-flink-bundle_2.12-0.9.0.jar
也将hive-exec
shade进去了。
所以要解决办法就是将以上jar包中的guava的shade去掉。
方法一
下载hive源码
wget https://dlcdn.apache.org/hive/hive-3.1.2/apache-hive-3.1.2-src.tar.gz --no-check-certificate
把ql
module下pom的guava inclusion删掉(之前exclude不掉ql下guava,正是因此),如图:
然后重新编译打包:
# mvn clean package -Pdist -DskipTests
# cd target/
# ll
总用量 64096
drwxr-xr-x 2 root root 4096 10月 12 16:45 antrun
drwxr-xr-x 4 root root 4096 10月 12 16:45 classes
drwxr-xr-x 5 root root 4096 10月 12 16:45 generated-sources
drwxr-xr-x 4 root root 4096 10月 12 16:46 generated-test-sources
-rw-r--r-- 1 root root 12608602 10月 12 16:46 hive-exec-3.1.2-core.jar
-rw-r--r-- 1 root root 13389 10月 12 16:46 hive-exec-3.1.2-fallbackauthorizer.jar
-rw-r--r-- 1 root root 38317402 10月 12 16:46 hive-exec-3.1.2.jar
-rw-r--r-- 1 root root 2031632 10月 12 16:46 hive-exec-3.1.2-tests.jar
drwxr-xr-x 2 root root 4096 10月 12 16:46 maven-archiver
drwxr-xr-x 3 root root 4096 10月 12 16:45 maven-shared-archive-resources
-rw-r--r-- 1 root root 12608602 10月 12 16:46 original-hive-exec-3.1.2.jar
drwxr-xr-x 4 root root 4096 10月 12 16:46 test-classes
drwxr-xr-x 7 root root 4096 10月 12 16:46 testconf
drwxr-xr-x 2 root root 4096 10月 12 16:46 tmp
drwxr-xr-x 2 root root 4096 10月 12 16:46 warehouse
然后将生成的hive-exec-3.1.2.jar
放在本地的maven仓库里。
这一步很重要
下载flink源码包:
wget https://archive.apache.org/dist/flink/flink-1.12.2/flink-1.12.2-src.tgz
解压后:
cd /data/software/flink-compile/flink-1.12.2/flink-connectors/flink-sql-connector-hive-3.1.2
编译flink-sql-connector-hive-3.1.2:
mvn clean install -Dfast -Dhadoop.version=3.1.3 -Dscala-2.12 -DskipTests -Dfast -T 4 -Dmaven.compile.fork=true -Dmaven.javadoc.skip=true -Dcheckstyle.skip=true
编译后的结果:
[root@cdh06 target]# ll
总用量 43936
drwxr-xr-x 3 root root 4096 10月 12 17:09 classes
-rw-r--r-- 1 root root 4997 10月 12 17:10 dependency-reduced-pom.xml
-rw-r--r-- 1 root root 44937242 10月 12 17:10 flink-sql-connector-hive-3.1.2_2.12-1.12.2.jar
-rw-r--r-- 1 root root 1632 10月 12 17:10 flink-sql-connector-hive-3.1.2_2.12-1.12.2-tests.jar
drwxr-xr-x 2 root root 4096 10月 12 17:09 maven-archiver
drwxr-xr-x 3 root root 4096 10月 12 17:09 maven-shared-archive-resources
-rw-r--r-- 1 root root 18888 10月 12 17:09 original-flink-sql-connector-hive-3.1.2_2.12-1.12.2.jar
drwxr-xr-x 3 root root 4096 10月 12 17:09 test-classes
将flink-sql-connector-hive-3.1.2_2.12-1.12.2.jar
放在flink的lib目录下,重启flink cluster,然后再次执行flinkSQL,问题解决。
需要注意的是,不同版本guava 依赖在flink 的lib目录下不能在其他的jar中存在。
方法二
将 flink-hive-connector 以及hudi-flink-bundle_2.12-0.9.0.jar
中的hive-exec 的include标签注释掉。