1、系统情况

hadoop、hive、yarn、zookeeper、hbase已经安装部署成功,hadoop、hive、yarn已经成功集成kerberos,并启动成功。

2、zookeeper报错情况

zookeeper启动成功,并且./bin/zkServer.sh status集群状态正常

执行./bin/zkCli.sh -server hadoop0:2181的时候无法连接到zookeeper,连接报错情况:

WatchedEvent state:SyncConnected type:None path:null

2021-12-13 05:15:01,371 [myid:hadoop2:2181] - ERROR [main-SendThread(hadoop2:2181):ZooKeeperSaslClient@341] - An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - LOOKING_UP_SERVER)]) occurred when evaluating Zookeeper Quorum Member's  received SASL token. Zookeeper Client will go to AUTH_FAILED state.

2021-12-13 05:15:01,372 [myid:hadoop2:2181] - ERROR [main-SendThread(hadoop2:2181):ClientCnxn$SendThread@1151] - SASL authentication with Zookeeper Quorum member failed: javax.security.sasl.SaslException: An error:(java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - LOOKING_UP_SERVER)]) occurred when evaluating Zookeeper Quorum Member's  received SASL token. Zookeeper Client will go to AUTH_FAILED state. [Caused by java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - LOOKING_UP_SERVER)]]

记录一次zookeeper和hbase集成kerberos的故障解决_zookeeper

查看/var/log/krb5kdc.log日志,发现zkCli.sh客户端连接zookeeper默认用的zookeeper用户验证的kerberos,而我创建的kerberos认证用户为hdfs

Dec 13 06:28:29 hadoop0 krb5kdc[8606](info): TGS_REQ (4 etypes {aes256-cts-hmac-sha1-96(18), aes128-cts-hmac-sha1-96(17), UNSUPPORTED:des3-hmac-sha1(16), DEPRECATED:arcfour-hmac(23)}) 172.18.0.4: LOOKING_UP_SERVER: authtime 0, etypes {rep=UNSUPPORTED:(0)} hdfs/hadoop2@HADOOP.COM for zookeeper/hadoop1@HADOOP.COM, Server not found in Kerberos database

Dec 13 06:28:29 hadoop0 krb5kdc[8606](info): TGS_REQ (4 etypes {aes256-cts-hmac-sha1-96(18), aes128-cts-hmac-sha1-96(17), UNSUPPORTED:des3-hmac-sha1(16), DEPRECATED:arcfour-hmac(23)}) 172.18.0.3: LOOKING_UP_SERVER: authtime 0, etypes {rep=UNSUPPORTED:(0)} hdfs/hadoop1@HADOOP.COM for zookeeper/hadoop0@HADOOP.COM, Server not found in Kerberos database

记录一次zookeeper和hbase集成kerberos的故障解决_kerberos_02

3、解决

在zkEnv.sh增加如下配置,指明客户端需要访问的服务端的名称:

CLIENT_JVMFLAGS=" -Dzookeeper.sasl.client.username=zookeeperkrb $CLIENT_JVMFLAGS"

记录一次zookeeper和hbase集成kerberos的故障解决_HBase_03

重启zookeeper集群,此时通过zkCli.sh连接zookeeper集群能够正常连接

4、HBase启动报错

查看master日志报如下错误,进程退出:

记录一次zookeeper和hbase集成kerberos的故障解决_kerberos_04

查看/var/log/krb5kdc.log日志,日志显示连接zookeeper同样用的默认用户zookeeper验证的kerberos,而我创建的kerberos认证用户为hdfs

记录一次zookeeper和hbase集成kerberos的故障解决_HBase_05

5、解决

修改conf/hbase-env.sh,添加-Dzookeeper.sasl.client.username=hdfs

export HBASE_OPTS="-XX:+UseConcMarkSweepGC -Djava.security.auth.login.config=/usr/local/hbase/conf/zk-jaas.conf -Dzookeeper.sasl.client.username=hdfs"

再次执行bin/start-hbase.sh启动成功