报错:
经过查看集群的jps如下:
==================== hadoop01 jps =================== 2561 FsShell 1971 ResourceManager 2452 NameNode 2606 Jps ==================== hadoop02 jps =================== 1570 NodeManager 1363 DataNode 1462 JournalNode 1303 QuorumPeerMain 1722 Jps ==================== hadoop03 jps =================== 1573 NodeManager 1366 DataNode 1465 JournalNode 1305 QuorumPeerMain 1725 Jps ==================== hadoop04 jps =================== 1458 JournalNode 1302 QuorumPeerMain 1718 Jps 1566 NodeManager 1359 DataNode ==================== hadoop05 jps =================== 1574 Jps 1295 NameNode
查看日志:
状况:
所有namenode都是standby,即ZK服务未生效
尝试一:手动强制转化某个namenode为active
操作:在某台namenode上,执行 hdfs haadmin -transitionToActive --forcemanual nn1 (nn1是你的某台nameservice-id)
结果:nn1被成功转为active。但是在stop-dfs.sh后再一次start-dfs.sh后,所有namenode仍然都是standby
结论:果然因该是ZK的问题
尝试二:初始化ZK
操作:在某台namenode上,执行 hdfs zkfc -formatZK
结果:重新 start-dfs.sh后,一切正常
NOTE: Zk初始化必须要启动ZK,否则报错