zookeeper启动无响应

1.问题一

报错消息如下:

2018-08-16 21:51:35,557 [myid:1] - ERROR [main:QuorumPeerMain@92] - Unexpected exception, exiting abnormally
java.lang.RuntimeException: My id 1 not in the peer list
    at org.apache.zookeeper.server.quorum.QuorumPeer.startLeaderElection(QuorumPeer.java:718)
    at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:637)
    at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:170)
    at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114)
    at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)

错误原因:根据报错日志,可以看出,是因为myid文件中配置有误。导致无法找到这个myid=1的peer。
解决办法:根据conf/zoo.cfg中的值配置myid中的值。
查看conf/zoo.cfg文件


[root@server4 conf]# cat zoo.cfg 
# The number of milliseconds of each tick
tickTime=2000

# The number of ticks that the initial synchronization phase can take
initLimit=10

# The number of ticks that can pass between sending a request and getting an acknowledgement
syncLimit=5


# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just example sakes.
dataDir=/tmp/zookeeper


# the port at which the clients will connect
clientPort=2181

# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.4=server4:2888:3888
server.5=server5:2888:3888
server.6=server6:2888:3888

看最后三行,这里的server.4中的4就是myid中的文件内容。所以在server4这个主机中,其zookeeper的myid文件内容就是4。server5,server6同理。
修改好myid文件中的内容之后,就可以正常启动了。

2.问题二

自上次解决zookeeper无法启动之后,笔者再次开机,发现zookeeper又无法启动了。我很纳闷儿,为何又失败?于是我检查了一下我的myid文件。

[root@server6 /]# ll
总用量 20
lrwxrwxrwx.   1 root root    7 616 01:12 bin -> usr/bin
····
dr-xr-xr-x.  13 root root    0 92 10:52 sys
drwxrwxrwt.  15 root root 4096 92 11:09 tmp
[root@server6 /]# cd tmp
[root@server6 tmp]# cd zookeeper/
[root@server6 zookeeper]# ll
总用量 4
-rw-r--r--. 1 root root 4 92 11:09 zookeeper_server.pid

之前添加的myid文件没有了?,难道不翼而飞了?后来我才想起来,tmp是一个特殊的文件夹,系统会自动清理。所以放在这里的myid文件被清理了导致zookeeper无法启动。正确的做法就是调整zoo.cfgdataDir的属性值

[root@server4 conf]# cat zoo.cfg 
# The number of milliseconds of each tick
tickTime=2000

···

# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just example sakes.
dataDir=/tmp/zookeeper

····

注意dataDir上面说的内容啊不要使用/tmp作为存储,这里的/tmp仅仅是个例子而已!修改如下:dataDir=/usr/local/zookeeper-3.4.11/data即可。

参考文章