SecondaryNameNode辅助管理FSImage与Edits原理

secondaryNamenode对namenode当中的fsimage和edits进行合并时,每次都会先将namenode的fsimage与edits文件拷贝一份过来,所以fsimage与edits文件在secondarNamendoe当中也会保存有一份,如果namenode的fsimage与edits文件损坏,那么我们可以将secondaryNamenode当中的fsimage与edits拷贝过去给namenode继续使用,只不过有可能会丢失一部分数据。

由于editlog记录了集群运行期间所有对HDFS的相关操作,所以这个文件会很大。
集群关闭后再次启动时会将Fsimage,editlog加载到内存中,进行合并,恢复到集群的。
由于editlog文件很大所有,集群再次启动时会花费较长时间。
为了加快集群的启动时间,所以使用secondarynameNode辅助NameNode合并Fsimage,editlog。

作用

Fsimage,Edits用于永久存储HDFS文件系统的镜像和操作日志。集群在二次启动时,使用Fsimage,Edits将集群状态恢复到集群关闭前的状态。

hdfs namenode故障恢复(namenode中fsimage与edits文件损坏)_hdfs

1、 secnonaryNN通知NameNode切换editlog
2、secondaryNN从NameNode中获得FSImage和editlog(通过http方式)
3、secondaryNN将FSImage载入内存,然后开始合并editlog,合并之后成为新的fsimage
4、secondaryNN将新的fsimage发回给NameNode
5、NameNode用新的fsimage替换旧的fsimage

今天下午突然间接到开发打过来的电话,说生产环境有问题,我到平台上看了下报错,出现以现报错。

3:23:20.682 PM ERROR NameNode Failed to start namenode.
java.io.IOException: NameNode is not formatted.
 at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:237)
 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1084)
 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:709)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:665)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:727)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:950)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:929)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1653)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1720

因为namenode节点是单节点的,查了一下/data/的文件,发现/data/dfs/nn下面的文件没了文件不见了,我就到服务器上查一下历史记录。

history 

查看了一下操作命令如下:

991  cd /data
992  rm -rf dfs
993  cd /
994  ll
995  df -h

hdfs显示如下界面:

hdfs namenode故障恢复(namenode中fsimage与edits文件损坏)_hdfs_02

然后登到secondaryNamenode,发现文件还在

hdfs namenode故障恢复(namenode中fsimage与edits文件损坏)_hdfs_03

然后使用tar -cvf dfs  dfs.tar.gz打包。

使用scp dfs.tar.gz root@xx-01:/data机器上。

操cdh的hdfs进入安全模式

tar -xvf dfs.tar.gz

chown -R hdfs:hdfs  dfs

然后进入cdh平台启动namenode

启动后打开hdfs的namenode的访问界面,查看了一下,目录已经恢复。

hdfs namenode故障恢复(namenode中fsimage与edits文件损坏)_hdfs_04

hdfs namenode故障恢复(namenode中fsimage与edits文件损坏)_hdfs_05

查看日志

IPC Server handler 28 on 8020, call Call#3 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.delete from 11.96.55.29:52484: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete /user/hdfs/.sparkStaging/application_1628829434336_0419. Name node is in safe mode.
The reported blocks 2203938 needs additional 7660 blocks to reach the threshold 0.9990 of total blocks 2213812.
The number of live datanodes 8 has reached the minimum number 1. Name node detected blocks with generation stamps in future. This means that Name node metadata is inconsistent. This can happen if Name node metadata files have been manually replaced. Exiting safe mode will cause loss of 269358283994 byte(s). Please restart name node with right metadata or use "hdfs dfsadmin -safemode forceExit" if you are certain that the NameNode was started with the correct FsImage and edit logs. If you encountered this during a rollback, it is safe to exit with -safemode forceExit. NamenodeHostName:xx-01

PC Server handler 34 on 8022, call Call#41 Retry#0 org.apache.hadoop.hdfs.server.protocol.NamenodeProtocol.rollEditLog from 11.96.55.30:52676
org.apache.hadoop.hdfs.server.namenode.SafeModeException: Log not rolled. Name node is in safe mode.
The reported blocks 2203938 needs additional 7660 blocks to reach the threshold 0.9990 of total blocks 2213812.
The number of live datanodes 8 has reached the minimum number 1. Name node detected blocks with generation stamps in future. This means that Name node metadata is inconsistent. This can happen if Name node metadata files have been manually replaced. Exiting safe mode will cause loss of 269358283994 byte(s). Please restart name node with right metadata or use "hdfs dfsadmin -safemode forceExit" if you are certain that the NameNode was started with the correct FsImage and edit logs. If you encountered this during a rollback, it is safe to exit with -safemode forceExit. NamenodeHostName:xx-01
 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.newSafemodeException(FSNamesystem.java:1448)
 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1435)
 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:4600)
 at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1276)
 at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:146)
 at org.apache.hadoop.hdfs.protocol.proto.NamenodeProtocolProtos$NamenodeProtocolService$2.callBlockingMethod(NamenodeProtocolProtos.java:12974)
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
 at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
 at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at org.apache.

name的edit的数据信息已经在恢复了。

以上会丢一部分数据。