在这以前我想先分享一下hadoop namenode基于QJM实现HA的原理。
首先作为一个典型的ha集群,要有两个namenode,一个是active状态,对外提供服务,一个是standby状态,随时待命,以便当active状态的namenode出现故障的时候能够提供快速的故障恢复能力。
JournalNodes上 ,而同时standby节点则从JournalNodes上面读取命名空间信息,更新到自己内部的命名空间。一旦active节点发生错误,standby节点才能保证切换到active。下面引用一张架构图。
JournalNodes上面提交数据。
下面我们开始搭建hadoop ha集群:hadoop-2.6.2
主机信息:
IP | 主机名 | 作用 | 备注 |
192.168.2.10 | bi10 | namenode,datanode,JournalNode | 主namenode |
192.168.2.12 | bi12 | namenode,resourcemanager,datanode,JournalNode | 主resourcemanager,副namenode |
192.168.2.13 | bi13 | resourcemanager,datanode,JournalNode | 副resourcemanager |
192.168.4.33 | bi3 | zookeeper | |
192.168.4.34 | bi4 | zookeeper | |
192.168.4.35 | bi5 | zookeeper |
主要目录信息:
主机 | 挂载信息 | 目录分配 | 建立hdfs目录 |
bi10 | /dev/sda /data1 /dev/sdb /data2 /dev/sdc /data3 /dev/sdd /data4 | mkdir /home/hadoop/work/hadoop-2.6.2/data/hdfs/name/ mkdir /home/hadoop/work/hadoop-2.6.2/data/journal/ mkdir /home/hadoop/work/hadoop-2.6.2/temp/ hadoop.tmp.dir:/home/hadoop/work/hadoop-2.6.2/temp/ dfs.journalnode.edits.dir:/home/hadoop/work/hadoop-2.6.2/data/journal dfs.namenode.name.dir:/home/hadoop/work/hadoop-2.6.2/data/hdfs/name | mkdir /data1/hdfsdata/ mkdir /data2/hdfsdata/ mkdir /data3/hdfsdata/ mkdir /data4/hdfsdata/ |
bi12 | /dev/sda /data1 /dev/sdb /data2 /dev/sdc /data3 /dev/sdd /data4 | mkdir /home/hadoop/work/hadoop-2.6.2/data/hdfs/name/ mkdir /home/hadoop/work/hadoop-2.6.2/data/journal/ mkdir /home/hadoop/work/hadoop-2.6.2/temp/ hadoop.tmp.dir:/home/hadoop/work/hadoop-2.6.2/temp/ dfs.journalnode.edits.dir:/home/hadoop/work/hadoop-2.6.2/data/journal dfs.namenode.name.dir:/home/hadoop/work/hadoop-2.6.2/data/hdfs/name | mkdir /data1/hdfsdata/ mkdir /data2/hdfsdata/ mkdir /data3/hdfsdata/ mkdir /data4/hdfsdata/ |
bi13 | /dev/sda /data1 /dev/sdb /data2 /dev/sdd /data4 /dev/sdc /data3 /dev/sde /data5 /dev/sdf /data6 | mkdir /home/hadoop/work/hadoop-2.6.2/data/hdfs/name/ mkdir /home/hadoop/work/hadoop-2.6.2/data/journal/ mkdir /home/hadoop/work/hadoop-2.6.2/temp/ hadoop.tmp.dir:/home/hadoop/work/hadoop-2.6.2/temp/ dfs.journalnode.edits.dir:/home/hadoop/work/hadoop-2.6.2/data/journal dfs.namenode.name.dir:/home/hadoop/work/hadoop-2.6.2/data/hdfs/name | mkdir /data1/hdfsdata/ mkdir /data2/hdfsdata/ mkdir /data3/hdfsdata/ mkdir /data4/hdfsdata/ mkdir /data5/hdfsdata/ mkdir /data6/hdfsdata/ |
hadoop-env.sh配置 java环境
# The java implementation to use.
export JAVA_HOME=/home/hadoop/work/jdk1.7.0_75
core-site.xml配置,
<configuration>
<!-- 指定hdfs的nameservice为masters -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://masters</value>
</property>
<!-- 指定hadoop临时目录 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/work/hadoop-2.6.2/temp/</value>
</property>
<!-- 指定zookeeper地址 -->
<property>
<name>ha.zookeeper.quorum</name>
<value>bi3:2181,bi4:2181,bi5:2181</value>
</property>
</configuration>
<configuration>
<!--指定hdfs的nameservice为masters,需要和core-site.xml中的保持一致 -->
<property>
<name>dfs.nameservices</name>
<value>masters</value>
</property>
<!-- masters下面有两个NameNode,分别是bi10,bi12 -->
<property>
<name>dfs.ha.namenodes.masters</name>
<value>nn1,nn2</value>
</property>
<!-- dehadp01的RPC通信地址 -->
<property>
<name>dfs.namenode.rpc-address.masters.nn1</name>
<value>bi10:9000</value>
</property>
<!-- dehadp01的http通信地址 -->
<property>
<name>dfs.namenode.http-address.masters.nn1</name>
<value>bi10:50070</value>
</property>
<!-- dehadp02的RPC通信地址 -->
<property>
<name>dfs.namenode.rpc-address.masters.nn2</name>
<value>bi12:9000</value>
</property>
<!-- dehadp02的http通信地址 -->
<property>
<name>dfs.namenode.http-address.masters.nn2</name>
<value>bi12:50070</value>
</property>
<!-- 指定NameNode的元数据在JournalNode上的存放位置 -->
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://bi10:8485;bi12:8485;bi13:8485/masters</value>
</property>
<!-- 指定JournalNode在本地磁盘存放数据的位置 -->
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/home/hadoop/work/hadoop-2.6.2/data/journal</value>
</property>
<!-- 开启NameNode失败自动切换 -->
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<!-- 配置失败自动切换实现方式 -->
<property>
<name>dfs.client.failover.proxy.provider.masters</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行-->
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<!-- 使用sshfence隔离机制时需要ssh免登陆 -->
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>
<!-- 配置sshfence隔离机制超时时间 -->
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
<property>
<name>dfs.blocksize</name>
<value>128m</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hadoop/work/hadoop-2.6.2/data/hdfs/name</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/data1/hdfsdata,file:/data2/hdfsdata,file:/data3/hdfsdata,file:/data4/hdfsdata</value>
</property>
</configuration>
yarn-site.xml配置,
<configuration>
<!-- Site specific YARN configuration properties -->
<!-- 开启RM高可靠 -->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<!-- 指定RM的cluster id -->
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>RM_HA_ID</value>
</property>
<!-- 指定RM的名字 -->
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<!-- 分别指定RM的地址 -->
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>bi12</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>bi13</value>
</property>
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
<!-- 指定zk集群地址 -->
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>bi3:2181,bi4:2181,bi5:2181</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
mapred-site.xml配置,指定使用yarn,
<configuration>
<!-- 指定mr框架为yarn方式 -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
配置slaves文件,
bi10
bi12
bi13
开始启动集群
启动namenode:
1. 在主namenode(bi10)上,启动三台journalnode
[hadoop@bi10 ~]$ hadoop-daemons.sh start journalnode
bi10: starting journalnode, logging to /home/hadoop/work/hadoop-2.6.2/logs/hadoop-hadoop-journalnode-bi10.out
bi12: starting journalnode, logging to /home/hadoop/work/hadoop-2.6.2/logs/hadoop-hadoop-journalnode-bi12.out
bi13: starting journalnode, logging to /home/hadoop/work/hadoop-2.6.2/logs/hadoop-hadoop-journalnode-bi13.out
[hadoop@bi10 ~]$ hadoop-daemons.sh start journalnode
bi10: starting journalnode, logging to /home/hadoop/work/hadoop-2.6.2/logs/hadoop-hadoop-journalnode-bi10.out
bi12: starting journalnode, logging to /home/hadoop/work/hadoop-2.6.2/logs/hadoop-hadoop-journalnode-bi12.out
bi13: starting journalnode, logging to /home/hadoop/work/hadoop-2.6.2/logs/hadoop-hadoop-journalnode-bi13.out
在主namenode(bi10)上,格式化主namenode
[hadoop@bi10 ~]$ hdfs namenode -format
[hadoop@bi10 ~]$ hdfs namenode -format
在主namenode(bi10)上,格式化zkfc
[hadoop@bi10 ~]$ hdfs zkfc -formatZK
[hadoop@bi10 ~]$ hdfs zkfc -formatZK
在主namenode(bi10)上,启动主namenode
[hadoop@bi10 ~]$ hadoop-daemon.sh start namenode
[hadoop@bi10 ~]$ hadoop-daemon.sh start namenode
5. 在从namenode(bi12)上,同步namenode信息
[hadoop@bi12 ~]$ hdfs namenode -bootstrapStandby
[hadoop@bi12 ~]$ hdfs namenode -bootstrapStandby
在从namenode(bi12)上,启动从namenode
[hadoop@bi12 ~]$ hadoop-daemon.sh start namenode
[hadoop@bi12 ~]$ hadoop-daemon.sh start namenode
[hadoop@bi10 ~]$ jps
1914 JournalNode
2294 Jps
2109 NameNode
[hadoop@bi12 ~]$ jps
12063 NameNode
12141 Jps
11843 JournalNode
[hadoop@bi13 ~]$ jps
22197 JournalNode
22323 Jps
[hadoop@bi10 ~]$ jps
1914 JournalNode
2294 Jps
2109 NameNode
[hadoop@bi12 ~]$ jps
12063 NameNode
12141 Jps
11843 JournalNode
[hadoop@bi13 ~]$ jps
22197 JournalNode
22323 Jps
查看zookeeper信息
[zk: localhost:2181(CONNECTED) 13] ls /hadoop-ha
[ns1, masters]
[zk: localhost:2181(CONNECTED) 13] ls /hadoop-ha
[ns1, masters]
7. 分别在bi10和bi12上面启动namenode自动切换
[hadoop@bi10 ~]$ hadoop-daemon.sh start zkfc
starting zkfc, logging to /home/hadoop/work/hadoop-2.6.2/logs/hadoop-hadoop-zkfc-bi10.out
[hadoop@bi10 ~]$ hadoop-daemon.sh start zkfc
starting zkfc, logging to /home/hadoop/work/hadoop-2.6.2/logs/hadoop-hadoop-zkfc-bi10.out
[hadoop@bi12 ~]$ hadoop-daemon.sh start zkfc
starting zkfc, logging to /home/hadoop/work/hadoop-2.6.2/logs/hadoop-hadoop-zkfc-bi12.out
[hadoop@bi12 ~]$ hadoop-daemon.sh start zkfc
starting zkfc, logging to /home/hadoop/work/hadoop-2.6.2/logs/hadoop-hadoop-zkfc-bi12.out
启动三台datanode
1. 在主namenode(bi10)上面之行
[hadoop@bi10 ~]$ hadoop-daemons.sh start datanode
bi10: starting datanode, logging to /home/hadoop/work/hadoop-2.6.2/logs/hadoop-hadoop-datanode-bi10.out
bi12: starting datanode, logging to /home/hadoop/work/hadoop-2.6.2/logs/hadoop-hadoop-datanode-bi12.out
bi13: starting datanode, logging to /home/hadoop/work/hadoop-2.6.2/logs/hadoop-hadoop-datanode-bi13.out
[hadoop@bi10 ~]$ hadoop-daemons.sh start datanode
bi10: starting datanode, logging to /home/hadoop/work/hadoop-2.6.2/logs/hadoop-hadoop-datanode-bi10.out
bi12: starting datanode, logging to /home/hadoop/work/hadoop-2.6.2/logs/hadoop-hadoop-datanode-bi12.out
bi13: starting datanode, logging to /home/hadoop/work/hadoop-2.6.2/logs/hadoop-hadoop-datanode-bi13.out
启动yarn
1. 在主resourcemanager(bi12)上面启动yarn
[hadoop@bi12 ~]$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /home/hadoop/work/hadoop-2.6.2/logs/yarn-hadoop-resourcemanager-bi12.out
bi10: starting nodemanager, logging to /home/hadoop/work/hadoop-2.6.2/logs/yarn-hadoop-nodemanager-bi10.out
bi12: starting nodemanager, logging to /home/hadoop/work/hadoop-2.6.2/logs/yarn-hadoop-nodemanager-bi12.out
bi13: starting nodemanager, logging to /home/hadoop/work/hadoop-2.6.2/logs/yarn-hadoop-nodemanager-bi13.out
[hadoop@bi12 ~]$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /home/hadoop/work/hadoop-2.6.2/logs/yarn-hadoop-resourcemanager-bi12.out
bi10: starting nodemanager, logging to /home/hadoop/work/hadoop-2.6.2/logs/yarn-hadoop-nodemanager-bi10.out
bi12: starting nodemanager, logging to /home/hadoop/work/hadoop-2.6.2/logs/yarn-hadoop-nodemanager-bi12.out
bi13: starting nodemanager, logging to /home/hadoop/work/hadoop-2.6.2/logs/yarn-hadoop-nodemanager-bi13.out
resourcemanager
yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /home/hadoop/work/hadoop-2.6.2/logs/yarn-hadoop-resourcemanager-bi13.out
yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /home/hadoop/work/hadoop-2.6.2/logs/yarn-hadoop-resourcemanager-bi13.out
[hadoop@bi10 ~]$ jps
2659 NodeManager
1914 JournalNode
2784 Jps
2347 DFSZKFailoverController
2515 DataNode
2109 NameNode
[hadoop@bi10 ~]$ jps
2659 NodeManager
1914 JournalNode
2784 Jps
2347 DFSZKFailoverController
2515 DataNode
2109 NameNode
[hadoop@bi12 ~]$ jps
12063 NameNode
12403 DataNode
11843 JournalNode
12569 ResourceManager
12270 DFSZKFailoverController
12678 NodeManager
13031 Jps
[hadoop@bi12 ~]$ jps
12063 NameNode
12403 DataNode
11843 JournalNode
12569 ResourceManager
12270 DFSZKFailoverController
12678 NodeManager
13031 Jps
[hadoop@bi13 ~]$ jps
22729 Jps
22383 DataNode
22197 JournalNode
22553 NodeManager
22691 ResourceManager
[hadoop@bi13 ~]$ jps
22729 Jps
22383 DataNode
22197 JournalNode
22553 NodeManager
22691 ResourceManager
集群测试
wordcount测试
1. 上传测试文件
[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -mkdir /user
[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -mkdir /user/hadoop
[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -mkdir /user/hadoop/wordcount
[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -mkdir /user/hadoop/wordcount/input
[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -put ./LICENSE.txt /user/hadoop/wordcount/input
[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -ls /user/hadoop/wordcount/input
Found 1 items
-rw-r--r-- 2 hadoop supergroup 15429 2016-02-16 15:38 /user/hadoop/wordcount/input/LICENSE.txt
[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -mkdir /user
[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -mkdir /user/hadoop
[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -mkdir /user/hadoop/wordcount
[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -mkdir /user/hadoop/wordcount/input
[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -put ./LICENSE.txt /user/hadoop/wordcount/input
[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -ls /user/hadoop/wordcount/input
Found 1 items
-rw-r--r-- 2 hadoop supergroup 15429 2016-02-16 15:38 /user/hadoop/wordcount/input/LICENSE.txt
2. 之行wordcount测试
hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.2.jar wordcount /user/hadoop/wordcount/input /user/hadoop/wordcount/output
hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.2.jar wordcount /user/hadoop/wordcount/input /user/hadoop/wordcount/output
[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -ls /user/hadoop/wordcount/output
Found 2 items
-rw-r--r-- 2 hadoop supergroup 0 2016-02-16 15:45 /user/hadoop/wordcount/output/_SUCCESS
-rw-r--r-- 2 hadoop supergroup 8006 2016-02-16 15:45 /user/hadoop/wordcount/output/part-r-00000
[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -cat /user/hadoop/wordcount/output/part-r-00000
[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -ls /user/hadoop/wordcount/output
Found 2 items
-rw-r--r-- 2 hadoop supergroup 0 2016-02-16 15:45 /user/hadoop/wordcount/output/_SUCCESS
-rw-r--r-- 2 hadoop supergroup 8006 2016-02-16 15:45 /user/hadoop/wordcount/output/part-r-00000
[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -cat /user/hadoop/wordcount/output/part-r-00000
namenode冗余测试
1. 查看当前namenode状态
[hadoop@bi10 hadoop-2.6.2]$ hdfs haadmin -getServiceState nn1
active
[hadoop@bi10 hadoop-2.6.2]$ hdfs haadmin -getServiceState nn2
standby
[hadoop@bi10 hadoop-2.6.2]$ hdfs haadmin -getServiceState nn1
active
[hadoop@bi10 hadoop-2.6.2]$ hdfs haadmin -getServiceState nn2
standby
[hadoop@bi10 hadoop-2.6.2]$ hadoop-daemon.sh stop namenode
stopping namenode
[hadoop@bi10 hadoop-2.6.2]$ jps
2659 NodeManager
3691 Jps
1914 JournalNode
2347 DFSZKFailoverController
2515 DataNode
[hadoop@bi10 hadoop-2.6.2]$ hadoop-daemon.sh stop namenode
stopping namenode
[hadoop@bi10 hadoop-2.6.2]$ jps
2659 NodeManager
3691 Jps
1914 JournalNode
2347 DFSZKFailoverController
2515 DataNode
[hadoop@bi10 hadoop-2.6.2]$ hdfs haadmin -getServiceState nn1
16/02/16 15:53:48 INFO ipc.Client: Retrying connect to server: bi10/192.168.2.10:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
Operation failed: Call From bi10/192.168.2.10 to bi10:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
[hadoop@bi10 hadoop-2.6.2]$ hdfs haadmin -getServiceState nn2
active
[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -ls
Found 1 items
drwxr-xr-x - hadoop supergroup 0 2016-02-16 15:45 wordcount
[hadoop@bi10 hadoop-2.6.2]$ hdfs haadmin -getServiceState nn1
16/02/16 15:53:48 INFO ipc.Client: Retrying connect to server: bi10/192.168.2.10:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
Operation failed: Call From bi10/192.168.2.10 to bi10:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
[hadoop@bi10 hadoop-2.6.2]$ hdfs haadmin -getServiceState nn2
active
[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -ls
Found 1 items
drwxr-xr-x - hadoop supergroup 0 2016-02-16 15:45 wordcount
3. 重启bi10上面的namenode,再次查看namenode状态
[hadoop@bi10 hadoop-2.6.2]$ hadoop-daemon.sh start namenode
starting namenode, logging to /home/hadoop/work/hadoop-2.6.2/logs/hadoop-hadoop-namenode-bi10.out
[hadoop@bi10 hadoop-2.6.2]$ hdfs haadmin -getServiceState nn1
standby
[hadoop@bi10 hadoop-2.6.2]$ hdfs haadmin -getServiceState nn2
active
[hadoop@bi10 hadoop-2.6.2]$ hadoop-daemon.sh start namenode
starting namenode, logging to /home/hadoop/work/hadoop-2.6.2/logs/hadoop-hadoop-namenode-bi10.out
[hadoop@bi10 hadoop-2.6.2]$ hdfs haadmin -getServiceState nn1
standby
[hadoop@bi10 hadoop-2.6.2]$ hdfs haadmin -getServiceState nn2
active
resourcesmanager冗余测试
1. 模拟主resourcemanager崩溃,然后测试
[hadoop@bi12 ~]$ yarn rmadmin -getServiceState rm1
active
[hadoop@bi12 ~]$ yarn rmadmin -getServiceState rm2
standby
[hadoop@bi12 ~]$ yarn-daemon.sh stop resourcemanager
stopping resourcemanager
[hadoop@bi12 ~]$ yarn rmadmin -getServiceState rm2
active
[hadoop@bi12 ~]$ yarn rmadmin -getServiceState rm1
16/02/16 16:11:36 INFO ipc.Client: Retrying connect to server: bi12/192.168.2.12:8033. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
Operation failed: Call From bi12/192.168.2.12 to bi12:8033 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
[hadoop@bi12 ~]$ yarn rmadmin -getServiceState rm1
active
[hadoop@bi12 ~]$ yarn rmadmin -getServiceState rm2
standby
[hadoop@bi12 ~]$ yarn-daemon.sh stop resourcemanager
stopping resourcemanager
[hadoop@bi12 ~]$ yarn rmadmin -getServiceState rm2
active
[hadoop@bi12 ~]$ yarn rmadmin -getServiceState rm1
16/02/16 16:11:36 INFO ipc.Client: Retrying connect to server: bi12/192.168.2.12:8033. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
Operation failed: Call From bi12/192.168.2.12 to bi12:8033 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
2. 重启bi12的resourcemanager,查看resourcemanager状态
[hadoop@bi12 ~]$ yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /home/hadoop/work/hadoop-2.6.2/logs/yarn-hadoop-resourcemanager-bi12.out
[hadoop@bi12 ~]$ yarn rmadmin -getServiceState rm1
standby
[hadoop@bi12 ~]$ yarn rmadmin -getServiceState rm2
active
[hadoop@bi12 ~]$ yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /home/hadoop/work/hadoop-2.6.2/logs/yarn-hadoop-resourcemanager-bi12.out
[hadoop@bi12 ~]$ yarn rmadmin -getServiceState rm1
standby
[hadoop@bi12 ~]$ yarn rmadmin -getServiceState rm2
active
3. 测试wordcount
[hadoop@bi10 hadoop-2.6.2]$ hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.2.jar wordcount /user/hadoop/wordcount/input /user/hadoop/wordcount/output1
[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -ls wordcount/output1
Found 2 items
-rw-r--r-- 2 hadoop supergroup 0 2016-02-16 16:14 wordcount/output1/_SUCCESS
-rw-r--r-- 2 hadoop supergroup 8006 2016-02-16 16:14 wordcount/output1/part-r-00000
[hadoop@bi10 hadoop-2.6.2]$ hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.2.jar wordcount /user/hadoop/wordcount/input /user/hadoop/wordcount/output1
[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -ls wordcount/output1
Found 2 items
-rw-r--r-- 2 hadoop supergroup 0 2016-02-16 16:14 wordcount/output1/_SUCCESS
-rw-r--r-- 2 hadoop supergroup 8006 2016-02-16 16:14 wordcount/output1/part-r-00000
关闭集群
yarn-daemons.sh stop nodemanager
yarn-daemons.sh stop resourcemanager
hadoop-daemons.sh stop datanode
hadoop-daemons.sh stop zkfc
hadoop-daemons.sh stop namenode
hadoop-daemons.sh stop journalnode