文章目录

  • 一 阿里云主机(3台)
  • 1.1 选择云服务器ECS
  • 1.2 创建实例
  • 1.3 基本配置
  • 1.4 网络和安全组
  • 1.5 系统配置
  • 1.6 确认实例
  • 1.7 实例列表
  • 二 环境准备
  • 2.1 用户和目录准备
  • 2.2 软件准备
  • 2.3 ip与hostname绑定
  • 2.4 设置3台机器ssh免密通信
  • 三 安装jdk
  • 四 安装ZooKeeper
  • 五 安装Hadoop
  • 5.1 解压(3台机器)
  • 5.2 配置环境变量(3台机器)
  • 5.3 配置hadoop-env.sh
  • 5.4 配置core-site.xml
  • 5.5 配置hdfs-site.xml
  • 5.6 配置mapred-site.xml
  • 5.7 配置yarn-site.xml
  • 5.8 slaves
  • 六 启动Hadoop
  • 6.1 启动JournalNode (三台机器)
  • 6.2 格式化namenode
  • 6.3 同步元数据
  • 6.4 初始化zkfc
  • 6.5 启动hdfs分布式存储系统
  • 6.6 启动yarn
  • 6.7 启动jobhistory
  • 七 关闭集群
  • 八 再次启动集群


一 阿里云主机(3台)

注册和充值这里就不演示了,我们需要购买3台机器,然后选择按量付费,搭建完之后可以摧毁

1.1 选择云服务器ECS

cdh hadoop两个集群互信 hadoop三台集群_hadoop

1.2 创建实例

cdh hadoop两个集群互信 hadoop三台集群_zookeeper_02

1.3 基本配置

cdh hadoop两个集群互信 hadoop三台集群_HA_03

1.4 网络和安全组

cdh hadoop两个集群互信 hadoop三台集群_hadoop_04

1.5 系统配置

cdh hadoop两个集群互信 hadoop三台集群_cdh hadoop两个集群互信_05

1.6 确认实例

cdh hadoop两个集群互信 hadoop三台集群_HA_06

1.7 实例列表

cdh hadoop两个集群互信 hadoop三台集群_zookeeper_07

二 环境准备

主机规划

hadoop001

hadoop002

hadoop003

ZooKeeper




NameNode



DataNode




JournalNode




ResourceManager



NodeManager




DFSZKFailoverControl



JobHistroyServer


2.1 用户和目录准备

创建用户

useradd hadoop

切换hadoop用户

su - hadoop

在hadoop家目录创建相关目录

mkdir app data lib maven_repos script software source tmp

cdh hadoop两个集群互信 hadoop三台集群_zookeeper_08

2.2 软件准备

cdh hadoop两个集群互信 hadoop三台集群_hadoop_09


百度网盘:链接:https://pan.baidu.com/s/1NUghNdmkjiC6sRenfxAKlg 密码:ffj4然后通过crt或者xshell上传到3台机器的/home/hadoop/software

cdh hadoop两个集群互信 hadoop三台集群_zookeeper_10

2.3 ip与hostname绑定

先把3台机器都切换回root用户,然后执行下面的命令,注意:ip地址用内网ip

echo '#ip与hostname绑定' >> /etc/hosts
echo '172.19.94.117 hadoop001' >> /etc/hosts
echo '172.19.94.119 hadoop002' >> /etc/hosts
echo '172.19.94.118 hadoop003' >> /etc/hosts

我们看下有没有添加进去

cdh hadoop两个集群互信 hadoop三台集群_hadoop_11

2.4 设置3台机器ssh免密通信

1.先切换hadoop用户

su - hadoop

2.执行下面命令,出现提示时,连续三次回车

ssh-keygen

然后家目录下会出现.ssh文件夹,.ssh文件夹有下面2个文件,带有.pub的为公钥

cdh hadoop两个集群互信 hadoop三台集群_cdh hadoop两个集群互信_12


3.在.ssh目录下创建authorized_keys文件,并把3台机器的公钥id_rsa.pub都追加进去,如下图

cdh hadoop两个集群互信 hadoop三台集群_cdh hadoop两个集群互信_13


4.三台机器分别执行面命令

ssh hadoop001 date
ssh hadoop002 date
ssh hadoop003 date

cdh hadoop两个集群互信 hadoop三台集群_zookeeper_14


执行的时候,会让输入yes/no,输入yes即可,因为第一需要验证下

到这里三台机器的ssh免密信任就ok了

三 安装jdk

首先三台机器都切换到root用户
1.创建目录

mkdir /usr/java

2.解压jdk

tar -zxvf /home/hadoop/software/jdk-8u45-linux-x64.gz -C /usr/java

3.配置环境变量

echo 'export JAVA_HOME=/usr/java/jdk1.8.0_45' >> /etc/profile
echo 'export PATH=$JAVA_HOME/bin:$PATH' >> /etc/profile

4.生效环境

source /etc/profile

5.改变jdk的用户和用户组为root

chown -R root:root /usr/java/*

6.查看是否安装成功

java -version

cdh hadoop两个集群互信 hadoop三台集群_zookeeper_15

四 安装ZooKeeper

首先三台机器都切换为hadoop用户

su - hadoop

1.解压zookeeper(3台机器同时)

tar -zxvf ~/software/zookeeper-3.4.6.tar.gz -C ~/app/

2.切换到app下(3台机器同时)

cd ~/app

3.创建软连接(3台机器同时)

ln -s zookeeper-3.4.6 zookeeper

cdh hadoop两个集群互信 hadoop三台集群_hadoop_16


4.修改配置文件(hadoop001机器上做)

先进入conf文件家

cd ~/app/zookeeper/conf

拷贝一份zoo_sample.cfg

cp zoo_sample.cfg zoo.cfg

编辑zoo.cfg

vi zoo.cfg

修改dataDir路径

dataDir=/home/hadoop/data/zookeeper

添加server地址

server.1=hadoop001:2888:3888
server.2=hadoop002:2888:3888
server.3=hadoop003:2888:3888

如下图

cdh hadoop两个集群互信 hadoop三台集群_zookeeper_17


5.把zoo.cfg配置文件拷贝到另外两台机器(在hadoop001机器上做)

scp ~/app/zookeeper/conf/zoo.cfg hadoop002:/home/hadoop/app/zookeeper/conf/
scp ~/app/zookeeper/conf/zoo.cfg hadoop003:/home/hadoop/app/zookeeper/conf/

6.创建dataDir目录(3台机器一起)
上面配置的dataDir目录还没有创建

mkdir ~/data/zookeeper

7.创建myid(每台机器不一样)
给每台机器一个id,不然每台机器都不知道自己的id是多少
hadoop001机器执行:

echo 1 > ~/data/zookeeper/myid

hadoop002机器执行:

echo 2 > ~/data/zookeeper/myid

hadoop003机器执行:

echo 3 > ~/data/zookeeper/myid

8.配置环境变量(3台机器一起)

echo '#zookeeper 环境变量' >> ~/.bash_profile
echo 'export ZOOKEEPER_HOME=/home/hadoop/app/zookeeper' >> ~/.bash_profile
echo 'export PATH=$ZOOKEEPER_HOME/bin:$PATH' >> ~/.bash_profile

9.生效环境变量(3台机器一起)

source ~/.bash_profile

10.启动zookeeper(3台机器)

zkServer.sh start

11.查看zookeeper状态

zkServer.sh status

hadoop001状态:

cdh hadoop两个集群互信 hadoop三台集群_cdh hadoop两个集群互信_18


hadoop002状态

cdh hadoop两个集群互信 hadoop三台集群_zookeeper_19


hadoop003状态

cdh hadoop两个集群互信 hadoop三台集群_cdh hadoop两个集群互信_20

五 安装Hadoop

首先切换回hadoop用户(3台机器)

su - hadoop

5.1 解压(3台机器)

tar -zxvf ~/software/hadoop-2.6.0-cdh5.15.1.tar.gz -C ~/app/

在app目录下创建软连接

cd ~/app
ln -s hadoop-2.6.0-cdh5.15.1 hadoop

查看目录

cdh hadoop两个集群互信 hadoop三台集群_阿里云_21

5.2 配置环境变量(3台机器)

终端输入下面的配置,追加到.bash_profile文件中

echo '#hadoop 环境变量' >> ~/.bash_profile
echo 'export HADOOP_HOME=/home/hadoop/app/hadoop' >> ~/.bash_profile
echo 'export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH' >> ~/.bash_profile

查看是否追加成功

cdh hadoop两个集群互信 hadoop三台集群_cdh hadoop两个集群互信_22


然后生效

source ~/.bash_profile

检查是否生效

cdh hadoop两个集群互信 hadoop三台集群_hadoop_23

5.3 配置hadoop-env.sh

在hadoop001机器,进入配置文件所在目录

cd ~/app/hadoop/etc/hadoop

修改hadoop-evn.sh的JAVA_HOME变量

cdh hadoop两个集群互信 hadoop三台集群_hadoop_24


把修改好的文件发送到另外两台机器hadoop002和hadoop003

scp hadoop-env.sh hadoop002:/home/hadoop/app/hadoop/etc/hadoop/
scp hadoop-env.sh hadoop003:/home/hadoop/app/hadoop/etc/hadoop/

5.4 配置core-site.xml

core-site.xml文件内容比较多,在win或者mac里修改好之后,再上传到服务器

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
	<!--Yarn 需要使用 fs.defaultFS 指定NameNode URI -->
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs://ruozeclusterg7</value>
        </property>
        <!--==============================Trash机制======================================= -->
        <property>
                <!--多长时间创建CheckPoint NameNode截点上运行的CheckPointer 
                从Current文件夹创建CheckPoint;默认:0 由fs.trash.interval项指定 -->
                <name>fs.trash.checkpoint.interval</name>
                <value>0</value>
        </property>
        <property>
                <!--多少分钟.Trash下的CheckPoint目录会被删除,该配置服务器设置优先级大于客户端,
                默认:0 不删除 -->
                <name>fs.trash.interval</name>
                <value>1440</value>
        </property>

         <!--指定hadoop临时目录, hadoop.tmp.dir 是hadoop文件系统依赖的基础配置,很多路径都依赖它。
         如果hdfs-site.xml中不配 置namenode和datanode的存放位置,默认就放在这>个路径中 -->
        <property>   
                <name>hadoop.tmp.dir</name>
                <value>/home/hadoop/tmp/hadoop</value>
        </property>

         <!-- 指定zookeeper地址 -->
        <property>
                <name>ha.zookeeper.quorum</name>
                <value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value>
        </property>
         <!--指定ZooKeeper超时间隔,单位毫秒 -->
        <property>
                <name>ha.zookeeper.session-timeout.ms</name>
                <value>2000</value>
        </property>
         <!--如果你的用户不是hadoop,要把下面hadoop改成你的用户 -->
        <property>
           <name>hadoop.proxyuser.hadoop.hosts</name>
           <value>*</value> 
        </property> 
        <property> 
            <name>hadoop.proxyuser.hadoop.groups</name> 
            <value>*</value> 
       </property> 


      <property>
		  <name>io.compression.codecs</name>
		  <value>org.apache.hadoop.io.compress.GzipCodec,
			org.apache.hadoop.io.compress.DefaultCodec,
			org.apache.hadoop.io.compress.BZip2Codec,
			org.apache.hadoop.io.compress.SnappyCodec
		  </value>
      </property>
</configuration>

配置文件中tmp目录需要创建,并赋予777权限(三台机器都执行)

mkdir ~/tmp/hadoop
chmod -R 777 ~/tmp/hadoop

5.5 配置hdfs-site.xml

hdfs-site.xml文件内容比较多,在win或者mac里修改好之后,再上传到服务器

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
	<!--HDFS超级用户 -->
	<property>
		<name>dfs.permissions.superusergroup</name>
		<value>hadoop</value>
	</property>

	<!--开启web hdfs -->
	<property>
		<name>dfs.webhdfs.enabled</name>
		<value>true</value>
	</property>
	<property>
		<name>dfs.namenode.name.dir</name>
		<value>/home/hadoop/data/dfs/name</value>
		<description> namenode 存放name table(fsimage)本地目录(需要修改)</description>
	</property>
	<property>
		<name>dfs.namenode.edits.dir</name>
		<value>${dfs.namenode.name.dir}</value>
		<description>namenode粗放 transaction file(edits)本地目录(需要修改)</description>
	</property>
	<property>
		<name>dfs.datanode.data.dir</name>
		<value>/home/hadoop/data/dfs/data</value>
		<description>datanode存放block本地目录(需要修改)</description>
	</property>
	<property>
		<name>dfs.replication</name>
		<value>3</value>
	</property>
	<!-- 块大小128M (默认128M) -->
	<property>
		<name>dfs.blocksize</name>
		<value>134217728</value>
	</property>
	<!--======================================================================= -->
	<!--HDFS高可用配置 -->
	<!--指定hdfs的nameservice为ruozeclusterg7,需要和core-site.xml中的保持一致 -->
	<property>
		<name>dfs.nameservices</name>
		<value>ruozeclusterg7</value>
	</property>
	<property>
		<!--设置NameNode IDs 此版本最大只支持两个NameNode -->
		<name>dfs.ha.namenodes.ruozeclusterg7</name>
		<value>nn1,nn2</value>
	</property>

	<!-- Hdfs HA: dfs.namenode.rpc-address.[nameservice ID] rpc 通信地址 -->
	<property>
		<name>dfs.namenode.rpc-address.ruozeclusterg7.nn1</name>
		<value>hadoop001:8020</value>
	</property>
	<property>
		<name>dfs.namenode.rpc-address.ruozeclusterg7.nn2</name>
		<value>hadoop002:8020</value>
	</property>

	<!-- Hdfs HA: dfs.namenode.http-address.[nameservice ID] http 通信地址 -->
	<property>
		<name>dfs.namenode.http-address.ruozeclusterg7.nn1</name>
		<value>hadoop001:50070</value>
	</property>
	<property>
		<name>dfs.namenode.http-address.ruozeclusterg7.nn2</name>
		<value>hadoop002:50070</value>
	</property>

	<!--==================Namenode editlog同步 ============================================ -->
	<!--保证数据恢复 -->
	<property>
		<name>dfs.journalnode.http-address</name>
		<value>0.0.0.0:8480</value>
	</property>
	<property>
		<name>dfs.journalnode.rpc-address</name>
		<value>0.0.0.0:8485</value>
	</property>
	<property>
		<!--设置JournalNode服务器地址,QuorumJournalManager 用于存储editlog -->
		<!--格式:qjournal://<host1:port1>;<host2:port2>;<host3:port3>/<journalId> 
		端口同journalnode.rpc-address -->
		<name>dfs.namenode.shared.edits.dir</name>
		<value>qjournal://hadoop001:8485;hadoop002:8485;hadoop003:8485/ruozeclusterg7</value>
	</property>

	<property>
		<!--JournalNode存放数据地址 -->
		<name>dfs.journalnode.edits.dir</name>
		<value>/home/hadoop/data/dfs/jn</value>
	</property>
	<!--==================DataNode editlog同步 ======================= -->
	<property>
		<!--DataNode,Client连接Namenode识别选择Active NameNode策略 -->
                             <!-- 配置失败自动切换实现方式 -->
		<name>dfs.client.failover.proxy.provider.ruozeclusterg7</name>
	<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
	</property>
	<!--==================Namenode fencing:================================= -->
	<!--Failover后防止停掉的Namenode启动,造成两个服务 -->
	<property>
		<name>dfs.ha.fencing.methods</name>
		<value>sshfence</value>
	</property>
	<property>
		<name>dfs.ha.fencing.ssh.private-key-files</name>
		<value>/home/hadoop/.ssh/id_rsa</value>
	</property>
	<property>
		<!--多少milliseconds 认为fencing失败 -->
		<name>dfs.ha.fencing.ssh.connect-timeout</name>
		<value>30000</value>
	</property>

	<!--========NameNode auto failover base ZKFC and Zookeeper========== -->
	<!--开启基于Zookeeper  -->
	<property>
		<name>dfs.ha.automatic-failover.enabled</name>
		<value>true</value>
	</property>
	<!--动态许可datanode连接namenode列表 -->
	 <property>
	   <name>dfs.hosts</name>
	   <value>/home/hadoop/app/hadoop/etc/hadoop/slaves</value>
	 </property>
</configuration>

5.6 配置mapred-site.xml

mapred-site.xml文件内容比较多,在win或者mac里修改好之后,再上传到服务器

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
	<!-- 配置 MapReduce Applications -->
	<property>
		<name>mapreduce.framework.name</name>
		<value>yarn</value>
	</property>
	<!-- JobHistory Server ==================== -->
	<!-- 配置 MapReduce JobHistory Server 地址 ,默认端口10020 -->
	<property>
		<name>mapreduce.jobhistory.address</name>
		<value>hadoop001:10020</value>
	</property>
	<!-- 配置 MapReduce JobHistory Server web ui 地址, 默认端口19888 -->
	<property>
		<name>mapreduce.jobhistory.webapp.address</name>
		<value>hadoop001:19888</value>
	</property>

<!-- 配置 Map段输出的压缩,snappy-->
  <property>
      <name>mapreduce.map.output.compress</name> 
      <value>true</value>
  </property>
              
  <property>
      <name>mapreduce.map.output.compress.codec</name> 
      <value>org.apache.hadoop.io.compress.SnappyCodec</value>
   </property>
</configuration>

5.7 配置yarn-site.xml

yarn-site.xml文件内容比较多,在win或者mac里修改好之后,再上传到服务器

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
	<!-- nodemanager 配置 ================================================= -->
	<property>
		<name>yarn.nodemanager.aux-services</name>
		<value>mapreduce_shuffle</value>
	</property>
	<property>
		<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
		<value>org.apache.hadoop.mapred.ShuffleHandler</value>
	</property>
	<property>
		<name>yarn.nodemanager.localizer.address</name>
		<value>0.0.0.0:23344</value>
		<description>Address where the localizer IPC is.</description>
	</property>
	<property>
		<name>yarn.nodemanager.webapp.address</name>
		<value>0.0.0.0:23999</value>
		<description>NM Webapp address.</description>
	</property>

	<!-- HA 配置 =============================================================== -->
	<!-- Resource Manager Configs -->
	<property>
		<name>yarn.resourcemanager.connect.retry-interval.ms</name>
		<value>2000</value>
	</property>
	<property>
		<name>yarn.resourcemanager.ha.enabled</name>
		<value>true</value>
	</property>
	<property>
		<name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
		<value>true</value>
	</property>
	<!-- 使嵌入式自动故障转移。HA环境启动,与 ZKRMStateStore 配合 处理fencing -->
	<property>
		<name>yarn.resourcemanager.ha.automatic-failover.embedded</name>
		<value>true</value>
	</property>
	<!-- 集群名称,确保HA选举时对应的集群 -->
	<property>
		<name>yarn.resourcemanager.cluster-id</name>
		<value>yarn-cluster</value>
	</property>
	<property>
		<name>yarn.resourcemanager.ha.rm-ids</name>
		<value>rm1,rm2</value>
	</property>


    <!--这里RM主备结点需要单独指定,(可选)
	<property>
		 <name>yarn.resourcemanager.ha.id</name>
		 <value>rm2</value>
	 </property>
	 -->

	<property>
		<name>yarn.resourcemanager.scheduler.class</name>
		<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
	</property>
	<property>
		<name>yarn.resourcemanager.recovery.enabled</name>
		<value>true</value>
	</property>
	<property>
		<name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name>
		<value>5000</value>
	</property>
	<!-- ZKRMStateStore 配置 -->
	<property>
		<name>yarn.resourcemanager.store.class</name>
		<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
	</property>
	<property>
		<name>yarn.resourcemanager.zk-address</name>
		<value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value>
	</property>
	<property>
		<name>yarn.resourcemanager.zk.state-store.address</name>
		<value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value>
	</property>
	<!-- Client访问RM的RPC地址 (applications manager interface) -->
	<property>
		<name>yarn.resourcemanager.address.rm1</name>
		<value>hadoop001:23140</value>
	</property>
	<property>
		<name>yarn.resourcemanager.address.rm2</name>
		<value>hadoop002:23140</value>
	</property>
	<!-- AM访问RM的RPC地址(scheduler interface) -->
	<property>
		<name>yarn.resourcemanager.scheduler.address.rm1</name>
		<value>hadoop001:23130</value>
	</property>
	<property>
		<name>yarn.resourcemanager.scheduler.address.rm2</name>
		<value>hadoop002:23130</value>
	</property>
	<!-- RM admin interface -->
	<property>
		<name>yarn.resourcemanager.admin.address.rm1</name>
		<value>hadoop001:23141</value>
	</property>
	<property>
		<name>yarn.resourcemanager.admin.address.rm2</name>
		<value>hadoop002:23141</value>
	</property>
	<!--NM访问RM的RPC端口 -->
	<property>
		<name>yarn.resourcemanager.resource-tracker.address.rm1</name>
		<value>hadoop001:23125</value>
	</property>
	<property>
		<name>yarn.resourcemanager.resource-tracker.address.rm2</name>
		<value>hadoop002:23125</value>
	</property>
	<!-- RM web application 地址 -->
	<property>
		<name>yarn.resourcemanager.webapp.address.rm1</name>
		<value>hadoop001:8088</value>
	</property>
	<property>
		<name>yarn.resourcemanager.webapp.address.rm2</name>
		<value>hadoop002:8088</value>
	</property>
	<property>
		<name>yarn.resourcemanager.webapp.https.address.rm1</name>
		<value>hadoop001:23189</value>
	</property>
	<property>
		<name>yarn.resourcemanager.webapp.https.address.rm2</name>
		<value>hadoop002:23189</value>
	</property>
	<!--日期聚合-->
	<property>
	   <name>yarn.log-aggregation-enable</name>
	   <value>true</value>
	</property>
	<property>
		 <name>yarn.log.server.url</name>
		 <value>http://hadoop001:19888/jobhistory/logs</value>
	</property>

	<!--资源配置-->
	<property>
		<name>yarn.nodemanager.resource.memory-mb</name>
		<value>2048</value>
	</property>
	<property>
		<name>yarn.scheduler.minimum-allocation-mb</name>
		<value>1024</value>
		<discription>单个任务可申请最少内存,默认1024MB</discription>
	 </property>

  
  <property>
	<name>yarn.scheduler.maximum-allocation-mb</name>
	<value>2048</value>
	<discription>单个任务可申请最大内存,默认8192MB</discription>
  </property>

   <property>
       <name>yarn.nodemanager.resource.cpu-vcores</name>
       <value>2</value>
   </property>

</configuration>

5.8 slaves

内容如下:

hadoop001
hadoop002
hadoop003

六 启动Hadoop

6.1 启动JournalNode (三台机器)

先要在journalNode节点上启动JournalNode进程

hadoop-daemon.sh start journalnode

cdh hadoop两个集群互信 hadoop三台集群_zookeeper_25

6.2 格式化namenode

格式化hadoop001的机器

hadoop namenode -format

只有第一次启动需要先格式化

cdh hadoop两个集群互信 hadoop三台集群_HA_26

6.3 同步元数据

同步hadoop001元数据到hadoop002
要保证两个namenode的元数据保持一致

scp -r ~/data/dfs/name hadoop002:/home/hadoop/data/dfs

6.4 初始化zkfc

hdfs zkfc -formatZK

cdh hadoop两个集群互信 hadoop三台集群_cdh hadoop两个集群互信_27

6.5 启动hdfs分布式存储系统

在hadoop001执行:

start-dfs.sh

cdh hadoop两个集群互信 hadoop三台集群_HA_28


cdh hadoop两个集群互信 hadoop三台集群_阿里云_29


因为datanode是从节点,要去读取slaves文件,我们看看slaves文件信息和类型

cdh hadoop两个集群互信 hadoop三台集群_zookeeper_30


有问题,因为我们这个文件是放在win上面写的,所以需要用dos2unix转换下

yum install -y dos2unix
dos2unix $HADOOP_HOME/etc/hadoop/slaves

我们再查看文件ok了

cdh hadoop两个集群互信 hadoop三台集群_cdh hadoop两个集群互信_31


把文件拷贝到另外两台机器

scp slaves hadoop002:/home/hadoop/app/hadoop/etc/hadoop
scp slaves hadoop003:/home/hadoop/app/hadoop/etc/hadoop

然后我们再启动hdfs

start-dfs.sh

如下图:启动成功

cdh hadoop两个集群互信 hadoop三台集群_zookeeper_32


我们看下hadoop001的web端,状态为active

cdh hadoop两个集群互信 hadoop三台集群_HA_33


我们再看下hadoop002的web端,状态为standby

cdh hadoop两个集群互信 hadoop三台集群_HA_34

6.6 启动yarn

在hadoop001执行:

start-yarn.sh

如下图:hadoop002的rm好像没启动成功

cdh hadoop两个集群互信 hadoop三台集群_cdh hadoop两个集群互信_35


我们看下,hadoop002的rm果然没启动成功

cdh hadoop两个集群互信 hadoop三台集群_hadoop_36


这里是个坑,我们要在hadoop002机器手动启动

yarn-daemon.sh start resourcemanager

我们查看hadoop001的web端:http://47.103.149.67:8088

cdh hadoop两个集群互信 hadoop三台集群_zookeeper_37


查看hadoop002的web端:http://47.103.146.169:8088/cluster/cluster

cdh hadoop两个集群互信 hadoop三台集群_HA_38

6.7 启动jobhistory

mr-jobhistory-daemon.sh start historyserver

主要用来查看job的历史情况,web如下:http://47.103.149.67:19888/jobhistory

cdh hadoop两个集群互信 hadoop三台集群_HA_39

七 关闭集群

1.关闭Hadoop (YARN->HDFS)

[hadoop@hadoop001 ~]# mr-jobhistory-daemon.sh stop historyserver
[hadoop@hadoop001 sbin]# stop-yarn.sh
[hadoop@hadoop002 sbin]# yarn-daemon.sh stop resourcemanager 
[hadoop@hadoop001 sbin]# stop-dfs.sh

2.关闭Zookeeper

[hadoop@hadoop001 bin]# zkServer.sh stop
[hadoop@hadoop002 bin]# zkServer.sh stop
[hadoop@hadoop003 bin]# zkServer.sh stop

八 再次启动集群

1.启动 Zookeeper

[hadoop@hadoop001 bin]# zkServer.sh start 
[hadoop@hadoop002 bin]# zkServer.sh start 
[hadoop@hadoop003 bin]# zkServer.sh start

2.启动Hadoop

[hadoop@hadoop001 sbin]# start-dfs.sh
[hadoop@hadoop001 sbin]# start-yarn.sh
[hadoop@hadoop002 sbin]# yarn-daemon.sh start resourcemanager
[hadoop@hadoop001 ~]# mr-jobhistory-daemon.sh start historyserver