准备将hadoop103,hadoop104,hadoop105这三台服务器配置成集群,现在先将hadoop102配置好然后将内容同步到这三台服务器上
环境准备
1,编辑hadoop-env.sh
配置JDK
2,编辑core-site.xml
配置NameNode的地址和hadoop运行时产生文件的存储目录
<!--配置HDFS的NameNode-->
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop102:9000</value>
</property>
<!--配置运行时产生数据保存的位置-->
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/module/hadoop-2.7.2/data/tmp</value
3,配置hdfs-site.xml
配置副本数,默认值为3,伪分布式设置为1
<!--配置HDFS的冗余度-->
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
启动hadoop102:
1,格式化NameNode(第一次启动时格式化,以后就不要总格式化)
bin/hdfs namenode -format
2,启动namenode
sbin/hadoop-daemon.sh start namenode
Jps 查看进程判断是否启动成功
3,启动datanode
sbin/hadoop-daemon.sh start datanode
查看产生的log日志
/opt/module/hadoop-2.7.2/logs
SCP:服务器之间的拷贝
推送到另一台服务器:
[root@hadoop102 opt]# scp -r /opt/module root@hadoop105:/opt
从另一台服务器拉过来:
[root@hadoop103 opt]# scp root@hadoop102:/etc/profile /etc/profile
Ssh:服务器之间的免密登录
生成公钥和密钥
[atguigu@hadoop102 .ssh]$ pwd
/home/atguigu/.ssh
[atguigu@hadoop102 .ssh]$ ssh-keygen -t rsa
三次回车即可生成
103配置免密登录104机器:[atguigu@hadoop103 .ssh]$ ssh-copy-id hadoop104
Rsync(远程同步工具):服务器之间的同步
Xsync脚本编写:
cd /usr/local/bin
[root@hadoop103 bin]# touch xsync 创建脚本文件
[root@hadoop103 bin]# chmod 777 xsync 修改权限,可读可写可执行
[root@hadoop103 bin]# chown atguigu:atguigu xsync 修改文件所属用户
[root@hadoop103 bin]# vi xsync 编辑脚本
根据具体需求编写脚本,将一台机器上的内容同步到每台机器,至此环境配置完成。
搭建完全分布式集群规划
| Hadoop103 | Hadoop104 | Hadoop105 |
HDFS | NameNode(很重要,独占一个服务器) DataNode | DataNode | SecondaryNameNode(辅助NameNode工作) DataNode |
YARN | NodeManager | NodeManager ResourceManager | NodeManager |
启几个DataNode就会对应有几个NodeManager
配置集群:
HDFS:
- 修改core-site.xml(NameNode所在机器--hadoop103 )
- Hadoop-env.sh(JDK)
- Hdfs-site.xml(HDFS--3副本数和SecondrayNameNode--hadoop105)
- Slaves(DataNode--每一台)
YARN:
- yarn-env.sh
2,Yarn-site.xml(reducer获取数据的方式,ResourceManager的地址)
Mapreduce:
- mapred-env.sh
- Mapred-site.xml(mapreduce在yarn上运行)
在集群上分发以上所有文件
[atguigu@hadoop103 etc]$ xsync hadoop
清空原来copy来的数据
[atguigu@hadoop103 hadoop-2.7.2]$ rm -rf data/ logs/
[atguigu@hadoop104 hadoop-2.7.2]$ rm -rf data/ logs/
[atguigu@hadoop105 hadoop-2.7.2]$ rm -rf data/ logs/
删除数据之后,第一次启动集群时需要格式化
在NameNode所在的结点上格式化:
[atguigu@hadoop103 hadoop-2.7.2]$ bin/hdfs namenode -format
启动集群服务
1,在NameNode所在结点启动集群
[atguigu@hadoop103 sbin]$ start-dfs.sh
打印信息:
Starting namenodes on [hadoop103]
hadoop103: starting namenode, logging to /opt/module/hadoop-2.7.2/logs/hadoop-atguigu-namenode-hadoop103.out
hadoop105: starting datanode, logging to /opt/module/hadoop-2.7.2/logs/hadoop-atguigu-datanode-hadoop105.out
hadoop104: starting datanode, logging to /opt/module/hadoop-2.7.2/logs/hadoop-atguigu-datanode-hadoop104.out
hadoop103: starting datanode, logging to /opt/module/hadoop-2.7.2/logs/hadoop-atguigu-datanode-hadoop103.out
Starting secondary namenodes [hadoop105]
hadoop105: starting secondarynamenode, logging to /opt/module/hadoop-2.7.2/logs/hadoop-atguigu-secondarynamenode-hadoop105.out
103上有NameNode和DataNode
[atguigu@hadoop103 sbin]$ jps
1640 NameNode
1736 DataNode
1962 Jps
104上只有DataNode
[atguigu@hadoop104 hadoop-2.7.2]$ jps
1495 DataNode
1566 Jps
105上有DataNode和SecondaryNameNode
[atguigu@hadoop105 hadoop-2.7.2]$ jps
1495 DataNode
1639 Jps
1552 SecondaryNameNode
均与上面集群规划的表格内容一致,说明HDFS启动成功了
2,在ResourceManager结点上启动ResourceManager和NodeManager
[atguigu@hadoop104 hadoop-2.7.2]$ sbin/start-yarn.sh
打印信息:
starting yarn daemons
starting resourcemanager, logging to /opt/module/hadoop-2.7.2/logs/yarn-atguigu-resourcemanager-hadoop104.out
hadoop105: starting nodemanager, logging to /opt/module/hadoop-2.7.2/logs/yarn-atguigu-nodemanager-hadoop105.out
hadoop103: starting nodemanager, logging to /opt/module/hadoop-2.7.2/logs/yarn-atguigu-nodemanager-hadoop103.out
hadoop104: starting nodemanager, logging to /opt/module/hadoop-2.7.2/logs/yarn-atguigu-nodemanager-hadoop104.out
重新查看JPS发现,与上面集群规划表格完全一致。至此集群启动成功
检查服务器集群:
测试服务器:
创建目录:[atguigu@hadoop103 hadoop-2.7.2]$ hadoop fs -mkdir -p /user/atguigu/input
查看创建的目录:[atguigu@hadoop103 hadoop-2.7.2]$ hadoop fs -lsr /
lsr: DEPRECATED: Please use 'ls -R' instead.
drwxr-xr-x - atguigu supergroup 0 2018-09-05 06:20 /user
drwxr-xr-x - atguigu supergroup 0 2018-09-05 06:20 /user/atguigu
drwxr-xr-x - atguigu supergroup 0 2018-09-05 06:20 /user/atguigu/input