Hadoop伪分布式模式部署
Hadoop2.x:
官网:hadoop.apache.org
三个组件:
HDFS:分布式文件系统,存储
MapReduce:分布式计算
Yarn:资源(cpu + memory)和JOB调度监控
文档地址:
http://hadoop.apache.org/docs/r2.8.2/
部署方式:
1.单机模式(Standalone):1个Java进程
2.伪分布模式(Pseudo-Distribute Mode):开发|学习 多个Java进程
3.集群模式(Cluster Mode):生产 多态机器多个Java进程
伪分布式部署:HDFS
1.创建hadoop服务的一个用户
# useradd hadoop
# id hadoop
# vi /etc/sudoers
hadoop ALL=(root) NOPASSWD:ALL
2.部署Java
使用:Oracle jdk1.8
尽力不要使用: Open JDk
2.1、解压+环境变量
2.2、CDH课 /usr/java
最好:which java
3.部署ssh,确保其是运行的
查看:默认是已经安装
service sshd status
4.解压Hadoop
解压:
software]# tar -zxvf hadoop-2.8.1.tar.gz
修改权限:
chown -R root:root hadoop-2.8.1
配置环境变量
vi /etc/profile
export HADOOP_HOME=/opt/software/hadoop-2.8.1
export PATH=$HADOOP_HOME/bin:$PROTOC_HOME/bin:$FINDBUGS_HOME/bin:$MAVEN_HOME/bin:$JAVA_HOME/bin:$PATH
# which hadoop
# /opt/software/hadoop-2.8.1/bin/hadoop
软连接:
software]# ln -s /opt/software/hadoop-2.8.1 hadoop
修改用户和用户组:
software]# chown -R hadoop:hadoop hadoop
修改软连接
software]# chown -R hadoop:hadoop hadoop/*
修改软连接文件夹里面的内容
software]# chown -R hadoop:hadoop hadoop-2.8.1
修改原文件夹
chown -R hadoop:hadoop 文件夹:修改文件夹和文件夹里面的内容
chown -R hadoop:hadoop 软连接文件夹:修改软连接文件夹,不会修改软连接文件夹里面的
chown -R hadoop:hadoop/* 软连接文件夹/*:软连接文件夹不修改,只修改软连接文件夹里面的
software]# cd hadoop-2.8.1
#删除文件夹内.cmd后缀文件避免混淆
hadoop-2.8.1]# rm -rf /hadoop-2.8.1/bin/*.cmd
hadoop-2.8.1]# rm -rf /hadoop-2.8.1/sbin/*.cmd
bin:执行命令的shell
etc:配置文件
lib:lib库
sbin:启动和关闭hadoop里面的组件(进程)
hadoop-2.8.1/share/hadoop/hdfs:jar包
5.hadoop的模块组件配置core-site.xml
Hadoop-env.sh :hadoop配合环境
core-site.xml :hadoop 核心配置文件
hdfs-site.xml :针对于hdfs服务的会起进程
[mapred-site.xml:MapReduce计算所需要的配置文件] 只有在jar计算时才有
yarn-site.xml :yarn服务的会起进程
slaves :集群的机器名称
配置core-site文件
etc/hadoop/core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
etc/hadoop/hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
6.配置root用户的ssh的信任关系
~]# rm -rf .ssh
~]# ssh-keygen
~]# cd .ssh
.ssh]# cat id_rsa.pub >> authorized_keys 添加自己的公钥到信任名单
#第一次执行
~]# ssh localhost date
The authenticity of host 'localhost (::1)' can't be established.
RSA key fingerprint is ec:85:86:32:22:94:d1:a9:f2:0b:c5:12:3f:ba:e2:61.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
Sun May 13 21:49:14 CST 2018
#第二次执行
~]# ssh localhost date
Sun May 13 21:49:17 CST 2018
7.格式化文件系统
$ bin/hdfs namenode -format
8.java home配置
hadoop]# vi hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_45
9.Start NameNode daemon and DataNode daemon:
hadoop-2.8.1]# sbin/start-dfs.sh 执行shell脚本
Starting namenodes on [localhost]
localhost: starting namenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-root-namenode-hadoop000.out
localhost: starting datanode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-root-datanode-hadoop000.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
RSA key fingerprint is ec:85:86:32:22:94:d1:a9:f2:0b:c5:12:3f:ba:e2:61.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (RSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-root-secondarynamenode-hadoop000.out /]# jps
9861 DataNode
9768 NameNode
10056 SecondaryNameNode
10492 Jps